Compare commits

...

989 Commits

Author SHA1 Message Date
Thiago Marques
941f96ce0e add helpers to select kernel in some distributions 2021-10-19 20:17:09 +00:00
Thiago Marques
720324afab Merge remote-tracking branch 'upstream/master' 2021-10-19 16:58:24 +00:00
Andrii Nakryiko
92c1e61a60 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   0e545dbaa2797133f57bf8387e8f74cd245cedea
Checkpoint bpf-next commit: 5319255b8df9271474bc9027cabf82253934f28d
Baseline bpf commit:        d0c6416bd7091647f6041599f396bfa19ae30368
Checkpoint bpf commit:      8d6c414cd2fb74aa6812e9bfec6178f8246c4f3a

Hou Tao (1):
  libbpf: Support detecting and attaching of writable tracepoint program

 src/libbpf.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

--
2.30.2
2021-10-11 12:30:36 -07:00
Hou Tao
133e3603ec libbpf: Support detecting and attaching of writable tracepoint program
Program on writable tracepoint is BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE,
but its attachment is the same as BPF_PROG_TYPE_RAW_TRACEPOINT.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211004094857.30868-3-hotforest@gmail.com
2021-10-11 12:30:36 -07:00
Andrii Nakryiko
f9f6e92458 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   38261f369fb905552ebdd3feb9699c0788fd3371
Checkpoint bpf-next commit: 0e545dbaa2797133f57bf8387e8f74cd245cedea
Baseline bpf commit:        571fa247ab411f3233eeaaf837c6e646a513b9f8
Checkpoint bpf commit:      d0c6416bd7091647f6041599f396bfa19ae30368

Alexei Starovoitov (1):
  libbpf: Make gen_loader data aligned.

Andrii Nakryiko (2):
  libbpf: Add API that copies all BTF types from one BTF object to
    another
  libbpf: Fix memory leak in strset

Grant Seltzer (1):
  libbpf: Add API documentation convention guidelines

Hengqi Chen (3):
  libbpf: Support uniform BTF-defined key/value specification across all
    BPF maps
  libbpf: Deprecate bpf_object__unload() API since v0.6
  libbpf: Deprecate bpf_{map,program}__{prev,next} APIs since v0.7

Kumar Kartikeya Dwivedi (5):
  libbpf: Fix skel_internal.h to set errno on loader retval < 0
  libbpf: Support kernel module function calls
  libbpf: Resolve invalid weak kfunc calls with imm = 0, off = 0
  libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations
  libbpf: Fix segfault in light skeleton for objects without BTF

Toke Høiland-Jørgensen (1):
  libbpf: Properly ignore STT_SECTION symbols in legacy map definitions

 docs/libbpf_naming_convention.rst |  40 ++++
 src/bpf.c                         |   1 +
 src/bpf_gen_internal.h            |  16 +-
 src/btf.c                         | 132 ++++++++++++-
 src/btf.h                         |  22 +++
 src/gen_loader.c                  | 317 +++++++++++++++++++++++++-----
 src/libbpf.c                      | 165 ++++++++++++----
 src/libbpf.h                      |  36 ++--
 src/libbpf.map                    |   5 +
 src/libbpf_internal.h             |   3 +
 src/skel_internal.h               |   6 +-
 src/strset.c                      |   1 +
 12 files changed, 633 insertions(+), 111 deletions(-)

--
2.30.2
2021-10-06 14:40:18 -07:00
Andrii Nakryiko
b05bace770 libbpf: Fix memory leak in strset
Free struct strset itself, not just its internal parts.

Fixes: 90d76d3ececc ("libbpf: Extract internal set-of-strings datastructure APIs")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20211001185910.86492-1-andrii@kernel.org
2021-10-06 14:40:18 -07:00
Kumar Kartikeya Dwivedi
7022527a7b libbpf: Fix segfault in light skeleton for objects without BTF
When fed an empty BPF object, bpftool gen skeleton -L crashes at
btf__set_fd() since it assumes presence of obj->btf, however for
the sequence below clang adds no .BTF section (hence no BTF).

Reproducer:

  $ touch a.bpf.c
  $ clang -O2 -g -target bpf -c a.bpf.c
  $ bpftool gen skeleton -L a.bpf.o
  /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
  /* THIS FILE IS AUTOGENERATED! */

  struct a_bpf {
	struct bpf_loader_ctx ctx;
  Segmentation fault (core dumped)

The same occurs for files compiled without BTF info, i.e. without
clang's -g flag.

Fixes: 67234743736a (libbpf: Generate loader program out of BPF ELF file.)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210930061634.1840768-1-memxor@gmail.com
2021-10-06 14:40:18 -07:00
Hengqi Chen
d3169df794 libbpf: Deprecate bpf_{map,program}__{prev,next} APIs since v0.7
Deprecate bpf_{map,program}__{prev,next} APIs. Replace them with
a new set of APIs named bpf_object__{prev,next}_{program,map} which
follow the libbpf API naming convention ([0]). No functionality changes.

  [0] Closes: https://github.com/libbpf/libbpf/issues/296

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211003165844.4054931-2-hengqi.chen@gmail.com
2021-10-06 14:40:18 -07:00
Hengqi Chen
d665ca0bb0 libbpf: Deprecate bpf_object__unload() API since v0.6
BPF objects are not reloadable after unload. Users are expected to use
bpf_object__close() to unload and free up resources in one operation.
No need to expose bpf_object__unload() as a public API, deprecate it
([0]).  Add bpf_object__unload() as an alias to internal
bpf_object_unload() and replace all bpf_object__unload() uses to avoid
compilation errors.

  [0] Closes: https://github.com/libbpf/libbpf/issues/290

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211002161000.3854559-1-hengqi.chen@gmail.com
2021-10-06 14:40:18 -07:00
Grant Seltzer
744fd961c7 libbpf: Add API documentation convention guidelines
This adds a section to the documentation for libbpf
naming convention which describes how to document
API features in libbpf, specifically the format of
which API doc comments need to conform to.

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211004215644.497327-1-grantseltzer@gmail.com
2021-10-06 14:40:18 -07:00
Andrii Nakryiko
13ebb60ab6 libbpf: Add API that copies all BTF types from one BTF object to another
Add a bulk copying api, btf__add_btf(), that speeds up and simplifies
appending entire contents of one BTF object to another one, taking care
of copying BTF type data, adjusting resulting BTF type IDs according to
their new locations in the destination BTF object, as well as copying
and deduplicating all the referenced strings and updating all the string
offsets in new BTF types as appropriate.

This API is intended to be used from tools that are generating and
otherwise manipulating BTFs generically, such as pahole. In pahole's
case, this API is useful for speeding up parallelized BTF encoding, as
it allows pahole to offload all the intricacies of BTF type copying to
libbpf and handle the parallelization aspects of the process.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/bpf/20211006051107.17921-2-andrii@kernel.org
2021-10-06 14:40:18 -07:00
Kumar Kartikeya Dwivedi
3298393748 libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations
This change updates the BPF syscall loader to relocate BTF_KIND_FUNC
relocations, with support for weak kfunc relocations. The general idea
is to move map_fds to loader map, and also use the data for storing
kfunc BTF fds. Since both reuse the fd_array parameter, they need to be
kept together.

For map_fds, we reserve MAX_USED_MAPS slots in a region, and for kfunc,
we reserve MAX_KFUNC_DESCS. This is done so that insn->off has more
chances of being <= INT16_MAX than treating data map as a sparse array
and adding fd as needed.

When the MAX_KFUNC_DESCS limit is reached, we fall back to the sparse
array model, so that as long as it does remain <= INT16_MAX, we pass an
index relative to the start of fd_array.

We store all ksyms in an array where we try to avoid calling the
bpf_btf_find_by_name_kind helper, and also reuse the BTF fd that was
already stored. This also speeds up the loading process compared to
emitting calls in all cases, in later tests.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-9-memxor@gmail.com
2021-10-06 14:40:18 -07:00
Kumar Kartikeya Dwivedi
0141d9dded libbpf: Resolve invalid weak kfunc calls with imm = 0, off = 0
Preserve these calls as it allows verifier to succeed in loading the
program if they are determined to be unreachable after dead code
elimination during program load. If not, the verifier will fail at
runtime. This is done for ext->is_weak symbols similar to the case for
variable ksyms.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-8-memxor@gmail.com
2021-10-06 14:40:18 -07:00
Kumar Kartikeya Dwivedi
7dde8f8f8d libbpf: Support kernel module function calls
This patch adds libbpf support for kernel module function call support.
The fd_array parameter is used during BPF program load to pass module
BTFs referenced by the program. insn->off is set to index into this
array, but starts from 1, because insn->off as 0 is reserved for
btf_vmlinux.

We try to use existing insn->off for a module, since the kernel limits
the maximum distinct module BTFs for kfuncs to 256, and also because
index must never exceed the maximum allowed value that can fit in
insn->off (INT16_MAX). In the future, if kernel interprets signed offset
as unsigned for kfunc calls, this limit can be increased to UINT16_MAX.

Also introduce a btf__find_by_name_kind_own helper to start searching
from module BTF's start id when we know that the BTF ID is not present
in vmlinux BTF (in find_ksym_btf_id).

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-7-memxor@gmail.com
2021-10-06 14:40:18 -07:00
Hengqi Chen
962f241379 libbpf: Support uniform BTF-defined key/value specification across all BPF maps
A bunch of BPF maps do not support specifying BTF types for key and value.
This is non-uniform and inconvenient[0]. Currently, libbpf uses a retry
logic which removes BTF type IDs when BPF map creation failed. Instead
of retrying, this commit recognizes those specialized maps and removes
BTF type IDs when creating BPF map.

  [0] Closes: https://github.com/libbpf/libbpf/issues/355

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210930161456.3444544-2-hengqi.chen@gmail.com
2021-10-06 14:40:18 -07:00
Kumar Kartikeya Dwivedi
8c2d905ff4 libbpf: Fix skel_internal.h to set errno on loader retval < 0
When the loader indicates an internal error (result of a checked bpf
system call), it returns the result in attr.test.retval. However, tests
that rely on ASSERT_OK_PTR on NULL (returned from light skeleton) may
miss that NULL denotes an error if errno is set to 0. This would result
in skel pointer being NULL, while ASSERT_OK_PTR returning 1, leading to
a SEGV on dereference of skel, because libbpf_get_error relies on the
assumption that errno is always set in case of error for ptr == NULL.

In particular, this was observed for the ksyms_module test. When
executed using `./test_progs -t ksyms`, prior tests manipulated errno
and the test didn't crash when it failed at ksyms_module load, while
using `./test_progs -t ksyms_module` crashed due to errno being
untouched.

Fixes: 67234743736a (libbpf: Generate loader program out of BPF ELF file.)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210927145941.1383001-11-memxor@gmail.com
2021-10-06 14:40:18 -07:00
Toke Høiland-Jørgensen
531be4d879 libbpf: Properly ignore STT_SECTION symbols in legacy map definitions
The previous patch to ignore STT_SECTION symbols only added the ignore
condition in one of them. This fails if there's more than one map
definition in the 'maps' section, because the subsequent modulus check will
fail, resulting in error messages like:

libbpf: elf: unable to determine legacy map definition size in ./xdpdump_xdp.o

Fix this by also ignoring STT_SECTION in the first loop.

Fixes: c3e8c44a9063 ("libbpf: Ignore STT_SECTION symbols in 'maps' section")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210929213837.832449-1-toke@redhat.com
2021-10-06 14:40:18 -07:00
Alexei Starovoitov
f49c65d216 libbpf: Make gen_loader data aligned.
Align gen_loader data to 8 byte boundary to make sure union bpf_attr,
bpf_insns and other structs are aligned.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210927145941.1383001-9-memxor@gmail.com
2021-10-06 14:40:18 -07:00
Sergei Iudin
b0feb9b4d5 Remove caching of pahole test result
Initial idea was to run it hourly, but when it runs hourly it produce a
lot of useless noise and we had to switch it do daily. When we run it
daily caching make much less sense and only make debugging more complex.
2021-10-04 11:20:32 -07:00
Andrii Nakryiko
e671a47bc2 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   69cd823956ba8ce266a901170b1060db8073bddd
Checkpoint bpf-next commit: 38261f369fb905552ebdd3feb9699c0788fd3371
Baseline bpf commit:        bc23f724481759d0fac61dfb5ce979af2190bbe0
Checkpoint bpf commit:      571fa247ab411f3233eeaaf837c6e646a513b9f8

Andrii Nakryiko (15):
  libbpf: Use pre-setup sec_def in libbpf_find_attach_btf_id()
  libbpf: Deprecated bpf_object_open_opts.relaxed_core_relocs
  libbpf: Allow skipping attach_func_name in
    bpf_program__set_attach_target()
  libbpf: Schedule open_opts.attach_prog_fd deprecation since v0.7
  libbpf: Constify all high-level program attach APIs
  libbpf: Fix memory leak in legacy kprobe attach logic
  libbpf: Refactor and simplify legacy kprobe code
  libbpf: Add legacy uprobe attaching support
  libbpf: Add "tc" SEC_DEF which is a better name for "classifier"
  libbpf: Refactor internal sec_def handling to enable pluggability
  libbpf: Reduce reliance of attach_fns on sec_def internals
  libbpf: Refactor ELF section handler definitions
  libbpf: Complete SEC() table unification for
    BPF_APROG_SEC/BPF_EAPROG_SEC
  libbpf: Add opt-in strict BPF program section name handling logic
  selftests/bpf: Switch sk_lookup selftests to strict SEC("sk_lookup")
    use

Dave Marchevsky (4):
  bpf: Add bpf_trace_vprintk helper
  libbpf: Modify bpf_printk to choose helper based on arg count
  libbpf: Use static const fmt string in __bpf_printk
  bpf: Clarify data_len param in bpf_snprintf and bpf_seq_printf
    comments

Grant Seltzer (1):
  libbpf: Add doc comments in libbpf.h

Kumar Kartikeya Dwivedi (1):
  libbpf: Fix segfault in static linker for objects without BTF

Matteo Croce (1):
  bpf: Update bpf_get_smp_processor_id() documentation

Toke Høiland-Jørgensen (1):
  libbpf: Ignore STT_SECTION symbols in 'maps' section

 include/uapi/linux/bpf.h |  18 +-
 src/bpf_helpers.h        |  51 ++-
 src/libbpf.c             | 882 +++++++++++++++++++++++----------------
 src/libbpf.h             | 106 +++--
 src/libbpf_common.h      |   5 +
 src/libbpf_internal.h    |   7 +
 src/libbpf_legacy.h      |   9 +
 src/linker.c             |   8 +-
 8 files changed, 679 insertions(+), 407 deletions(-)

--
2.30.2
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
151e3cb314 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-09-29 12:05:46 -07:00
Kumar Kartikeya Dwivedi
2cfeea135c libbpf: Fix segfault in static linker for objects without BTF
When a BPF object is compiled without BTF info (without -g),
trying to link such objects using bpftool causes a SIGSEGV due to
btf__get_nr_types accessing obj->btf which is NULL. Fix this by
checking for the NULL pointer, and return error.

Reproducer:
$ cat a.bpf.c
extern int foo(void);
int bar(void) { return foo(); }
$ cat b.bpf.c
int foo(void) { return 0; }
$ clang -O2 -target bpf -c a.bpf.c
$ clang -O2 -target bpf -c b.bpf.c
$ bpftool gen obj out a.bpf.o b.bpf.o
Segmentation fault (core dumped)

After fix:
$ bpftool gen obj out a.bpf.o b.bpf.o
libbpf: failed to find BTF info for object 'a.bpf.o'
Error: failed to link 'a.bpf.o': Unknown error -22 (-22)

Fixes: a46349227cd8 (libbpf: Add linker extern resolution support for functions and global variables)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210924023725.70228-1-memxor@gmail.com
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
3fe8ee2edb selftests/bpf: Switch sk_lookup selftests to strict SEC("sk_lookup") use
Update "sk_lookup/" definition to be a stand-alone type specifier,
with backwards-compatible prefix match logic in non-libbpf-1.0 mode.

Currently in selftests all the "sk_lookup/<whatever>" uses just use
<whatever> for duplicated unique name encoding, which is redundant as
BPF program's name (C function name) uniquely and descriptively
identifies the intended use for such BPF programs.

With libbpf's SEC_DEF("sk_lookup") definition updated, switch existing
sk_lookup programs to use "unqualified" SEC("sk_lookup") section names,
with no random text after it.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-11-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
823881b4f6 libbpf: Add opt-in strict BPF program section name handling logic
Implement strict ELF section name handling for BPF programs. It utilizes
`libbpf_set_strict_mode()` framework and adds new flag: LIBBPF_STRICT_SEC_NAME.

If this flag is set, libbpf will enforce exact section name matching for
a lot of program types that previously allowed just partial prefix
match. E.g., if previously SEC("xdp_whatever_i_want") was allowed, now
in strict mode only SEC("xdp") will be accepted, which makes SEC("")
definitions cleaner and more structured. SEC() now won't be used as yet
another way to uniquely encode BPF program identifier (for that
C function name is better and is guaranteed to be unique within
bpf_object). Now SEC() is strictly BPF program type and, depending on
program type, extra load/attach parameter specification.

Libbpf completely supports multiple BPF programs in the same ELF
section, so multiple BPF programs of the same type/specification easily
co-exist together within the same bpf_object scope.

Additionally, a new (for now internal) convention is introduced: section
name that can be a stand-alone exact BPF program type specificator, but
also could have extra parameters after '/' delimiter. An example of such
section is "struct_ops", which can be specified by itself, but also
allows to specify the intended operation to be attached to, e.g.,
"struct_ops/dctcp_init". Note, that "struct_ops_some_op" is not allowed.
Such section definition is specified as "struct_ops+".

This change is part of libbpf 1.0 effort ([0], [1]).

  [0] Closes: https://github.com/libbpf/libbpf/issues/271
  [1] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#stricter-and-more-uniform-bpf-program-section-name-sec-handling

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-10-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
20c1aa11b1 libbpf: Complete SEC() table unification for BPF_APROG_SEC/BPF_EAPROG_SEC
Complete SEC() table refactoring towards unified form by rewriting
BPF_APROG_SEC and BPF_EAPROG_SEC definitions with
SEC_DEF(SEC_ATTACHABLE_OPT) (for optional expected_attach_type) and
SEC_DEF(SEC_ATTACHABLE) (mandatory expected_attach_type), respectively.
Drop BPF_APROG_SEC, BPF_EAPROG_SEC, and BPF_PROG_SEC_IMPL macros after
that, leaving SEC_DEF() macro as the only one used.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-9-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
62ea715f71 libbpf: Refactor ELF section handler definitions
Refactor ELF section handler definitions table to use a set of flags and
unified SEC_DEF() macro. This allows for more succinct and table-like
set of definitions, and allows to more easily extend the logic without
adding more verbosity (this is utilized in later patches in the series).

This approach is also making libbpf-internal program pre-load callback
not rely on bpf_sec_def definition, which demonstrates that future
pluggable ELF section handlers will be able to achieve similar level of
integration without libbpf having to expose extra types and APIs.

For starters, update SEC_DEF() definitions and make them more succinct.
Also convert BPF_PROG_SEC() and BPF_APROG_COMPAT() definitions to
a common SEC_DEF() use.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-8-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
3c5c62097e libbpf: Reduce reliance of attach_fns on sec_def internals
Move closer to not relying on bpf_sec_def internals that won't be part
of public API, when pluggable SEC() handlers will be allowed. Drop
pre-calculated prefix length, and in various helpers don't rely on this
prefix length availability. Also minimize reliance on knowing
bpf_sec_def's prefix for few places where section prefix shortcuts are
supported (e.g., tp vs tracepoint, raw_tp vs raw_tracepoint).

Given checking some string for having a given string-constant prefix is
such a common operation and so annoying to be done with pure C code, add
a small macro helper, str_has_pfx(), and reuse it throughout libbpf.c
where prefix comparison is performed. With __builtin_constant_p() it's
possible to have a convenient helper that checks some string for having
a given prefix, where prefix is either string literal (or compile-time
known string due to compiler optimization) or just a runtime string
pointer, which is quite convenient and saves a lot of typing and string
literal duplication.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-7-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
fba5f02401 libbpf: Refactor internal sec_def handling to enable pluggability
Refactor internals of libbpf to allow adding custom SEC() handling logic
easily from outside of libbpf. To that effect, each SEC()-handling
registration sets mandatory program type/expected attach type for
a given prefix and can provide three callbacks called at different
points of BPF program lifetime:

  - init callback for right after bpf_program is initialized and
  prog_type/expected_attach_type is set. This happens during
  bpf_object__open() step, close to the very end of constructing
  bpf_object, so all the libbpf APIs for querying and updating
  bpf_program properties should be available;

  - pre-load callback is called right before BPF_PROG_LOAD command is
  called in the kernel. This callbacks has ability to set both
  bpf_program properties, as well as program load attributes, overriding
  and augmenting the standard libbpf handling of them;

  - optional auto-attach callback, which makes a given SEC() handler
  support auto-attachment of a BPF program through bpf_program__attach()
  API and/or BPF skeletons <skel>__attach() method.

Each callbacks gets a `long cookie` parameter passed in, which is
specified during SEC() handling. This can be used by callbacks to lookup
whatever additional information is necessary.

This is not yet completely ready to be exposed to the outside world,
mainly due to non-public nature of struct bpf_prog_load_params. Instead
of making it part of public API, we'll wait until the planned low-level
libbpf API improvements for BPF_PROG_LOAD and other typical bpf()
syscall APIs, at which point we'll have a public, probably OPTS-based,
way to fully specify BPF program load parameters, which will be used as
an interface for custom pre-load callbacks.

But this change itself is already a good first step to unify the BPF
program hanling logic even within the libbpf itself. As one example, all
the extra per-program type handling (sleepable bit, attach_btf_id
resolution, unsetting optional expected attach type) is now more obvious
and is gathered in one place.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-6-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
bb833f8129 libbpf: Add "tc" SEC_DEF which is a better name for "classifier"
As argued in [0], add "tc" ELF section definition for SCHED_CLS BPF
program type. "classifier" is a misleading terminology and should be
migrated away from.

  [0] https://lore.kernel.org/bpf/270e27b1-e5be-5b1c-b343-51bd644d0747@iogearbox.net/

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-2-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Toke Høiland-Jørgensen
cb12e83136 libbpf: Ignore STT_SECTION symbols in 'maps' section
When parsing legacy map definitions, libbpf would error out when
encountering an STT_SECTION symbol. This becomes a problem because some
versions of binutils will produce SECTION symbols for every section when
processing an ELF file, so BPF files run through 'strip' will end up with
such symbols, making libbpf refuse to load them.

There's not really any reason why erroring out is strictly necessary, so
change libbpf to just ignore SECTION symbols when parsing the ELF.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210927205810.715656-1-toke@redhat.com
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
09b528e847 libbpf: Add legacy uprobe attaching support
Similarly to recently added legacy kprobe attach interface support
through tracefs, support attaching uprobes using the legacy interface if
host kernel doesn't support newer FD-based interface.

For uprobes event name consists of "libbpf_" prefix, PID, sanitized
binary path and offset within that binary. Structuraly the code is
aligned with kprobe logic refactoring in previous patch. struct
bpf_link_perf is re-used and all the same legacy_probe_name and
legacy_is_retprobe fields are used to ensure proper cleanup on
bpf_link__destroy().

Users should be aware, though, that on old kernels which don't support
FD-based interface for kprobe/uprobe attachment, if the application
crashes before bpf_link__destroy() is called, uprobe legacy
events will be left in tracefs. This is the same limitation as with
legacy kprobe interfaces.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-5-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
9ecc0dcc19 libbpf: Refactor and simplify legacy kprobe code
Refactor legacy kprobe handling code to follow the same logic as uprobe
legacy logic added in the next patchs:
  - add append_to_file() helper that makes it simpler to work with
    tracefs file-based interface for creating and deleting probes;
  - move out probe/event name generation outside of the code that
    adds/removes it, which simplifies bookkeeping significantly;
  - change the probe name format to start with "libbpf_" prefix and
    include offset within kernel function;
  - switch 'unsigned long' to 'size_t' for specifying kprobe offsets,
    which is consistent with how uprobes define that, simplifies
    printf()-ing internally, and also avoids unnecessary complications on
    architectures where sizeof(long) != sizeof(void *).

This patch also implicitly fixes the problem with invalid open() error
handling present in poke_kprobe_events(), which (the function) this
patch removes.

Fixes: ca304b40c20d ("libbpf: Introduce legacy kprobe events support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-4-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
246b61780b libbpf: Fix memory leak in legacy kprobe attach logic
In some error scenarios legacy_probe string won't be free()'d. Fix this.
This was reported by Coverity static analysis.

Fixes: ca304b40c20d ("libbpf: Introduce legacy kprobe events support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-2-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Grant Seltzer
cdf14ff22b libbpf: Add doc comments in libbpf.h
This adds comments above functions in libbpf.h which document
their uses. These comments are of a format that doxygen and sphinx
can pick up and render. These are rendered by libbpf.readthedocs.org

These doc comments are for:
- bpf_object__find_map_by_name()
- bpf_map__fd()
- bpf_map__is_internal()
- libbpf_get_error()
- libbpf_num_possible_cpus()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210918031457.36204-1-grantseltzer@gmail.com
2021-09-29 12:05:46 -07:00
Dave Marchevsky
e1966114cc bpf: Clarify data_len param in bpf_snprintf and bpf_seq_printf comments
Since the data_len in these two functions is a byte len of the preceding
u64 *data array, it must always be a multiple of 8. If this isn't the
case both helpers error out, so let's make the requirement explicit so
users don't need to infer it.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-10-davemarchevsky@fb.com
2021-09-29 12:05:46 -07:00
Dave Marchevsky
989d7189cd libbpf: Use static const fmt string in __bpf_printk
The __bpf_printk convenience macro was using a 'char' fmt string holder
as it predates support for globals in libbpf. Move to more efficient
'static const char', but provide a fallback to the old way via
BPF_NO_GLOBAL_DATA so users on old kernels can still use the macro.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-6-davemarchevsky@fb.com
2021-09-29 12:05:46 -07:00
Dave Marchevsky
18922504c3 libbpf: Modify bpf_printk to choose helper based on arg count
Instead of being a thin wrapper which calls into bpf_trace_printk,
libbpf's bpf_printk convenience macro now chooses between
bpf_trace_printk and bpf_trace_vprintk. If the arg count (excluding
format string) is >3, use bpf_trace_vprintk, otherwise use the older
helper.

The motivation behind this added complexity - instead of migrating
entirely to bpf_trace_vprintk - is to maintain good developer experience
for users compiling against new libbpf but running on older kernels.
Users who are passing <=3 args to bpf_printk will see no change in their
bytecode.

__bpf_vprintk functions similarly to BPF_SEQ_PRINTF and BPF_SNPRINTF
macros elsewhere in the file - it allows use of bpf_trace_vprintk
without manual conversion of varargs to u64 array. Previous
implementation of bpf_printk macro is moved to __bpf_printk for use by
the new implementation.

This does change behavior of bpf_printk calls with >3 args in the "new
libbpf, old kernels" scenario. Before this patch, attempting to use 4
args to bpf_printk results in a compile-time error. After this patch,
using bpf_printk with 4 args results in a trace_vprintk helper call
being emitted and a load-time failure on older kernels.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-5-davemarchevsky@fb.com
2021-09-29 12:05:46 -07:00
Dave Marchevsky
029273039d bpf: Add bpf_trace_vprintk helper
This helper is meant to be "bpf_trace_printk, but with proper vararg
support". Follow bpf_snprintf's example and take a u64 pseudo-vararg
array. Write to /sys/kernel/debug/tracing/trace_pipe using the same
mechanism as bpf_trace_printk. The functionality of this helper was
requested in the libbpf issue tracker [0].

[0] Closes: https://github.com/libbpf/libbpf/issues/315

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-4-davemarchevsky@fb.com
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
b177fac2e2 libbpf: Constify all high-level program attach APIs
Attach APIs shouldn't need to modify bpf_program/bpf_map structs, so
change all struct bpf_program and struct bpf_map pointers to const
pointers. This is completely backwards compatible with no functional
change.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-8-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
3e7c04669e libbpf: Schedule open_opts.attach_prog_fd deprecation since v0.7
bpf_object_open_opts.attach_prog_fd makes a pretty strong assumption
that bpf_object contains either only single freplace BPF program or all
of BPF programs in BPF object are freplaces intended to replace
different subprograms of the same target BPF program. This seems both
a bit confusing, too assuming, and limiting.

We've had bpf_program__set_attach_target() API which allows more
fine-grained control over this, on a per-program level. As such, mark
open_opts.attach_prog_fd as deprecated starting from v0.7, so that we
have one more universal way of setting freplace targets. With previous
change to allow NULL attach_func_name argument, and especially combined
with BPF skeleton, arguable bpf_program__set_attach_target() is a more
convenient and explicit API as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-7-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
599b999f5d libbpf: Allow skipping attach_func_name in bpf_program__set_attach_target()
Allow to use bpf_program__set_attach_target to only set target attach
program FD, while letting libbpf to use target attach function name from
SEC() definition. This might be useful for some scenarios where
bpf_object contains multiple related freplace BPF programs intended to
replace different sub-programs in target BPF program. In such case all
programs will have the same attach_prog_fd, but different
attach_func_name. It's convenient to specify such target function names
declaratively in SEC() definitions, but attach_prog_fd is a dynamic
runtime setting.

To simplify such scenario, allow bpf_program__set_attach_target() to
delay BTF ID resolution till the BPF program load time by providing NULL
attach_func_name. In that case the behavior will be similar to using
bpf_object_open_opts.attach_prog_fd (which is marked deprecated since
v0.7), but has the benefit of allowing more control by user in what is
attached to what. Such setup allows having BPF programs attached to
different target attach_prog_fd with target functions still declaratively
recorded in BPF source code in SEC() definitions.

Selftests changes in the next patch should make this more obvious.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-5-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
e856a7d560 libbpf: Deprecated bpf_object_open_opts.relaxed_core_relocs
It's relevant and hasn't been doing anything for a long while now.
Deprecated it.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-4-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
0a76ce1395 libbpf: Use pre-setup sec_def in libbpf_find_attach_btf_id()
Don't perform another search for sec_def inside
libbpf_find_attach_btf_id(), as each recognized bpf_program already has
prog->sec_def set.

Also remove unnecessary NULL check for prog->sec_name, as it can never
be NULL.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-2-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Matteo Croce
1eba3a6470 bpf: Update bpf_get_smp_processor_id() documentation
BPF programs run with migration disabled regardless of preemption, as
they are protected by migrate_disable(). Update the uapi documentation
accordingly.

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210914235400.59427-1-mcroce@linux.microsoft.com
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
7d365e49f3 ci: use -a/-d with exact string match for black/white-listing
Doing substring matches allows accidental new tests to be enabled,
when they are not supposed to be. E.g., whitelisting "xdp" allows new
"xdpwall" test on 5.5.0, which wasn't supposed to happen.

Cc: Yucong Sun <fallentree@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-09-29 11:15:38 -07:00
Andrii Nakryiko
980777cc16 vmtest: temporarily blacklist spammy selftests
There is a problem in bpf-next tree which causes get_stack_raw_tp and
few other selftests to produce tons of kernel warnings, timing out and
failing CI test runs. Blacklist until bpf tree, which has a fix, is
merged into bpf-next.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-09-17 14:03:39 -07:00
Andrii Nakryiko
24dbcb3a30 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   47bb27a20d6ea22cd092c1fc2bb4fcecac374838
Checkpoint bpf-next commit: 69cd823956ba8ce266a901170b1060db8073bddd
Baseline bpf commit:        3776f3517ed94d40ff0e3851d7ce2ce17b63099f
Checkpoint bpf commit:      bc23f724481759d0fac61dfb5ce979af2190bbe0

Andrii Nakryiko (4):
  libbpf: Make libbpf_version.h non-auto-generated
  libbpf: Ensure BPF prog types are set before relocations
  libbpf: Simplify BPF program auto-attach code
  libbpf: Minimize explicit iterator of section definition array

Grant Seltzer (1):
  libbpf: Add sphinx code documentation comments

Quentin Monnet (1):
  libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API
    deprecations

Rafael David Tinoco (1):
  libbpf: Introduce legacy kprobe events support

Rocco Yue (1):
  ipv6: add IFLA_INET6_RA_MTU to expose mtu value

Song Liu (1):
  bpf: Introduce helper bpf_get_branch_snapshot

Vadim Fedorenko (1):
  bpf: Add hardware timestamp field to __sk_buff

Yonghong Song (4):
  btf: Change BTF_KIND_* macros to enums
  bpf: Support for new btf kind BTF_KIND_TAG
  libbpf: Rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag
  libbpf: Add support for BTF_KIND_TAG

 include/uapi/linux/bpf.h     |  24 +++
 include/uapi/linux/btf.h     |  55 ++++--
 include/uapi/linux/if_link.h |   1 +
 src/btf.c                    |  84 +++++++-
 src/btf.h                    |  87 +++++++++
 src/btf_dump.c               |   3 +
 src/libbpf.c                 | 359 ++++++++++++++++++++++++-----------
 src/libbpf.map               |   5 +
 src/libbpf_common.h          |  19 ++
 src/libbpf_internal.h        |   2 +
 src/libbpf_version.h         |   9 +
 11 files changed, 505 insertions(+), 143 deletions(-)
 create mode 100644 src/libbpf_version.h

--
2.30.2
2021-09-17 14:03:39 -07:00
Andrii Nakryiko
42da89eb16 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-09-17 14:03:39 -07:00
Grant Seltzer
627cbb395b libbpf: Add sphinx code documentation comments
This adds comments above five functions in btf.h which document
their uses. These comments are of a format that doxygen and sphinx
can pick up and render. These are rendered by libbpf.readthedocs.org

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210915021951.117186-1-grantseltzer@gmail.com
2021-09-17 14:03:39 -07:00
Yonghong Song
7e7f59d658 libbpf: Add support for BTF_KIND_TAG
Add BTF_KIND_TAG support for parsing and dedup.
Also added sanitization for BTF_KIND_TAG. If BTF_KIND_TAG is not
supported in the kernel, sanitize it to INTs.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223025.246687-1-yhs@fb.com
2021-09-17 14:03:39 -07:00
Yonghong Song
966ba8918d libbpf: Rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag
This patch renames functions btf_{hash,equal}_int() to
btf_{hash,equal}_int_tag() so they can be reused for
BTF_KIND_TAG support. There is no functionality change for
this patch.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223020.245829-1-yhs@fb.com
2021-09-17 14:03:39 -07:00
Yonghong Song
fad7357469 bpf: Support for new btf kind BTF_KIND_TAG
LLVM14 added support for a new C attribute ([1])
  __attribute__((btf_tag("arbitrary_str")))
This attribute will be emitted to dwarf ([2]) and pahole
will convert it to BTF. Or for bpf target, this
attribute will be emitted to BTF directly ([3], [4]).
The attribute is intended to provide additional
information for
  - struct/union type or struct/union member
  - static/global variables
  - static/global function or function parameter.

For linux kernel, the btf_tag can be applied
in various places to specify user pointer,
function pre- or post- condition, function
allow/deny in certain context, etc. Such information
will be encoded in vmlinux BTF and can be used
by verifier.

The btf_tag can also be applied to bpf programs
to help global verifiable functions, e.g.,
specifying preconditions, etc.

This patch added basic parsing and checking support
in kernel for new BTF_KIND_TAG kind.

 [1] https://reviews.llvm.org/D106614
 [2] https://reviews.llvm.org/D106621
 [3] https://reviews.llvm.org/D106622
 [4] https://reviews.llvm.org/D109560

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223015.245546-1-yhs@fb.com
2021-09-17 14:03:39 -07:00
Yonghong Song
a6188fc5b4 btf: Change BTF_KIND_* macros to enums
Change BTF_KIND_* macros to enums so they are encoded in dwarf and
appear in vmlinux.h. This will make it easier for bpf programs
to use these constants without macro definitions.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223009.245307-1-yhs@fb.com
2021-09-17 14:03:39 -07:00
Andrii Nakryiko
d328ba7768 libbpf: Minimize explicit iterator of section definition array
Remove almost all the code that explicitly iterated BPF program section
definitions in favor of using find_sec_def(). The only remaining user of
section_defs is libbpf_get_type_names that has to iterate all of them to
construct its result.

Having one internal API entry point for section definitions will
simplify further refactorings around libbpf's program section
definitions parsing.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-5-andrii@kernel.org
2021-09-17 14:03:39 -07:00
Andrii Nakryiko
27d14b6e3b libbpf: Simplify BPF program auto-attach code
Remove the need to explicitly pass bpf_sec_def for auto-attachable BPF
programs, as it is already recorded at bpf_object__open() time for all
recognized type of BPF programs. This further reduces number of explicit
calls to find_sec_def(), simplifying further refactorings.

No functional changes are done by this patch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-4-andrii@kernel.org
2021-09-17 14:03:39 -07:00
Andrii Nakryiko
f3c744997f libbpf: Ensure BPF prog types are set before relocations
Refactor bpf_object__open() sequencing to perform BPF program type
detection based on SEC() definitions before we get to relocations
collection. This allows to have more information about BPF program by
the time we get to, say, struct_ops relocation gathering. This,
subsequently, simplifies struct_ops logic and removes the need to
perform extra find_sec_def() resolution.

With this patch libbpf will require all struct_ops BPF programs to be
marked with SEC("struct_ops") or SEC("struct_ops/xxx") annotations.
Real-world applications are already doing that through something like
selftests's BPF_STRUCT_OPS() macro. This change streamlines libbpf's
internal handling of SEC() definitions and is in the sprit of
upcoming libbpf-1.0 section strictness changes ([0]).

  [0] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#stricter-and-more-uniform-bpf-program-section-name-sec-handling

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-3-andrii@kernel.org
2021-09-17 14:03:39 -07:00
Rafael David Tinoco
749b3942a0 libbpf: Introduce legacy kprobe events support
Allow kprobe tracepoint events creation through legacy interface, as the
kprobe dynamic PMUs support, used by default, was only created in v4.17.

Store legacy kprobe name in struct bpf_perf_link, instead of creating
a new "subclass" off of bpf_perf_link. This is ok as it's just two new
fields, which are also going to be reused for legacy uprobe support in
follow up patches.

Signed-off-by: Rafael David Tinoco <rafaeldtinoco@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210912064844.3181742-1-rafaeldtinoco@gmail.com
2021-09-17 14:03:39 -07:00
Andrii Nakryiko
8ade99a6f8 libbpf: Make libbpf_version.h non-auto-generated
Turn previously auto-generated libbpf_version.h header into a normal
header file. This prevents various tricky Makefile integration issues,
simplifies the overall build process, but also allows to further extend
it with some more versioning-related APIs in the future.

To prevent accidental out-of-sync versions as defined by libbpf.map and
libbpf_version.h, Makefile checks their consistency at build time.

Simultaneously with this change bump libbpf.map to v0.6.

Also undo adding libbpf's output directory into include path for
kernel/bpf/preload, bpftool, and resolve_btfids, which is not necessary
because libbpf_version.h is just a normal header like any other.

Fixes: 0b46b7550560 ("libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210913222309.3220849-1-andrii@kernel.org
2021-09-17 14:03:39 -07:00
Song Liu
0c0f4a57da bpf: Introduce helper bpf_get_branch_snapshot
Introduce bpf_get_branch_snapshot(), which allows tracing pogram to get
branch trace from hardware (e.g. Intel LBR). To use the feature, the
user need to create perf_event with proper branch_record filtering
on each cpu, and then calls bpf_get_branch_snapshot in the bpf function.
On Intel CPUs, VLBR event (raw event 0x1b00) can be use for this.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210910183352.3151445-3-songliubraving@fb.com
2021-09-17 14:03:39 -07:00
Vadim Fedorenko
03f31a6aed bpf: Add hardware timestamp field to __sk_buff
BPF programs may want to know hardware timestamps if NIC supports
such timestamping.

Expose this data as hwtstamp field of __sk_buff the same way as
gso_segs/gso_size. This field could be accessed from the same
programs as tstamp field, but it's read-only field. Explicit test
to deny access to padding data is added to bpf_skb_is_valid_access.

Also update BPF_PROG_TEST_RUN tests of the feature.

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210909220409.8804-2-vfedorenko@novek.ru
2021-09-17 14:03:39 -07:00
Quentin Monnet
f8983f7fb0 libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations
Introduce a macro LIBBPF_DEPRECATED_SINCE(major, minor, message) to prepare
the deprecation of two API functions. This macro marks functions as deprecated
when libbpf's version reaches the values passed as an argument.

As part of this change libbpf_version.h header is added with recorded major
(LIBBPF_MAJOR_VERSION) and minor (LIBBPF_MINOR_VERSION) libbpf version macros.
They are now part of libbpf public API and can be relied upon by user code.
libbpf_version.h is installed system-wide along other libbpf public headers.

Due to this new build-time auto-generated header, in-kernel applications
relying on libbpf (resolve_btfids, bpftool, bpf_preload) are updated to
include libbpf's output directory as part of a list of include search paths.
Better fix would be to use libbpf's make_install target to install public API
headers, but that clean up is left out as a future improvement. The build
changes were tested by building kernel (with KBUILD_OUTPUT and O= specified
explicitly), bpftool, libbpf, selftests/bpf, and resolve_btfids builds. No
problems were detected.

Note that because of the constraints of the C preprocessor we have to write
a few lines of macro magic for each version used to prepare deprecation (0.6
for now).

Also, use LIBBPF_DEPRECATED_SINCE() to schedule deprecation of
btf__get_from_id() and btf__load(), which are replaced by
btf__load_from_kernel_by_id() and btf__load_into_kernel(), respectively,
starting from future libbpf v0.6. This is part of libbpf 1.0 effort ([0]).

  [0] Closes: https://github.com/libbpf/libbpf/issues/278

Co-developed-by: Quentin Monnet <quentin@isovalent.com>
Co-developed-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210908213226.1871016-1-andrii@kernel.org
2021-09-17 14:03:39 -07:00
Rocco Yue
514bb47ac5 ipv6: add IFLA_INET6_RA_MTU to expose mtu value
The kernel provides a "/proc/sys/net/ipv6/conf/<iface>/mtu"
file, which can temporarily record the mtu value of the last
received RA message when the RA mtu value is lower than the
interface mtu, but this proc has following limitations:

(1) when the interface mtu (/sys/class/net/<iface>/mtu) is
updeated, mtu6 (/proc/sys/net/ipv6/conf/<iface>/mtu) will
be updated to the value of interface mtu;
(2) mtu6 (/proc/sys/net/ipv6/conf/<iface>/mtu) only affect
ipv6 connection, and not affect ipv4.

Therefore, when the mtu option is carried in the RA message,
there will be a problem that the user sometimes cannot obtain
RA mtu value correctly by reading mtu6.

After this patch set, if a RA message carries the mtu option,
you can send a netlink msg which nlmsg_type is RTM_GETLINK,
and then by parsing the attribute of IFLA_INET6_RA_MTU to
get the mtu value carried in the RA message received on the
inet6 device. In addition, you can also get a link notification
when ra_mtu is updated so it doesn't have to poll.

In this way, if the MTU values that the device receives from
the network in the PCO IPv4 and the RA IPv6 procedures are
different, the user can obtain the correct ipv6 ra_mtu value
and compare the value of ra_mtu and ipv4 mtu, then the device
can use the lower MTU value for both IPv4 and IPv6.

Signed-off-by: Rocco Yue <rocco.yue@mediatek.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210827150412.9267-1-rocco.yue@mediatek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-17 14:03:39 -07:00
Andrii Nakryiko
05d95ef6fa makefile: add libbpf_version.h to the list of installed headers
libbpf_version.h is a new header that is installed along other public
libbpf API headers.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-09-17 14:03:39 -07:00
Andrii Nakryiko
eda1ebe520 ci: use test_progs -l option to generate the list of whitelisted tests
Use this list of enabled tests as a whitelist, so that we don't have to
keep updating BLACKLIST-5.5.0 anymore. I'll keep BLACKLIST-5.5.0 for
now, because it serves as a nice historic log of which tests depend on
which kernels.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-09-17 14:03:39 -07:00
grantseltzer
7c26fe30f3 Add enum to be displayed in documentation
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
2021-09-16 11:39:22 -07:00
Andrii Nakryiko
5579664205 libbpf: Fix build with latest gcc/binutils with LTO
After updating to binutils 2.35, the build began to fail with an
assembler error. A bug was opened on the Red Hat Bugzilla a few days
later for the same issue.

Work around the problem by using the new `symver` attribute (introduced
in GCC 10) as needed instead of assembler directives.

This addresses Red Hat ([0]) and OpenSUSE ([1]) bug reports, as well as libbpf
issue ([2]).

  [0]: https://bugzilla.redhat.com/show_bug.cgi?id=1863059
  [1]: https://bugzilla.opensuse.org/show_bug.cgi?id=1188749
  [2]: Closes: https://github.com/libbpf/libbpf/issues/338

Co-developed-by: Patrick McCarty <patrick.mccarty@intel.com>
Co-developed-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Patrick McCarty <patrick.mccarty@intel.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210907221023.2660953-1-andrii@kernel.org
2021-09-08 16:01:51 -07:00
Matt Smith
860b201cd0 libbpf: Change bpf_object_skeleton data field to const pointer
This change was necessary to enforce the implied contract
that bpf_object_skeleton->data should not be mutated.  The data
will be cast to `void *` during assignment to handle the case
where a user is compiling with older libbpf headers to avoid
a compiler warning of `const void *` data being cast to `void *`

Signed-off-by: Matt Smith <alastorze@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210901194439.3853238-2-alastorze@fb.com
2021-09-08 16:01:51 -07:00
Toke Høiland-Jørgensen
50e13993a1 libbpf: Don't crash on object files with no symbol tables
If libbpf encounters an ELF file that has been stripped of its symbol
table, it will crash in bpf_object__add_programs() when trying to
dereference the obj->efile.symbols pointer.

Fix this by erroring out of bpf_object__elf_collect() if it is not able
able to find the symbol table.

v2:
  - Move check into bpf_object__elf_collect() and add nice error message

Fixes: 6245947c1b3c ("libbpf: Allow gaps in BPF program sections to support overriden weak functions")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210901114812.204720-1-toke@redhat.com
2021-09-08 16:01:51 -07:00
Andrii Nakryiko
72fd44da53 ci: blacklist few new selftests for 5.5
Blacklist netns_cookie, sockopt_qos_to_cc, task_pt_regs for 5.5.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-31 14:21:44 -07:00
Andrii Nakryiko
d6e9681b0d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   d20b41115ad53293201cc07ee429a38740cb056b
Checkpoint bpf-next commit: 47bb27a20d6ea22cd092c1fc2bb4fcecac374838
Baseline bpf commit:        3776f3517ed94d40ff0e3851d7ce2ce17b63099f
Checkpoint bpf commit:      3776f3517ed94d40ff0e3851d7ce2ce17b63099f

Daniel Xu (1):
  bpf: Add bpf_task_pt_regs() helper

Dave Marchevsky (1):
  bpf: Migrate cgroup_bpf to internal cgroup_bpf_attach_type enum

 include/uapi/linux/bpf.h | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

--
2.30.2
2021-08-31 14:21:44 -07:00
Andrii Nakryiko
517762deca sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-08-31 14:21:44 -07:00
Daniel Xu
1670e6100b bpf: Add bpf_task_pt_regs() helper
The motivation behind this helper is to access userspace pt_regs in a
kprobe handler.

uprobe's ctx is the userspace pt_regs. kprobe's ctx is the kernelspace
pt_regs. bpf_task_pt_regs() allows accessing userspace pt_regs in a
kprobe handler. The final case (kernelspace pt_regs in uprobe) is
pretty rare (usermode helper) so I think that can be solved later if
necessary.

More concretely, this helper is useful in doing BPF-based DWARF stack
unwinding. Currently the kernel can only do framepointer based stack
unwinds for userspace code. This is because the DWARF state machines are
too fragile to be computed in kernelspace [0]. The idea behind
DWARF-based stack unwinds w/ BPF is to copy a chunk of the userspace
stack (while in prog context) and send it up to userspace for unwinding
(probably with libunwind) [1]. This would effectively enable profiling
applications with -fomit-frame-pointer using kprobes and uprobes.

[0]: https://lkml.org/lkml/2012/2/10/356
[1]: https://github.com/danobi/bpf-dwarf-walk

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/e2718ced2d51ef4268590ab8562962438ab82815.1629772842.git.dxu@dxuuu.xyz
2021-08-31 14:21:44 -07:00
Dave Marchevsky
ed529685db bpf: Migrate cgroup_bpf to internal cgroup_bpf_attach_type enum
Add an enum (cgroup_bpf_attach_type) containing only valid cgroup_bpf
attach types and a function to map bpf_attach_type values to the new
enum. Inspired by netns_bpf_attach_type.

Then, migrate cgroup_bpf to use cgroup_bpf_attach_type wherever
possible.  Functionality is unchanged as attach_type_to_prog_type
switches in bpf/syscall.c were preventing non-cgroup programs from
making use of the invalid cgroup_bpf array slots.

As a result struct cgroup_bpf uses 504 fewer bytes relative to when its
arrays were sized using MAX_BPF_ATTACH_TYPE.

bpf_cgroup_storage is notably not migrated as struct
bpf_cgroup_storage_key is part of uapi and contains a bpf_attach_type
member which is not meant to be opaque. Similarly, bpf_cgroup_link
continues to report its bpf_attach_type member to userspace via fdinfo
and bpf_link_info.

To ease disambiguation, bpf_attach_type variables are renamed from
'type' to 'atype' when changed to cgroup_bpf_attach_type.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210819092420.1984861-2-davemarchevsky@fb.com
2021-08-31 14:21:44 -07:00
Andrii Nakryiko
c92a5d043e ci: don't hang on kernel crashes in qemu
Backport of kernel-patches PR ([0]), done by @thefallentree.

  [0] https://github.com/kernel-patches/vmtest/pull/29

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-30 12:52:51 -07:00
Thiago Marques
aea40f7179 Merge remote-tracking branch 'upstream/master' 2021-08-20 01:33:45 +00:00
grantseltzer
8bdc267e7b sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   3c3bd542ffbb2ac09631313ede46ae66660ae550
Checkpoint bpf-next commit: d20b41115ad53293201cc07ee429a38740cb056b
Baseline bpf commit:        3776f3517ed94d40ff0e3851d7ce2ce17b63099f
Checkpoint bpf commit:      3776f3517ed94d40ff0e3851d7ce2ce17b63099f

Grant Seltzer (1):
  libbpf: Rename libbpf documentation index file

 docs/{libbpf.rst => index.rst}    | 8 ++++++++
 docs/libbpf_naming_convention.rst | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)
 rename docs/{libbpf.rst => index.rst} (75%)

--
2.31.1
2021-08-18 15:22:43 -07:00
Grant Seltzer
d0c398be4f libbpf: Rename libbpf documentation index file
This patch renames a documentation libbpf.rst to index.rst. In order
for readthedocs.org to pick this file up and properly build the
documentation site.

It also changes the title type of the ABI subsection in the
naming convention doc. This is so that readthedocs.org doesn't treat this
section as a separate document.

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210818151313.49992-1-grantseltzer@gmail.com
2021-08-18 15:22:43 -07:00
grantseltzer
7d9cc837ef Fix path to Doxygen source code input
Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
2021-08-18 12:28:09 -07:00
Andrii Nakryiko
a3c0cc19d4 ci: blacklist new selftests on 5.5
Blacklist bpf_cookie, perf_link, and xdp_bonding selftests on 5.5 kernel.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
7c6d34a2c9 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   372642ea83ff1c71a5d567a704c912359eb59776
Checkpoint bpf-next commit: 3c3bd542ffbb2ac09631313ede46ae66660ae550
Baseline bpf commit:        7c4a22339e7ce7b6ed473a8e682da622c3a774ee
Checkpoint bpf commit:      3776f3517ed94d40ff0e3851d7ce2ce17b63099f

Andrii Nakryiko (8):
  bpf: Implement minimal BPF perf link
  bpf: Allow to specify user-provided bpf_cookie for BPF perf links
  bpf: Add bpf_get_attach_cookie() BPF helper to access bpf_cookie value
  libbpf: Remove unused bpf_link's destroy operation, but add dealloc
  libbpf: Use BPF perf link when supported by kernel
  libbpf: Add bpf_cookie support to bpf_link_create() API
  libbpf: Add bpf_cookie to perf_event, kprobe, uprobe, and tp attach
    APIs
  libbpf: Add uprobe ref counter offset support for USDT semaphores

Hangbin Liu (1):
  bonding: add new option lacp_active

Hao Luo (1):
  libbpf: Support weak typed ksyms.

Randy Dunlap (1):
  libbpf, doc: Eliminate warnings in libbpf_naming_convention

Robin Gögge (1):
  libbpf: Fix probe for BPF_PROG_TYPE_CGROUP_SOCKOPT

grantseltzer (1):
  bpf: Reconfigure libbpf docs to remove unversioned API

 docs/libbpf_api.rst               |  27 ----
 docs/libbpf_naming_convention.rst |   4 +-
 include/uapi/linux/bpf.h          |  25 ++++
 include/uapi/linux/if_link.h      |   1 +
 src/bpf.c                         |  32 ++++-
 src/bpf.h                         |   8 +-
 src/libbpf.c                      | 229 +++++++++++++++++++++++-------
 src/libbpf.h                      |  75 ++++++++--
 src/libbpf.map                    |   3 +
 src/libbpf_internal.h             |  32 +++--
 src/libbpf_probes.c               |   4 +-
 11 files changed, 332 insertions(+), 108 deletions(-)
 delete mode 100644 docs/libbpf_api.rst

--
2.30.2
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
a69c52bb11 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
6d67d53143 libbpf: Add uprobe ref counter offset support for USDT semaphores
When attaching to uprobes through perf subsystem, it's possible to specify
offset of a so-called USDT semaphore, which is just a reference counted u16,
used by kernel to keep track of how many tracers are attached to a given
location. Support for this feature was added in [0], so just wire this through
uprobe_opts. This is important to enable implementing USDT attachment and
tracing through libbpf's bpf_program__attach_uprobe_opts() API.

  [0] a6ca88b241d5 ("trace_uprobe: support reference counter in fd-based uprobe")

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210815070609.987780-16-andrii@kernel.org
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
91259bc676 libbpf: Add bpf_cookie to perf_event, kprobe, uprobe, and tp attach APIs
Wire through bpf_cookie for all attach APIs that use perf_event_open under the
hood:
  - for kprobes, extend existing bpf_kprobe_opts with bpf_cookie field;
  - for perf_event, uprobe, and tracepoint APIs, add their _opts variants and
    pass bpf_cookie through opts.

For kernel that don't support BPF_LINK_CREATE for perf_events, and thus
bpf_cookie is not supported either, return error and log warning for user.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210815070609.987780-12-andrii@kernel.org
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
a3f8c5a306 libbpf: Add bpf_cookie support to bpf_link_create() API
Add ability to specify bpf_cookie value when creating BPF perf link with
bpf_link_create() low-level API.

Given BPF_LINK_CREATE command is growing and keeps getting new fields that are
specific to the type of BPF_LINK, extend libbpf side of bpf_link_create() API
and corresponding OPTS struct to accomodate such changes. Add extra checks to
prevent using incompatible/unexpected combinations of fields.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210815070609.987780-11-andrii@kernel.org
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
d23679b415 libbpf: Use BPF perf link when supported by kernel
Detect kernel support for BPF perf link and prefer it when attaching to
perf_event, tracepoint, kprobe/uprobe. Underlying perf_event FD will be kept
open until BPF link is destroyed, at which point both perf_event FD and BPF
link FD will be closed.

This preserves current behavior in which perf_event FD is open for the
duration of bpf_link's lifetime and user is able to "disconnect" bpf_link from
underlying FD (with bpf_link__disconnect()), so that bpf_link__destroy()
doesn't close underlying perf_event FD.When BPF perf link is used, disconnect
will keep both perf_event and bpf_link FDs open, so it will be up to
(advanced) user to close them. This approach is demonstrated in bpf_cookie.c
selftests, added in this patch set.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210815070609.987780-10-andrii@kernel.org
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
9923f25600 libbpf: Remove unused bpf_link's destroy operation, but add dealloc
bpf_link->destroy() isn't used by any code, so remove it. Instead, add ability
to override deallocation procedure, with default doing plain free(link). This
is necessary for cases when we want to "subclass" struct bpf_link to keep
extra information, as is the case in the next patch adding struct
bpf_link_perf.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210815070609.987780-9-andrii@kernel.org
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
40160ed4d4 bpf: Add bpf_get_attach_cookie() BPF helper to access bpf_cookie value
Add new BPF helper, bpf_get_attach_cookie(), which can be used by BPF programs
to get access to a user-provided bpf_cookie value, specified during BPF
program attachment (BPF link creation) time.

Naming is hard, though. With the concept being named "BPF cookie", I've
considered calling the helper:
  - bpf_get_cookie() -- seems too unspecific and easily mistaken with socket
    cookie;
  - bpf_get_bpf_cookie() -- too much tautology;
  - bpf_get_link_cookie() -- would be ok, but while we create a BPF link to
    attach BPF program to BPF hook, it's still an "attachment" and the
    bpf_cookie is associated with BPF program attachment to a hook, not a BPF
    link itself. Technically, we could support bpf_cookie with old-style
    cgroup programs.So I ultimately rejected it in favor of
    bpf_get_attach_cookie().

Currently all perf_event-backed BPF program types support
bpf_get_attach_cookie() helper. Follow-up patches will add support for
fentry/fexit programs as well.

While at it, mark bpf_tracing_func_proto() as static to make it obvious that
it's only used from within the kernel/trace/bpf_trace.c.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210815070609.987780-7-andrii@kernel.org
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
7b22fc4cdb bpf: Allow to specify user-provided bpf_cookie for BPF perf links
Add ability for users to specify custom u64 value (bpf_cookie) when creating
BPF link for perf_event-backed BPF programs (kprobe/uprobe, perf_event,
tracepoints).

This is useful for cases when the same BPF program is used for attaching and
processing invocation of different tracepoints/kprobes/uprobes in a generic
fashion, but such that each invocation is distinguished from each other (e.g.,
BPF program can look up additional information associated with a specific
kernel function without having to rely on function IP lookups). This enables
new use cases to be implemented simply and efficiently that previously were
possible only through code generation (and thus multiple instances of almost
identical BPF program) or compilation at runtime (BCC-style) on target hosts
(even more expensive resource-wise). For uprobes it is not even possible in
some cases to know function IP before hand (e.g., when attaching to shared
library without PID filtering, in which case base load address is not known
for a library).

This is done by storing u64 bpf_cookie in struct bpf_prog_array_item,
corresponding to each attached and run BPF program. Given cgroup BPF programs
already use two 8-byte pointers for their needs and cgroup BPF programs don't
have (yet?) support for bpf_cookie, reuse that space through union of
cgroup_storage and new bpf_cookie field.

Make it available to kprobe/tracepoint BPF programs through bpf_trace_run_ctx.
This is set by BPF_PROG_RUN_ARRAY, used by kprobe/uprobe/tracepoint BPF
program execution code, which luckily is now also split from
BPF_PROG_RUN_ARRAY_CG. This run context will be utilized by a new BPF helper
giving access to this user-provided cookie value from inside a BPF program.
Generic perf_event BPF programs will access this value from perf_event itself
through passed in BPF program context.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/bpf/20210815070609.987780-6-andrii@kernel.org
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
152882e17a bpf: Implement minimal BPF perf link
Introduce a new type of BPF link - BPF perf link. This brings perf_event-based
BPF program attachments (perf_event, tracepoints, kprobes, and uprobes) into
the common BPF link infrastructure, allowing to list all active perf_event
based attachments, auto-detaching BPF program from perf_event when link's FD
is closed, get generic BPF link fdinfo/get_info functionality.

BPF_LINK_CREATE command expects perf_event's FD as target_fd. No extra flags
are currently supported.

Force-detaching and atomic BPF program updates are not yet implemented, but
with perf_event-based BPF links we now have common framework for this without
the need to extend ioctl()-based perf_event interface.

One interesting consideration is a new value for bpf_attach_type, which
BPF_LINK_CREATE command expects. Generally, it's either 1-to-1 mapping from
bpf_attach_type to bpf_prog_type, or many-to-1 mapping from a subset of
bpf_attach_types to one bpf_prog_type (e.g., see BPF_PROG_TYPE_SK_SKB or
BPF_PROG_TYPE_CGROUP_SOCK). In this case, though, we have three different
program types (KPROBE, TRACEPOINT, PERF_EVENT) using the same perf_event-based
mechanism, so it's many bpf_prog_types to one bpf_attach_type. I chose to
define a single BPF_PERF_EVENT attach type for all of them and adjust
link_create()'s logic for checking correspondence between attach type and
program type.

The alternative would be to define three new attach types (e.g., BPF_KPROBE,
BPF_TRACEPOINT, and BPF_PERF_EVENT), but that seemed like unnecessary overkill
and BPF_KPROBE will cause naming conflicts with BPF_KPROBE() macro, defined by
libbpf. I chose to not do this to avoid unnecessary proliferation of
bpf_attach_type enum values and not have to deal with naming conflicts.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/bpf/20210815070609.987780-5-andrii@kernel.org
2021-08-17 08:18:57 -07:00
Hao Luo
0e7520949e libbpf: Support weak typed ksyms.
Currently weak typeless ksyms have default value zero, when they don't
exist in the kernel. However, weak typed ksyms are rejected by libbpf
if they can not be resolved. This means that if a bpf object contains
the declaration of a nonexistent weak typed ksym, it will be rejected
even if there is no program that references the symbol.

Nonexistent weak typed ksyms can also default to zero just like
typeless ones. This allows programs that access weak typed ksyms to be
accepted by verifier, if the accesses are guarded. For example,

extern const int bpf_link_fops3 __ksym __weak;

/* then in BPF program */

if (&bpf_link_fops3) {
   /* use bpf_link_fops3 */
}

If actual use of nonexistent typed ksym is not guarded properly,
verifier would see that register is not PTR_TO_BTF_ID and wouldn't
allow to use it for direct memory reads or passing it to BPF helpers.

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210812003819.2439037-1-haoluo@google.com
2021-08-17 08:18:57 -07:00
Randy Dunlap
c3f7daaab5 libbpf, doc: Eliminate warnings in libbpf_naming_convention
Use "code-block: none" instead of "c" for non-C-language code blocks.
Removes these warnings:

  lnx-514-rc4/Documentation/bpf/libbpf/libbpf_naming_convention.rst:111: WARNING: Could not lex literal_block as "c". Highlighting skipped.
  lnx-514-rc4/Documentation/bpf/libbpf/libbpf_naming_convention.rst:124: WARNING: Could not lex literal_block as "c". Highlighting skipped.

Fixes: f42cfb469f9b ("bpf: Add documentation for libbpf including API autogen")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210802015037.787-1-rdunlap@infradead.org
2021-08-17 08:18:57 -07:00
Hangbin Liu
1a1e7a0612 bonding: add new option lacp_active
Add an option lacp_active, which is similar with team's runner.active.
This option specifies whether to send LACPDU frames periodically. If set
on, the LACPDU frames are sent along with the configured lacp_rate
setting. If set off, the LACPDU frames acts as "speak when spoken to".

Note, the LACPDU state frames still will be sent when init or unbind port.

v2: remove module parameter

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
827963ffb3 sync: fix up docs sync path mapping
Kernel docs from Documentation/bpf/libbpf go straight to docs/ under libbpf.
Also ignore libbpf-only parts of docs subdir.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-16 22:37:57 -07:00
Andrii Nakryiko
4ab24e7d62 docs: initial set of libbpf docs
Add libbpf-related .rst files before they started being synced automatically.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-16 22:37:57 -07:00
grantseltzer
b2a63c974d docs: reconfigure libbpf documentation syncing
This adds documentation files, including ones for autogenerating API
documentation based on code comments in the source code that's pulled
in via the mirror.

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
2021-08-16 22:30:23 -07:00
Quentin Monnet
88649fe655 ci: run script to test bpftool types/options sync
When new eBPF program, map, or attach types are added to the kernel,
bpftool needs to be updated in order to support the related features.
These updates should add the new types to the code itself, but also to
the help messages, documentation, and bash completion. Given that it is
easy to omit one of those, a script has been created to attempt to
validate that all parts have been consistently updated.

This new script for bpftool is hosted in the kernel repository, amongst
the BPF selftests. But it is not called from the Makefile, and not run
along with the other selftests. If it was, all patches updating the BPF
UAPI would require the relevant changes in bpftool at the same time, _in
the same patches_, which is not desirable.

To ensure that bpftool's parts remain in sync, let's run this script
from the CI. This patch adds a new section to the run.sh script, focused
on bpftool, and calling the new test_bpftool_synctypes.py.
2021-08-16 15:16:08 -07:00
Sergei Iudin
1778e0b1bd Make CI tests compatible with vanilla kernel tree
This is required to migrate kernel-patches CI to use this code
instead of fork
2021-08-11 16:06:23 -07:00
Andrii Nakryiko
64f027efda ci: restore all temporary disabled tests
Upstream bpf-next should be good, so no temporary blocked tests should remain.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-09 13:54:27 -07:00
Yucong Sun
6bf8babb33 Add a test step to produce a minimal binary using libbpf.
This patch adds a test step to link a minimal program to libbpf library produced,
making sure that the library works.
2021-08-09 13:54:19 -07:00
Rafael David Tinoco
70ad3e8314 makefile: fix missing object for static compilation
Makefile needs relo_core object added to objects list to avoid static
linking errors when doing static compilation:

/bin/ld: .../libbpf.a(libbpf.o): in function `bpf_core_apply_relo':
.../libbpf/src/libbpf.c:5134: undefined reference to `bpf_core_apply_relo_insn'

Signed-off-by: Rafael David Tinoco <rafaeldtinoco@gmail.com>
2021-08-09 13:54:19 -07:00
Andrii Nakryiko
dbdd8f3b34 ci: make CI build log less verbose
Only keep stderr output in case of errors for kernel and selftests builds.
Having a multi-thousand-line output isn't useful and slows down Github
Actions' log view UI.

Also quiet down wget's "progress bar" output. While at the same time see some
totals from tar, just for the fun of it.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-09 13:54:19 -07:00
Andrii Nakryiko
52e96052a2 ci: blacklist newly migrated netcnt selftest
Seems like netcnt uses some map operations not supported by 5.5.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-09 13:54:19 -07:00
Andrii Nakryiko
41db5534d8 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   807b8f0e24e6004984094e1bcbbd2b297011a085
Checkpoint bpf-next commit: 372642ea83ff1c71a5d567a704c912359eb59776
Baseline bpf commit:        d6371c76e20d7d3f61b05fd67b596af4d14a8886
Checkpoint bpf commit:      a02215ce72a37a19a690803b23b091186ee4f7b2

Alexei Starovoitov (4):
  libbpf: Cleanup the layering between CORE and bpf_program.
  libbpf: Split bpf_core_apply_relo() into bpf_program independent
    helper.
  libbpf: Move CO-RE types into relo_core.h.
  libbpf: Split CO-RE logic into relo_core.c.

Daniel Xu (1):
  libbpf: Do not close un-owned FD 0 on errors

Evgeniy Litvinenko (1):
  libbpf: Add bpf_map__pin_path function

Hengqi Chen (1):
  libbpf: Add btf__load_vmlinux_btf/btf__load_module_btf

Jason Wang (1):
  libbpf: Fix comment typo

Jiri Olsa (3):
  libbpf: Fix func leak in attach_kprobe
  libbpf: Allow decimal offset for kprobes
  libbpf: Export bpf_program__attach_kprobe_opts function

Martynas Pumputis (1):
  libbpf: Fix race when pinning maps in parallel

Quentin Monnet (4):
  libbpf: Return non-null error on failures in libbpf_find_prog_btf_id()
  libbpf: Rename btf__load() as btf__load_into_kernel()
  libbpf: Rename btf__get_from_id() as btf__load_from_kernel_by_id()
  libbpf: Add split BTF support for btf__load_from_kernel_by_id()

Robin Gögge (1):
  libbpf: Fix probe for BPF_PROG_TYPE_CGROUP_SOCKOPT

 src/btf.c             |   50 +-
 src/btf.h             |   12 +-
 src/libbpf.c          | 1419 +++--------------------------------------
 src/libbpf.h          |   16 +
 src/libbpf.map        |    7 +
 src/libbpf_internal.h |   81 +--
 src/libbpf_probes.c   |    4 +-
 src/relo_core.c       | 1295 +++++++++++++++++++++++++++++++++++++
 src/relo_core.h       |  100 +++
 9 files changed, 1561 insertions(+), 1423 deletions(-)
 create mode 100644 src/relo_core.c
 create mode 100644 src/relo_core.h

--
2.30.2
2021-08-09 13:54:14 -07:00
Andrii Nakryiko
54a7bc87d5 ci: restore all temporary disabled tests
Upstream bpf-next should be good, so no temporary blocked tests should remain.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-06 20:09:33 -07:00
Yucong Sun
9979463ccf Remove shared linking for now 2021-08-06 15:04:04 -07:00
Yucong Sun
b91ca01922 Add a test step to produce a minimal binary using libbpf.
This patch adds a test step to link a minimal program to libbpf library produced,
making sure that the library works.
2021-08-06 15:04:04 -07:00
Rafael David Tinoco
8ded7c6db0 makefile: fix missing object for static compilation
Makefile needs relo_core object added to objects list to avoid static
linking errors when doing static compilation:

/bin/ld: .../libbpf.a(libbpf.o): in function `bpf_core_apply_relo':
.../libbpf/src/libbpf.c:5134: undefined reference to `bpf_core_apply_relo_insn'

Signed-off-by: Rafael David Tinoco <rafaeldtinoco@gmail.com>
2021-08-06 13:50:20 -07:00
Andrii Nakryiko
7df4ea0f0d ci: make CI build log less verbose
Only keep stderr output in case of errors for kernel and selftests builds.
Having a multi-thousand-line output isn't useful and slows down Github
Actions' log view UI.

Also quiet down wget's "progress bar" output. While at the same time see some
totals from tar, just for the fun of it.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-04 23:54:46 -07:00
Daniel Xu
02efadd0b0 libbpf: Do not close un-owned FD 0 on errors
Before this patch, btf_new() was liable to close an arbitrary FD 0 if
BTF parsing failed. This was because:

* btf->fd was initialized to 0 through the calloc()
* btf__free() (in the `done` label) closed any FDs >= 0
* btf->fd is left at 0 if parsing fails

This issue was discovered on a system using libbpf v0.3 (without
BTF_KIND_FLOAT support) but with a kernel that had BTF_KIND_FLOAT types
in BTF. Thus, parsing fails.

While this patch technically doesn't fix any issues b/c upstream libbpf
has BTF_KIND_FLOAT support, it'll help prevent issues in the future if
more BTF types are added. It also allow the fix to be backported to
older libbpf's.

Fixes: 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness")
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/5969bb991adedb03c6ae93e051fd2a00d293cf25.1627513670.git.dxu@dxuuu.xyz
2021-08-04 18:27:12 -07:00
Andrii Nakryiko
02333ba360 ci: blacklist newly migrated netcnt selftest
Seems like netcnt uses some map operations not supported by 5.5.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-04 18:27:12 -07:00
Robin Gögge
2805c2a4ca libbpf: Fix probe for BPF_PROG_TYPE_CGROUP_SOCKOPT
This patch fixes the probe for BPF_PROG_TYPE_CGROUP_SOCKOPT,
so the probe reports accurate results when used by e.g.
bpftool.

Fixes: 4cdbfb59c44a ("libbpf: support sockopt hooks")
Signed-off-by: Robin Gögge <r.goegge@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20210728225825.2357586-1-r.goegge@gmail.com
2021-08-04 18:27:12 -07:00
Andrii Nakryiko
6921017d25 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   807b8f0e24e6004984094e1bcbbd2b297011a085
Checkpoint bpf-next commit: 372642ea83ff1c71a5d567a704c912359eb59776
Baseline bpf commit:        d6371c76e20d7d3f61b05fd67b596af4d14a8886
Checkpoint bpf commit:      a02215ce72a37a19a690803b23b091186ee4f7b2

Alexei Starovoitov (4):
  libbpf: Cleanup the layering between CORE and bpf_program.
  libbpf: Split bpf_core_apply_relo() into bpf_program independent
    helper.
  libbpf: Move CO-RE types into relo_core.h.
  libbpf: Split CO-RE logic into relo_core.c.

Daniel Xu (1):
  libbpf: Do not close un-owned FD 0 on errors

Evgeniy Litvinenko (1):
  libbpf: Add bpf_map__pin_path function

Hengqi Chen (1):
  libbpf: Add btf__load_vmlinux_btf/btf__load_module_btf

Jason Wang (1):
  libbpf: Fix comment typo

Jiri Olsa (3):
  libbpf: Fix func leak in attach_kprobe
  libbpf: Allow decimal offset for kprobes
  libbpf: Export bpf_program__attach_kprobe_opts function

Martynas Pumputis (1):
  libbpf: Fix race when pinning maps in parallel

Quentin Monnet (4):
  libbpf: Return non-null error on failures in libbpf_find_prog_btf_id()
  libbpf: Rename btf__load() as btf__load_into_kernel()
  libbpf: Rename btf__get_from_id() as btf__load_from_kernel_by_id()
  libbpf: Add split BTF support for btf__load_from_kernel_by_id()

Robin Gögge (1):
  libbpf: Fix probe for BPF_PROG_TYPE_CGROUP_SOCKOPT

 src/btf.c             |   50 +-
 src/btf.h             |   12 +-
 src/libbpf.c          | 1419 +++--------------------------------------
 src/libbpf.h          |   16 +
 src/libbpf.map        |    7 +
 src/libbpf_internal.h |   81 +--
 src/libbpf_probes.c   |    4 +-
 src/relo_core.c       | 1295 +++++++++++++++++++++++++++++++++++++
 src/relo_core.h       |  100 +++
 9 files changed, 1561 insertions(+), 1423 deletions(-)
 create mode 100644 src/relo_core.c
 create mode 100644 src/relo_core.h

--
2.30.2
2021-08-04 18:27:12 -07:00
Hengqi Chen
e65d128903 libbpf: Add btf__load_vmlinux_btf/btf__load_module_btf
Add two new APIs: btf__load_vmlinux_btf and btf__load_module_btf.
btf__load_vmlinux_btf is just an alias to the existing API named
libbpf_find_kernel_btf, rename to be more precisely and consistent
with existing BTF APIs. btf__load_module_btf can be used to load
module BTF, add it for completeness. These two APIs are useful for
implementing tracing tools and introspection tools. This is part
of the effort towards libbpf 1.0 ([0]).

  [0] Closes: https://github.com/libbpf/libbpf/issues/280

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210730114012.494408-1-hengqi.chen@gmail.com
2021-08-04 18:27:12 -07:00
Quentin Monnet
512b472d97 libbpf: Add split BTF support for btf__load_from_kernel_by_id()
Add a new API function btf__load_from_kernel_by_id_split(), which takes
a pointer to a base BTF object in order to support split BTF objects
when retrieving BTF information from the kernel.

Reference: https://github.com/libbpf/libbpf/issues/314

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210729162028.29512-8-quentin@isovalent.com
2021-08-04 18:27:12 -07:00
Quentin Monnet
73788dd22f libbpf: Rename btf__get_from_id() as btf__load_from_kernel_by_id()
Rename function btf__get_from_id() as btf__load_from_kernel_by_id() to
better indicate what the function does. Change the new function so that,
instead of requiring a pointer to the pointer to update and returning
with an error code, it takes a single argument (the id of the BTF
object) and returns the corresponding pointer. This is more in line with
the existing constructors.

The other tools calling the (soon-to-be) deprecated btf__get_from_id()
function will be updated in a future commit.

References:

- https://github.com/libbpf/libbpf/issues/278
- https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#btfh-apis

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210729162028.29512-4-quentin@isovalent.com
2021-08-04 18:27:12 -07:00
Quentin Monnet
a180eb551e libbpf: Rename btf__load() as btf__load_into_kernel()
As part of the effort to move towards a v1.0 for libbpf, rename
btf__load() function, used to "upload" BTF information into the kernel,
as btf__load_into_kernel(). This new name better reflects what the
function does.

References:

- https://github.com/libbpf/libbpf/issues/278
- https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#btfh-apis

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210729162028.29512-3-quentin@isovalent.com
2021-08-04 18:27:12 -07:00
Quentin Monnet
9d2b7e471b libbpf: Return non-null error on failures in libbpf_find_prog_btf_id()
Variable "err" is initialised to -EINVAL so that this error code is
returned when something goes wrong in libbpf_find_prog_btf_id().
However, a recent change in the function made use of the variable in
such a way that it is set to 0 if retrieving linear information on the
program is successful, and this 0 value remains if we error out on
failures at later stages.

Let's fix this by setting err to -EINVAL later in the function.

Fixes: e9fc3ce99b34 ("libbpf: Streamline error reporting for high-level APIs")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210729162028.29512-2-quentin@isovalent.com
2021-08-04 18:27:12 -07:00
Martynas Pumputis
3a0fc666ef libbpf: Fix race when pinning maps in parallel
When loading in parallel multiple programs which use the same to-be
pinned map, it is possible that two instances of the loader will call
bpf_object__create_maps() at the same time. If the map doesn't exist
when both instances call bpf_object__reuse_map(), then one of the
instances will fail with EEXIST when calling bpf_map__pin().

Fix the race by retrying reusing a map if bpf_map__pin() returns
EEXIST. The fix is similar to the one in iproute2: e4c4685fd6e4 ("bpf:
Fix race condition with map pinning").

Before retrying the pinning, we don't do any special cleaning of an
internal map state. The closer code inspection revealed that it's not
required:

    - bpf_object__create_map(): map->inner_map is destroyed after a
      successful call, map->fd is closed if pinning fails.
    - bpf_object__populate_internal_map(): created map elements is
      destroyed upon close(map->fd).
    - init_map_slots(): slots are freed after their initialization.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210726152001.34845-1-m@lambda.lt
2021-08-04 18:27:12 -07:00
Jason Wang
7c25b1d569 libbpf: Fix comment typo
Remove the repeated word 'the' in line 48.

Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210727115928.74600-1-wangborong@cdjrlc.com
2021-08-04 18:27:12 -07:00
Alexei Starovoitov
d41e821ccf libbpf: Split CO-RE logic into relo_core.c.
Move CO-RE logic into separate file.
The internal interface between libbpf and CO-RE is through
bpf_core_apply_relo_insn() function and few structs defined in relo_core.h.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210721000822.40958-5-alexei.starovoitov@gmail.com
2021-08-04 18:27:12 -07:00
Alexei Starovoitov
2fe57e40ac libbpf: Move CO-RE types into relo_core.h.
In order to make a clean split of CO-RE logic move its types
into independent header file.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210721000822.40958-4-alexei.starovoitov@gmail.com
2021-08-04 18:27:12 -07:00
Alexei Starovoitov
f81dbd3475 libbpf: Split bpf_core_apply_relo() into bpf_program independent helper.
bpf_core_apply_relo() doesn't need to know bpf_program internals
and hashmap details.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210721000822.40958-3-alexei.starovoitov@gmail.com
2021-08-04 18:27:12 -07:00
Alexei Starovoitov
035fd6aca0 libbpf: Cleanup the layering between CORE and bpf_program.
CO-RE processing functions don't need to know 'struct bpf_program' details.
Cleanup the layering to eventually be able to move CO-RE logic into a separate file.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210721000822.40958-2-alexei.starovoitov@gmail.com
2021-08-04 18:27:12 -07:00
Evgeniy Litvinenko
e44c8486c6 libbpf: Add bpf_map__pin_path function
Add bpf_map__pin_path, so that the inconsistently named
bpf_map__get_pin_path can be deprecated later. This is part of the
effort towards libbpf v1.0: https://github.com/libbpf/libbpf/issues/307

Also, add a selftest for the new function.

Signed-off-by: Evgeniy Litvinenko <evgeniyl@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210723221511.803683-1-evgeniyl@fb.com
2021-08-04 18:27:12 -07:00
Jiri Olsa
14f5433b2e libbpf: Export bpf_program__attach_kprobe_opts function
Export bpf_program__attach_kprobe_opts as a public API.

Rename bpf_program_attach_kprobe_opts to bpf_kprobe_opts and turn it into OPTS
struct.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20210721215810.889975-4-jolsa@kernel.org
2021-08-04 18:27:12 -07:00
Jiri Olsa
d7a2de020b libbpf: Allow decimal offset for kprobes
Allow to specify decimal offset in SEC macro, like:
  SEC("kprobe/bpf_fentry_test7+5")

Add selftest for that.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20210721215810.889975-3-jolsa@kernel.org
2021-08-04 18:27:12 -07:00
Jiri Olsa
bb92e7ab4d libbpf: Fix func leak in attach_kprobe
Add missing free() for func pointer in attach_kprobe function.

Fixes: a2488b5f483f ("libbpf: Allow specification of "kprobe/function+offset"")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20210721215810.889975-2-jolsa@kernel.org
2021-08-04 18:27:12 -07:00
Yucong Sun
3f22535d56 Fix text grouping issue on github actions
github action grouping is broken because we were outputing "::endgroup" where
it needs "::endgroup::". This patch also added some addtional grouping around
contianer setup phase, making output easier to read.
2021-08-04 12:18:41 -07:00
Andrii Nakryiko
f8ab8bde8e ci: bump nightly Clang/LLVM version to 14
CI started to fail with missing clang-13 warning. Bump the version to 14.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-03 12:44:00 -07:00
Sergei Iudin
506a544834 ci: improve CI UX a little by setting names a hiding debug 2021-07-29 17:35:23 -07:00
Andrii Nakryiko
ec2c78c034 README: remove Travis CI build badge
We stop using Travis CI from now on.
2021-07-29 14:22:49 -07:00
Andrii Nakryiko
030ff87857 ci: fix log folding in Github Actions
Backport kernel-patches fix for the same issue ([0] and [1]).

  [0] 17c596fe8b
  [1] a38de685c7

Cc: Sergei Iudin <siudin@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-27 19:26:06 -07:00
Andrii Nakryiko
0db006d28e ci: add cancel-in-progress behavior for main test CI workflow
Make sure that only the latest enqueued workflow is running for any given PR
and/or branch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-22 23:22:42 -07:00
Andrii Nakryiko
6e6f18ac5d ci: add Github Actions status badge
Add badge to show the Github Actions test.yml workflow status.
2021-07-21 11:20:34 -07:00
Andrii Nakryiko
deca7932c3 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   08f71a1e39a1f07a464ac782d9b612d6a74c7015
Checkpoint bpf-next commit: 807b8f0e24e6004984094e1bcbbd2b297011a085
Baseline bpf commit:        d6371c76e20d7d3f61b05fd67b596af4d14a8886
Checkpoint bpf commit:      d6371c76e20d7d3f61b05fd67b596af4d14a8886

Alan Maguire (2):
  libbpf: Avoid use of __int128 in typed dump display
  libbpf: Propagate errors when retrieving enum value for typed data
    display

 src/btf_dump.c | 103 ++++++++++++++++++++++++++++++++-----------------
 1 file changed, 68 insertions(+), 35 deletions(-)

--
2.30.2
2021-07-21 11:16:01 -07:00
Alan Maguire
ebcae72279 libbpf: Propagate errors when retrieving enum value for typed data display
When retrieving the enum value associated with typed data during
"is data zero?" checking in btf_dump_type_data_check_zero(), the
return value of btf_dump_get_enum_value() is not passed to the caller
if the function returns a non-zero (error) value.  Currently, 0
is returned if the function returns an error.  We should instead
propagate the error to the caller.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626770993-11073-4-git-send-email-alan.maguire@oracle.com
2021-07-21 11:16:01 -07:00
Alan Maguire
64362b8896 libbpf: Avoid use of __int128 in typed dump display
__int128 is not supported for some 32-bit platforms (arm and i386).
__int128 was used in carrying out computations on bitfields which
aid display, but the same calculations could be done with __u64
with the small effect of not supporting 128-bit bitfields.

With these changes, a big-endian issue with casting 128-bit integers
to 64-bit for enum bitfields is solved also, as we now use 64-bit
integers for bitfield calculations.

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626770993-11073-2-git-send-email-alan.maguire@oracle.com
2021-07-21 11:16:01 -07:00
Michal Suchanek
df01b246df README: State the source origin more prominently.
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
2021-07-20 14:45:49 -07:00
Michal Suchanek
6eb5e25905 Makefile: Default LIBSUBDIR to lib64 on 64bit architectures.
commit a82a66e ("Extend build and add install rules to Makefile") adds
special handling for LIBSUBDIR on x86_64. Expand this to all
architectures with 64 in name which suggests a 32bit variant exists, and
s390x which is 64bit extension of s390.

Fixes: #337
Fixes: a82a66e ("Extend build and add install rules to Makefile")
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
2021-07-20 14:45:49 -07:00
Andrii Nakryiko
a603965dad sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   068dfc655b666b54e08fc3d7108b309d7f906d34
Checkpoint bpf-next commit: 08f71a1e39a1f07a464ac782d9b612d6a74c7015
Baseline bpf commit:        a6c39de76d709f30982d4b80a9b9537e1d388858
Checkpoint bpf commit:      d6371c76e20d7d3f61b05fd67b596af4d14a8886

Alan Maguire (3):
  libbpf: Clarify/fix unaligned data issues for btf typed dump
  libbpf: Fix compilation errors on ppc64le for btf dump typed data
  libbpf: Btf typed dump does not need to allocate dump data

Martynas Pumputis (1):
  libbpf: Fix removal of inner map in bpf_object__create_map

 src/btf_dump.c | 41 ++++++++++++++++++++++++++++++-----------
 src/libbpf.c   | 10 ++++------
 2 files changed, 34 insertions(+), 17 deletions(-)

--
2.30.2
2021-07-19 17:45:10 -07:00
Martynas Pumputis
f61c3b318b libbpf: Fix removal of inner map in bpf_object__create_map
If creating an outer map of a BTF-defined map-in-map fails (via
bpf_object__create_map()), then the previously created its inner map
won't be destroyed.

Fix this by ensuring that the destroy routines are not bypassed in the
case of a failure.

Fixes: 646f02ffdd49c ("libbpf: Add BTF-defined map-in-map support")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210719173838.423148-2-m@lambda.lt
2021-07-19 17:45:10 -07:00
Alan Maguire
8235032464 libbpf: Btf typed dump does not need to allocate dump data
By using the stack for this small structure, we avoid the need
for freeing memory in error paths.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626475617-25984-4-git-send-email-alan.maguire@oracle.com
2021-07-19 17:45:10 -07:00
Alan Maguire
dc2c53b7f6 libbpf: Fix compilation errors on ppc64le for btf dump typed data
__s64 can be defined as either long or long long, depending on the
architecture. On ppc64le it's defined as long, giving this error:

 In file included from btf_dump.c:22:
btf_dump.c: In function 'btf_dump_type_data_check_overflow':
libbpf_internal.h:111:22: error: format '%lld' expects argument of
type 'long long int', but argument 3 has type '__s64' {aka 'long int'}
[-Werror=format=]
  111 |  libbpf_print(level, "libbpf: " fmt, ##__VA_ARGS__); \
      |                      ^~~~~~~~~~
libbpf_internal.h:114:27: note: in expansion of macro '__pr'
  114 | #define pr_warn(fmt, ...) __pr(LIBBPF_WARN, fmt, ##__VA_ARGS__)
      |                           ^~~~
btf_dump.c:1992:3: note: in expansion of macro 'pr_warn'
 1992 |   pr_warn("unexpected size [%lld] for id [%u]\n",
      |   ^~~~~~~
btf_dump.c:1992:32: note: format string is defined here
 1992 |   pr_warn("unexpected size [%lld] for id [%u]\n",
      |                             ~~~^
      |                                |
      |                                long long int
      |                             %ld

Cast to size_t and use %zu instead.

Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626475617-25984-3-git-send-email-alan.maguire@oracle.com
2021-07-19 17:45:10 -07:00
Alan Maguire
fb3809e940 libbpf: Clarify/fix unaligned data issues for btf typed dump
If data is packed, data structures can store it outside of usual
boundaries.  For example a 4-byte int can be stored on a unaligned
boundary in a case like this:

struct s {
	char f1;
	int f2;
} __attribute((packed));

...the int is stored at an offset of one byte.  Some platforms have
problems dereferencing data that is not aligned with its size, and
code exists to handle most cases of this for BTF typed data display.
However pointer display was missed, and a simple function to test if
"ptr_is_aligned(data, data_sz)" would help clarify this code.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626475617-25984-2-git-send-email-alan.maguire@oracle.com
2021-07-19 17:45:10 -07:00
Fejes Ferenc
74d3571880 Update README.md 2021-07-19 16:43:05 -07:00
Fejes Ferenc
be570b29c1 Update README.md
Manjaro is a popular and friendly Arch based distro. Recently they also enabled the BTF support: https://forum.manjaro.org/t/co-re-support-in-kernel/46134/19

I can confirm that:
[user@pc ~]$ uname -a
Linux pc 5.12.16-1-MANJARO #1 SMP PREEMPT Sun Jul 11 13:23:34 UTC 2021 x86_64 GNU/Linux
[user@pc ~]$ ls -la /sys/kernel/btf/vmlinux
-r--r--r-- 1 root root 4226769 jul   17 15.27 /sys/kernel/btf/vmlinux
2021-07-19 16:43:05 -07:00
Sergei Iudin
9aa71e1040 Run apt-get update as a first step for GH actions
otherwise container may contain stall repo metadata cached
2021-07-19 14:57:35 -07:00
Andrii Nakryiko
b3ffd258fc vmtest: blacklist 5.5 selftests
Add few new selftests to blacklist. They can't succeed on 5.5.
Also temporarily remove btf_dump for 4.9 due to newly added data dumping
subtests.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-19 11:36:37 -07:00
Andrii Nakryiko
4447ac82d4 ci: temporary work-around to get green CI builds back
Temporary disable tc_bpf tests that seem to have regressed.

Temporary and artificially bump pahole version from 1.21 to 1.22 to get
per-CPU BTF data built.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-19 11:36:37 -07:00
Andrii Nakryiko
8fa229c455 ci: disable -Wstringop-truncation for GCC10 configurations as well
We used to have it disabled for GCC8, but now GCC10 is false-report same
warnings, so disable stringop-truncation warnigs for GCC10 as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-19 11:36:37 -07:00
Andrii Nakryiko
8a670b7422 vmtest: regenerate latest vmlinux.h
This is necessary to make runqslower compile with task->__state field on old
kernels, for which we don't have an actual vmlinux.h.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-19 11:36:37 -07:00
Andrii Nakryiko
21f90f61b0 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   f42cfb469f9b4a1c002a03cce3d9329376800a6f
Checkpoint bpf-next commit: 068dfc655b666b54e08fc3d7108b309d7f906d34
Baseline bpf commit:        61e8aeda9398925f8c6fc290585bdd9727d154c4
Checkpoint bpf commit:      a6c39de76d709f30982d4b80a9b9537e1d388858

Alan Maguire (2):
  libbpf: Allow specification of "kprobe/function+offset"
  libbpf: BTF dumper support for typed data

Alexei Starovoitov (2):
  bpf: Sync tools/include/uapi/linux/bpf.h
  bpf: Introduce bpf timers.

Jiri Olsa (3):
  bpf: Add bpf_get_func_ip helper for tracing programs
  bpf: Add bpf_get_func_ip helper for kprobe programs
  libbpf: Add bpf_program__attach_kprobe_opts function

Jonathan Edwards (1):
  libbpf: Add extra BPF_PROG_TYPE check to bpf_object__probe_loading

Kumar Kartikeya Dwivedi (2):
  libbpf: Add request buffer type for netlink messages
  libbpf: Switch to void * casting in netlink helpers

Kuniyuki Iwashima (1):
  bpf: Fix a typo of reuseport map in bpf.h.

Martynas Pumputis (1):
  libbpf: Fix reuse of pinned map on older kernel

Shuyi Cheng (2):
  libbpf: Introduce 'btf_custom_path' to 'bpf_obj_open_opts'
  libbpf: Fix the possible memory leak on error

Toke Høiland-Jørgensen (1):
  libbpf: Restore errno return for functions that were already returning
    it

 include/uapi/linux/bpf.h |  85 +++-
 src/btf.h                |  19 +
 src/btf_dump.c           | 819 ++++++++++++++++++++++++++++++++++++++-
 src/libbpf.c             | 146 ++++++-
 src/libbpf.h             |   9 +-
 src/libbpf.map           |   1 +
 src/netlink.c            | 115 +++---
 src/nlattr.c             |   2 +-
 src/nlattr.h             |  38 +-
 9 files changed, 1117 insertions(+), 117 deletions(-)

--
2.30.2
2021-07-16 17:05:44 -07:00
Andrii Nakryiko
c8b1d14b03 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-07-16 17:05:44 -07:00
Alan Maguire
c0b2ceba1d libbpf: BTF dumper support for typed data
Add a BTF dumper for typed data, so that the user can dump a typed
version of the data provided.

The API is

int btf_dump__dump_type_data(struct btf_dump *d, __u32 id,
                             void *data, size_t data_sz,
                             const struct btf_dump_type_data_opts *opts);

...where the id is the BTF id of the data pointed to by the "void *"
argument; for example the BTF id of "struct sk_buff" for a
"struct skb *" data pointer.  Options supported are

 - a starting indent level (indent_lvl)
 - a user-specified indent string which will be printed once per
   indent level; if NULL, tab is chosen but any string <= 32 chars
   can be provided.
 - a set of boolean options to control dump display, similar to those
   used for BPF helper bpf_snprintf_btf().  Options are
        - compact : omit newlines and other indentation
        - skip_names: omit member names
        - emit_zeroes: show zero-value members

Default output format is identical to that dumped by bpf_snprintf_btf(),
for example a "struct sk_buff" representation would look like this:

struct sk_buff){
	(union){
		(struct){
			.next = (struct sk_buff *)0xffffffffffffffff,
			.prev = (struct sk_buff *)0xffffffffffffffff,
		(union){
			.dev = (struct net_device *)0xffffffffffffffff,
			.dev_scratch = (long unsigned int)18446744073709551615,
		},
	},
...

If the data structure is larger than the *data_sz*
number of bytes that are available in *data*, as much
of the data as possible will be dumped and -E2BIG will
be returned.  This is useful as tracers will sometimes
not be able to capture all of the data associated with
a type; for example a "struct task_struct" is ~16k.
Being able to specify that only a subset is available is
important for such cases.  On success, the amount of data
dumped is returned.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626362126-27775-2-git-send-email-alan.maguire@oracle.com
2021-07-16 17:05:44 -07:00
Shuyi Cheng
bd25fc7df1 libbpf: Fix the possible memory leak on error
If the strdup() fails then we need to call bpf_object__close(obj) to
avoid a resource leak.

Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables")
Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626180159-112996-3-git-send-email-chengshuyi@linux.alibaba.com
2021-07-16 17:05:44 -07:00
Shuyi Cheng
4920031c88 libbpf: Introduce 'btf_custom_path' to 'bpf_obj_open_opts'
btf_custom_path allows developers to load custom BTF which libbpf will
subsequently use for CO-RE relocation instead of vmlinux BTF.

Having btf_custom_path in bpf_object_open_opts one can directly use the
skeleton's <objname>_bpf__open_opts() API to pass in the btf_custom_path
parameter, as opposed to using bpf_object__load_xattr() which is slated to be
deprecated ([0]).

This work continues previous work started by another developer ([1]).

  [0] https://lore.kernel.org/bpf/CAEf4BzbJZLjNoiK8_VfeVg_Vrg=9iYFv+po-38SMe=UzwDKJ=Q@mail.gmail.com/#t
  [1] https://yhbt.net/lore/all/CAEf4Bzbgw49w2PtowsrzKQNcxD4fZRE6AKByX-5-dMo-+oWHHA@mail.gmail.com/

Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626180159-112996-2-git-send-email-chengshuyi@linux.alibaba.com
2021-07-16 17:05:44 -07:00
Alan Maguire
8fa50e86c1 libbpf: Allow specification of "kprobe/function+offset"
kprobes can be placed on most instructions in a function, not
just entry, and ftrace and bpftrace support the function+offset
notification for probe placement.  Adding parsing of func_name
into func+offset to bpf_program__attach_kprobe() allows the
user to specify

SEC("kprobe/bpf_fentry_test5+0x6")

...for example, and the offset can be passed to perf_event_open_probe()
to support kprobe attachment.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210714094400.396467-8-jolsa@kernel.org
2021-07-16 17:05:44 -07:00
Jiri Olsa
330a158982 libbpf: Add bpf_program__attach_kprobe_opts function
Adding bpf_program__attach_kprobe_opts that does the same
as bpf_program__attach_kprobe, but takes opts argument.

Currently opts struct holds just retprobe bool, but we will
add new field in following patch.

The function is not exported, so there's no need to add
size to the struct bpf_program_attach_kprobe_opts for now.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210714094400.396467-7-jolsa@kernel.org
2021-07-16 17:05:44 -07:00
Jiri Olsa
a524ae0bbf bpf: Add bpf_get_func_ip helper for kprobe programs
Adding bpf_get_func_ip helper for BPF_PROG_TYPE_KPROBE programs,
so it's now possible to call bpf_get_func_ip from both kprobe and
kretprobe programs.

Taking the caller's address from 'struct kprobe::addr', which is
defined for both kprobe and kretprobe.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/bpf/20210714094400.396467-5-jolsa@kernel.org
2021-07-16 17:05:44 -07:00
Jiri Olsa
97e2a9c9a1 bpf: Add bpf_get_func_ip helper for tracing programs
Adding bpf_get_func_ip helper for BPF_PROG_TYPE_TRACING programs,
specifically for all trampoline attach types.

The trampoline's caller IP address is stored in (ctx - 8) address.
so there's no reason to actually call the helper, but rather fixup
the call instruction and return [ctx - 8] value directly.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210714094400.396467-4-jolsa@kernel.org
2021-07-16 17:05:44 -07:00
Alexei Starovoitov
bef77595ca bpf: Introduce bpf timers.
Introduce 'struct bpf_timer { __u64 :64; __u64 :64; };' that can be embedded
in hash/array/lru maps as a regular field and helpers to operate on it:

// Initialize the timer.
// First 4 bits of 'flags' specify clockid.
// Only CLOCK_MONOTONIC, CLOCK_REALTIME, CLOCK_BOOTTIME are allowed.
long bpf_timer_init(struct bpf_timer *timer, struct bpf_map *map, int flags);

// Configure the timer to call 'callback_fn' static function.
long bpf_timer_set_callback(struct bpf_timer *timer, void *callback_fn);

// Arm the timer to expire 'nsec' nanoseconds from the current time.
long bpf_timer_start(struct bpf_timer *timer, u64 nsec, u64 flags);

// Cancel the timer and wait for callback_fn to finish if it was running.
long bpf_timer_cancel(struct bpf_timer *timer);

Here is how BPF program might look like:
struct map_elem {
    int counter;
    struct bpf_timer timer;
};

struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 1000);
    __type(key, int);
    __type(value, struct map_elem);
} hmap SEC(".maps");

static int timer_cb(void *map, int *key, struct map_elem *val);
/* val points to particular map element that contains bpf_timer. */

SEC("fentry/bpf_fentry_test1")
int BPF_PROG(test1, int a)
{
    struct map_elem *val;
    int key = 0;

    val = bpf_map_lookup_elem(&hmap, &key);
    if (val) {
        bpf_timer_init(&val->timer, &hmap, CLOCK_REALTIME);
        bpf_timer_set_callback(&val->timer, timer_cb);
        bpf_timer_start(&val->timer, 1000 /* call timer_cb2 in 1 usec */, 0);
    }
}

This patch adds helper implementations that rely on hrtimers
to call bpf functions as timers expire.
The following patches add necessary safety checks.

Only programs with CAP_BPF are allowed to use bpf_timer.

The amount of timers used by the program is constrained by
the memcg recorded at map creation time.

The bpf_timer_init() helper needs explicit 'map' argument because inner maps
are dynamic and not known at load time. While the bpf_timer_set_callback() is
receiving hidden 'aux->prog' argument supplied by the verifier.

The prog pointer is needed to do refcnting of bpf program to make sure that
program doesn't get freed while the timer is armed. This approach relies on
"user refcnt" scheme used in prog_array that stores bpf programs for
bpf_tail_call. The bpf_timer_set_callback() will increment the prog refcnt which is
paired with bpf_timer_cancel() that will drop the prog refcnt. The
ops->map_release_uref is responsible for cancelling the timers and dropping
prog refcnt when user space reference to a map reaches zero.
This uref approach is done to make sure that Ctrl-C of user space process will
not leave timers running forever unless the user space explicitly pinned a map
that contained timers in bpffs.

bpf_timer_init() and bpf_timer_set_callback() will return -EPERM if map doesn't
have user references (is not held by open file descriptor from user space and
not pinned in bpffs).

The bpf_map_delete_elem() and bpf_map_update_elem() operations cancel
and free the timer if given map element had it allocated.
"bpftool map update" command can be used to cancel timers.

The 'struct bpf_timer' is explicitly __attribute__((aligned(8))) because
'__u64 :64' has 1 byte alignment of 8 byte padding.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210715005417.78572-4-alexei.starovoitov@gmail.com
2021-07-16 17:05:44 -07:00
Kuniyuki Iwashima
6f7839f477 bpf: Fix a typo of reuseport map in bpf.h.
Fix s/BPF_MAP_TYPE_REUSEPORT_ARRAY/BPF_MAP_TYPE_REUSEPORT_SOCKARRAY/ typo
in bpf.h.

Fixes: 2dbb9b9e6df6 ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210714124317.67526-1-kuniyu@amazon.co.jp
2021-07-16 17:05:44 -07:00
Alexei Starovoitov
90aba5e582 bpf: Sync tools/include/uapi/linux/bpf.h
Commit 47316f4a3053 missed updating tools/.../bpf.h.
Sync it.

Fixes: 47316f4a3053 ("bpf: Support input xdp_md context in BPF_PROG_TEST_RUN")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-07-16 17:05:44 -07:00
Martynas Pumputis
4dc3aeb072 libbpf: Fix reuse of pinned map on older kernel
When loading a BPF program with a pinned map, the loader checks whether
the pinned map can be reused, i.e. their properties match. To derive
such of the pinned map, the loader invokes BPF_OBJ_GET_INFO_BY_FD and
then does the comparison.

Unfortunately, on < 4.12 kernels the BPF_OBJ_GET_INFO_BY_FD is not
available, so loading the program fails with the following error:

	libbpf: failed to get map info for map FD 5: Invalid argument
	libbpf: couldn't reuse pinned map at
		'/sys/fs/bpf/tc/globals/cilium_call_policy': parameter
		mismatch"
	libbpf: map 'cilium_call_policy': error reusing pinned map
	libbpf: map 'cilium_call_policy': failed to create:
		Invalid argument(-22)
	libbpf: failed to load object 'bpf_overlay.o'

To fix this, fallback to derivation of the map properties via
/proc/$PID/fdinfo/$MAP_FD if BPF_OBJ_GET_INFO_BY_FD fails with EINVAL,
which can be used as an indicator that the kernel doesn't support
the latter.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210712125552.58705-1-m@lambda.lt
2021-07-16 17:05:44 -07:00
Toke Høiland-Jørgensen
4ce0551ee5 libbpf: Restore errno return for functions that were already returning it
The update to streamline libbpf error reporting intended to change all
functions to return the errno as a negative return value if
LIBBPF_STRICT_DIRECT_ERRS is set. However, if the flag is *not* set, the
return value changes for the two functions that were already returning a
negative errno unconditionally: bpf_link__unpin() and perf_buffer__poll().

This is a user-visible API change that breaks applications; so let's revert
these two functions back to unconditionally returning a negative errno
value.

Fixes: e9fc3ce99b34 ("libbpf: Streamline error reporting for high-level APIs")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210706122355.236082-1-toke@redhat.com
2021-07-16 17:05:44 -07:00
Kumar Kartikeya Dwivedi
f8411901c4 libbpf: Switch to void * casting in netlink helpers
Netlink helpers I added in 8bbb77b7c7a2 ("libbpf: Add various netlink
helpers") used char * casts everywhere, and there were a few more that
existed from before.

Convert all of them to void * cast, as it is treated equivalently by
clang/gcc for the purposes of pointer arithmetic and to follow the
convention elsewhere in the kernel/libbpf.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210619041454.417577-2-memxor@gmail.com
2021-07-16 17:05:44 -07:00
Kumar Kartikeya Dwivedi
9ff2b76693 libbpf: Add request buffer type for netlink messages
Coverity complains about OOB writes to nlmsghdr. There is no OOB as we
write to the trailing buffer, but static analyzers and compilers may
rightfully be confused as the nlmsghdr pointer has subobject provenance
(and hence subobject bounds).

Fix this by using an explicit request structure containing the nlmsghdr,
struct tcmsg/ifinfomsg, and attribute buffer.

Also switch nh_tail (renamed to req_tail) to cast req * to char * so
that it can be understood as arithmetic on pointer to the representation
array (hence having same bound as request structure), which should
further appease analyzers.

As a bonus, callers don't have to pass sizeof(req) all the time now, as
size is implicitly obtained using the pointer. While at it, also reduce
the size of attribute buffer to 128 bytes (132 for ifinfomsg using
functions due to the padding).

Summary of problem:

  Even though C standard allows interconvertibility of pointer to first
  member and pointer to struct, for the purposes of alias analysis it
  would still consider the first as having pointer value "pointer to T"
  where T is type of first member hence having subobject bounds,
  allowing analyzers within reason to complain when object is accessed
  beyond the size of pointed to object.

  The only exception to this rule may be when a char * is formed to a
  member subobject. It is not possible for the compiler to be able to
  tell the intent of the programmer that it is a pointer to member
  object or the underlying representation array of the containing
  object, so such diagnosis is suppressed.

Fixes: 715c5ce454a6 ("libbpf: Add low level TC-BPF management API")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210619041454.417577-1-memxor@gmail.com
2021-07-16 17:05:44 -07:00
Jonathan Edwards
df023f5cfc libbpf: Add extra BPF_PROG_TYPE check to bpf_object__probe_loading
eBPF has been backported for RHEL 7 w/ kernel 3.10-940+ [0]. However only
the following program types are supported [1]:

  BPF_PROG_TYPE_KPROBE
  BPF_PROG_TYPE_TRACEPOINT
  BPF_PROG_TYPE_PERF_EVENT

For libbpf this causes an EINVAL return during the bpf_object__probe_loading
call which only checks to see if programs of type BPF_PROG_TYPE_SOCKET_FILTER
can load.

The following will try BPF_PROG_TYPE_TRACEPOINT as a fallback attempt before
erroring out. BPF_PROG_TYPE_KPROBE was not a good candidate because on some
kernels it requires knowledge of the LINUX_VERSION_CODE.

  [0] https://www.redhat.com/en/blog/introduction-ebpf-red-hat-enterprise-linux-7
  [1] https://access.redhat.com/articles/3550581

Signed-off-by: Jonathan Edwards <jonathan.edwards@165gc.onmicrosoft.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210619151007.GA6963@165gc.onmicrosoft.com
2021-07-16 17:05:44 -07:00
Andrii Nakryiko
ae62c159ec include: initial sync of pkt_cls.h and pkt_sched.h
Add pkt_cls.h and pkt_sched.h to include/uapi/linux.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-16 14:22:07 -07:00
Yonghong Song
8bf016110e sync uapi headers linux/pkt_cls.h and linux/pkt_sched.h
Let us sync linux/{pkt_cls.h,pkt_sched.h} to libbpf repo.
Otherwise, on ubuntu 16.04, system headers will be picked up
and this will result in compilation error like:
  .../netlink.c:416:23: error: ‘TC_H_CLSACT’ undeclared (first use in this function)
     *parent = TC_H_MAKE(TC_H_CLSACT,
                         ^
  .../netlink.c:418:9: error: ‘TC_H_MIN_INGRESS’ undeclared (first use in this function)
           TC_H_MIN_INGRESS : TC_H_MIN_EGRESS);
           ^
  .../netlink.c:418:28: error: ‘TC_H_MIN_EGRESS’ undeclared (first use in this function)
           TC_H_MIN_INGRESS : TC_H_MIN_EGRESS);
                              ^
  .../netlink.c: In function ‘__get_tc_info’:
  .../netlink.c:522:11: error: ‘TCA_BPF_ID’ undeclared (first use in this function)
    if (!tbb[TCA_BPF_ID])
             ^

Signed-off-by: Yonghong Song <yhs@fb.com>
2021-07-12 14:01:21 -07:00
Yucong Sun
d3e4039a0a create ondemand vmtest workflow 2021-07-09 14:09:51 -07:00
Jussi Mäki
dd34504b43 vmtest: Set CONFIG_BONDING=y in latest.config
This is preparation for the XDP bonding patch set [1] to avoid having to mangle the kernel configuration from vmtest.sh.
 
[1]: https://lore.kernel.org/bpf/202106221509.kwNvAAZg-lkp@intel.com/T/#m4635dc0003944f38a54059b11147ab46abeffa13

Signed-off-by: Jussi Maki <joamaki@gmail.com>
2021-07-08 15:18:32 -07:00
Andrii Nakryiko
bec2ae0c6e sync: update rewritten bpf-next SHA
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-06 13:31:12 -07:00
Andrii Nakryiko
1d6106cf45 ci: blacklist few new tests on 5.5
tc_redirect and migrate_reuseport use new functionality not present on 5.5

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-06-18 13:05:10 -07:00
Andrii Nakryiko
95e51c1dbe ci: disable fail-fast for Github Actions tests
Make sure we run all of the tests even if some of them fail. This allows to
test all of them independently, especially kernel LATEST slow test.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-06-18 13:05:10 -07:00
Andrii Nakryiko
db132757c9 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   cf68fa431d5da7ef0b5ea142dd603611696cbd44
Checkpoint bpf-next commit: f540a7d2c37f9ae0867de0a14bf06cf50b63d65e
Baseline bpf commit:        11fc79fc9f2e395aa39fa5baccae62767c5d8280
Checkpoint bpf commit:      61e8aeda9398925f8c6fc290585bdd9727d154c4

Kumar Kartikeya Dwivedi (2):
  libbpf: Remove unneeded check for flags during tc detach
  libbpf: Set NLM_F_EXCL when creating qdisc

Kuniyuki Iwashima (3):
  bpf: Support BPF_FUNC_get_socket_cookie() for
    BPF_PROG_TYPE_SK_REUSEPORT.
  bpf: Support socket migration by eBPF.
  libbpf: Set expected_attach_type for BPF_PROG_TYPE_SK_REUSEPORT.

Lorenz Bauer (1):
  libbpf: Fail compilation if target arch is missing

Wang Hai (1):
  libbpf: Simplify the return expression of bpf_object__init_maps
    function

grantseltzer (1):
  Add documentation for libbpf including API autogen

 include/uapi/linux/bpf.h |  16 ++++
 src/README.rst           | 168 ---------------------------------------
 src/bpf_tracing.h        |  46 ++++++++++-
 src/libbpf.c             |   9 ++-
 src/netlink.c            |   4 +-
 5 files changed, 64 insertions(+), 179 deletions(-)
 delete mode 100644 src/README.rst

--
2.30.2
2021-06-18 13:05:10 -07:00
grantseltzer
41cddf18f4 Add documentation for libbpf including API autogen
This patch is meant to start the initiative to document libbpf.
It includes .rst files which are text documentation describing building,
API naming convention, as well as an index to generated API documentation.

In this approach the generated API documentation is enabled by the kernels
existing kernel documentation system which uses sphinx. The resulting docs
would then be synced to kernel.org/doc

You can test this by running `make htmldocs` and serving the html in
Documentation/output. Since libbpf does not yet have comments in kernel
doc format, see kernel.org/doc/html/latest/doc-guide/kernel-doc.html for
an example so you can test this.

The advantage of this approach is to use the existing sphinx
infrastructure that the kernel has, and have libbpf docs in
the same place as everything else.

The current plan is to have the libbpf mirror sync the generated docs
and version them based on the libbpf releases which are cut on github.

This patch includes the addition of libbpf_api.rst which pulls comment
documentation from header files in libbpf under tools/lib/bpf/. The comment
docs would be of the standard kernel doc format.

Signed-off-by: grantseltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210618140459.9887-2-grantseltzer@gmail.com
2021-06-18 13:05:10 -07:00
Lorenz Bauer
f883bbf3f4 libbpf: Fail compilation if target arch is missing
bpf2go is the Go equivalent of libbpf skeleton. The convention is that
the compiled BPF is checked into the repository to facilitate distributing
BPF as part of Go packages. To make this portable, bpf2go by default
generates both bpfel and bpfeb variants of the C.

Using bpf_tracing.h is inherently non-portable since the fields of
struct pt_regs differ between platforms, so CO-RE can't help us here.
The only way of working around this is to compile for each target
platform independently. bpf2go can't do this by default since there
are too many platforms.

Define the various PT_... macros when no target can be determined and
turn them into compilation failures. This works because bpf2go always
compiles for bpf targets, so the compiler fallback doesn't kick in.
Conditionally define __BPF_MISSING_TARGET so that we can inject a
more appropriate error message at build time. The user can then
choose which platform to target explicitly.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210616083635.11434-1-lmb@cloudflare.com
2021-06-18 13:05:10 -07:00
Kuniyuki Iwashima
db8982bcaa libbpf: Set expected_attach_type for BPF_PROG_TYPE_SK_REUSEPORT.
This commit introduces a new section (sk_reuseport/migrate) and sets
expected_attach_type to two each section in BPF_PROG_TYPE_SK_REUSEPORT
program.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210612123224.12525-11-kuniyu@amazon.co.jp
2021-06-18 13:05:10 -07:00
Kuniyuki Iwashima
d1571ab5ce bpf: Support socket migration by eBPF.
This patch introduces a new bpf_attach_type for BPF_PROG_TYPE_SK_REUSEPORT
to check if the attached eBPF program is capable of migrating sockets. When
the eBPF program is attached, we run it for socket migration if the
expected_attach_type is BPF_SK_REUSEPORT_SELECT_OR_MIGRATE or
net.ipv4.tcp_migrate_req is enabled.

Currently, the expected_attach_type is not enforced for the
BPF_PROG_TYPE_SK_REUSEPORT type of program. Thus, this commit follows the
earlier idea in the commit aac3fc320d94 ("bpf: Post-hooks for sys_bind") to
fix up the zero expected_attach_type in bpf_prog_load_fixup_attach_type().

Moreover, this patch adds a new field (migrating_sk) to sk_reuseport_md to
select a new listener based on the child socket. migrating_sk varies
depending on if it is migrating a request in the accept queue or during
3WHS.

  - accept_queue : sock (ESTABLISHED/SYN_RECV)
  - 3WHS         : request_sock (NEW_SYN_RECV)

In the eBPF program, we can select a new listener by
BPF_FUNC_sk_select_reuseport(). Also, we can cancel migration by returning
SK_DROP. This feature is useful when listeners have different settings at
the socket API level or when we want to free resources as soon as possible.

  - SK_PASS with selected_sk, select it as a new listener
  - SK_PASS with selected_sk NULL, fallbacks to the random selection
  - SK_DROP, cancel the migration.

There is a noteworthy point. We select a listening socket in three places,
but we do not have struct skb at closing a listener or retransmitting a
SYN+ACK. On the other hand, some helper functions do not expect skb is NULL
(e.g. skb_header_pointer() in BPF_FUNC_skb_load_bytes(), skb_tail_pointer()
in BPF_FUNC_skb_load_bytes_relative()). So we allocate an empty skb
temporarily before running the eBPF program.

Suggested-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/netdev/20201123003828.xjpjdtk4ygl6tg6h@kafai-mbp.dhcp.thefacebook.com/
Link: https://lore.kernel.org/netdev/20201203042402.6cskdlit5f3mw4ru@kafai-mbp.dhcp.thefacebook.com/
Link: https://lore.kernel.org/netdev/20201209030903.hhow5r53l6fmozjn@kafai-mbp.dhcp.thefacebook.com/
Link: https://lore.kernel.org/bpf/20210612123224.12525-10-kuniyu@amazon.co.jp
2021-06-18 13:05:10 -07:00
Kuniyuki Iwashima
03b0787342 bpf: Support BPF_FUNC_get_socket_cookie() for BPF_PROG_TYPE_SK_REUSEPORT.
We will call sock_reuseport.prog for socket migration in the next commit,
so the eBPF program has to know which listener is closing to select a new
listener.

We can currently get a unique ID of each listener in the userspace by
calling bpf_map_lookup_elem() for BPF_MAP_TYPE_REUSEPORT_SOCKARRAY map.

This patch makes the pointer of sk available in sk_reuseport_md so that we
can get the ID by BPF_FUNC_get_socket_cookie() in the eBPF program.

Suggested-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/netdev/20201119001154.kapwihc2plp4f7zc@kafai-mbp.dhcp.thefacebook.com/
Link: https://lore.kernel.org/bpf/20210612123224.12525-9-kuniyu@amazon.co.jp
2021-06-18 13:05:10 -07:00
Kumar Kartikeya Dwivedi
a1bd8104a9 libbpf: Set NLM_F_EXCL when creating qdisc
This got lost during the refactoring across versions. We always use
NLM_F_EXCL when creating some TC object, so reflect what the function
says and set the flag.

Fixes: 715c5ce454a6 ("libbpf: Add low level TC-BPF management API")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210612023502.1283837-3-memxor@gmail.com
2021-06-18 13:05:10 -07:00
Kumar Kartikeya Dwivedi
ccead28901 libbpf: Remove unneeded check for flags during tc detach
Coverity complained about this being unreachable code. It is right
because we already enforce flags to be unset, so a check validating
the flag value is redundant.

Fixes: 715c5ce454a6 ("libbpf: Add low level TC-BPF management API")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210612023502.1283837-2-memxor@gmail.com
2021-06-18 13:05:10 -07:00
Wang Hai
0b59d75ecd libbpf: Simplify the return expression of bpf_object__init_maps function
There is no need for special treatment of the 'ret == 0' case.
This patch simplifies the return expression.

Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210609115651.3392580-1-wanghai38@huawei.com
2021-06-18 13:05:10 -07:00
Sergei Iudin
a5ee05d505 Run pahole staging once a day 2021-06-17 17:49:56 -07:00
Sergei Iudin
42ebbbce7d test pahole 2021-06-17 13:38:13 -07:00
Sergei Iudin
26497b9a88 Add coverity workflow 2021-06-15 16:13:44 -07:00
Sergei Iudin
5d5af3f07e Migrate libbpf ci to GH actions
changes to docker command require to run it in non-interactive mode
2021-06-15 14:13:57 -07:00
Andrii Nakryiko
899c45baa2 travis-ci: extend 5.5 blacklist
Blacklist few more recent selftests.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-06-08 13:04:27 -07:00
Andrii Nakryiko
95008d47dd Makefile: sync Makefile with upstream
Add gen_loader.o to list of built object files. Complete the list of installed
headers.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-06-08 13:04:27 -07:00
Andrii Nakryiko
13acc0af00 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   f18ba26da88a89db9b50cb4ff47fadb159f2810b
Checkpoint bpf-next commit: cf68fa431d5da7ef0b5ea142dd603611696cbd44
Baseline bpf commit:        d0c0fe10ce6d87734b65c18dc8f4bcae3f4dbea4
Checkpoint bpf commit:      11fc79fc9f2e395aa39fa5baccae62767c5d8280

Alexei Starovoitov (12):
  bpf: Introduce bpf_sys_bpf() helper and program type.
  libbpf: Support for syscall program type
  bpf: Introduce fd_idx
  bpf: Add bpf_btf_find_by_name_kind() helper.
  bpf: Add bpf_sys_close() helper.
  libbpf: Change the order of data and text relocations.
  libbpf: Add bpf_object pointer to kernel_supports().
  libbpf: Preliminary support for fd_idx
  libbpf: Generate loader program out of BPF ELF file.
  libbpf: Cleanup temp FDs when intermediate sys_bpf fails.
  libbpf: Introduce bpf_map__initial_value().
  bpf: Add cmd alias BPF_PROG_RUN

Andrii Nakryiko (4):
  libbpf: Add libbpf_set_strict_mode() API to turn on libbpf 1.0
    behaviors
  libbpf: Streamline error reporting for low-level APIs
  libbpf: Streamline error reporting for high-level APIs
  libbpf: Move few APIs from 0.4 to 0.5 version

Denis Salopek (2):
  bpf: Add lookup_and_delete_elem support to hashtab
  bpf: Extend libbpf with bpf_map_lookup_and_delete_elem_flags

Florent Revest (1):
  libbpf: Move BPF_SEQ_PRINTF and BPF_SNPRINTF to bpf_helpers.h

Hangbin Liu (1):
  xdp: Extend xdp_redirect_map with broadcast support

Kev Jackson (1):
  libbpf: Fixes incorrect rx_ring_setup_done

Michal Suchanek (1):
  libbpf: Fix pr_warn type warnings on 32bit

Stanislav Fomichev (1):
  libbpf: Skip bpf_object__probe_loading for light skeleton

 include/uapi/linux/bpf.h |  66 ++-
 src/bpf.c                | 179 +++++---
 src/bpf.h                |   2 +
 src/bpf_gen_internal.h   |  41 ++
 src/bpf_helpers.h        |  66 +++
 src/bpf_prog_linfo.c     |  18 +-
 src/bpf_tracing.h        |  62 +--
 src/btf.c                | 302 ++++++-------
 src/btf_dump.c           |  14 +-
 src/gen_loader.c         | 729 +++++++++++++++++++++++++++++++
 src/libbpf.c             | 909 +++++++++++++++++++++++++--------------
 src/libbpf.h             |  14 +
 src/libbpf.map           |   8 +
 src/libbpf_errno.c       |   7 +-
 src/libbpf_internal.h    |  55 +++
 src/libbpf_legacy.h      |  59 +++
 src/linker.c             |  22 +-
 src/netlink.c            |  81 ++--
 src/ringbuf.c            |  26 +-
 src/skel_internal.h      | 123 ++++++
 src/xsk.c                |   2 +-
 21 files changed, 2135 insertions(+), 650 deletions(-)
 create mode 100644 src/bpf_gen_internal.h
 create mode 100644 src/gen_loader.c
 create mode 100644 src/libbpf_legacy.h
 create mode 100644 src/skel_internal.h

--
2.30.2
2021-06-08 13:04:27 -07:00
Andrii Nakryiko
1b9138e452 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-06-08 13:04:27 -07:00
Kev Jackson
2da7f66d3f libbpf: Fixes incorrect rx_ring_setup_done
When calling xsk_socket__create_shared(), the logic at line 1097 marks a
boolean flag true within the xsk_umem structure to track setup progress
in order to support multiple calls to the function.  However, instead of
marking umem->tx_ring_setup_done, the code incorrectly sets
umem->rx_ring_setup_done.  This leads to improper behaviour when
creating and destroying xsk and umem structures.

Multiple calls to this function is documented as supported.

Fixes: ca7a83e2487a ("libbpf: Only create rx and tx XDP rings when necessary")
Signed-off-by: Kev Jackson <foamdino@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/YL4aU4f3Aaik7CN0@linux-dev
2021-06-08 13:04:27 -07:00
Michal Suchanek
9d5ac4931d libbpf: Fix pr_warn type warnings on 32bit
The printed value is ptrdiff_t and is formatted wiht %ld. This works on
64bit but produces a warning on 32bit. Fix the format specifier to %td.

Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.")
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210604112448.32297-1-msuchanek@suse.de
2021-06-08 13:04:27 -07:00
Andrii Nakryiko
5bfbb36440 libbpf: Move few APIs from 0.4 to 0.5 version
Official libbpf 0.4 release doesn't include three APIs that were tentatively
put into 0.4 section. Fix libbpf.map and move these three APIs:

  - bpf_map__initial_value;
  - bpf_map_lookup_and_delete_elem_flags;
  - bpf_object__gen_loader.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210603004026.2698513-2-andrii@kernel.org
2021-06-08 13:04:27 -07:00
Florent Revest
343f63e245 libbpf: Move BPF_SEQ_PRINTF and BPF_SNPRINTF to bpf_helpers.h
These macros are convenient wrappers around the bpf_seq_printf and
bpf_snprintf helpers. They are currently provided by bpf_tracing.h which
targets low level tracing primitives. bpf_helpers.h is a better fit.

The __bpf_narg and __bpf_apply are needed in both files and provided
twice. __bpf_empty isn't used anywhere and is removed from bpf_tracing.h

Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210526164643.2881368-1-revest@chromium.org
2021-06-08 13:04:27 -07:00
Hangbin Liu
0dccb885a3 xdp: Extend xdp_redirect_map with broadcast support
This patch adds two flags BPF_F_BROADCAST and BPF_F_EXCLUDE_INGRESS to
extend xdp_redirect_map for broadcast support.

With BPF_F_BROADCAST the packet will be broadcasted to all the interfaces
in the map. with BPF_F_EXCLUDE_INGRESS the ingress interface will be
excluded when do broadcasting.

When getting the devices in dev hash map via dev_map_hash_get_next_key(),
there is a possibility that we fall back to the first key when a device
was removed. This will duplicate packets on some interfaces. So just walk
the whole buckets to avoid this issue. For dev array map, we also walk the
whole map to find valid interfaces.

Function bpf_clear_redirect_map() was removed in
commit ee75aef23afe ("bpf, xdp: Restructure redirect actions").
Add it back as we need to use ri->map again.

With test topology:
  +-------------------+             +-------------------+
  | Host A (i40e 10G) |  ---------- | eno1(i40e 10G)    |
  +-------------------+             |                   |
                                    |   Host B          |
  +-------------------+             |                   |
  | Host C (i40e 10G) |  ---------- | eno2(i40e 10G)    |
  +-------------------+             |                   |
                                    |          +------+ |
                                    | veth0 -- | Peer | |
                                    | veth1 -- |      | |
                                    | veth2 -- |  NS  | |
                                    |          +------+ |
                                    +-------------------+

On Host A:
 # pktgen/pktgen_sample03_burst_single_flow.sh -i eno1 -d $dst_ip -m $dst_mac -s 64

On Host B(Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz, 128G Memory):
Use xdp_redirect_map and xdp_redirect_map_multi in samples/bpf for testing.
All the veth peers in the NS have a XDP_DROP program loaded. The
forward_map max_entries in xdp_redirect_map_multi is modify to 4.

Testing the performance impact on the regular xdp_redirect path with and
without patch (to check impact of additional check for broadcast mode):

5.12 rc4         | redirect_map        i40e->i40e      |    2.0M |  9.7M
5.12 rc4         | redirect_map        i40e->veth      |    1.7M | 11.8M
5.12 rc4 + patch | redirect_map        i40e->i40e      |    2.0M |  9.6M
5.12 rc4 + patch | redirect_map        i40e->veth      |    1.7M | 11.7M

Testing the performance when cloning packets with the redirect_map_multi
test, using a redirect map size of 4, filled with 1-3 devices:

5.12 rc4 + patch | redirect_map multi  i40e->veth (x1) |    1.7M | 11.4M
5.12 rc4 + patch | redirect_map multi  i40e->veth (x2) |    1.1M |  4.3M
5.12 rc4 + patch | redirect_map multi  i40e->veth (x3) |    0.8M |  2.6M

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/20210519090747.1655268-3-liuhangbin@gmail.com
2021-06-08 13:04:27 -07:00
Andrii Nakryiko
8e3a63ea48 libbpf: Streamline error reporting for high-level APIs
Implement changes to error reporting for high-level libbpf APIs to make them
less surprising and less error-prone to users:
  - in all the cases when error happens, errno is set to an appropriate error
    value;
  - in libbpf 1.0 mode, all pointer-returning APIs return NULL on error and
    error code is communicated through errno; this applies both to APIs that
    already returned NULL before (so now they communicate more detailed error
    codes), as well as for many APIs that used ERR_PTR() macro and encoded
    error numbers as fake pointers.
  - in legacy (default) mode, those APIs that were returning ERR_PTR(err),
    continue doing so, but still set errno.

With these changes, errno can be always used to extract actual error,
regardless of legacy or libbpf 1.0 modes. This is utilized internally in
libbpf in places where libbpf uses it's own high-level APIs.
libbpf_get_error() is adapted to handle both cases completely transparently to
end-users (and is used by libbpf consistently as well).

More context, justification, and discussion can be found in "Libbpf: the road
to v1.0" document ([0]).

  [0] https://docs.google.com/document/d/1UyjTZuPFWiPFyKk1tV5an11_iaRuec6U-ZESZ54nNTY

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210525035935.1461796-5-andrii@kernel.org
2021-06-08 13:04:27 -07:00
Andrii Nakryiko
7c7ba067fc libbpf: Streamline error reporting for low-level APIs
Ensure that low-level APIs behave uniformly across the libbpf as follows:
  - in case of an error, errno is always set to the correct error code;
  - when libbpf 1.0 mode is enabled with LIBBPF_STRICT_DIRECT_ERRS option to
    libbpf_set_strict_mode(), return -Exxx error value directly, instead of -1;
  - by default, until libbpf 1.0 is released, keep returning -1 directly.

More context, justification, and discussion can be found in "Libbpf: the road
to v1.0" document ([0]).

  [0] https://docs.google.com/document/d/1UyjTZuPFWiPFyKk1tV5an11_iaRuec6U-ZESZ54nNTY

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210525035935.1461796-4-andrii@kernel.org
2021-06-08 13:04:27 -07:00
Andrii Nakryiko
12eb2666d9 libbpf: Add libbpf_set_strict_mode() API to turn on libbpf 1.0 behaviors
Add libbpf_set_strict_mode() API that allows application to simulate libbpf
1.0 breaking changes before libbpf 1.0 is released. This will help users
migrate gradually and with confidence.

For now only ALL or NONE options are available, subsequent patches will add
more flags. This patch is preliminary for selftests/bpf changes.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210525035935.1461796-2-andrii@kernel.org
2021-06-08 13:04:27 -07:00
Denis Salopek
234dea015b bpf: Extend libbpf with bpf_map_lookup_and_delete_elem_flags
Add bpf_map_lookup_and_delete_elem_flags() libbpf API in order to use
the BPF_F_LOCK flag with the map_lookup_and_delete_elem() function.

Signed-off-by: Denis Salopek <denis.salopek@sartura.hr>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/15b05dafe46c7e0750d110f233977372029d1f62.1620763117.git.denis.salopek@sartura.hr
2021-06-08 13:04:27 -07:00
Denis Salopek
c3c2e52201 bpf: Add lookup_and_delete_elem support to hashtab
Extend the existing bpf_map_lookup_and_delete_elem() functionality to
hashtab map types, in addition to stacks and queues.
Create a new hashtab bpf_map_ops function that does lookup and deletion
of the element under the same bucket lock and add the created map_ops to
bpf.h.

Signed-off-by: Denis Salopek <denis.salopek@sartura.hr>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/4d18480a3e990ffbf14751ddef0325eed3be2966.1620763117.git.denis.salopek@sartura.hr
2021-06-08 13:04:27 -07:00
Stanislav Fomichev
b79c698300 libbpf: Skip bpf_object__probe_loading for light skeleton
I'm getting the following error when running 'gen skeleton -L' as
regular user:

libbpf: Error in bpf_object__probe_loading():Operation not permitted(1).
Couldn't load trivial BPF program. Make sure your kernel supports BPF
(CONFIG_BPF_SYSCALL=y) and/or that RLIMIT_MEMLOCK is set to big enough
value.

Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210521030653.2626513-1-sdf@google.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
546199a723 bpf: Add cmd alias BPF_PROG_RUN
Add BPF_PROG_RUN command as an alias to BPF_RPOG_TEST_RUN to better
indicate the full range of use cases done by the command.

Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210519014032.20908-1-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
b44566c71b libbpf: Introduce bpf_map__initial_value().
Introduce bpf_map__initial_value() to read initial contents
of mmaped data/rodata/bss maps.
Note that bpf_map__set_initial_value() doesn't allow modifying
kconfig map while bpf_map__initial_value() allows reading
its values.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-17-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
594960b3db libbpf: Cleanup temp FDs when intermediate sys_bpf fails.
Fix loader program to close temporary FDs when intermediate
sys_bpf command fails.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-16-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
694a70c522 libbpf: Generate loader program out of BPF ELF file.
The BPF program loading process performed by libbpf is quite complex
and consists of the following steps:
"open" phase:
- parse elf file and remember relocations, sections
- collect externs and ksyms including their btf_ids in prog's BTF
- patch BTF datasec (since llvm couldn't do it)
- init maps (old style map_def, BTF based, global data map, kconfig map)
- collect relocations against progs and maps
"load" phase:
- probe kernel features
- load vmlinux BTF
- resolve externs (kconfig and ksym)
- load program BTF
- init struct_ops
- create maps
- apply CO-RE relocations
- patch ld_imm64 insns with src_reg=PSEUDO_MAP, PSEUDO_MAP_VALUE, PSEUDO_BTF_ID
- reposition subprograms and adjust call insns
- sanitize and load progs

During this process libbpf does sys_bpf() calls to load BTF, create maps,
populate maps and finally load programs.
Instead of actually doing the syscalls generate a trace of what libbpf
would have done and represent it as the "loader program".
The "loader program" consists of single map with:
- union bpf_attr(s)
- BTF bytes
- map value bytes
- insns bytes
and single bpf program that passes bpf_attr(s) and data into bpf_sys_bpf() helper.
Executing such "loader program" via bpf_prog_test_run() command will
replay the sequence of syscalls that libbpf would have done which will result
the same maps created and programs loaded as specified in the elf file.
The "loader program" removes libelf and majority of libbpf dependency from
program loading process.

kconfig, typeless ksym, struct_ops and CO-RE are not supported yet.

The order of relocate_data and relocate_calls had to change, so that
bpf_gen__prog_load() can see all relocations for a given program with
correct insn_idx-es.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-15-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
c96f2f1b29 libbpf: Preliminary support for fd_idx
Prep libbpf to use FD_IDX kernel feature when generating loader program.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-14-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
ac2095783a libbpf: Add bpf_object pointer to kernel_supports().
Add a pointer to 'struct bpf_object' to kernel_supports() helper.
It will be used in the next patch.
No functional changes.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-13-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
fecf2cf6dd libbpf: Change the order of data and text relocations.
In order to be able to generate loader program in the later
patches change the order of data and text relocations.
Also improve the test to include data relos.

If the kernel supports "FD array" the map_fd relocations can be processed
before text relos since generated loader program won't need to manually
patch ld_imm64 insns with map_fd.
But ksym and kfunc relocations can only be processed after all calls
are relocated, since loader program will consist of a sequence
of calls to bpf_btf_find_by_name_kind() followed by patching of btf_id
and btf_obj_fd into corresponding ld_imm64 insns. The locations of those
ld_imm64 insns are specified in relocations.
Hence process all data relocations (maps, ksym, kfunc) together after call relos.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-12-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
c1f36fb3e3 bpf: Add bpf_sys_close() helper.
Add bpf_sys_close() helper to be used by the syscall/loader program to close
intermediate FDs and other cleanup.
Note this helper must never be allowed inside fdget/fdput bracketing.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-11-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
6eac86910c bpf: Add bpf_btf_find_by_name_kind() helper.
Add new helper:
long bpf_btf_find_by_name_kind(char *name, int name_sz, u32 kind, int flags)
Description
	Find BTF type with given name and kind in vmlinux BTF or in module's BTFs.
Return
	Returns btf_id and btf_obj_fd in lower and upper 32 bits.

It will be used by loader program to find btf_id to attach the program to
and to find btf_ids of ksyms.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-10-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
64a654f398 bpf: Introduce fd_idx
Typical program loading sequence involves creating bpf maps and applying
map FDs into bpf instructions in various places in the bpf program.
This job is done by libbpf that is using compiler generated ELF relocations
to patch certain instruction after maps are created and BTFs are loaded.
The goal of fd_idx is to allow bpf instructions to stay immutable
after compilation. At load time the libbpf would still create maps as usual,
but it wouldn't need to patch instructions. It would store map_fds into
__u32 fd_array[] and would pass that pointer to sys_bpf(BPF_PROG_LOAD).

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-9-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
34eb4fb3f1 libbpf: Support for syscall program type
Trivial support for syscall program type.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-5-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
007709011e bpf: Introduce bpf_sys_bpf() helper and program type.
Add placeholders for bpf_sys_bpf() helper and new program type.
Make sure to check that expected_attach_type is zero for future extensibility.
Allow tracing helper functions to be used in this program type, since they will
only execute from user context via bpf_prog_test_run.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-2-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Yonghong Song
db9614b6bd libbpf: Add support for new llvm bpf relocations
LLVM patch https://reviews.llvm.org/D102712
narrowed the scope of existing R_BPF_64_64
and R_BPF_64_32 relocations, and added three
new relocations, R_BPF_64_ABS64, R_BPF_64_ABS32
and R_BPF_64_NODYLD32. The main motivation is
to make relocations linker friendly.

This change, unfortunately, breaks libbpf build,
and we will see errors like below:
  libbpf: ELF relo #0 in section #6 has unexpected type 2 in
     /home/yhs/work/bpf-next/tools/testing/selftests/bpf/bpf_tcp_nogpl.o
  Error: failed to link
     '/home/yhs/work/bpf-next/tools/testing/selftests/bpf/bpf_tcp_nogpl.o':
     Unknown error -22 (-22)
The new relocation R_BPF_64_ABS64 is generated
and libbpf linker sanity check doesn't understand it.
Relocation section '.rel.struct_ops' at offset 0x1410 contains 1 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name
0000000000000018  0000000700000002 R_BPF_64_ABS64         0000000000000000 nogpltcp_init

Look at the selftests/bpf/bpf_tcp_nogpl.c,
  void BPF_STRUCT_OPS(nogpltcp_init, struct sock *sk)
  {
  }

  SEC(".struct_ops")
  struct tcp_congestion_ops bpf_nogpltcp = {
          .init           = (void *)nogpltcp_init,
          .name           = "bpf_nogpltcp",
  };
The new llvm relocation scheme categorizes 'nogpltcp_init' reference
as R_BPF_64_ABS64 instead of R_BPF_64_64 which is used to specify
ld_imm64 relocation in the new scheme.

Let us fix the linker sanity checking by including
R_BPF_64_ABS64 and R_BPF_64_ABS32. There is no need to
check R_BPF_64_NODYLD32 which is used for .BTF and .BTF.ext.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210522162341.3687617-1-yhs@fb.com
2021-05-24 21:24:56 -07:00
Andrii Nakryiko
57375504c6 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   c1cccec9c63637c4c5ee0aa2da2850d983c19e88
Checkpoint bpf-next commit: f18ba26da88a89db9b50cb4ff47fadb159f2810b
Baseline bpf commit:        c9a7c013569d73881199a7a3011c03336f592cc8
Checkpoint bpf commit:      d0c0fe10ce6d87734b65c18dc8f4bcae3f4dbea4

Andrii Nakryiko (1):
  libbpf: Reject static entry-point BPF programs

Kumar Kartikeya Dwivedi (2):
  libbpf: Add various netlink helpers
  libbpf: Add low level TC-BPF management API

 src/libbpf.c   |   5 +
 src/libbpf.h   |  44 ++++
 src/libbpf.map |   5 +
 src/netlink.c  | 568 +++++++++++++++++++++++++++++++++++++++++--------
 src/nlattr.h   |  48 +++++
 5 files changed, 587 insertions(+), 83 deletions(-)

--
2.30.2
2021-05-17 17:12:06 -07:00
Kumar Kartikeya Dwivedi
d71ff87a2d libbpf: Add low level TC-BPF management API
This adds functions that wrap the netlink API used for adding, manipulating,
and removing traffic control filters.

The API summary:

A bpf_tc_hook represents a location where a TC-BPF filter can be attached.
This means that creating a hook leads to creation of the backing qdisc,
while destruction either removes all filters attached to a hook, or destroys
qdisc if requested explicitly (as discussed below).

The TC-BPF API functions operate on this bpf_tc_hook to attach, replace,
query, and detach tc filters. All functions return 0 on success, and a
negative error code on failure.

bpf_tc_hook_create - Create a hook
Parameters:
	@hook - Cannot be NULL, ifindex > 0, attach_point must be set to
		proper enum constant. Note that parent must be unset when
		attach_point is one of BPF_TC_INGRESS or BPF_TC_EGRESS. Note
		that as an exception BPF_TC_INGRESS|BPF_TC_EGRESS is also a
		valid value for attach_point.

		Returns -EOPNOTSUPP when hook has attach_point as BPF_TC_CUSTOM.

bpf_tc_hook_destroy - Destroy a hook
Parameters:
	@hook - Cannot be NULL. The behaviour depends on value of
		attach_point. If BPF_TC_INGRESS, all filters attached to
		the ingress hook will be detached. If BPF_TC_EGRESS, all
		filters attached to the egress hook will be detached. If
		BPF_TC_INGRESS|BPF_TC_EGRESS, the clsact qdisc will be
		deleted, also detaching all filters. As before, parent must
		be unset for these attach_points, and set for BPF_TC_CUSTOM.

		It is advised that if the qdisc is operated on by many programs,
		then the program at least check that there are no other existing
		filters before deleting the clsact qdisc. An example is shown
		below:

		DECLARE_LIBBPF_OPTS(bpf_tc_hook, .ifindex = if_nametoindex("lo"),
				    .attach_point = BPF_TC_INGRESS);
		/* set opts as NULL, as we're not really interested in
		 * getting any info for a particular filter, but just
	 	 * detecting its presence.
		 */
		r = bpf_tc_query(&hook, NULL);
		if (r == -ENOENT) {
			/* no filters */
			hook.attach_point = BPF_TC_INGRESS|BPF_TC_EGREESS;
			return bpf_tc_hook_destroy(&hook);
		} else {
			/* failed or r == 0, the latter means filters do exist */
			return r;
		}

		Note that there is a small race between checking for no
		filters and deleting the qdisc. This is currently unavoidable.

		Returns -EOPNOTSUPP when hook has attach_point as BPF_TC_CUSTOM.

bpf_tc_attach - Attach a filter to a hook
Parameters:
	@hook - Cannot be NULL. Represents the hook the filter will be
		attached to. Requirements for ifindex and attach_point are
		same as described in bpf_tc_hook_create, but BPF_TC_CUSTOM
		is also supported.  In that case, parent must be set to the
		handle where the filter will be attached (using BPF_TC_PARENT).
		E.g. to set parent to 1:16 like in tc command line, the
		equivalent would be BPF_TC_PARENT(1, 16).

	@opts - Cannot be NULL. The following opts are optional:
		* handle   - The handle of the filter
		* priority - The priority of the filter
			     Must be >= 0 and <= UINT16_MAX
		Note that when left unset, they will be auto-allocated by
		the kernel. The following opts must be set:
		* prog_fd - The fd of the loaded SCHED_CLS prog
		The following opts must be unset:
		* prog_id - The ID of the BPF prog
		The following opts are optional:
		* flags - Currently only BPF_TC_F_REPLACE is allowed. It
			  allows replacing an existing filter instead of
			  failing with -EEXIST.
		The following opts will be filled by bpf_tc_attach on a
		successful attach operation if they are unset:
		* handle   - The handle of the attached filter
		* priority - The priority of the attached filter
		* prog_id  - The ID of the attached SCHED_CLS prog
		This way, the user can know what the auto allocated values
		for optional opts like handle and priority are for the newly
		attached filter, if they were unset.

		Note that some other attributes are set to fixed default
		values listed below (this holds for all bpf_tc_* APIs):
		protocol as ETH_P_ALL, direct action mode, chain index of 0,
		and class ID of 0 (this can be set by writing to the
		skb->tc_classid field from the BPF program).

bpf_tc_detach
Parameters:
	@hook - Cannot be NULL. Represents the hook the filter will be
		detached from. Requirements are same as described above
		in bpf_tc_attach.

	@opts - Cannot be NULL. The following opts must be set:
		* handle, priority
		The following opts must be unset:
		* prog_fd, prog_id, flags

bpf_tc_query
Parameters:
	@hook - Cannot be NULL. Represents the hook where the filter lookup will
		be performed. Requirements are same as described above in
		bpf_tc_attach().

	@opts - Cannot be NULL. The following opts must be set:
		* handle, priority
		The following opts must be unset:
		* prog_fd, prog_id, flags
		The following fields will be filled by bpf_tc_query upon a
		successful lookup:
		* prog_id

Some usage examples (using BPF skeleton infrastructure):

BPF program (test_tc_bpf.c):

	#include <linux/bpf.h>
	#include <bpf/bpf_helpers.h>

	SEC("classifier")
	int cls(struct __sk_buff *skb)
	{
		return 0;
	}

Userspace loader:

	struct test_tc_bpf *skel = NULL;
	int fd, r;

	skel = test_tc_bpf__open_and_load();
	if (!skel)
		return -ENOMEM;

	fd = bpf_program__fd(skel->progs.cls);

	DECLARE_LIBBPF_OPTS(bpf_tc_hook, hook, .ifindex =
			    if_nametoindex("lo"), .attach_point =
			    BPF_TC_INGRESS);
	/* Create clsact qdisc */
	r = bpf_tc_hook_create(&hook);
	if (r < 0)
		goto end;

	DECLARE_LIBBPF_OPTS(bpf_tc_opts, opts, .prog_fd = fd);
	r = bpf_tc_attach(&hook, &opts);
	if (r < 0)
		goto end;
	/* Print the auto allocated handle and priority */
	printf("Handle=%u", opts.handle);
	printf("Priority=%u", opts.priority);

	opts.prog_fd = opts.prog_id = 0;
	bpf_tc_detach(&hook, &opts);
end:
	test_tc_bpf__destroy(skel);

This is equivalent to doing the following using tc command line:
  # tc qdisc add dev lo clsact
  # tc filter add dev lo ingress bpf obj foo.o sec classifier da
  # tc filter del dev lo ingress handle <h> prio <p> bpf
... where the handle and priority can be found using:
  # tc filter show dev lo ingress

Another example replacing a filter (extending prior example):

	/* We can also choose both (or one), let's try replacing an
	 * existing filter.
	 */
	DECLARE_LIBBPF_OPTS(bpf_tc_opts, replace_opts, .handle =
			    opts.handle, .priority = opts.priority,
			    .prog_fd = fd);
	r = bpf_tc_attach(&hook, &replace_opts);
	if (r == -EEXIST) {
		/* Expected, now use BPF_TC_F_REPLACE to replace it */
		replace_opts.flags = BPF_TC_F_REPLACE;
		return bpf_tc_attach(&hook, &replace_opts);
	} else if (r < 0) {
		return r;
	}
	/* There must be no existing filter with these
	 * attributes, so cleanup and return an error.
	 */
	replace_opts.prog_fd = replace_opts.prog_id = 0;
	bpf_tc_detach(&hook, &replace_opts);
	return -1;

To obtain info of a particular filter:

	/* Find info for filter with handle 1 and priority 50 */
	DECLARE_LIBBPF_OPTS(bpf_tc_opts, info_opts, .handle = 1,
			    .priority = 50);
	r = bpf_tc_query(&hook, &info_opts);
	if (r == -ENOENT)
		printf("Filter not found");
	else if (r < 0)
		return r;
	printf("Prog ID: %u", info_opts.prog_id);
	return 0;

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> # libbpf API design
[ Daniel: also did major patch cleanup ]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210512103451.989420-3-memxor@gmail.com
2021-05-17 17:12:06 -07:00
Kumar Kartikeya Dwivedi
01515b8f05 libbpf: Add various netlink helpers
This change introduces a few helpers to wrap open coded attribute
preparation in netlink.c. It also adds a libbpf_netlink_send_recv() that
is useful to wrap send + recv handling in a generic way. Subsequent patch
will also use this function for sending and receiving a netlink response.
The libbpf_nl_get_link() helper has been removed instead, moving socket
creation into the newly named libbpf_netlink_send_recv().

Every nested attribute's closure must happen using the helper
nlattr_end_nested(), which sets its length properly. NLA_F_NESTED is
enforced using nlattr_begin_nested() helper. Other simple attributes
can be added directly.

The maxsz parameter corresponds to the size of the request structure
which is being filled in, so for instance with req being:

  struct {
	struct nlmsghdr nh;
	struct tcmsg t;
	char buf[4096];
  } req;

Then, maxsz should be sizeof(req).

This change also converts the open coded attribute preparation with these
helpers. Note that the only failure the internal call to nlattr_add()
could result in the nested helper would be -EMSGSIZE, hence that is what
we return to our caller.

The libbpf_netlink_send_recv() call takes care of opening the socket,
sending the netlink message, receiving the response, potentially invoking
callbacks, and return errors if any, and then finally close the socket.
This allows users to avoid identical socket setup code in different places.
The only user of libbpf_nl_get_link() has been converted to make use of it.
__bpf_set_link_xdp_fd_replace() has also been refactored to use it.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
[ Daniel: major patch cleanup ]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210512103451.989420-2-memxor@gmail.com
2021-05-17 17:12:06 -07:00
Andrii Nakryiko
6028cec50c libbpf: Reject static entry-point BPF programs
Detect use of static entry-point BPF programs (those with SEC() markings) and
emit error message. This is similar to
c1cccec9c636 ("libbpf: Reject static maps") but for BPF programs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210514195534.1440970-1-andrii@kernel.org
2021-05-17 17:12:06 -07:00
Andrii Nakryiko
68695d0173 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   9d31d2338950293ec19d9b095fbaa9030899dcb4
Checkpoint bpf-next commit: c1cccec9c63637c4c5ee0aa2da2850d983c19e88
Baseline bpf commit:        9683e5775c75097c46bd24e65411b16ac6c6cbb3
Checkpoint bpf commit:      c9a7c013569d73881199a7a3011c03336f592cc8

Andrii Nakryiko (4):
  libbpf: Add per-file linker opts
  libbpf: Fix ELF symbol visibility update logic
  libbpf: Treat STV_INTERNAL same as STV_HIDDEN for functions
  libbpf: Reject static maps

Arnaldo Carvalho de Melo (1):
  libbpf: Provide GELF_ST_VISIBILITY() define for older libelf

 src/libbpf.c          | 35 +++++++++++++++++++++++++----------
 src/libbpf.h          | 10 +++++++++-
 src/libbpf_internal.h |  5 +++++
 src/linker.c          | 18 +++++++++++++-----
 4 files changed, 52 insertions(+), 16 deletions(-)

--
2.30.2
2021-05-17 14:36:22 -07:00
Arnaldo Carvalho de Melo
72cdd6ed42 libbpf: Provide GELF_ST_VISIBILITY() define for older libelf
Where that macro isn't available.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/YJaspEh0qZr4LYOc@kernel.org
2021-05-17 14:36:22 -07:00
Andrii Nakryiko
b5bfbab488 libbpf: Reject static maps
Static maps never really worked with libbpf, because all such maps were always
silently resolved to the very first map. Detect static maps (both legacy and
BTF-defined) and report user-friendly error.

Tested locally by switching few maps (legacy and BTF-defined) in selftests to
static ones and verifying that now libbpf rejects them loudly.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210513233643.194711-2-andrii@kernel.org
2021-05-17 14:36:22 -07:00
Andrii Nakryiko
1cf1c245d1 libbpf: Treat STV_INTERNAL same as STV_HIDDEN for functions
Do the same global -> static BTF update for global functions with STV_INTERNAL
visibility to turn on static BPF verification mode.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210507054119.270888-7-andrii@kernel.org
2021-05-17 14:36:22 -07:00
Andrii Nakryiko
076dd5dadb libbpf: Fix ELF symbol visibility update logic
Fix silly bug in updating ELF symbol's visibility.

Fixes: a46349227cd8 ("libbpf: Add linker extern resolution support for functions and global variables")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210507054119.270888-6-andrii@kernel.org
2021-05-17 14:36:22 -07:00
Andrii Nakryiko
6f585ab88f libbpf: Add per-file linker opts
For better future extensibility add per-file linker options. Currently
the set of available options is empty. This changes bpf_linker__add_file()
API, but it's not a breaking change as bpf_linker APIs hasn't been released
yet.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210507054119.270888-3-andrii@kernel.org
2021-05-17 14:36:22 -07:00
Andrii Nakryiko
c5389a965b sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   87bd9e602e39585c5280556a2b6a6363bb334257
Checkpoint bpf-next commit: 9d31d2338950293ec19d9b095fbaa9030899dcb4
Baseline bpf commit:        b02265429681c9c827c45978a61a9f00be5ea9aa
Checkpoint bpf commit:      9683e5775c75097c46bd24e65411b16ac6c6cbb3

Andrii Nakryiko (2):
  libbpf: Support BTF_KIND_FLOAT during type compatibility checks in
    CO-RE
  selftests/bpf: Fix BPF_CORE_READ_BITFIELD() macro

Brendan Jackman (1):
  libbpf: Fix signed overflow in ringbuf_process_ring

Ian Rogers (1):
  libbpf: Add NULL check to add_dummy_ksym_var

 src/bpf_core_read.h | 16 ++++++++++++----
 src/libbpf.c        |  9 +++++++--
 src/ringbuf.c       | 30 +++++++++++++++++++++---------
 3 files changed, 40 insertions(+), 15 deletions(-)

--
2.30.2
2021-05-05 16:39:05 -07:00
Ian Rogers
a58b8ca93e libbpf: Add NULL check to add_dummy_ksym_var
Avoids a segv if btf isn't present. Seen on the call path
__bpf_object__open calling bpf_object__collect_externs.

Fixes: 5bd022ec01f0 (libbpf: Support extern kernel function)
Suggested-by: Stanislav Fomichev <sdf@google.com>
Suggested-by: Petar Penkov <ppenkov@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210504234910.976501-1-irogers@google.com
2021-05-05 16:39:05 -07:00
Brendan Jackman
1691c37b39 libbpf: Fix signed overflow in ringbuf_process_ring
One of our benchmarks running in (Google-internal) CI pushes data
through the ringbuf faster htan than userspace is able to consume
it. In this case it seems we're actually able to get >INT_MAX entries
in a single ring_buffer__consume() call. ASAN detected that cnt
overflows in this case.

Fix by using 64-bit counter internally and then capping the result to
INT_MAX before converting to the int return type. Do the same for
the ring_buffer__poll().

Fixes: bf99c936f947 (libbpf: Add BPF ring buffer support)
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210429130510.1621665-1-jackmanb@google.com
2021-05-05 16:39:05 -07:00
Andrii Nakryiko
242842b34c selftests/bpf: Fix BPF_CORE_READ_BITFIELD() macro
Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable
bitfields. Missing breaks in a switch caused 8-byte reads always. This can
confuse libbpf because it does strict checks that memory load size corresponds
to the original size of the field, which in this case quite often would be
wrong.

After fixing that, we run into another problem, which quite subtle, so worth
documenting here. The issue is in Clang optimization and CO-RE relocation
interactions. Without that asm volatile construct (also known as
barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and
will apply BYTE_OFFSET 4 times for each switch case arm. This will result in
the same error from libbpf about mismatch of memory load size and original
field size. I.e., if we were reading u32, we'd still have *(u8 *), *(u16 *),
*(u32 *), and *(u64 *) memory loads, three of which will fail. Using
barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to
calculate p, after which value of p is used without relocation in each of
switch case arms, doing appropiately-sized memory load.

Here's the list of relevant relocations and pieces of generated BPF code
before and after this patch for test_core_reloc_bitfields_direct selftests.

BEFORE
=====
 #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32
 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32

     157:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll
     159:       7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1
     160:       b7 02 00 00 04 00 00 00 r2 = 4
; BYTE_SIZE relocation here                 ^^^
     161:       66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63>
     162:       16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65>
     163:       16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66>
     164:       05 00 12 00 00 00 00 00 goto +18 <LBB0_69>

0000000000000528 <LBB0_66>:
     165:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     167:       69 11 08 00 00 00 00 00 r1 = *(u16 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     168:       05 00 0e 00 00 00 00 00 goto +14 <LBB0_69>

0000000000000548 <LBB0_63>:
     169:       16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67>
     170:       16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68>
     171:       05 00 0b 00 00 00 00 00 goto +11 <LBB0_69>

0000000000000560 <LBB0_68>:
     172:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     174:       79 11 08 00 00 00 00 00 r1 = *(u64 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     175:       05 00 07 00 00 00 00 00 goto +7 <LBB0_69>

0000000000000580 <LBB0_65>:
     176:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     178:       71 11 08 00 00 00 00 00 r1 = *(u8 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     179:       05 00 03 00 00 00 00 00 goto +3 <LBB0_69>

00000000000005a0 <LBB0_67>:
     180:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     182:       61 11 08 00 00 00 00 00 r1 = *(u32 *)(r1 + 8)
; BYTE_OFFSET relo here w/ RIGHT size        ^^^^^^^^^^^^^^^^

00000000000005b8 <LBB0_69>:
     183:       67 01 00 00 20 00 00 00 r1 <<= 32
     184:       b7 02 00 00 00 00 00 00 r2 = 0
     185:       16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71>
     186:       c7 01 00 00 20 00 00 00 r1 s>>= 32
     187:       05 00 01 00 00 00 00 00 goto +1 <LBB0_72>

00000000000005e0 <LBB0_71>:
     188:       77 01 00 00 20 00 00 00 r1 >>= 32

AFTER
=====

 #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32

     129:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll
     131:       7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1
     132:       b7 01 00 00 08 00 00 00 r1 = 8
; BYTE_OFFSET relo here                     ^^^
; no size check for non-memory dereferencing instructions
     133:       0f 12 00 00 00 00 00 00 r2 += r1
     134:       b7 03 00 00 04 00 00 00 r3 = 4
; BYTE_SIZE relocation here                 ^^^
     135:       66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63>
     136:       16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65>
     137:       16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66>
     138:       05 00 0a 00 00 00 00 00 goto +10 <LBB0_69>

0000000000000458 <LBB0_66>:
     139:       69 21 00 00 00 00 00 00 r1 = *(u16 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     140:       05 00 08 00 00 00 00 00 goto +8 <LBB0_69>

0000000000000468 <LBB0_63>:
     141:       16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67>
     142:       16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68>
     143:       05 00 05 00 00 00 00 00 goto +5 <LBB0_69>

0000000000000480 <LBB0_68>:
     144:       79 21 00 00 00 00 00 00 r1 = *(u64 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     145:       05 00 03 00 00 00 00 00 goto +3 <LBB0_69>

0000000000000490 <LBB0_65>:
     146:       71 21 00 00 00 00 00 00 r1 = *(u8 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     147:       05 00 01 00 00 00 00 00 goto +1 <LBB0_69>

00000000000004a0 <LBB0_67>:
     148:       61 21 00 00 00 00 00 00 r1 = *(u32 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^

00000000000004a8 <LBB0_69>:
     149:       67 01 00 00 20 00 00 00 r1 <<= 32
     150:       b7 02 00 00 00 00 00 00 r2 = 0
     151:       16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71>
     152:       c7 01 00 00 20 00 00 00 r1 s>>= 32
     153:       05 00 01 00 00 00 00 00 goto +1 <LBB0_72>

00000000000004d0 <LBB0_71>:
     154:       77 01 00 00 20 00 00 00 r1 >>= 323

Fixes: ee26dade0e3b ("libbpf: Add support for relocatable bitfields")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Lorenz Bauer <lmb@cloudflare.com>
Link: https://lore.kernel.org/bpf/20210426192949.416837-4-andrii@kernel.org
2021-05-05 16:39:05 -07:00
Andrii Nakryiko
6b14cfa56e libbpf: Support BTF_KIND_FLOAT during type compatibility checks in CO-RE
Add BTF_KIND_FLOAT support when doing CO-RE field type compatibility check.
Without this, relocations against float/double fields will fail.

Also adjust one error message to emit instruction index instead of less
convenient instruction byte offset.

Fixes: 22541a9eeb0d ("libbpf: Add BTF_KIND_FLOAT support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Lorenz Bauer <lmb@cloudflare.com>
Link: https://lore.kernel.org/bpf/20210426192949.416837-3-andrii@kernel.org
2021-05-05 16:39:05 -07:00
Andrii Nakryiko
b8b1faa3d4 travis: fix libelf-dev build-dep issues by using aptitude instead of apt-get
Use aptitude to actually see what's wrong with the dependencies. And it
actually magically resolves whatever minor version conflicts there are.

The big surprise came from the apparent difference in build-dep command
behavior. Aptitude's build-dep doesn't seem to install the libpfelf-dev
package itself. Adding explicit `aptitude install libelf-dev` after build-dep
solves the issue for now.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-05-05 13:57:07 -07:00
Andrii Nakryiko
9e123fa5d2 vmtests: fix libc6 dependency and remove explicit libelf-dev install
Force libc6 dependency version.

Drop explicit libelf-dev install command, as it should be pre-installed by
Travis CI already, according to .travis.yaml.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-04-30 09:39:10 -07:00
Andrii Nakryiko
4ccc1f0b9f vmtest: blacklist 2 tests on 5.5
Blacklist linked_vars, which expects typed ksym support. Blacklist snprintf,
which expectes new bpf_snprintf() helper.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
b9278634aa sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   1e1032b0c4afaed7739a6681ff6b4cb120b82994
Checkpoint bpf-next commit: 87bd9e602e39585c5280556a2b6a6363bb334257
Baseline bpf commit:        a14d273ba15968495896a38b7b3399dba66d0270
Checkpoint bpf commit:      b02265429681c9c827c45978a61a9f00be5ea9aa

Alexei Starovoitov (1):
  libbpf: Remove unused field.

Andrii Nakryiko (11):
  libbpf: Add bpf_map__inner_map API
  libbpf: Suppress compiler warning when using SEC() macro with externs
  libbpf: Mark BPF subprogs with hidden visibility as static for BPF
    verifier
  libbpf: Allow gaps in BPF program sections to support overriden weak
    functions
  libbpf: Refactor BTF map definition parsing
  libbpf: Factor out symtab and relos sanity checks
  libbpf: Make few internal helpers available outside of libbpf.c
  libbpf: Extend sanity checking ELF symbols with externs validation
  libbpf: Tighten BTF type ID rewriting with error checking
  libbpf: Add linker extern resolution support for functions and global
    variables
  libbpf: Support extern resolution for BTF-defined maps in .maps
    section

Ciara Loftus (1):
  libbpf: Fix potential NULL pointer dereference

Daniel Borkmann (1):
  bpf: Sync bpf headers in tooling infrastucture

Florent Revest (3):
  bpf: Add a bpf_snprintf helper
  libbpf: Initialize the bpf_seq_printf parameters array field by field
  libbpf: Introduce a BPF_SNPRINTF helper macro

Pedro Tammela (1):
  libbpf: Clarify flags in ringbuf helpers

Toke Høiland-Jørgensen (1):
  bpf: Return target info when a tracing bpf_link is queried

 include/uapi/linux/bpf.h |   83 ++-
 src/bpf_helpers.h        |   19 +-
 src/bpf_tracing.h        |   58 +-
 src/btf.c                |    5 -
 src/libbpf.c             |  396 +++++++-----
 src/libbpf.h             |    1 +
 src/libbpf.map           |    1 +
 src/libbpf_internal.h    |   45 ++
 src/linker.c             | 1270 ++++++++++++++++++++++++++++++++------
 src/xsk.c                |    3 +
 10 files changed, 1512 insertions(+), 369 deletions(-)

--
2.30.2
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
af47e6c199 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
b2c06aec99 libbpf: Support extern resolution for BTF-defined maps in .maps section
Add extra logic to handle map externs (only BTF-defined maps are supported for
linking). Re-use the map parsing logic used during bpf_object__open(). Map
externs are currently restricted to always match complete map definition. So
all the specified attributes will be compared (down to pining, map_flags,
numa_node, etc). In the future this restriction might be relaxed with no
backwards compatibility issues. If any attribute is mismatched between extern
and actual map definition, linker will report an error, pointing out which one
mismatches.

The original intent was to allow for extern to specify attributes that matters
(to user) to enforce. E.g., if you specify just key information and omit
value, then any value fits. Similarly, it should have been possible to enforce
map_flags, pinning, and any other possible map attribute. Unfortunately, that
means that multiple externs can be only partially overlapping with each other,
which means linker would need to combine their type definitions to end up with
the most restrictive and fullest map definition. This requires an extra amount
of BTF manipulation which at this time was deemed unnecessary and would
require further extending generic BTF writer APIs. So that is left for future
follow ups, if there will be demand for that. But the idea seems intresting
and useful, so I want to document it here.

Weak definitions are also supported, but are pretty strict as well, just
like externs: all weak map definitions have to match exactly. In the follow up
patches this most probably will be relaxed, with __weak map definitions being
able to differ between each other (with non-weak definition always winning, of
course).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-13-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
29e4840915 libbpf: Add linker extern resolution support for functions and global variables
Add BPF static linker logic to resolve extern variables and functions across
multiple linked together BPF object files.

For that, linker maintains a separate list of struct glob_sym structures,
which keeps track of few pieces of metadata (is it extern or resolved global,
is it a weak symbol, which ELF section it belongs to, etc) and ties together
BTF type info and ELF symbol information and keeps them in sync.

With adding support for extern variables/funcs, it's now possible for some
sections to contain both extern and non-extern definitions. This means that
some sections may start out as ephemeral (if only externs are present and thus
there is not corresponding ELF section), but will be "upgraded" to actual ELF
section as symbols are resolved or new non-extern definitions are appended.

Additional care is taken to not duplicate extern entries in sections like
.kconfig and .ksyms.

Given libbpf requires BTF type to always be present for .kconfig/.ksym
externs, linker extends this requirement to all the externs, even those that
are supposed to be resolved during static linking and which won't be visible
to libbpf. With BTF information always present, static linker will check not
just ELF symbol matches, but entire BTF type signature match as well. That
logic is stricter that BPF CO-RE checks. It probably should be re-used by
.ksym resolution logic in libbpf as well, but that's left for follow up
patches.

To make it unnecessary to rewrite ELF symbols and minimize BTF type
rewriting/removal, ELF symbols that correspond to externs initially will be
updated in place once they are resolved. Similarly for BTF type info, VAR/FUNC
and var_secinfo's (sec_vars in struct bpf_linker) are staying stable, but
types they point to might get replaced when extern is resolved. This might
leave some left-over types (even though we try to minimize this for common
cases of having extern funcs with not argument names vs concrete function with
names properly specified). That can be addresses later with a generic BTF
garbage collection. That's left for a follow up as well.

Given BTF type appending phase is separate from ELF symbol
appending/resolution, special struct glob_sym->underlying_btf_id variable is
used to communicate resolution and rewrite decisions. 0 means
underlying_btf_id needs to be appended (it's not yet in final linker->btf), <0
values are used for temporary storage of source BTF type ID (not yet
rewritten), so -glob_sym->underlying_btf_id is BTF type id in obj-btf. But by
the end of linker_append_btf() phase, that underlying_btf_id will be remapped
and will always be > 0. This is the uglies part of the whole process, but
keeps the other parts much simpler due to stability of sec_var and VAR/FUNC
types, as well as ELF symbol, so please keep that in mind while reviewing.

BTF-defined maps require some extra custom logic and is addressed separate in
the next patch, so that to keep this one smaller and easier to review.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-12-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
7078c5eae4 libbpf: Tighten BTF type ID rewriting with error checking
It should never fail, but if it does, it's better to know about this rather
than end up with nonsensical type IDs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-11-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
692ae888bc libbpf: Extend sanity checking ELF symbols with externs validation
Add logic to validate extern symbols, plus some other minor extra checks, like
ELF symbol #0 validation, general symbol visibility and binding validations.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-10-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
24b5d82967 libbpf: Make few internal helpers available outside of libbpf.c
Make skip_mods_and_typedefs(), btf_kind_str(), and btf_func_linkage() helpers
available outside of libbpf.c, to be used by static linker code.

Also do few cleanups (error code fixes, comment clean up, etc) that don't
deserve their own commit.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-9-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
4dcf439178 libbpf: Factor out symtab and relos sanity checks
Factor out logic for sanity checking SHT_SYMTAB and SHT_REL sections into
separate sections. They are already quite extensive and are suffering from too
deep indentation. Subsequent changes will extend SYMTAB sanity checking
further, so it's better to factor each into a separate function.

No functional changes are intended.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-8-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
d9b9d4a43a libbpf: Refactor BTF map definition parsing
Refactor BTF-defined maps parsing logic to allow it to be nicely reused by BPF
static linker. Further, at least for BPF static linker, it's important to know
which attributes of a BPF map were defined explicitly, so provide a bit set
for each known portion of BTF map definition. This allows BPF static linker to
do a simple check when dealing with extern map declarations.

The same capabilities allow to distinguish attributes explicitly set to zero
(e.g., __uint(max_entries, 0)) vs the case of not specifying it at all (no
max_entries attribute at all). Libbpf is currently not utilizing that, but it
could be useful for backwards compatibility reasons later.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-7-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
7b106ea4b1 libbpf: Allow gaps in BPF program sections to support overriden weak functions
Currently libbpf is very strict about parsing BPF program instruction
sections. No gaps are allowed between sequential BPF programs within a given
ELF section. Libbpf enforced that by keeping track of the next section offset
that should start a new BPF (sub)program and cross-checks that by searching
for a corresponding STT_FUNC ELF symbol.

But this is too restrictive once we allow to have weak BPF programs and link
together two or more BPF object files. In such case, some weak BPF programs
might be "overridden" by either non-weak BPF program with the same name and
signature, or even by another weak BPF program that just happened to be linked
first. That, in turn, leaves BPF instructions of the "lost" BPF (sub)program
intact, but there is no corresponding ELF symbol, because no one is going to
be referencing it.

Libbpf already correctly handles such cases in the sense that it won't append
such dead code to actual BPF programs loaded into kernel. So the only change
that needs to be done is to relax the logic of parsing BPF instruction
sections. Instead of assuming next BPF (sub)program section offset, iterate
available STT_FUNC ELF symbols to discover all available BPF subprograms and
programs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-6-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
3319982d34 libbpf: Mark BPF subprogs with hidden visibility as static for BPF verifier
Define __hidden helper macro in bpf_helpers.h, which is a short-hand for
__attribute__((visibility("hidden"))). Add libbpf support to mark BPF
subprograms marked with __hidden as static in BTF information to enforce BPF
verifier's static function validation algorithm, which takes more information
(caller's context) into account during a subprogram validation.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-5-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
2e430712f5 libbpf: Suppress compiler warning when using SEC() macro with externs
When used on externs SEC() macro will trigger compilation warning about
inapplicable `__attribute__((used))`. That's expected for extern declarations,
so suppress it with the corresponding _Pragma.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-4-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Florent Revest
120a21852b libbpf: Introduce a BPF_SNPRINTF helper macro
Similarly to BPF_SEQ_PRINTF, this macro turns variadic arguments into an
array of u64, making it more natural to call the bpf_snprintf helper.

Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-6-revest@chromium.org
2021-04-26 16:30:18 -07:00
Florent Revest
f4da689d90 libbpf: Initialize the bpf_seq_printf parameters array field by field
When initializing the __param array with a one liner, if all args are
const, the initial array value will be placed in the rodata section but
because libbpf does not support relocation in the rodata section, any
pointer in this array will stay NULL.

Fixes: c09add2fbc5a ("tools/libbpf: Add bpf_iter support")
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-5-revest@chromium.org
2021-04-26 16:30:18 -07:00
Florent Revest
30755d3a1c bpf: Add a bpf_snprintf helper
The implementation takes inspiration from the existing bpf_trace_printk
helper but there are a few differences:

To allow for a large number of format-specifiers, parameters are
provided in an array, like in bpf_seq_printf.

Because the output string takes two arguments and the array of
parameters also takes two arguments, the format string needs to fit in
one argument. Thankfully, ARG_PTR_TO_CONST_STR is guaranteed to point to
a zero-terminated read-only map so we don't need a format string length
arg.

Because the format-string is known at verification time, we also do
a first pass of format string validation in the verifier logic. This
makes debugging easier.

Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-4-revest@chromium.org
2021-04-26 16:30:18 -07:00
Alexei Starovoitov
dda0dd6a87 libbpf: Remove unused field.
relo->processed is set, but not used. Remove it.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210415141817.53136-1-alexei.starovoitov@gmail.com
2021-04-26 16:30:18 -07:00
Toke Høiland-Jørgensen
c21f91bd35 bpf: Return target info when a tracing bpf_link is queried
There is currently no way to discover the target of a tracing program
attachment after the fact. Add this information to bpf_link_info and return
it when querying the bpf_link fd.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210413091607.58945-1-toke@redhat.com
2021-04-26 16:30:18 -07:00
Pedro Tammela
552dec12dc libbpf: Clarify flags in ringbuf helpers
In 'bpf_ringbuf_reserve()' we require the flag to '0' at the moment.

For 'bpf_ringbuf_{discard,submit,output}' a flag of '0' might send a
notification to the process if needed.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210412192434.944343-1-pctammela@mojatatu.com
2021-04-26 16:30:18 -07:00
Daniel Borkmann
678e8c8e49 bpf: Sync bpf headers in tooling infrastucture
Synchronize tools/include/uapi/linux/bpf.h which was missing changes
from various commits:

  - f3c45326ee71 ("bpf: Document PROG_TEST_RUN limitations")
  - e5e35e754c28 ("bpf: BPF-helper for MTU checking add length input")

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
f6de59dc3e libbpf: Add bpf_map__inner_map API
The API gives access to inner map for map in map types (array or
hash of map). It will be used to dynamically set max_entries in it.

Signed-off-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210408061310.95877-7-yauheni.kaliuta@redhat.com
2021-04-26 16:30:18 -07:00
Ciara Loftus
823648416c libbpf: Fix potential NULL pointer dereference
Wait until after the UMEM is checked for null to dereference it.

Fixes: 43f1bc1efff1 ("libbpf: Restore umem state after socket create failure")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210408052009.7844-1-ciara.loftus@intel.com
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
02dbcbea28 travis-ci: use default docker from Focal
Stop trying to update Docker version. Focal seems to have recent enough
Docker. This fixes Debian builds.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-04-26 08:30:13 -07:00
Ilya Leoshkevich
a7502f2707 vmtest: fix error reporting
S50-run-tests uses -e, which means that it immediately exits on test
failures without writing /exitcode. Fix by temporarily turning -e off.

Another issue is that $? in S50-run-tests is not quoted, which causes
the random value from the host to be taken (in practice always 0), so
fix that as well.

Finally, this fix has a positive side effect - QEMU no longer hangs
when tests fail. This is because rcS (generated by mkrootfs.sh) also
uses -e and immediately exits, if one of the scripts that it calls
fails, without calling S99-poweroff.

Example output after the fix:

Summary: 53/184 PASSED, 5 SKIPPED, 1 FAILED
+ exitstatus=1
+ set -e
+ echo 1
+ chmod 644 /exitstatus
+ for path in /etc/rcS.d/S*
+ '[' -x /etc/rcS.d/S99-poweroff ']'
+ /etc/rcS.d/S99-poweroff
travis_fold:start:shutdown
Shutdown

starting pid 232, tty '': '/sbin/swapoff -a'
starting pid 233, tty '': '/bin/umount -a -r'
[   45.909033] EXT4-fs (vda): re-mounted. Opts: (null)
The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Requesting system poweroff
[   48.932007] ACPI: Preparing to enter system sleep state S5
[   48.932785] reboot: Power down

Tests exit status: 1
travis_fold🔚shutdown

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-04-06 21:57:01 -07:00
Andrii Nakryiko
bab780e6f9 travis-ci: update to Ubuntu Focal
Update to Ubuntu Focal 20.04.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-04-06 21:18:31 -07:00
Ilya Leoshkevich
915f3abe94 travis-ci: prohibit uninitialized variables in managers/
The scripts in this directory rely on certain environment variables, so
fail if they are not set in order to improve the debugging experience.
The vmtest/ scripts already do it.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-04-06 21:18:31 -07:00
Ilya Leoshkevich
0c248143d4 travis-ci: disable GCC's -Wstringop-truncation noisy error on Ubuntu
This is the same as commit 4d86cae4f0 ("ci: disable GCC's
-Wstringop-truncation noisy error"), but for Ubuntu. Without this,
there are false positives in bpf_object__new() on Ubuntu 20.04:
this function calls strncpy() with the correct bounds, but still
triggers the warning.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-04-06 21:18:31 -07:00
Ilya Leoshkevich
9f0e42b512 vmtest: blacklist stacktrace_build_id selftest
It requires v5.9+ kernel when the test code is built with a newer
toolchain. The support was added by commit b33164f2bd1c ("bpf:
Iterate through all PT_NOTE sections when looking for build id").

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-04-06 21:18:31 -07:00
Andrii Nakryiko
174d0b7b49 vmtest: blacklist kfunc_call selftests
kfunc_call requires 5.13+ kernel.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-04-05 11:30:05 -07:00
Andrii Nakryiko
95f83b8b0c sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   155f556d64b1a48710f01305e14bb860734ed1e3
Checkpoint bpf-next commit: 1e1032b0c4afaed7739a6681ff6b4cb120b82994
Baseline bpf commit:        002322402dafd846c424ffa9240a937f49b48c42
Checkpoint bpf commit:      a14d273ba15968495896a38b7b3399dba66d0270

Andrii Nakryiko (2):
  libbpf: Preserve empty DATASEC BTFs during static linking
  libbpf: Fix memory leak when emitting final btf_ext

Ciara Loftus (3):
  libbpf: Ensure umem pointer is non-NULL before dereferencing
  libbpf: Restore umem state after socket create failure
  libbpf: Only create rx and tx XDP rings when necessary

Cong Wang (1):
  sock_map: Introduce BPF_SK_SKB_VERDICT

Hengqi Chen (1):
  libbpf: Fix KERNEL_VERSION macro

Maciej Fijalkowski (1):
  libbpf: xsk: Use bpf_link

Martin KaFai Lau (6):
  bpf: Support bpf program calling kernel function
  libbpf: Refactor bpf_object__resolve_ksyms_btf_id
  libbpf: Refactor codes for finding btf id of a kernel symbol
  libbpf: Rename RELO_EXTERN to RELO_EXTERN_VAR
  libbpf: Record extern sym relocation first
  libbpf: Support extern kernel function

Pedro Tammela (1):
  libbpf: Fix bail out from 'ringbuf_process_ring()' on error

Yang Yingliang (1):
  libbpf: Remove redundant semi-colon

 include/uapi/linux/bpf.h |   5 +
 src/bpf_helpers.h        |   2 +-
 src/libbpf.c             | 389 +++++++++++++++++++++++++++++----------
 src/linker.c             |  39 +++-
 src/ringbuf.c            |   2 +-
 src/xsk.c                | 315 ++++++++++++++++++++++++-------
 6 files changed, 574 insertions(+), 178 deletions(-)

--
2.30.2
2021-04-05 11:30:05 -07:00
Ciara Loftus
416343d95c libbpf: Only create rx and tx XDP rings when necessary
Prior to this commit xsk_socket__create(_shared) always attempted to create
the rx and tx rings for the socket. However this causes an issue when the
socket being setup is that which shares the fd with the UMEM. If a
previous call to this function failed with this socket after the rings were
set up, a subsequent call would always fail because the rings are not torn
down after the first call and when we try to set them up again we encounter
an error because they already exist. Solve this by remembering whether the
rings were set up by introducing new bools to struct xsk_umem which
represent the ring setup status and using them to determine whether or
not to set up the rings.

Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210331061218.1647-4-ciara.loftus@intel.com
2021-04-05 11:30:05 -07:00
Ciara Loftus
4f932c1ee9 libbpf: Restore umem state after socket create failure
If the call to xsk_socket__create fails, the user may want to retry the
socket creation using the same umem. Ensure that the umem is in the
same state on exit if the call fails by:
1. ensuring the umem _save pointers are unmodified.
2. not unmapping the set of umem rings that were set up with the umem
during xsk_umem__create, since those maps existed before the call to
xsk_socket__create and should remain in tact even in the event of
failure.

Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210331061218.1647-3-ciara.loftus@intel.com
2021-04-05 11:30:05 -07:00
Ciara Loftus
37e838f959 libbpf: Ensure umem pointer is non-NULL before dereferencing
Calls to xsk_socket__create dereference the umem to access the
fill_save and comp_save pointers. Make sure the umem is non-NULL
before doing this.

Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20210331061218.1647-2-ciara.loftus@intel.com
2021-04-05 11:30:05 -07:00
Pedro Tammela
d98e968707 libbpf: Fix bail out from 'ringbuf_process_ring()' on error
The current code bails out with negative and positive returns.
If the callback returns a positive return code, 'ring_buffer__consume()'
and 'ring_buffer__poll()' will return a spurious number of records
consumed, but mostly important will continue the processing loop.

This patch makes positive returns from the callback a no-op.

Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210325150115.138750-1-pctammela@mojatatu.com
2021-04-05 11:30:05 -07:00
Hengqi Chen
d4beac571a libbpf: Fix KERNEL_VERSION macro
Add missing ')' for KERNEL_VERSION macro.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210405040119.802188-1-hengqi.chen@gmail.com
2021-04-05 11:30:05 -07:00
Yang Yingliang
bea42d49f8 libbpf: Remove redundant semi-colon
Remove redundant semi-colon in finalize_btf_ext().

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210402012634.1965453-1-yangyingliang@huawei.com
2021-04-05 11:30:05 -07:00
Cong Wang
4e8d8d5cd2 sock_map: Introduce BPF_SK_SKB_VERDICT
Reusing BPF_SK_SKB_STREAM_VERDICT is possible but its name is
confusing and more importantly we still want to distinguish them
from user-space. So we can just reuse the stream verdict code but
introduce a new type of eBPF program, skb_verdict. Users are not
allowed to attach stream_verdict and skb_verdict programs to the
same map.

Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210331023237.41094-10-xiyou.wangcong@gmail.com
2021-04-05 11:30:05 -07:00
Maciej Fijalkowski
8628610c32 libbpf: xsk: Use bpf_link
Currently, if there are multiple xdpsock instances running on a single
interface and in case one of the instances is terminated, the rest of
them are left in an inoperable state due to the fact of unloaded XDP
prog from interface.

Consider the scenario below:

// load xdp prog and xskmap and add entry to xskmap at idx 10
$ sudo ./xdpsock -i ens801f0 -t -q 10

// add entry to xskmap at idx 11
$ sudo ./xdpsock -i ens801f0 -t -q 11

terminate one of the processes and another one is unable to work due to
the fact that the XDP prog was unloaded from interface.

To address that, step away from setting bpf prog in favour of bpf_link.
This means that refcounting of BPF resources will be done automatically
by bpf_link itself.

Provide backward compatibility by checking if underlying system is
bpf_link capable. Do this by looking up/creating bpf_link on loopback
device. If it failed in any way, stick with netlink-based XDP prog.
therwise, use bpf_link-based logic.

When setting up BPF resources during xsk socket creation, check whether
bpf_link for a given ifindex already exists via set of calls to
bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
and comparing the ifindexes from bpf_link and xsk socket.

For case where resources exist but they are not AF_XDP related, bail out
and ask user to remove existing prog and then retry.

Lastly, do a bit of refactoring within __xsk_setup_xdp_prog and pull out
existing code branches based on prog_id value onto separate functions
that are responsible for resource initialization if prog_id was 0 and
for lookup existing resources for non-zero prog_id as that implies that
XDP program is present on the underlying net device. This in turn makes
it easier to follow, especially the teardown part of both branches.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210329224316.17793-7-maciej.fijalkowski@intel.com
2021-04-05 11:30:05 -07:00
Andrii Nakryiko
2e51adc9bc libbpf: Fix memory leak when emitting final btf_ext
Free temporary allocated memory used to construct finalized .BTF.ext data.
Found by Coverity static analysis on libbpf's Github repo.

Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210327042502.969745-1-andrii@kernel.org
2021-04-05 11:30:05 -07:00
Martin KaFai Lau
1ccb9d99d6 libbpf: Support extern kernel function
This patch is to make libbpf able to handle the following extern
kernel function declaration and do the needed relocations before
loading the bpf program to the kernel.

extern int foo(struct sock *) __attribute__((section(".ksyms")))

In the collect extern phase, needed changes is made to
bpf_object__collect_externs() and find_extern_btf_id() to collect
extern function in ".ksyms" section.  The func in the BTF datasec also
needs to be replaced by an int var.  The idea is similar to the existing
handling in extern var.  In case the BTF may not have a var, a dummy ksym
var is added at the beginning of bpf_object__collect_externs()
if there is func under ksyms datasec.  It will also change the
func linkage from extern to global which the kernel can support.
It also assigns a param name if it does not have one.

In the collect relo phase, it will record the kernel function
call as RELO_EXTERN_FUNC.

bpf_object__resolve_ksym_func_btf_id() is added to find the func
btf_id of the running kernel.

During actual relocation, it will patch the BPF_CALL instruction with
src_reg = BPF_PSEUDO_FUNC_CALL and insn->imm set to the running
kernel func's btf_id.

The required LLVM patch: https://reviews.llvm.org/D93563

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015234.1548923-1-kafai@fb.com
2021-04-05 11:30:05 -07:00
Martin KaFai Lau
90e052e6dd libbpf: Record extern sym relocation first
This patch records the extern sym relocs first before recording
subprog relocs.  The later patch will have relocs for extern
kernel function call which is also using BPF_JMP | BPF_CALL.
It will be easier to handle the extern symbols first in
the later patch.

is_call_insn() helper is added.  The existing is_ldimm64() helper
is renamed to is_ldimm64_insn() for consistency.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015227.1548623-1-kafai@fb.com
2021-04-05 11:30:05 -07:00
Martin KaFai Lau
7ef7ed2a5d libbpf: Rename RELO_EXTERN to RELO_EXTERN_VAR
This patch renames RELO_EXTERN to RELO_EXTERN_VAR.
It is to avoid the confusion with a later patch adding
RELO_EXTERN_FUNC.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015221.1547722-1-kafai@fb.com
2021-04-05 11:30:05 -07:00
Martin KaFai Lau
7036f3356e libbpf: Refactor codes for finding btf id of a kernel symbol
This patch refactors code, that finds kernel btf_id by kind
and symbol name, to a new function find_ksym_btf_id().

It also adds a new helper __btf_kind_str() to return
a string by the numeric kind value.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015214.1547069-1-kafai@fb.com
2021-04-05 11:30:05 -07:00
Martin KaFai Lau
e5d7cbe15a libbpf: Refactor bpf_object__resolve_ksyms_btf_id
This patch refactors most of the logic from
bpf_object__resolve_ksyms_btf_id() into a new function
bpf_object__resolve_ksym_var_btf_id().
It is to get ready for a later patch adding
bpf_object__resolve_ksym_func_btf_id() which resolves
a kernel function to the running kernel btf_id.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015207.1546749-1-kafai@fb.com
2021-04-05 11:30:05 -07:00
Martin KaFai Lau
e35afcb289 bpf: Support bpf program calling kernel function
This patch adds support to BPF verifier to allow bpf program calling
kernel function directly.

The use case included in this set is to allow bpf-tcp-cc to directly
call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()").  Those
functions have already been used by some kernel tcp-cc implementations.

This set will also allow the bpf-tcp-cc program to directly call the
kernel tcp-cc implementation,  For example, a bpf_dctcp may only want to
implement its own dctcp_cwnd_event() and reuse other dctcp_*() directly
from the kernel tcp_dctcp.c instead of reimplementing (or
copy-and-pasting) them.

The tcp-cc kernel functions mentioned above will be white listed
for the struct_ops bpf-tcp-cc programs to use in a later patch.
The white listed functions are not bounded to a fixed ABI contract.
Those functions have already been used by the existing kernel tcp-cc.
If any of them has changed, both in-tree and out-of-tree kernel tcp-cc
implementations have to be changed.  The same goes for the struct_ops
bpf-tcp-cc programs which have to be adjusted accordingly.

This patch is to make the required changes in the bpf verifier.

First change is in btf.c, it adds a case in "btf_check_func_arg_match()".
When the passed in "btf->kernel_btf == true", it means matching the
verifier regs' states with a kernel function.  This will handle the
PTR_TO_BTF_ID reg.  It also maps PTR_TO_SOCK_COMMON, PTR_TO_SOCKET,
and PTR_TO_TCP_SOCK to its kernel's btf_id.

In the later libbpf patch, the insn calling a kernel function will
look like:

insn->code == (BPF_JMP | BPF_CALL)
insn->src_reg == BPF_PSEUDO_KFUNC_CALL /* <- new in this patch */
insn->imm == func_btf_id /* btf_id of the running kernel */

[ For the future calling function-in-kernel-module support, an array
  of module btf_fds can be passed at the load time and insn->off
  can be used to index into this array. ]

At the early stage of verifier, the verifier will collect all kernel
function calls into "struct bpf_kfunc_desc".  Those
descriptors are stored in "prog->aux->kfunc_tab" and will
be available to the JIT.  Since this "add" operation is similar
to the current "add_subprog()" and looking for the same insn->code,
they are done together in the new "add_subprog_and_kfunc()".

In the "do_check()" stage, the new "check_kfunc_call()" is added
to verify the kernel function call instruction:
1. Ensure the kernel function can be used by a particular BPF_PROG_TYPE.
   A new bpf_verifier_ops "check_kfunc_call" is added to do that.
   The bpf-tcp-cc struct_ops program will implement this function in
   a later patch.
2. Call "btf_check_kfunc_args_match()" to ensure the regs can be
   used as the args of a kernel function.
3. Mark the regs' type, subreg_def, and zext_dst.

At the later do_misc_fixups() stage, the new fixup_kfunc_call()
will replace the insn->imm with the function address (relative
to __bpf_call_base).  If needed, the jit can find the btf_func_model
by calling the new bpf_jit_find_kfunc_model(prog, insn).
With the imm set to the function address, "bpftool prog dump xlated"
will be able to display the kernel function calls the same way as
it displays other bpf helper calls.

gpl_compatible program is required to call kernel function.

This feature currently requires JIT.

The verifier selftests are adjusted because of the changes in
the verbose log in add_subprog_and_kfunc().

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015142.1544736-1-kafai@fb.com
2021-04-05 11:30:05 -07:00
Andrii Nakryiko
a18b72b920 libbpf: Preserve empty DATASEC BTFs during static linking
Ensure that BPF static linker preserves all DATASEC BTF types, even if some of
them might not have any variable information at all. This may happen if the
compiler promotes local initialized variable contents into .rodata section and
there are no global or static functions in the program.

For example,

  $ cat t.c
  struct t { char a; char b; char c; };
  void bar(struct t*);
  void find() {
     struct t tmp = {1, 2, 3};
     bar(&tmp);
  }

  $ clang -target bpf -O2 -g -S t.c
         .long   104                             # BTF_KIND_DATASEC(id = 8)
         .long   251658240                       # 0xf000000
         .long   0

         .ascii  ".rodata"                       # string offset=104

  $ clang -target bpf -O2 -g -c t.c
  $ readelf -S t.o | grep data
     [ 4] .rodata           PROGBITS         0000000000000000  00000090

Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210326043036.3081011-1-andrii@kernel.org
2021-04-05 11:30:05 -07:00
Adam Jensen
2bd682d23e README: Mention Alpine in list of packaging distros 2021-04-04 17:50:11 -07:00
Andrii Nakryiko
99bc176337 README: update links to more up-to-date CO-RE articles
Update links to point to blog posts that have some new updates and are
generally kept more up-to-date.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-03-28 13:36:49 -07:00
Andrii Nakryiko
ea5752c641 Makefile: add strset.o and linker.o to build
Fix libbpf build by linking strset.o and linker.o into libbpf.a.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-03-25 23:40:47 -07:00
Andrii Nakryiko
1d2d2d0034 vmtests: blacklist fexit_sleep on 5.5
Blacklist fexit_sleep selftest that relies on bpf_trampoline fix in 5.12.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
582b8fe21b sync: remove libbpf_util.h from the list of headers
libbpf_util.h was removed in 7e8bbe24cb8b ("libbpf: xsk: Move barriers from
libbpf_util.h to xsk.h") upstream, so remove it from the list of installable
headers.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
3ea10e46cb sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   22541a9eeb0d968c133aaebd95fa59da3208e705
Checkpoint bpf-next commit: 155f556d64b1a48710f01305e14bb860734ed1e3
Baseline bpf commit:        6185266c5a853bb0f2a459e3ff594546f277609b
Checkpoint bpf commit:      002322402dafd846c424ffa9240a937f49b48c42

Andrii Nakryiko (11):
  libbpf: Add explicit padding to bpf_xdp_set_link_opts
  libbpf: provide NULL and KERNEL_VERSION macros in bpf_helpers.h
  libbpf: Expose btf_type_by_id() internally
  libbpf: Generalize BTF and BTF.ext type ID and strings iteration
  libbpf: Rename internal memory-management helpers
  libbpf: Extract internal set-of-strings datastructure APIs
  libbpf: Add generic BTF type shallow copy API
  libbpf: Add BPF static linker APIs
  libbpf: Add BPF static linker BTF and BTF.ext support
  libbpf: Skip BTF fixup if object file has no BTF
  libbpf: Constify few bpf_program getters

Björn Töpel (3):
  libbpf, xsk: Add libbpf_smp_store_release libbpf_smp_load_acquire
  libbpf: xsk: Remove linux/compiler.h header
  libbpf: xsk: Move barriers from libbpf_util.h to xsk.h

Jean-Philippe Brucker (2):
  libbpf: Fix arm64 build
  libbpf: Fix BTF dump of pointer-to-array-of-struct

Joe Stringer (2):
  scripts/bpf: Abstract eBPF API target parameter
  tools: Sync uapi bpf.h header with latest changes

KP Singh (1):
  libbpf: Add explicit padding to btf_dump_emit_type_decl_opts

Kumar Kartikeya Dwivedi (1):
  libbpf: Use SOCK_CLOEXEC when opening the netlink socket

Lorenz Bauer (1):
  bpf: Add PROG_TEST_RUN support for sk_lookup programs

Maciej Fijalkowski (1):
  libbpf: Clear map_info before each bpf_obj_get_info_by_fd

Namhyung Kim (1):
  libbpf: Fix error path in bpf_object__elf_init()

Pedro Tammela (1):
  libbpf: Avoid inline hint definition from 'linux/stddef.h'

Rafael David Tinoco (1):
  libbpf: Add bpf object kern_version attribute setter

Xuesen Huang (1):
  bpf: Add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_ENCAP_L2_ETH

 include/uapi/linux/bpf.h |  724 +++++++++++++-
 src/bpf_helpers.h        |   21 +-
 src/btf.c                |  714 +++++++-------
 src/btf.h                |    3 +
 src/btf_dump.c           |   10 +-
 src/libbpf.c             |   32 +-
 src/libbpf.h             |   19 +-
 src/libbpf.map           |    6 +
 src/libbpf_internal.h    |   38 +-
 src/libbpf_util.h        |   47 -
 src/linker.c             | 1944 ++++++++++++++++++++++++++++++++++++++
 src/netlink.c            |    2 +-
 src/strset.c             |  176 ++++
 src/strset.h             |   21 +
 src/xsk.c                |    5 +-
 src/xsk.h                |   87 +-
 16 files changed, 3379 insertions(+), 470 deletions(-)
 delete mode 100644 src/libbpf_util.h
 create mode 100644 src/linker.c
 create mode 100644 src/strset.c
 create mode 100644 src/strset.h

--
2.30.2
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
3118d38a2e sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-03-25 23:31:23 -07:00
Rafael David Tinoco
b09a4999d9 libbpf: Add bpf object kern_version attribute setter
Unfortunately some distros don't have their kernel version defined
accurately in <linux/version.h> due to different long term support
reasons.

It is important to have a way to override the bpf kern_version
attribute during runtime: some old kernels might still check for
kern_version attribute during bpf_prog_load().

Signed-off-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210323040952.2118241-1-rafaeldtinoco@ubuntu.com
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
53f0e7d8ec libbpf: Constify few bpf_program getters
bpf_program__get_type() and bpf_program__get_expected_attach_type() shouldn't
modify given bpf_program, so mark input parameter as const struct bpf_program.
This eliminates unnecessary compilation warnings or explicit casts in user
programs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210324172941.2609884-1-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
d4d3a88b5a libbpf: Skip BTF fixup if object file has no BTF
Skip BTF fixup step when input object file is missing BTF altogether.

Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/bpf/20210319205909.1748642-3-andrii@kernel.org
2021-03-25 23:31:23 -07:00
KP Singh
7b1f3e310b libbpf: Add explicit padding to btf_dump_emit_type_decl_opts
Similar to
https://lore.kernel.org/bpf/20210313210920.1959628-2-andrii@kernel.org/

When DECLARE_LIBBPF_OPTS is used with inline field initialization, e.g:

  DECLARE_LIBBPF_OPTS(btf_dump_emit_type_decl_opts, opts,
    .field_name = var_ident,
    .indent_level = 2,
    .strip_mods = strip_mods,
  );

and compiled in debug mode, the compiler generates code which
leaves the padding uninitialized and triggers errors within libbpf APIs
which require strict zero initialization of OPTS structs.

Adding anonymous padding field fixes the issue.

Fixes: 9f81654eebe8 ("libbpf: Expose BTF-to-C type declaration emitting API")
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210319192117.2310658-1-kpsingh@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
b156979d19 libbpf: Add BPF static linker BTF and BTF.ext support
Add .BTF and .BTF.ext static linking logic.

When multiple BPF object files are linked together, their respective .BTF and
.BTF.ext sections are merged together. BTF types are not just concatenated,
but also deduplicated. .BTF.ext data is grouped by type (func info, line info,
core_relos) and target section names, and then all the records are
concatenated together, preserving their relative order. All the BTF type ID
references and string offsets are updated as necessary, to take into account
possibly deduplicated strings and types.

BTF DATASEC types are handled specially. Their respective var_secinfos are
accumulated separately in special per-section data and then final DATASEC
types are emitted at the very end during bpf_linker__finalize() operation,
just before emitting final ELF output file.

BTF data can also provide "section annotations" for some extern variables.
Such concept is missing in ELF, but BTF will have DATASEC types for such
special extern datasections (e.g., .kconfig, .ksyms). Such sections are called
"ephemeral" internally. Internally linker will keep metadata for each such
section, collecting variables information, but those sections won't be emitted
into the final ELF file.

Also, given LLVM/Clang during compilation emits BTF DATASECS that are
incomplete, missing section size and variable offsets for static variables,
BPF static linker will initially fix up such DATASECs, using ELF symbols data.
The final DATASECs will preserve section sizes and all variable offsets. This
is handled correctly by libbpf already, so won't cause any new issues. On the
other hand, it's actually a nice property to have a complete BTF data without
runtime adjustments done during bpf_object__open() by libbpf. In that sense,
BPF static linker is also a BTF normalizer.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-8-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
bd81770e10 libbpf: Add BPF static linker APIs
Introduce BPF static linker APIs to libbpf. BPF static linker allows to
perform static linking of multiple BPF object files into a single combined
resulting object file, preserving all the BPF programs, maps, global
variables, etc.

Data sections (.bss, .data, .rodata, .maps, maps, etc) with the same name are
concatenated together. Similarly, code sections are also concatenated. All the
symbols and ELF relocations are also concatenated in their respective ELF
sections and are adjusted accordingly to the new object file layout.

Static variables and functions are handled correctly as well, adjusting BPF
instructions offsets to reflect new variable/function offset within the
combined ELF section. Such relocations are referencing STT_SECTION symbols and
that stays intact.

Data sections in different files can have different alignment requirements, so
that is taken care of as well, adjusting sizes and offsets as necessary to
satisfy both old and new alignment requirements.

DWARF data sections are stripped out, currently. As well as LLLVM_ADDRSIG
section, which is ignored by libbpf in bpf_object__open() anyways. So, in
a way, BPF static linker is an analogue to `llvm-strip -g`, which is a pretty
nice property, especially if resulting .o file is then used to generate BPF
skeleton.

Original string sections are ignored and instead we construct our own set of
unique strings using libbpf-internal `struct strset` API.

To reduce the size of the patch, all the .BTF and .BTF.ext processing was
moved into a separate patch.

The high-level API consists of just 4 functions:
  - bpf_linker__new() creates an instance of BPF static linker. It accepts
    output filename and (currently empty) options struct;
  - bpf_linker__add_file() takes input filename and appends it to the already
    processed ELF data; it can be called multiple times, one for each BPF
    ELF object file that needs to be linked in;
  - bpf_linker__finalize() needs to be called to dump final ELF contents into
    the output file, specified when bpf_linker was created; after
    bpf_linker__finalize() is called, no more bpf_linker__add_file() and
    bpf_linker__finalize() calls are allowed, they will return error;
  - regardless of whether bpf_linker__finalize() was called or not,
    bpf_linker__free() will free up all the used resources.

Currently, BPF static linker doesn't resolve cross-object file references
(extern variables and/or functions). This will be added in the follow up patch
set.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-7-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
4fdc36418d libbpf: Add generic BTF type shallow copy API
Add btf__add_type() API that performs shallow copy of a given BTF type from
the source BTF into the destination BTF. All the information and type IDs are
preserved, but all the strings encountered are added into the destination BTF
and corresponding offsets are rewritten. BTF type IDs are assumed to be
correct or such that will be (somehow) modified afterwards.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-6-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
861ad35ceb libbpf: Extract internal set-of-strings datastructure APIs
Extract BTF logic for maintaining a set of strings data structure, used for
BTF strings section construction in writable mode, into separate re-usable
API. This data structure is going to be used by bpf_linker to maintains ELF
STRTAB section, which has the same layout as BTF strings section.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-5-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
7fc514acf1 libbpf: Rename internal memory-management helpers
Rename btf_add_mem() and btf_ensure_mem() helpers that abstract away details
of dynamically resizable memory to use libbpf_ prefix, as they are not
BTF-specific. No functional changes.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-4-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
74e94c40fe libbpf: Generalize BTF and BTF.ext type ID and strings iteration
Extract and generalize the logic to iterate BTF type ID and string offset
fields within BTF types and .BTF.ext data. Expose this internally in libbpf
for re-use by bpf_linker.

Additionally, complete strings deduplication handling for BTF.ext (e.g., CO-RE
access strings), which was previously missing. There previously was no
case of deduplicating .BTF.ext data, but bpf_linker is going to use it.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-3-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
082a5c6020 libbpf: Expose btf_type_by_id() internally
btf_type_by_id() is internal-only convenience API returning non-const pointer
to struct btf_type. Expose it outside of btf.c for re-use.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-2-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
75a2e3bda8 libbpf: provide NULL and KERNEL_VERSION macros in bpf_helpers.h
Given that vmlinux.h is not compatible with headers like stddef.h, NULL poses
an annoying problem: it is defined as #define, so is not captured in BTF, so
is not emitted into vmlinux.h. This leads to users either sticking to explicit
0, or defining their own NULL (as progs/skb_pkt_end.c does).

But it's easy for bpf_helpers.h to provide (conditionally) NULL definition.
Similarly, KERNEL_VERSION is another commonly missed macro that came up
multiple times. So this patch adds both of them, along with offsetof(), that
also is typically defined in stddef.h, just like NULL.

This might cause compilation warning for existing BPF applications defining
their own NULL and/or KERNEL_VERSION already:

  progs/skb_pkt_end.c:7:9: warning: 'NULL' macro redefined [-Wmacro-redefined]
  #define NULL 0
          ^
  /tmp/linux/tools/testing/selftests/bpf/tools/include/vmlinux.h:4:9: note: previous definition is here
  #define NULL ((void *)0)
	  ^

It is trivial to fix, though, so long-term benefits outweight temporary
inconveniences.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20210317200510.1354627-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
c8bfeae778 libbpf: Add explicit padding to bpf_xdp_set_link_opts
Adding such anonymous padding fixes the issue with uninitialized portions of
bpf_xdp_set_link_opts when using LIBBPF_DECLARE_OPTS macro with inline field
initialization:

DECLARE_LIBBPF_OPTS(bpf_xdp_set_link_opts, opts, .old_fd = -1);

When such code is compiled in debug mode, compiler is generating code that
leaves padding bytes uninitialized, which triggers error inside libbpf APIs
that do strict zero initialization checks for OPTS structs.

Adding anonymous padding field fixes the issue.

Fixes: bd5ca3ef93cd ("libbpf: Add function to set link XDP fd while specifying old program")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210313210920.1959628-2-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Pedro Tammela
a0ad81d9c4 libbpf: Avoid inline hint definition from 'linux/stddef.h'
Linux headers might pull 'linux/stddef.h' which defines
'__always_inline' as the following:

   #ifndef __always_inline
   #define __always_inline inline
   #endif

This becomes an issue if the program picks up the 'linux/stddef.h'
definition as the macro now just hints inline to clang.

This change now enforces the proper definition for BPF programs
regardless of the include order.

Signed-off-by: Pedro Tammela <pctammela@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210314173839.457768-1-pctammela@gmail.com
2021-03-25 23:31:23 -07:00
Björn Töpel
b727e2deca libbpf: xsk: Move barriers from libbpf_util.h to xsk.h
The only user of libbpf_util.h is xsk.h. Move the barriers to xsk.h,
and remove libbpf_util.h. The barriers are used as an implementation
detail, and should not be considered part of the stable API.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210310080929.641212-3-bjorn.topel@gmail.com
2021-03-25 23:31:23 -07:00
Björn Töpel
27db7104d5 libbpf: xsk: Remove linux/compiler.h header
In commit 291471dd1559 ("libbpf, xsk: Add libbpf_smp_store_release
libbpf_smp_load_acquire") linux/compiler.h was added as a dependency
to xsk.h, which is the user-facing API. This makes it harder for
userspace application to consume the library. Here the header
inclusion is removed, and instead {READ,WRITE}_ONCE() is added
explicitly.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210310080929.641212-2-bjorn.topel@gmail.com
2021-03-25 23:31:23 -07:00
Jean-Philippe Brucker
c14f7e5dcf libbpf: Fix BTF dump of pointer-to-array-of-struct
The vmlinux.h generated from BTF is invalid when building
drivers/phy/ti/phy-gmii-sel.c with clang:

vmlinux.h:61702:27: error: array type has incomplete element type ‘struct reg_field’
61702 |  const struct reg_field (*regfields)[3];
      |                           ^~~~~~~~~

bpftool generates a forward declaration for this struct regfield, which
compilers aren't happy about. Here's a simplified reproducer:

	struct inner {
		int val;
	};
	struct outer {
		struct inner (*ptr_to_array)[2];
	} A;

After build with clang -> bpftool btf dump c -> clang/gcc:
./def-clang.h:11:23: error: array has incomplete element type 'struct inner'
        struct inner (*ptr_to_array)[2];

Member ptr_to_array of struct outer is a pointer to an array of struct
inner. In the DWARF generated by clang, struct outer appears before
struct inner, so when converting BTF of struct outer into C, bpftool
issues a forward declaration to struct inner. With GCC the DWARF info is
reversed so struct inner gets fully defined.

That forward declaration is not sufficient when compilers handle an
array of the struct, even when it's only used through a pointer. Note
that we can trigger the same issue with an intermediate typedef:

	struct inner {
	        int val;
	};
	typedef struct inner inner2_t[2];
	struct outer {
	        inner2_t *ptr_to_array;
	} A;

Becomes:

	struct inner;
	typedef struct inner inner2_t[2];

And causes:

./def-clang.h:10:30: error: array has incomplete element type 'struct inner'
	typedef struct inner inner2_t[2];

To fix this, clear through_ptr whenever we encounter an intermediate
array, to make the inner struct part of a strong link and force full
declaration.

Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210319112554.794552-2-jean-philippe@linaro.org
2021-03-25 23:31:23 -07:00
Kumar Kartikeya Dwivedi
bbc65156d7 libbpf: Use SOCK_CLOEXEC when opening the netlink socket
Otherwise, there exists a small window between the opening and closing
of the socket fd where it may leak into processes launched by some other
thread.

Fixes: 949abbe88436 ("libbpf: add function to setup XDP")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210317115857.6536-1-memxor@gmail.com
2021-03-25 23:31:23 -07:00
Namhyung Kim
c903b3ab70 libbpf: Fix error path in bpf_object__elf_init()
When it failed to get section names, it should call into
bpf_object__elf_finish() like others.

Fixes: 88a82120282b ("libbpf: Factor out common ELF operations and improve logging")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210317145414.884817-1-namhyung@kernel.org
2021-03-25 23:31:23 -07:00
Jean-Philippe Brucker
186ffbe0b5 libbpf: Fix arm64 build
The macro for libbpf_smp_store_release() doesn't build on arm64, fix it.

Fixes: 291471dd1559 ("libbpf, xsk: Add libbpf_smp_store_release libbpf_smp_load_acquire")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210308182521.155536-1-jean-philippe@linaro.org
2021-03-25 23:31:23 -07:00
Björn Töpel
1d483b45fc libbpf, xsk: Add libbpf_smp_store_release libbpf_smp_load_acquire
Now that the AF_XDP rings have load-acquire/store-release semantics,
move libbpf to that as well.

The library-internal libbpf_smp_{load_acquire,store_release} are only
valid for 32-bit words on ARM64.

Also, remove the barriers that are no longer in use.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210305094113.413544-3-bjorn.topel@gmail.com
2021-03-25 23:31:23 -07:00
Xuesen Huang
d64f8d3207 bpf: Add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_ENCAP_L2_ETH
bpf_skb_adjust_room sets the inner_protocol as skb->protocol for packets
encapsulation. But that is not appropriate when pushing Ethernet header.

Add an option to further specify encap L2 type and set the inner_protocol
as ETH_P_TEB.

Suggested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Xuesen Huang <huangxuesen@kuaishou.com>
Signed-off-by: Zhiyong Cheng <chengzhiyong@kuaishou.com>
Signed-off-by: Li Wang <wangli09@kuaishou.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/bpf/20210304064046.6232-1-hxseverything@gmail.com
2021-03-25 23:31:23 -07:00
Lorenz Bauer
4f2e1ecbd9 bpf: Add PROG_TEST_RUN support for sk_lookup programs
Allow to pass sk_lookup programs to PROG_TEST_RUN. User space
provides the full bpf_sk_lookup struct as context. Since the
context includes a socket pointer that can't be exposed
to user space we define that PROG_TEST_RUN returns the cookie
of the selected socket or zero in place of the socket pointer.

We don't support testing programs that select a reuseport socket,
since this would mean running another (unrelated) BPF program
from the sk_lookup test handler.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210303101816.36774-3-lmb@cloudflare.com
2021-03-25 23:31:23 -07:00
Joe Stringer
21f523f235 tools: Sync uapi bpf.h header with latest changes
Synchronize the header after all of the recent changes.

Signed-off-by: Joe Stringer <joe@cilium.io>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210302171947.2268128-16-joe@cilium.io
2021-03-25 23:31:23 -07:00
Joe Stringer
18c0f03e2d scripts/bpf: Abstract eBPF API target parameter
Abstract out the target parameter so that upcoming commits, more than
just the existing "helpers" target can be called to generate specific
portions of docs from the eBPF UAPI headers.

Signed-off-by: Joe Stringer <joe@cilium.io>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210302171947.2268128-10-joe@cilium.io
2021-03-25 23:31:23 -07:00
Maciej Fijalkowski
fade1c32e6 libbpf: Clear map_info before each bpf_obj_get_info_by_fd
xsk_lookup_bpf_maps, based on prog_fd, looks whether current prog has a
reference to XSKMAP. BPF prog can include insns that work on various BPF
maps and this is covered by iterating through map_ids.

The bpf_map_info that is passed to bpf_obj_get_info_by_fd for filling
needs to be cleared at each iteration, so that it doesn't contain any
outdated fields and that is currently missing in the function of
interest.

To fix that, zero-init map_info via memset before each
bpf_obj_get_info_by_fd call.

Also, since the area of this code is touched, in general strcmp is
considered harmful, so let's convert it to strncmp and provide the
size of the array name for current map_info.

While at it, do s/continue/break/ once we have found the xsks_map to
terminate the search.

Fixes: 5750902a6e9b ("libbpf: proper XSKMAP cleanup")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/20210303185636.18070-4-maciej.fijalkowski@intel.com
2021-03-25 23:31:23 -07:00
Ilya Leoshkevich
8c2c7e5bcf sync: use bpf_doc.py
In the latest bpf-next bpf_helpers_doc.py has been renamed to
bpf_doc.py.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
092a606856 Makefile: fix install flags order
Having -m flag between source and destination breaks install on MacOS, as
reported in [0]. Fix it by moving the flag to the front.

  [0] https://github.com/openwrt/openwrt/pull/3959

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-03-05 17:48:04 -08:00
Ilya Leoshkevich
986962fade sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   303dcc25b5c782547eb13b9f29426de843dd6f34
Checkpoint bpf-next commit: 22541a9eeb0d968c133aaebd95fa59da3208e705
Baseline bpf commit:        6185266c5a853bb0f2a459e3ff594546f277609b
Checkpoint bpf commit:      22541a9eeb0d968c133aaebd95fa59da3208e705

Ilya Leoshkevich (3):
  bpf: Add BTF_KIND_FLOAT to uapi
  libbpf: Fix whitespace in btf_add_composite() comment
  libbpf: Add BTF_KIND_FLOAT support

 include/uapi/linux/btf.h |  5 ++--
 src/btf.c                | 51 +++++++++++++++++++++++++++++++++++++++-
 src/btf.h                |  6 +++++
 src/btf_dump.c           |  4 ++++
 src/libbpf.c             | 29 ++++++++++++++++++++++-
 src/libbpf.map           |  5 ++++
 src/libbpf_internal.h    |  2 ++
 7 files changed, 98 insertions(+), 4 deletions(-)

--
2.25.1
2021-03-05 14:15:26 -08:00
Ilya Leoshkevich
471e7c241d libbpf: Add BTF_KIND_FLOAT support
The logic follows that of BTF_KIND_INT most of the time. Sanitization
replaces BTF_KIND_FLOATs with equally-sized empty BTF_KIND_STRUCTs on
older kernels, for example, the following:

    [4] FLOAT 'float' size=4

becomes the following:

    [4] STRUCT '(anon)' size=4 vlen=0

With dwarves patch [1] and this patch, the older kernels, which were
failing with the floating-point-related errors, will now start working
correctly.

[1] https://github.com/iii-i/dwarves/commit/btf-kind-float-v2

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210226202256.116518-4-iii@linux.ibm.com
2021-03-05 14:15:26 -08:00
Ilya Leoshkevich
617f781804 libbpf: Fix whitespace in btf_add_composite() comment
Remove trailing space.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210226202256.116518-3-iii@linux.ibm.com
2021-03-05 14:15:26 -08:00
Ilya Leoshkevich
473899d4f7 bpf: Add BTF_KIND_FLOAT to uapi
Add a new kind value and expand the kind bitfield.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210226202256.116518-2-iii@linux.ibm.com
2021-03-05 14:15:26 -08:00
Andrii Nakryiko
7e03685b8d vmtests: blacklist new tests on 5.5 kernel
Extend 5.5 blacklist.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-03-03 08:35:13 -08:00
Andrii Nakryiko
7065a809fc sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   86ce322d21eb032ed8fdd294d0fb095d2debb430
Checkpoint bpf-next commit: 303dcc25b5c782547eb13b9f29426de843dd6f34
Baseline bpf commit:        78031381ae9c88f4f914d66154f4745122149c58
Checkpoint bpf commit:      6185266c5a853bb0f2a459e3ff594546f277609b

Alexei Starovoitov (1):
  bpf: Count the number of times recursion was prevented

Florent Revest (2):
  bpf: Be less specific about socket cookies guarantees
  bpf: Expose bpf_get_socket_cookie to tracing programs

Hangbin Liu (1):
  bpf: Remove blank line in bpf helper description comment

Jesper Dangaard Brouer (2):
  bpf: bpf_fib_lookup return MTU value as output when looked up
  bpf: Add BPF-helper for MTU checking

Jonas Bonn (1):
  Revert "GTP: add support for flow based tunneling API"

Martin KaFai Lau (1):
  libbpf: Ignore non function pointer member in struct_ops

Stanislav Fomichev (1):
  libbpf: Use AF_LOCAL instead of AF_INET in xsk.c

Yonghong Song (3):
  bpf: Add bpf_for_each_map_elem() helper
  libbpf: Move function is_ldimm64() earlier in libbpf.c
  libbpf: Support subprog address relocation

 include/uapi/linux/bpf.h     | 140 +++++++++++++++++++++++++++++++++--
 include/uapi/linux/if_link.h |   1 -
 src/libbpf.c                 |  98 +++++++++++++++++++-----
 src/xsk.c                    |   2 +-
 4 files changed, 213 insertions(+), 28 deletions(-)

--
2.24.1
2021-03-03 08:35:13 -08:00
Andrii Nakryiko
18b55bc136 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-03-03 08:35:13 -08:00
Yonghong Song
712f6587c9 libbpf: Support subprog address relocation
A new relocation RELO_SUBPROG_ADDR is added to capture
subprog addresses loaded with ld_imm64 insns. Such ld_imm64
insns are marked with BPF_PSEUDO_FUNC and will be passed to
kernel. For bpf_for_each_map_elem() case, kernel will
check that the to-be-used subprog address must be a static
function and replace it with proper actual jited func address.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210226204930.3885367-1-yhs@fb.com
2021-03-03 08:35:13 -08:00
Yonghong Song
60aa32b17a libbpf: Move function is_ldimm64() earlier in libbpf.c
Move function is_ldimm64() close to the beginning of libbpf.c,
so it can be reused by later code and the next patch as well.
There is no functionality change.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210226204929.3885295-1-yhs@fb.com
2021-03-03 08:35:13 -08:00
Yonghong Song
f3612e4117 bpf: Add bpf_for_each_map_elem() helper
The bpf_for_each_map_elem() helper is introduced which
iterates all map elements with a callback function. The
helper signature looks like
  long bpf_for_each_map_elem(map, callback_fn, callback_ctx, flags)
and for each map element, the callback_fn will be called. For example,
like hashmap, the callback signature may look like
  long callback_fn(map, key, val, callback_ctx)

There are two known use cases for this. One is from upstream ([1]) where
a for_each_map_elem helper may help implement a timeout mechanism
in a more generic way. Another is from our internal discussion
for a firewall use case where a map contains all the rules. The packet
data can be compared to all these rules to decide allow or deny
the packet.

For array maps, users can already use a bounded loop to traverse
elements. Using this helper can avoid using bounded loop. For other
type of maps (e.g., hash maps) where bounded loop is hard or
impossible to use, this helper provides a convenient way to
operate on all elements.

For callback_fn, besides map and map element, a callback_ctx,
allocated on caller stack, is also passed to the callback
function. This callback_ctx argument can provide additional
input and allow to write to caller stack for output.

If the callback_fn returns 0, the helper will iterate through next
element if available. If the callback_fn returns 1, the helper
will stop iterating and returns to the bpf program. Other return
values are not used for now.

Currently, this helper is only available with jit. It is possible
to make it work with interpreter with so effort but I leave it
as the future work.

[1]: https://lore.kernel.org/bpf/20210122205415.113822-1-xiyou.wangcong@gmail.com/

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210226204925.3884923-1-yhs@fb.com
2021-03-03 08:35:13 -08:00
Hangbin Liu
587d2ab628 bpf: Remove blank line in bpf helper description comment
Commit 34b2021cc616 ("bpf: Add BPF-helper for MTU checking") added an extra
blank line in bpf helper description. This will make bpf_helpers_doc.py stop
building bpf_helper_defs.h immediately after bpf_check_mtu(), which will
affect future added functions.

Fixes: 34b2021cc616 ("bpf: Add BPF-helper for MTU checking")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/20210223131457.1378978-1-liuhangbin@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-03-03 08:35:13 -08:00
Jesper Dangaard Brouer
6cc16d6401 bpf: Add BPF-helper for MTU checking
This BPF-helper bpf_check_mtu() works for both XDP and TC-BPF programs.

The SKB object is complex and the skb->len value (accessible from
BPF-prog) also include the length of any extra GRO/GSO segments, but
without taking into account that these GRO/GSO segments get added
transport (L4) and network (L3) headers before being transmitted. Thus,
this BPF-helper is created such that the BPF-programmer don't need to
handle these details in the BPF-prog.

The API is designed to help the BPF-programmer, that want to do packet
context size changes, which involves other helpers. These other helpers
usually does a delta size adjustment. This helper also support a delta
size (len_diff), which allow BPF-programmer to reuse arguments needed by
these other helpers, and perform the MTU check prior to doing any actual
size adjustment of the packet context.

It is on purpose, that we allow the len adjustment to become a negative
result, that will pass the MTU check. This might seem weird, but it's not
this helpers responsibility to "catch" wrong len_diff adjustments. Other
helpers will take care of these checks, if BPF-programmer chooses to do
actual size adjustment.

V14:
 - Improve man-page desc of len_diff.

V13:
 - Enforce flag BPF_MTU_CHK_SEGS cannot use len_diff.

V12:
 - Simplify segment check that calls skb_gso_validate_network_len.
 - Helpers should return long

V9:
- Use dev->hard_header_len (instead of ETH_HLEN)
- Annotate with unlikely req from Daniel
- Fix logic error using skb_gso_validate_network_len from Daniel

V6:
- Took John's advice and dropped BPF_MTU_CHK_RELAX
- Returned MTU is kept at L3-level (like fib_lookup)

V4: Lot of changes
 - ifindex 0 now use current netdev for MTU lookup
 - rename helper from bpf_mtu_check to bpf_check_mtu
 - fix bug for GSO pkt length (as skb->len is total len)
 - remove __bpf_len_adj_positive, simply allow negative len adj

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/161287790461.790810.3429728639563297353.stgit@firesoul
2021-03-03 08:35:13 -08:00
Jesper Dangaard Brouer
f0753b5259 bpf: bpf_fib_lookup return MTU value as output when looked up
The BPF-helpers for FIB lookup (bpf_xdp_fib_lookup and bpf_skb_fib_lookup)
can perform MTU check and return BPF_FIB_LKUP_RET_FRAG_NEEDED. The BPF-prog
don't know the MTU value that caused this rejection.

If the BPF-prog wants to implement PMTU (Path MTU Discovery) (rfc1191) it
need to know this MTU value for the ICMP packet.

Patch change lookup and result struct bpf_fib_lookup, to contain this MTU
value as output via a union with 'tot_len' as this is the value used for
the MTU lookup.

V5:
 - Fixed uninit value spotted by Dan Carpenter.
 - Name struct output member mtu_result

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/161287789952.790810.13134700381067698781.stgit@firesoul
2021-03-03 08:35:13 -08:00
Martin KaFai Lau
642655629b libbpf: Ignore non function pointer member in struct_ops
When libbpf initializes the kernel's struct_ops in
"bpf_map__init_kern_struct_ops()", it enforces all
pointer types must be a function pointer and rejects
others.  It turns out to be too strict.  For example,
when directly using "struct tcp_congestion_ops" from vmlinux.h,
it has a "struct module *owner" member and it is set to NULL
in a bpf_tcp_cc.o.

Instead, it only needs to ensure the member is a function
pointer if it has been set (relocated) to a bpf-prog.
This patch moves the "btf_is_func_proto(kern_mtype)" check
after the existing "if (!prog) { continue; }".  The original debug
message in "if (!prog) { continue; }" is also removed since it is
no longer valid.  Beside, there is a later debug message to tell
which function pointer is set.

The "btf_is_func_proto(mtype)" has already been guaranteed
in "bpf_object__collect_st_ops_relos()" which has been run
before "bpf_map__init_kern_struct_ops()".  Thus, this check
is removed.

v2:
- Remove outdated debug message (Andrii)
  Remove because there is a later debug message to tell
  which function pointer is set.
- Following mtype->type is no longer needed. Remove:
  "skip_mods_and_typedefs(btf, mtype->type, &mtype_id)"
- Do "if (!prog)" test before skip_mods_and_typedefs.

Fixes: 590a00888250 ("bpf: libbpf: Add STRUCT_OPS support")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210212021030.266932-1-kafai@fb.com
2021-03-03 08:35:13 -08:00
Stanislav Fomichev
14a61e86f0 libbpf: Use AF_LOCAL instead of AF_INET in xsk.c
We have the environments where usage of AF_INET is prohibited
(cgroup/sock_create returns EPERM for AF_INET). Let's use
AF_LOCAL instead of AF_INET, it should perfectly work with SIOCETHTOOL.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/20210209221826.922940-1-sdf@google.com
2021-03-03 08:35:13 -08:00
Florent Revest
d142d4a382 bpf: Expose bpf_get_socket_cookie to tracing programs
This needs a new helper that:
- can work in a sleepable context (using sock_gen_cookie)
- takes a struct sock pointer and checks that it's not NULL

Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210210111406.785541-2-revest@chromium.org
2021-03-03 08:35:13 -08:00
Florent Revest
99e6a464b8 bpf: Be less specific about socket cookies guarantees
Since "92acdc58ab11 bpf, net: Rework cookie generator as per-cpu one"
socket cookies are not guaranteed to be non-decreasing. The
bpf_get_socket_cookie helper descriptions are currently specifying that
cookies are non-decreasing but we don't want users to rely on that.

Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Link: https://lore.kernel.org/bpf/20210210111406.785541-1-revest@chromium.org
2021-03-03 08:35:13 -08:00
Alexei Starovoitov
1015d47c2b bpf: Count the number of times recursion was prevented
Add per-program counter for number of times recursion prevention mechanism
was triggered and expose it via show_fdinfo and bpf_prog_info.
Teach bpftool to print it.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210210033634.62081-7-alexei.starovoitov@gmail.com
2021-03-03 08:35:13 -08:00
Jonas Bonn
06ee116fb1 Revert "GTP: add support for flow based tunneling API"
This reverts commit 9ab7e76aefc97a9aa664accb59d6e8dc5e52514a.

This patch was committed without maintainer approval and despite a number
of unaddressed concerns from review.  There are several issues that
impede the acceptance of this patch and that make a reversion of this
particular instance of these changes the best way forward:

i)  the patch contains several logically separate changes that would be
better served as smaller patches (for review purposes)
ii) functionality like the handling of end markers has been introduced
without further explanation
iii) symmetry between the handling of GTPv0 and GTPv1 has been
unnecessarily broken
iv) the patchset produces 'broken' packets when extension headers are
included
v) there are no available userspace tools to allow for testing this
functionality
vi) there is an unaddressed Coverity report against the patch concering
memory leakage
vii) most importantly, the patch contains a large amount of superfluous
churn that impedes other ongoing work with this driver

This patch will be reworked into a series that aligns with other
ongoing work and facilitates review.

Signed-off-by: Jonas Bonn <jonas@norrbonn.se>
Acked-by: Harald Welte <laforge@gnumonks.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-03-03 08:35:13 -08:00
Andrii Nakryiko
e1a90f3768 travis-ci: switch from GCC8 to GCC10
GCC 8 builds started failing due to missing gcc-8 package. Let's switch to GCC
10 instead.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-02-22 14:00:53 -08:00
Andrii Nakryiko
f2a926ba46 Revert "vmtests: revert to Clang/LLVM 12 until Clang 13 regression is fixed"
Clang 13 now has the fix for the original regression, time to get back to
using nightly versions.

This reverts commit adaf538bca.
2021-02-22 11:36:12 -08:00
Matteo Croce
b0b5ec0006 fix typo in license name
LGPL was incorrectly named LPGL

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
2021-02-22 11:35:49 -08:00
Andrii Nakryiko
adaf538bca vmtests: revert to Clang/LLVM 12 until Clang 13 regression is fixed
Clang 13 regressed BPF code generation causing some of BPF selftests to fail.
Until that is mitigated, stick to version 12.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-02-08 20:37:11 -08:00
Andrii Nakryiko
f35e87ddc4 vmtest: switch to Clang/LLVM 13
Clang 13 became the new nightly version, so switch to it. Also do vmlinux
compilation with a bit more parallelism. And account python-docutils
installation as part of selftests build.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-02-03 14:53:19 -08:00
Matteo Croce
767d82caab install: don't preserve file owner
'cp -p' preserve file ownership, this may leave files owned by the
current in user in /lib .

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
2021-01-26 19:33:48 -08:00
Andrii Nakryiko
a199b85415 vmtest: blacklist atomics selftest for 5.5
5.5 doesn't support atomics.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-01-26 19:32:13 -08:00
Andrii Nakryiko
649f9dc746 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   3db1a3fa98808aa90f95ec3e0fa2fc7abf28f5c9
Checkpoint bpf-next commit: 86ce322d21eb032ed8fdd294d0fb095d2debb430
Baseline bpf commit:        1a3449c19407a28f7019a887cdf0d6ba2444751a
Checkpoint bpf commit:      78031381ae9c88f4f914d66154f4745122149c58

Andrii Nakryiko (5):
  libbpf: Add user-space variants of BPF_CORE_READ() family of macros
  libbpf: Add non-CO-RE variants of BPF_CORE_READ() macro family
  libbpf: Clarify kernel type use with USER variants of CORE reading
    macros
  libbpf: Support kernel module ksym externs
  libbpf: Allow loading empty BTFs

Björn Töpel (1):
  libbpf, xsk: Select AF_XDP BPF program based on kernel version

Brendan Jackman (4):
  bpf: Clarify return value of probe str helpers
  bpf: Rename BPF_XADD and prepare to encode other atomics in .imm
  bpf: Add BPF_FETCH field / create atomic_fetch_add instruction
  bpf: Add instructions for atomic_[cmp]xchg

Ian Rogers (1):
  bpf, libbpf: Avoid unused function warning on bpf_tail_call_static

Jiri Olsa (1):
  libbpf: Use string table index from index table if needed

Pravin B Shelar (1):
  GTP: add support for flow based tunneling API

 include/uapi/linux/bpf.h     |  20 +++--
 include/uapi/linux/if_link.h |   1 +
 src/bpf_core_read.h          | 169 +++++++++++++++++++++++++++--------
 src/bpf_helpers.h            |   2 +-
 src/btf.c                    |  17 ++--
 src/libbpf.c                 |  50 +++++++----
 src/xsk.c                    |  81 ++++++++++++++++-
 7 files changed, 265 insertions(+), 75 deletions(-)

--
2.24.1
2021-01-26 19:32:13 -08:00
Andrii Nakryiko
eb56f8fb12 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-01-26 19:32:13 -08:00
Björn Töpel
6e01a23cf6 libbpf, xsk: Select AF_XDP BPF program based on kernel version
Add detection for kernel version, and adapt the BPF program based on
kernel support. This way, users will get the best possible performance
from the BPF program.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Marek Majtyka  <alardam@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210122105351.11751-4-bjorn.topel@gmail.com
2021-01-26 19:32:13 -08:00
Jiri Olsa
16d7f413e2 libbpf: Use string table index from index table if needed
For very large ELF objects (with many sections), we could
get special value SHN_XINDEX (65535) for elf object's string
table index - e_shstrndx.

Call elf_getshdrstrndx to get the proper string table index,
instead of reading it directly from ELF header.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210121202203.9346-4-jolsa@kernel.org
2021-01-26 19:32:13 -08:00
Andrii Nakryiko
f037b92465 libbpf: Allow loading empty BTFs
Empty BTFs do come up (e.g., simple kernel modules with no new types and
strings, compared to the vmlinux BTF) and there is nothing technically wrong
with them. So remove unnecessary check preventing loading empty BTFs.

Fixes: d8123624506c ("libbpf: Fix BTF data layout checks and allow empty BTF")
Reported-by: Christopher William Snowhill <chris@kode54.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210110070341.1380086-2-andrii@kernel.org
2021-01-26 19:32:13 -08:00
Pravin B Shelar
4adbb7b2c7 GTP: add support for flow based tunneling API
Following patch add support for flow based tunneling API
to send and recv GTP tunnel packet over tunnel metadata API.
This would allow this device integration with OVS or eBPF using
flow based tunneling APIs.

Signed-off-by: Pravin B Shelar <pbshelar@fb.com>
Link: https://lore.kernel.org/r/20210110070021.26822-1-pbshelar@fb.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-26 19:32:13 -08:00
Brendan Jackman
d2b784d370 bpf: Add instructions for atomic_[cmp]xchg
This adds two atomic opcodes, both of which include the BPF_FETCH
flag. XCHG without the BPF_FETCH flag would naturally encode
atomic_set. This is not supported because it would be of limited
value to userspace (it doesn't imply any barriers). CMPXCHG without
BPF_FETCH woulud be an atomic compare-and-write. We don't have such
an operation in the kernel so it isn't provided to BPF either.

There are two significant design decisions made for the CMPXCHG
instruction:

 - To solve the issue that this operation fundamentally has 3
   operands, but we only have two register fields. Therefore the
   operand we compare against (the kernel's API calls it 'old') is
   hard-coded to be R0. x86 has similar design (and A64 doesn't
   have this problem).

   A potential alternative might be to encode the other operand's
   register number in the immediate field.

 - The kernel's atomic_cmpxchg returns the old value, while the C11
   userspace APIs return a boolean indicating the comparison
   result. Which should BPF do? A64 returns the old value. x86 returns
   the old value in the hard-coded register (and also sets a
   flag). That means return-old-value is easier to JIT, so that's
   what we use.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210114181751.768687-8-jackmanb@google.com
2021-01-26 19:32:13 -08:00
Brendan Jackman
ac86f42e4a bpf: Add BPF_FETCH field / create atomic_fetch_add instruction
The BPF_FETCH field can be set in bpf_insn.imm, for BPF_ATOMIC
instructions, in order to have the previous value of the
atomically-modified memory location loaded into the src register
after an atomic op is carried out.

Suggested-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210114181751.768687-7-jackmanb@google.com
2021-01-26 19:32:13 -08:00
Brendan Jackman
03fbe22a59 bpf: Rename BPF_XADD and prepare to encode other atomics in .imm
A subsequent patch will add additional atomic operations. These new
operations will use the same opcode field as the existing XADD, with
the immediate discriminating different operations.

In preparation, rename the instruction mode BPF_ATOMIC and start
calling the zero immediate BPF_ADD.

This is possible (doesn't break existing valid BPF progs) because the
immediate field is currently reserved MBZ and BPF_ADD is zero.

All uses are removed from the tree but the BPF_XADD definition is
kept around to avoid breaking builds for people including kernel
headers.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Björn Töpel <bjorn.topel@gmail.com>
Link: https://lore.kernel.org/bpf/20210114181751.768687-5-jackmanb@google.com
2021-01-26 19:32:13 -08:00
Ian Rogers
f15814c93a bpf, libbpf: Avoid unused function warning on bpf_tail_call_static
Add inline to __always_inline making it match the linux/compiler.h.
Adding this avoids an unused function warning on bpf_tail_call_static
when compining with -Wall.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210113223609.3358812-1-irogers@google.com
2021-01-26 19:32:13 -08:00
Andrii Nakryiko
0db7da9a4a libbpf: Support kernel module ksym externs
Add support for searching for ksym externs not just in vmlinux BTF, but across
all module BTFs, similarly to how it's done for CO-RE relocations. Kernels
that expose module BTFs through sysfs are assumed to support new ldimm64
instruction extension with BTF FD provided in insn[1].imm field, so no extra
feature detection is performed.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20210112075520.4103414-7-andrii@kernel.org
2021-01-26 19:32:13 -08:00
Brendan Jackman
0de8b9a906 bpf: Clarify return value of probe str helpers
When the buffer is too small to contain the input string, these helpers
return the length of the buffer, not the length of the original string.
This tries to make the docs totally clear about that, since "the length
of the [copied ]string" could also refer to the length of the input.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210112123422.2011234-1-jackmanb@google.com
2021-01-26 19:32:13 -08:00
Andrii Nakryiko
d52e5f5f88 libbpf: Clarify kernel type use with USER variants of CORE reading macros
Add comments clarifying that USER variants of CO-RE reading macro are still
only going to work with kernel types, defined in kernel or kernel module BTF.
This should help preventing invalid use of those macro to read user-defined
types (which doesn't work with CO-RE).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210108194408.3468860-1-andrii@kernel.org
2021-01-26 19:32:13 -08:00
Andrii Nakryiko
a26ae1b254 libbpf: Add non-CO-RE variants of BPF_CORE_READ() macro family
BPF_CORE_READ(), in addition to handling CO-RE relocations, also allows much
nicer way to read data structures with nested pointers. Instead of writing
a sequence of bpf_probe_read() calls to follow links, one can just write
BPF_CORE_READ(a, b, c, d) to effectively do a->b->c->d read. This is a welcome
ability when porting BCC code, which (in most cases) allows exactly the
intuitive a->b->c->d variant.

This patch adds non-CO-RE variants of BPF_CORE_READ() family of macros for
cases where CO-RE is not supported (e.g., old kernels). In such cases, the
property of shortening a sequence of bpf_probe_read()s to a simple
BPF_PROBE_READ(a, b, c, d) invocation is still desirable, especially when
porting BCC code to libbpf. Yet, no CO-RE relocation is going to be emitted.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20201218235614.2284956-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-01-26 19:32:13 -08:00
Andrii Nakryiko
c1d4bbb8c7 libbpf: Add user-space variants of BPF_CORE_READ() family of macros
Add BPF_CORE_READ_USER(), BPF_CORE_READ_USER_STR() and their _INTO()
variations to allow reading CO-RE-relocatable kernel data structures from the
user-space. One of such cases is reading input arguments of syscalls, while
reaping the benefits of CO-RE relocations w.r.t. handling 32/64 bit
conversions and handling missing/new fields in UAPI data structs.

Suggested-by: Gilad Reti <gilad.reti@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20201218235614.2284956-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-01-26 19:32:13 -08:00
Luca Boccassi
7c2a94f4f8 README: mention that Debian 11 ships with BTF support 2021-01-22 12:27:35 -08:00
Luca Boccassi
051a4009f9 pkgconfig: use literal ${prefix} to allow override
Various workflows (--define-prefix, --define-variable=prefix) require variables in
the pc file to use a literal  so that it is overridden. Change the Makefile
so that, by default and unless  is specified, it is set as expected.

Signed-off-by: Luca Boccassi <bluca@debian.org>
2021-01-03 10:41:31 -08:00
Luca Boccassi
a3a5e9688a README: point to Debian source package rather than binary
For consistency with other links

Signed-off-by: Luca Boccassi <bluca@debian.org>
2021-01-03 10:41:31 -08:00
Luca Boccassi
5569404346 README: note that Debian 11 (will) ship LLVM 11
Signed-off-by: Luca Boccassi <bluca@debian.org>
2021-01-03 10:41:31 -08:00
Andrii Nakryiko
e05f9be4f4 vmtests: temporarily disable test_maps
Disable test_maps test until it's debugged why they started failing.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-12-20 17:00:58 -08:00
Andrii Nakryiko
4d3535ff7b vmtests: test_maps needs more memory, so bump to 4G
Memory is cheap.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-12-20 17:00:58 -08:00
Andrii Nakryiko
c66a9770e3 vmtests: fix up bpf_testmod.ko generation for 5.5 and 4.9
Selftests makefile deletes local bpf_testmod.ko, so that invalidates current
approach of faking bpf_testmod.ko "generation". Instead, generate a fake
Makefile that will create an empty bpf_testmod/bpf_testmod.ko.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-12-20 17:00:58 -08:00
Andrii Nakryiko
8262be6034 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   5c667dca71095abec90420eb09503f35f66c9585
Checkpoint bpf-next commit: 3db1a3fa98808aa90f95ec3e0fa2fc7abf28f5c9
Baseline bpf commit:        12c8a8ca117f3d734babc3fba131fdaa329d2163
Checkpoint bpf commit:      1a3449c19407a28f7019a887cdf0d6ba2444751a

Andrii Nakryiko (2):
  bpf: Fix enum names for bpf_this_cpu_ptr() and bpf_per_cpu_ptr()
    helpers
  libbpf: Support modules in bpf_program__set_attach_target() API

Brendan Jackman (1):
  libbpf: Expose libbpf ring_buffer epoll_fd

Florent Revest (1):
  bpf: Add a bpf_sock_from_file helper

 include/uapi/linux/bpf.h | 13 ++++++--
 src/libbpf.c             | 64 +++++++++++++++++++++++++---------------
 src/libbpf.h             |  1 +
 src/libbpf.map           |  1 +
 src/ringbuf.c            |  6 ++++
 5 files changed, 59 insertions(+), 26 deletions(-)

--
2.24.1
2020-12-20 17:00:58 -08:00
Andrii Nakryiko
182e9dde0d sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-12-20 17:00:58 -08:00
Brendan Jackman
30e2c16571 libbpf: Expose libbpf ring_buffer epoll_fd
This provides a convenient perf ringbuf -> libbpf ringbuf migration
path for users of external polling systems. It is analogous to
perf_buffer__epoll_fd.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201214113812.305274-1-jackmanb@google.com
2020-12-20 17:00:58 -08:00
Andrii Nakryiko
ebcae62e7e libbpf: Support modules in bpf_program__set_attach_target() API
Support finding kernel targets in kernel modules when using
bpf_program__set_attach_target() API. This brings it up to par with what
libbpf supports when doing declarative SEC()-based target determination.

Some minor internal refactoring was needed to make sure vmlinux BTF can be
loaded before bpf_object's load phase.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20201211215825.3646154-2-andrii@kernel.org
2020-12-20 17:00:58 -08:00
Florent Revest
252ad1f3eb bpf: Add a bpf_sock_from_file helper
While eBPF programs can check whether a file is a socket by file->f_op
== &socket_file_ops, they cannot convert the void private_data pointer
to a struct socket BTF pointer. In order to do this a new helper
wrapping sock_from_file is added.

This is useful to tracing programs but also other program types
inheriting this set of helpers such as iterators or LSM programs.

Signed-off-by: Florent Revest <revest@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: KP Singh <kpsingh@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201204113609.1850150-2-revest@google.com
2020-12-20 17:00:58 -08:00
Andrii Nakryiko
3e68c60659 bpf: Fix enum names for bpf_this_cpu_ptr() and bpf_per_cpu_ptr() helpers
Remove bpf_ prefix, which causes these helpers to be reported in verifier
dump as bpf_bpf_this_cpu_ptr() and bpf_bpf_per_cpu_ptr(), respectively. Lets
fix it as long as it is still possible before UAPI freezes on these helpers.

Fixes: eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-20 17:00:58 -08:00
Andrii Nakryiko
42baefba71 vmtests: update blacklist for 5.5
Blacklist selftests relying on newer kernel's features.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
46ecf7aef3 vmtest: omit building bpf_testmod.ko on non-latest kernels
Non-latest kernel versions don't build kernel from sources, so module buliding
fails, despite using `make prepare`. For now, just make sure no module is
built by overwriting bpf_testmod/Makefile.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
2981bb8d26 vmtests: update vmlinux.h to latest version
Update vmlinux.h to get some of BPF UAPI constants needed for the compilation
of new selftests.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
2042df2fed sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   c6bde958a62b8ca5ee8d2c1fe429aec4ad54efad
Checkpoint bpf-next commit: 5c667dca71095abec90420eb09503f35f66c9585
Baseline bpf commit:        d3bec0138bfbe58606fc1d6f57a4cdc1a20218db
Checkpoint bpf commit:      12c8a8ca117f3d734babc3fba131fdaa329d2163

Alan Maguire (1):
  libbpf: bpf__find_by_name[_kind] should use btf__get_nr_types()

Andrei Matei (1):
  libbpf: Fail early when loading programs with unspecified type

Andrii Nakryiko (11):
  bpf: Assign ID to vmlinux BTF and return extra info for BTF in
    GET_OBJ_INFO
  libbpf: Don't attempt to load unused subprog as an entry-point BPF
    program
  libbpf: Add base BTF accessor
  libbpf: Add internal helper to load BTF data by FD
  libbpf: Refactor CO-RE relocs to not assume a single BTF object
  libbpf: Add kernel module BTF support for CO-RE relocations
  bpf: Allow to specify kernel module BTFs when attaching BPF programs
  libbpf: Factor out low-level BPF program loading helper
  libbpf: Support attachment of BPF tracing programs to kernel modules
  libbpf: Use memcpy instead of strncpy to please GCC
  libbpf: Fix ring_buffer__poll() to return number of consumed samples

Dmitrii Banshchikov (1):
  bpf: Add bpf_ktime_get_coarse_ns helper

KP Singh (5):
  bpf: Implement task local storage
  libbpf: Add support for task local storage
  bpf: Implement get_current_task_btf and RET_PTR_TO_BTF_ID
  bpf: Add bpf_bprm_opts_set helper
  bpf: Add a BPF helper for getting the IMA hash of an inode

Li RongQing (1):
  libbpf: Add support for canceling cached_cons advance

Magnus Karlsson (1):
  libbpf: Replace size_t with __u32 in xsk interfaces

Mariusz Dudek (1):
  libbpf: Separate XDP program load with xsk socket creation

Stanislav Fomichev (1):
  libbpf: Cap retries in sys_bpf_prog_load

Thomas Karlsson (1):
  macvlan: Support for high multicast packet rate

Toke Høiland-Jørgensen (1):
  libbpf: Sanitise map names before pinning

 include/uapi/linux/bpf.h     |  96 +++++-
 include/uapi/linux/if_link.h |   2 +
 src/bpf.c                    | 104 +++++--
 src/btf.c                    |  74 +++--
 src/btf.h                    |   1 +
 src/libbpf.c                 | 550 +++++++++++++++++++++++++++--------
 src/libbpf.map               |   3 +
 src/libbpf_internal.h        |  31 ++
 src/libbpf_probes.c          |   1 +
 src/ringbuf.c                |   2 +-
 src/xsk.c                    |  92 +++++-
 src/xsk.h                    |  22 +-
 12 files changed, 771 insertions(+), 207 deletions(-)

--
2.24.1
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
8c2c4c3451 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
21ae7bb113 libbpf: Fix ring_buffer__poll() to return number of consumed samples
Fix ring_buffer__poll() to return the number of non-discarded records
consumed, just like its documentation states. It's also consistent with
ring_buffer__consume() return. Fix up selftests with wrong expected results.

Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
Fixes: cb1c9ddd5525 ("selftests/bpf: Add BPF ringbuf selftests")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201130223336.904192-1-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
b2a34784b2 libbpf: Use memcpy instead of strncpy to please GCC
Some versions of GCC are really nit-picky about strncpy() use. Use memcpy(),
as they are pretty much equivalent for the case of fixed length strings.

Fixes: e459f49b4394 ("libbpf: Separate XDP program load with xsk socket creation")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203235440.2302137-1-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
d95b12da56 libbpf: Support attachment of BPF tracing programs to kernel modules
Teach libbpf to search for BTF types in kernel modules for tracing BPF
programs. This allows attachment of raw_tp/fentry/fexit/fmod_ret/etc BPF
program types to tracepoints and functions in kernel modules.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-13-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
a1fd6dab54 libbpf: Factor out low-level BPF program loading helper
Refactor low-level API for BPF program loading to not rely on public API
types. This allows painless extension without constant efforts to cleverly not
break backwards compatibility.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-12-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
fde1be5a9c bpf: Allow to specify kernel module BTFs when attaching BPF programs
Add ability for user-space programs to specify non-vmlinux BTF when attaching
BTF-powered BPF programs: raw_tp, fentry/fexit/fmod_ret, LSM, etc. For this,
attach_prog_fd (now with the alias name attach_btf_obj_fd) should specify FD
of a module or vmlinux BTF object. For backwards compatibility reasons,
0 denotes vmlinux BTF. Only kernel BTF (vmlinux or module) can be specified.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-11-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
6b08519a69 libbpf: Add kernel module BTF support for CO-RE relocations
Teach libbpf to search for candidate types for CO-RE relocations across kernel
modules BTFs, in addition to vmlinux BTF. If at least one candidate type is
found in vmlinux BTF, kernel module BTFs are not iterated. If vmlinux BTF has
no matching candidates, then find all kernel module BTFs and search for all
matching candidates across all of them.

Kernel's support for module BTFs are inferred from the support for BTF name
pointer in BPF UAPI.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-6-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
aff8028b6e libbpf: Refactor CO-RE relocs to not assume a single BTF object
Refactor CO-RE relocation candidate search to not expect a single BTF, rather
return all candidate types with their corresponding BTF objects. This will
allow to extend CO-RE relocations to accommodate kernel module BTFs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-5-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
10e321f100 libbpf: Add internal helper to load BTF data by FD
Add a btf_get_from_fd() helper, which constructs struct btf from in-kernel BTF
data by FD. This is used for loading module BTFs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-4-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Stanislav Fomichev
8051a539d8 libbpf: Cap retries in sys_bpf_prog_load
I've seen a situation, where a process that's under pprof constantly
generates SIGPROF which prevents program loading indefinitely.
The right thing to do probably is to disable signals in the upper
layers while loading, but it still would be nice to get some error from
libbpf instead of an endless loop.

Let's add some small retry limit to the program loading:
try loading the program 5 (arbitrary) times and give up.

v2:
* 10 -> 5 retires (Andrii Nakryiko)

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201202231332.3923644-1-sdf@google.com
2020-12-04 20:04:52 -08:00
Toke Høiland-Jørgensen
691c22dc0c libbpf: Sanitise map names before pinning
When we added sanitising of map names before loading programs to libbpf, we
still allowed periods in the name. While the kernel will accept these for
the map names themselves, they are not allowed in file names when pinning
maps. This means that bpf_object__pin_maps() will fail if called on an
object that contains internal maps (such as sections .rodata).

Fix this by replacing periods with underscores when constructing map pin
paths. This only affects the paths generated by libbpf when
bpf_object__pin_maps() is called with a path argument. Any pin paths set
by bpf_map__set_pin_path() are unaffected, and it will still be up to the
caller to avoid invalid characters in those.

Fixes: 113e6b7e15e2 ("libbpf: Sanitise internal map names so they are not rejected by the kernel")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201203093306.107676-1-toke@redhat.com
2020-12-04 20:04:52 -08:00
Andrei Matei
5fe9c1217a libbpf: Fail early when loading programs with unspecified type
Before this patch, a program with unspecified type
(BPF_PROG_TYPE_UNSPEC) would be passed to the BPF syscall, only to have
the kernel reject it with an opaque invalid argument error. This patch
makes libbpf reject such programs with a nicer error message - in
particular libbpf now tries to diagnose bad ELF section names at both
open time and load time.

Signed-off-by: Andrei Matei <andreimatei1@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201203043410.59699-1-andreimatei1@gmail.com
2020-12-04 20:04:52 -08:00
Mariusz Dudek
78c76a1015 libbpf: Separate XDP program load with xsk socket creation
Add support for separation of eBPF program load and xsk socket
creation.

This is needed for use-case when you want to privide as little
privileges as possible to the data plane application that will
handle xsk socket creation and incoming traffic.

With this patch the data entity container can be run with only
CAP_NET_RAW capability to fulfill its purpose of creating xsk
socket and handling packages. In case your umem is larger or
equal process limit for MEMLOCK you need either increase the
limit or CAP_IPC_LOCK capability.

To resolve privileges issue two APIs are introduced:

- xsk_setup_xdp_prog - loads the built in XDP program. It can
also return xsks_map_fd which is needed by unprivileged process
to update xsks_map with AF_XDP socket "fd"

- xsk_socket__update_xskmap - inserts an AF_XDP socket into an xskmap
for a particular xsk_socket

Signed-off-by: Mariusz Dudek <mariuszx.dudek@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20201203090546.11976-2-mariuszx.dudek@intel.com
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
a741bc6479 libbpf: Add base BTF accessor
Add ability to get base BTF. It can be also used to check if BTF is split BTF.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201202065244.530571-3-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Magnus Karlsson
65e4be6f5d libbpf: Replace size_t with __u32 in xsk interfaces
Replace size_t with __u32 in the xsk interfaces that contain this.
There is no reason to have size_t since the internal variable that
is manipulated is a __u32. The following APIs are affected:

__u32 xsk_ring_prod__reserve(struct xsk_ring_prod *prod, __u32 nb, __u32 *idx)
void xsk_ring_prod__submit(struct xsk_ring_prod *prod, __u32 nb)
__u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __u32 *idx)
void xsk_ring_cons__cancel(struct xsk_ring_cons *cons, __u32 nb)
void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb)

The "nb" variable and the return values have been changed from size_t
to __u32.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1606383455-8243-1-git-send-email-magnus.karlsson@gmail.com
2020-12-04 20:04:52 -08:00
KP Singh
3a2739aa8a bpf: Add a BPF helper for getting the IMA hash of an inode
Provide a wrapper function to get the IMA hash of an inode. This helper
is useful in fingerprinting files (e.g executables on execution) and
using these fingerprints in detections like an executable unlinking
itself.

Since the ima_inode_hash can sleep, it's only allowed for sleepable
LSM hooks.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20201124151210.1081188-3-kpsingh@chromium.org
2020-12-04 20:04:52 -08:00
Li RongQing
dd2369d2a8 libbpf: Add support for canceling cached_cons advance
Add a new function for returning descriptors the user received
after an xsk_ring_cons__peek call. After the application has
gotten a number of descriptors from a ring, it might not be able
to or want to process them all for various reasons. Therefore,
it would be useful to have an interface for returning or
cancelling a number of them so that they are returned to the ring.

This patch adds a new function called xsk_ring_cons__cancel that
performs this operation on nb descriptors counted from the end of
the batch of descriptors that was received through the peek call.

Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ Magnus Karlsson: rewrote changelog ]
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/1606202474-8119-1-git-send-email-lirongqing@baidu.com
2020-12-04 20:04:52 -08:00
Dmitrii Banshchikov
39f5b2e75e bpf: Add bpf_ktime_get_coarse_ns helper
The helper uses CLOCK_MONOTONIC_COARSE source of time that is less
accurate but more performant.

We have a BPF CGROUP_SKB firewall that supports event logging through
bpf_perf_event_output(). Each event has a timestamp and currently we use
bpf_ktime_get_ns() for it. Use of bpf_ktime_get_coarse_ns() saves ~15-20
ns in time required for event logging.

bpf_ktime_get_ns():
EgressLogByRemoteEndpoint                              113.82ns    8.79M

bpf_ktime_get_coarse_ns():
EgressLogByRemoteEndpoint                               95.40ns   10.48M

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201117184549.257280-1-me@ubique.spb.ru
2020-12-04 20:04:52 -08:00
KP Singh
6969a44914 bpf: Add bpf_bprm_opts_set helper
The helper allows modification of certain bits on the linux_binprm
struct starting with the secureexec bit which can be updated using the
BPF_F_BPRM_SECUREEXEC flag.

secureexec can be set by the LSM for privilege gaining executions to set
the AT_SECURE auxv for glibc.  When set, the dynamic linker disables the
use of certain environment variables (like LD_PRELOAD).

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201117232929.2156341-1-kpsingh@chromium.org
2020-12-04 20:04:52 -08:00
Alan Maguire
de2edae80d libbpf: bpf__find_by_name[_kind] should use btf__get_nr_types()
When operating on split BTF, btf__find_by_name[_kind] will not
iterate over all types since they use btf->nr_types to show
the number of types to iterate over. For split BTF this is
the number of types _on top of base BTF_, so it will
underestimate the number of types to iterate over, especially
for vmlinux + module BTF, where the latter is much smaller.

Use btf__get_nr_types() instead.

Fixes: ba451366bf44 ("libbpf: Implement basic split BTF support")
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1605437195-2175-1-git-send-email-alan.maguire@oracle.com
2020-12-04 20:04:52 -08:00
Thomas Karlsson
2ea4ba9c96 macvlan: Support for high multicast packet rate
Background:
Broadcast and multicast packages are enqueued for later processing.
This queue was previously hardcoded to 1000.

This proved insufficient for handling very high packet rates.
This resulted in packet drops for multicast.
While at the same time unicast worked fine.

The change:
This patch make the queue length adjustable to accommodate
for environments with very high multicast packet rate.
But still keeps the default value of 1000 unless specified.

The queue length is specified as a request per macvlan
using the IFLA_MACVLAN_BC_QUEUE_LEN parameter.

The actual used queue length will then be the maximum of
any macvlan connected to the same port. The actual used
queue length for the port can be retrieved (read only)
by the IFLA_MACVLAN_BC_QUEUE_LEN_USED parameter for verification.

This will be followed up by a patch to iproute2
in order to adjust the parameter from userspace.

Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>
Link: https://lore.kernel.org/r/dd4673b2-7eab-edda-6815-85c67ce87f63@paneda.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
2dd5965052 libbpf: Don't attempt to load unused subprog as an entry-point BPF program
If BPF code contains unused BPF subprogram and there are no other subprogram
calls (which can realistically happen in real-world applications given
sufficiently smart Clang code optimizations), libbpf will erroneously assume
that subprograms are entry-point programs and will attempt to load them with
UNSPEC program type.

Fix by not relying on subcall instructions and rather detect it based on the
structure of BPF object's sections.

Fixes: 9a94f277c4fb ("tools: libbpf: restore the ability to load programs from .text section")
Reported-by: Dmitrii Banshchikov <dbanschikov@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20201107000251.256821-1-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
ef8820fea8 bpf: Assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFO
Allocate ID for vmlinux BTF. This makes it visible when iterating over all BTF
objects in the system. To allow distinguishing vmlinux BTF (and later kernel
module BTF) from user-provided BTFs, expose extra kernel_btf flag, as well as
BTF name ("vmlinux" for vmlinux BTF, will equal to module's name for module
BTF).  We might want to later allow specifying BTF name for user-provided BTFs
as well, if that makes sense. But currently this is reserved only for
in-kernel BTFs.

Having in-kernel BTFs exposed IDs will allow to extend BPF APIs that require
in-kernel BTF type with ability to specify BTF types from kernel modules, not
just vmlinux BTF. This will be implemented in a follow up patch set for
fentry/fexit/fmod_ret/lsm/etc.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201110011932.3201430-3-andrii@kernel.org
2020-12-04 20:04:52 -08:00
KP Singh
eae38a781c bpf: Implement get_current_task_btf and RET_PTR_TO_BTF_ID
The currently available bpf_get_current_task returns an unsigned integer
which can be used along with BPF_CORE_READ to read data from
the task_struct but still cannot be used as an input argument to a
helper that accepts an ARG_PTR_TO_BTF_ID of type task_struct.

In order to implement this helper a new return type, RET_PTR_TO_BTF_ID,
is added. This is similar to RET_PTR_TO_BTF_ID_OR_NULL but does not
require checking the nullness of returned pointer.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201106103747.2780972-6-kpsingh@chromium.org
2020-12-04 20:04:52 -08:00
KP Singh
83c2c20acb libbpf: Add support for task local storage
Updates the bpf_probe_map_type API to also support
BPF_MAP_TYPE_TASK_STORAGE similar to other local storage maps.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201106103747.2780972-4-kpsingh@chromium.org
2020-12-04 20:04:52 -08:00
KP Singh
00ae5bac8f bpf: Implement task local storage
Similar to bpf_local_storage for sockets and inodes add local storage
for task_struct.

The life-cycle of storage is managed with the life-cycle of the
task_struct.  i.e. the storage is destroyed along with the owning task
with a callback to the bpf_task_storage_free from the task_free LSM
hook.

The BPF LSM allocates an __rcu pointer to the bpf_local_storage in
the security blob which are now stackable and can co-exist with other
LSMs.

The userspace map operations can be done by using a pid fd as a key
passed to the lookup, update and delete operations.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201106103747.2780972-3-kpsingh@chromium.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
f99c252cbc vmtest: update Kconfig to accommodate IMA test config
test_progs's IMA selftests requires extra Kconfig values, so update
latest.config to accommodate those.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-12-03 12:27:49 -08:00
Andrii Nakryiko
5ae2a2621c readme: move gory sync details down and add libbpf-bootstrap references
Move gory details about libbpf mirror and sync into a
separate section at the bottom of README.

Also add references to libbpf-bootstrap and blog about it,
as well as libbpf-tools reference.
2020-11-29 13:34:03 -08:00
Andrii Nakryiko
5af3d86b5a vmtests: blacklist two more tests on 5.5
tcpbpf_user uses cgroup bpf_link, not available in 5.5. hash_large_key is
testing a more permissive verifier check, implemented in 5.11. So blacklist
both.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
c55abf0752 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   3cb12d27ff655e57e8efe3486dca2a22f4e30578
Checkpoint bpf-next commit: c6bde958a62b8ca5ee8d2c1fe429aec4ad54efad
Baseline bpf commit:        c66dca98a24cb5f3493dd08d40bcfa94a220fa92
Checkpoint bpf commit:      d3bec0138bfbe58606fc1d6f57a4cdc1a20218db

Andrii Nakryiko (6):
  libbpf: Factor out common operations in BTF writing APIs
  libbpf: Unify and speed up BTF string deduplication
  libbpf: Implement basic split BTF support
  libbpf: Fix BTF data layout checks and allow empty BTF
  libbpf: Support BTF dedup of split BTFs
  libbpf: Accomodate DWARF/compiler bug with duplicated identical arrays

Ian Rogers (1):
  libbpf, hashmap: Fix undefined behavior in hash_bits

Magnus Karlsson (2):
  libbpf: Fix null dereference in xsk_socket__delete
  libbpf: Fix possible use after free in xsk_socket__delete

 src/btf.c      | 807 +++++++++++++++++++++++++++++--------------------
 src/btf.h      |   8 +
 src/hashmap.h  |  15 +-
 src/libbpf.map |   9 +
 src/xsk.c      |   9 +-
 5 files changed, 504 insertions(+), 344 deletions(-)

--
2.24.1
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
e30f758aab sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-11-05 21:20:45 -08:00
Magnus Karlsson
8caff995c7 libbpf: Fix possible use after free in xsk_socket__delete
Fix a possible use after free in xsk_socket__delete that will happen
if xsk_put_ctx() frees the ctx. To fix, save the umem reference taken
from the context and just use that instead.

Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1604396490-12129-3-git-send-email-magnus.karlsson@gmail.com
2020-11-05 21:20:45 -08:00
Magnus Karlsson
539aa6bea5 libbpf: Fix null dereference in xsk_socket__delete
Fix a possible null pointer dereference in xsk_socket__delete that
will occur if a null pointer is fed into the function.

Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1604396490-12129-2-git-send-email-magnus.karlsson@gmail.com
2020-11-05 21:20:45 -08:00
Ian Rogers
224db2db07 libbpf, hashmap: Fix undefined behavior in hash_bits
If bits is 0, the case when the map is empty, then the >> is the size of
the register which is undefined behavior - on x86 it is the same as a
shift by 0.

Fix by handling the 0 case explicitly and guarding calls to hash_bits for
empty maps in hashmap__for_each_key_entry and hashmap__for_each_entry_safe.

Fixes: e3b924224028 ("libbpf: add resizable non-thread safe internal hashmap")
Suggested-by: Andrii Nakryiko <andriin@fb.com>,
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201029223707.494059-1-irogers@google.com
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
e6725d2467 libbpf: Accomodate DWARF/compiler bug with duplicated identical arrays
In some cases compiler seems to generate distinct DWARF types for identical
arrays within the same CU. That seems like a bug, but it's already out there
and breaks type graph equivalence checks, so accommodate it anyway by checking
for identical arrays, regardless of their type ID.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201105043402.2530976-10-andrii@kernel.org
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
658ac1ec19 libbpf: Support BTF dedup of split BTFs
Add support for deduplication split BTFs. When deduplicating split BTF, base
BTF is considered to be immutable and can't be modified or adjusted. 99% of
BTF deduplication logic is left intact (module some type numbering adjustments).
There are only two differences.

First, each type in base BTF gets hashed (expect VAR and DATASEC, of course,
those are always considered to be self-canonical instances) and added into
a table of canonical table candidates. Hashing is a shallow, fast operation,
so mostly eliminates the overhead of having entire base BTF to be a part of
BTF dedup.

Second difference is very critical and subtle. While deduplicating split BTF
types, it is possible to discover that one of immutable base BTF BTF_KIND_FWD
types can and should be resolved to a full STRUCT/UNION type from the split
BTF part.  This is, obviously, can't happen because we can't modify the base
BTF types anymore. So because of that, any type in split BTF that directly or
indirectly references that newly-to-be-resolved FWD type can't be considered
to be equivalent to the corresponding canonical types in base BTF, because
that would result in a loss of type resolution information. So in such case,
split BTF types will be deduplicated separately and will cause some
duplication of type information, which is unavoidable.

With those two changes, the rest of the algorithm manages to deduplicate split
BTF correctly, pointing all the duplicates to their canonical counter-parts in
base BTF, but also is deduplicating whatever unique types are present in split
BTF on their own.

Also, theoretically, split BTF after deduplication could end up with either
empty type section or empty string section. This is handled by libbpf
correctly in one of previous patches in the series.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201105043402.2530976-9-andrii@kernel.org
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
dd36215834 libbpf: Fix BTF data layout checks and allow empty BTF
Make data section layout checks stricter, disallowing overlap of types and
strings data.

Additionally, allow BTFs with no type data. There is nothing inherently wrong
with having BTF with no types (put potentially with some strings). This could
be a situation with kernel module BTFs, if module doesn't introduce any new
type information.

Also fix invalid offset alignment check for btf->hdr->type_off.

Fixes: 8a138aed4a80 ("bpf: btf: Add BTF support to libbpf")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201105043402.2530976-8-andrii@kernel.org
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
2811d54f8b libbpf: Implement basic split BTF support
Support split BTF operation, in which one BTF (base BTF) provides basic set of
types and strings, while another one (split BTF) builds on top of base's types
and strings and adds its own new types and strings. From API standpoint, the
fact that the split BTF is built on top of the base BTF is transparent.

Type numeration is transparent. If the base BTF had last type ID #N, then all
types in the split BTF start at type ID N+1. Any type in split BTF can
reference base BTF types, but not vice versa. Programmatically construction of
a split BTF on top of a base BTF is supported: one can create an empty split
BTF with btf__new_empty_split() and pass base BTF as an input, or pass raw
binary data to btf__new_split(), or use btf__parse_xxx_split() variants to get
initial set of split types/strings from the ELF file with .BTF section.

String offsets are similarly transparent and are a logical continuation of
base BTF's strings. When building BTF programmatically and adding a new string
(explicitly with btf__add_str() or implicitly through appending new
types/members), string-to-be-added would first be looked up from the base
BTF's string section and re-used if it's there. If not, it will be looked up
and/or added to the split BTF string section. Similarly to type IDs, types in
split BTF can refer to strings from base BTF absolutely transparently (but not
vice versa, of course, because base BTF doesn't "know" about existence of
split BTF).

Internal type index is slightly adjusted to be zero-indexed, ignoring a fake
[0] VOID type. This allows to handle split/base BTF type lookups transparently
by using btf->start_id type ID offset, which is always 1 for base/non-split
BTF and equals btf__get_nr_types(base_btf) + 1 for the split BTF.

BTF deduplication is not yet supported for split BTF and support for it will
be added in separate patch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201105043402.2530976-5-andrii@kernel.org
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
be2dc73ee2 libbpf: Unify and speed up BTF string deduplication
Revamp BTF dedup's string deduplication to match the approach of writable BTF
string management. This allows to transfer deduplicated strings index back to
BTF object after deduplication without expensive extra memory copying and hash
map re-construction. It also simplifies the code and speeds it up, because
hashmap-based string deduplication is faster than sort + unique approach.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201105043402.2530976-4-andrii@kernel.org
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
4953827790 libbpf: Factor out common operations in BTF writing APIs
Factor out commiting of appended type data. Also extract fetching the very
last type in the BTF (to append members to). These two operations are common
across many APIs and will be easier to refactor with split BTF, if they are
extracted into a single place.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201105043402.2530976-2-andrii@kernel.org
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
d1fd50d475 helpers: add struct bpf_redir_neigh forward declaration
This avoids compilation warning if `struct bpf_redir_neigh` is not provided by
other kernel headers.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-10-28 09:59:37 -07:00
Andrii Nakryiko
f0c6b6bdfb sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   376dcfe3a4e5a5475a84e6b5f926066a8614f887
Checkpoint bpf-next commit: 3cb12d27ff655e57e8efe3486dca2a22f4e30578
Baseline bpf commit:        28802e7c0c9954218d1830f7507edc9d49b03a00
Checkpoint bpf commit:      c66dca98a24cb5f3493dd08d40bcfa94a220fa92

Daniel Borkmann (1):
  bpf, libbpf: Guard bpf inline asm from bpf_tail_call_static

Toke Høiland-Jørgensen (1):
  bpf: Fix bpf_redirect_neigh helper api to support supplying nexthop

 include/uapi/linux/bpf.h | 22 ++++++++++++++++++----
 src/bpf_helpers.h        |  2 ++
 2 files changed, 20 insertions(+), 4 deletions(-)

--
2.24.1
2020-10-28 09:08:35 -07:00
Andrii Nakryiko
475ee87969 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-10-28 09:08:35 -07:00
Daniel Borkmann
f754860e35 bpf, libbpf: Guard bpf inline asm from bpf_tail_call_static
Yaniv reported a compilation error after pulling latest libbpf:

  [...]
  ../libbpf/src/root/usr/include/bpf/bpf_helpers.h:99:10: error:
  unknown register name 'r0' in asm
                     : "r0", "r1", "r2", "r3", "r4", "r5");
  [...]

The issue got triggered given Yaniv was compiling tracing programs with native
target (e.g. x86) instead of BPF target, hence no BTF generated vmlinux.h nor
CO-RE used, and later llc with -march=bpf was invoked to compile from LLVM IR
to BPF object file. Given that clang was expecting x86 inline asm and not BPF
one the error complained that these regs don't exist on the former.

Guard bpf_tail_call_static() with defined(__bpf__) where BPF inline asm is valid
to use. BPF tracing programs on more modern kernels use BPF target anyway and
thus the bpf_tail_call_static() function will be available for them. BPF inline
asm is supported since clang 7 (clang <= 6 otherwise throws same above error),
and __bpf_unreachable() since clang 8, therefore include the latter condition
in order to prevent compilation errors for older clang versions. Given even an
old Ubuntu 18.04 LTS has official LLVM packages all the way up to llvm-10, I did
not bother to special case the __bpf_unreachable() inside bpf_tail_call_static()
further.

Also, undo the sockex3_kern's use of bpf_tail_call_static() sample given they
still have the old hacky way to even compile networking progs with native instead
of BPF target so bpf_tail_call_static() won't be defined there anymore.

Fixes: 0e9f6841f664 ("bpf, libbpf: Add bpf_tail_call_static helper for bpf programs")
Reported-by: Yaniv Agman <yanivagman@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Tested-by: Yaniv Agman <yanivagman@gmail.com>
Link: https://lore.kernel.org/bpf/CAMy7=ZUk08w5Gc2Z-EKi4JFtuUCaZYmE4yzhJjrExXpYKR4L8w@mail.gmail.com
Link: https://lore.kernel.org/bpf/20201021203257.26223-1-daniel@iogearbox.net
2020-10-28 09:08:35 -07:00
Toke Høiland-Jørgensen
78d61150e9 bpf: Fix bpf_redirect_neigh helper api to support supplying nexthop
Based on the discussion in [0], update the bpf_redirect_neigh() helper to
accept an optional parameter specifying the nexthop information. This makes
it possible to combine bpf_fib_lookup() and bpf_redirect_neigh() without
incurring a duplicate FIB lookup - since the FIB lookup helper will return
the nexthop information even if no neighbour is present, this can simply
be passed on to bpf_redirect_neigh() if bpf_fib_lookup() returns
BPF_FIB_LKUP_RET_NO_NEIGH. Thus fix & extend it before helper API is frozen.

  [0] https://lore.kernel.org/bpf/393e17fc-d187-3a8d-2f0d-a627c7c63fca@iogearbox.net/

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/bpf/160322915615.32199.1187570224032024535.stgit@toke.dk
2020-10-28 09:08:35 -07:00
Andrii Nakryiko
49280406a2 readme: add Ubuntu mentions
Ubuntu 20.10 is now a good version to do BPF + CO-RE development.
2020-10-26 21:16:14 -07:00
Andrii Nakryiko
de58d0cccf sync: update 5.5.0 blacklist
Blacklist 2 new selftests, which depend on 5.10 kernel.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-10-12 14:27:04 -07:00
Andrii Nakryiko
6fa81d4dbe sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   f4d385e4d51d035c7f0d68a3e9564c9453c13aa4
Checkpoint bpf-next commit: 376dcfe3a4e5a5475a84e6b5f926066a8614f887
Baseline bpf commit:        9cf51446e68607136e42a4e531a30c888c472463
Checkpoint bpf commit:      28802e7c0c9954218d1830f7507edc9d49b03a00

Andrii Nakryiko (3):
  libbpf: Skip CO-RE relocations for not loaded BPF programs
  libbpf: Support safe subset of load/store instruction resizing with
    CO-RE
  libbpf: Allow specifying both ELF and raw BTF for CO-RE BTF override

Daniel Borkmann (3):
  bpf: Improve bpf_redirect_neigh helper description
  bpf: Add redirect_peer helper
  bpf: Allow for map-in-map with dynamic inner array map entries

Hangbin Liu (2):
  libbpf: Close map fd if init map slots failed
  libbpf: Check if pin_path was set even map fd exist

Hao Luo (4):
  bpf: Introduce pseudo_btf_id
  bpf/libbpf: BTF support for typed ksyms
  bpf: Introduce bpf_per_cpu_ptr()
  bpf: Introducte bpf_this_cpu_ptr()

Jakub Wilk (1):
  bpf: Fix typo in uapi/linux/bpf.h

Luigi Rizzo (1):
  bpf, libbpf: Use valid btf in bpf_program__set_attach_target

Magnus Karlsson (1):
  libbpf: Fix compatibility problem in xsk_socket__create

Nikita V. Shirokov (1):
  bpf: Add tcp_notsent_lowat bpf setsockopt

Song Liu (1):
  bpf: Introduce BPF_F_PRESERVE_ELEMS for perf event array

 include/uapi/linux/bpf.h | 104 ++++++++++--
 src/libbpf.c             | 348 ++++++++++++++++++++++++++++++++-------
 src/xsk.c                |   7 +-
 3 files changed, 385 insertions(+), 74 deletions(-)

--
2.24.1
2020-10-12 14:27:04 -07:00
Andrii Nakryiko
bc94c2b82f sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-10-12 14:27:04 -07:00
Daniel Borkmann
d47094a2ce bpf: Allow for map-in-map with dynamic inner array map entries
Recent work in f4d05259213f ("bpf: Add map_meta_equal map ops") and 134fede4eecf
("bpf: Relax max_entries check for most of the inner map types") added support
for dynamic inner max elements for most map-in-map types. Exceptions were maps
like array or prog array where the map_gen_lookup() callback uses the maps'
max_entries field as a constant when emitting instructions.

We recently implemented Maglev consistent hashing into Cilium's load balancer
which uses map-in-map with an outer map being hash and inner being array holding
the Maglev backend table for each service. This has been designed this way in
order to reduce overall memory consumption given the outer hash map allows to
avoid preallocating a large, flat memory area for all services. Also, the
number of service mappings is not always known a-priori.

The use case for dynamic inner array map entries is to further reduce memory
overhead, for example, some services might just have a small number of back
ends while others could have a large number. Right now the Maglev backend table
for small and large number of backends would need to have the same inner array
map entries which adds a lot of unneeded overhead.

Dynamic inner array map entries can be realized by avoiding the inlined code
generation for their lookup. The lookup will still be efficient since it will
be calling into array_map_lookup_elem() directly and thus avoiding retpoline.
The patch adds a BPF_F_INNER_MAP flag to map creation which therefore skips
inline code generation and relaxes array_map_meta_equal() check to ignore both
maps' max_entries. This also still allows to have faster lookups for map-in-map
when BPF_F_INNER_MAP is not specified and hence dynamic max_entries not needed.

Example code generation where inner map is dynamic sized array:

  # bpftool p d x i 125
  int handle__sys_enter(void * ctx):
  ; int handle__sys_enter(void *ctx)
     0: (b4) w1 = 0
  ; int key = 0;
     1: (63) *(u32 *)(r10 -4) = r1
     2: (bf) r2 = r10
  ;
     3: (07) r2 += -4
  ; inner_map = bpf_map_lookup_elem(&outer_arr_dyn, &key);
     4: (18) r1 = map[id:468]
     6: (07) r1 += 272
     7: (61) r0 = *(u32 *)(r2 +0)
     8: (35) if r0 >= 0x3 goto pc+5
     9: (67) r0 <<= 3
    10: (0f) r0 += r1
    11: (79) r0 = *(u64 *)(r0 +0)
    12: (15) if r0 == 0x0 goto pc+1
    13: (05) goto pc+1
    14: (b7) r0 = 0
    15: (b4) w6 = -1
  ; if (!inner_map)
    16: (15) if r0 == 0x0 goto pc+6
    17: (bf) r2 = r10
  ;
    18: (07) r2 += -4
  ; val = bpf_map_lookup_elem(inner_map, &key);
    19: (bf) r1 = r0                               | No inlining but instead
    20: (85) call array_map_lookup_elem#149280     | call to array_map_lookup_elem()
  ; return val ? *val : -1;                        | for inner array lookup.
    21: (15) if r0 == 0x0 goto pc+1
  ; return val ? *val : -1;
    22: (61) r6 = *(u32 *)(r0 +0)
  ; }
    23: (bc) w0 = w6
    24: (95) exit

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201010234006.7075-4-daniel@iogearbox.net
2020-10-12 14:27:04 -07:00
Daniel Borkmann
4672fb6790 bpf: Add redirect_peer helper
Add an efficient ingress to ingress netns switch that can be used out of tc BPF
programs in order to redirect traffic from host ns ingress into a container
veth device ingress without having to go via CPU backlog queue [0]. For local
containers this can also be utilized and path via CPU backlog queue only needs
to be taken once, not twice. On a high level this borrows from ipvlan which does
similar switch in __netif_receive_skb_core() and then iterates via another_round.
This helps to reduce latency for mentioned use cases.

Pod to remote pod with redirect(), TCP_RR [1]:

  # percpu_netperf 10.217.1.33
          RT_LATENCY:         122.450         (per CPU:         122.666         122.401         122.333         122.401 )
        MEAN_LATENCY:         121.210         (per CPU:         121.100         121.260         121.320         121.160 )
      STDDEV_LATENCY:         120.040         (per CPU:         119.420         119.910         125.460         115.370 )
         MIN_LATENCY:          46.500         (per CPU:          47.000          47.000          47.000          45.000 )
         P50_LATENCY:         118.500         (per CPU:         118.000         119.000         118.000         119.000 )
         P90_LATENCY:         127.500         (per CPU:         127.000         128.000         127.000         128.000 )
         P99_LATENCY:         130.750         (per CPU:         131.000         131.000         129.000         132.000 )

    TRANSACTION_RATE:       32666.400         (per CPU:        8152.200        8169.842        8174.439        8169.897 )

Pod to remote pod with redirect_peer(), TCP_RR:

  # percpu_netperf 10.217.1.33
          RT_LATENCY:          44.449         (per CPU:          43.767          43.127          45.279          45.622 )
        MEAN_LATENCY:          45.065         (per CPU:          44.030          45.530          45.190          45.510 )
      STDDEV_LATENCY:          84.823         (per CPU:          66.770          97.290          84.380          90.850 )
         MIN_LATENCY:          33.500         (per CPU:          33.000          33.000          34.000          34.000 )
         P50_LATENCY:          43.250         (per CPU:          43.000          43.000          43.000          44.000 )
         P90_LATENCY:          46.750         (per CPU:          46.000          47.000          47.000          47.000 )
         P99_LATENCY:          52.750         (per CPU:          51.000          54.000          53.000          53.000 )

    TRANSACTION_RATE:       90039.500         (per CPU:       22848.186       23187.089       22085.077       21919.130 )

  [0] https://linuxplumbersconf.org/event/7/contributions/674/attachments/568/1002/plumbers_2020_cilium_load_balancer.pdf
  [1] https://github.com/borkmann/netperf_scripts/blob/master/percpu_netperf

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201010234006.7075-3-daniel@iogearbox.net
2020-10-12 14:27:04 -07:00
Daniel Borkmann
a8a505a36f bpf: Improve bpf_redirect_neigh helper description
Follow-up to address David's feedback that we should better describe internals
of the bpf_redirect_neigh() helper.

Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Link: https://lore.kernel.org/bpf/20201010234006.7075-2-daniel@iogearbox.net
2020-10-12 14:27:04 -07:00
Nikita V. Shirokov
e3b9cf7aaa bpf: Add tcp_notsent_lowat bpf setsockopt
Adding support for TCP_NOTSENT_LOWAT sockoption (https://lwn.net/Articles/560082/)
in tcp bpf programs.

Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20201009070325.226855-1-tehnerd@tehnerd.com
2020-10-12 14:27:04 -07:00
Andrii Nakryiko
76764b891b libbpf: Allow specifying both ELF and raw BTF for CO-RE BTF override
Use generalized BTF parsing logic, making it possible to parse BTF both from
ELF file, as well as a raw BTF dump. This makes it easier to write custom
tests with manually generated BTFs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201008001025.292064-4-andrii@kernel.org
2020-10-12 14:27:04 -07:00
Andrii Nakryiko
8ef6a6e709 libbpf: Support safe subset of load/store instruction resizing with CO-RE
Add support for patching instructions of the following form:
  - rX = *(T *)(rY + <off>);
  - *(T *)(rX + <off>) = rY;
  - *(T *)(rX + <off>) = <imm>, where T is one of {u8, u16, u32, u64}.

For such instructions, if the actual kernel field recorded in CO-RE relocation
has a different size than the one recorded locally (e.g., from vmlinux.h),
then libbpf will adjust T to an appropriate 1-, 2-, 4-, or 8-byte loads.

In general, such transformation is not always correct and could lead to
invalid final value being loaded or stored. But two classes of cases are
always safe:
  - if both local and target (kernel) types are unsigned integers, but of
  different sizes, then it's OK to adjust load/store instruction according to
  the necessary memory size. Zero-extending nature of such instructions and
  unsignedness make sure that the final value is always correct;
  - pointer size mismatch between BPF target architecture (which is always
  64-bit) and 32-bit host kernel architecture can be similarly resolved
  automatically, because pointer is essentially an unsigned integer. Loading
  32-bit pointer into 64-bit BPF register with zero extension will leave
  correct pointer in the register.

Both cases are necessary to support CO-RE on 32-bit kernels, as `unsigned
long` in vmlinux.h generated from 32-bit kernel is 32-bit, but when compiled
with BPF program for BPF target it will be treated by compiler as 64-bit
integer. Similarly, pointers in vmlinux.h are 32-bit for kernel, but treated
as 64-bit values by compiler for BPF target. Both problems are now resolved by
libbpf for direct memory reads.

But similar transformations are useful in general when kernel fields are
"resized" from, e.g., unsigned int to unsigned long (or vice versa).

Now, similar transformations for signed integers are not safe to perform as
they will result in incorrect sign extension of the value. If such situation
is detected, libbpf will emit helpful message and will poison the instruction.
Not failing immediately means that it's possible to guard the instruction
based on kernel version (or other conditions) and make sure it's not
reachable.

If there is a need to read signed integers that change sizes between different
kernels, it's possible to use BPF_CORE_READ_BITFIELD() macro, which works both
with bitfields and non-bitfield integers of any signedness and handles
sign-extension properly. Also, bpf_core_read() with proper size and/or use of
bpf_core_field_size() relocation could allow to deal with such complicated
situations explicitly, if not so conventiently as direct memory reads.

Selftests added in a separate patch in progs/test_core_autosize.c demonstrate
both direct memory and probed use cases.

BPF_CORE_READ() is not changed and it won't deal with such situations as
automatically as direct memory reads due to the signedness integer
limitations, which are much harder to detect and control with compiler macro
magic. So it's encouraged to utilize direct memory reads as much as possible.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201008001025.292064-3-andrii@kernel.org
2020-10-12 14:27:04 -07:00
Andrii Nakryiko
44d5bc1709 libbpf: Skip CO-RE relocations for not loaded BPF programs
Bypass CO-RE relocations step for BPF programs that are not going to be
loaded. This allows to have BPF programs compiled in and disabled dynamically
if kernel is not supposed to provide enough relocation information. In such
case, there won't be unnecessary warnings about failed relocations.

Fixes: d929758101fc ("libbpf: Support disabling auto-loading BPF programs")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201008001025.292064-2-andrii@kernel.org
2020-10-12 14:27:04 -07:00
Magnus Karlsson
95848b59b9 libbpf: Fix compatibility problem in xsk_socket__create
Fix a compatibility problem when the old XDP_SHARED_UMEM mode is used
together with the xsk_socket__create() call. In the old XDP_SHARED_UMEM
mode, only sharing of the same device and queue id was allowed, and
in this mode, the fill ring and completion ring were shared between
the AF_XDP sockets.

Therefore, it was perfectly fine to call the xsk_socket__create() API
for each socket and not use the new xsk_socket__create_shared() API.
This behavior was ruined by the commit introducing XDP_SHARED_UMEM
support between different devices and/or queue ids. This patch restores
the ability to use xsk_socket__create in these circumstances so that
backward compatibility is not broken.

Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1602070946-11154-1-git-send-email-magnus.karlsson@gmail.com
2020-10-12 14:27:04 -07:00
Jakub Wilk
1bc08143b5 bpf: Fix typo in uapi/linux/bpf.h
Reported-by: Samanta Navarro <ferivoz@riseup.net>
Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20201007055717.7319-1-jwilk@jwilk.net
2020-10-12 14:27:04 -07:00
Luigi Rizzo
b9682e291d bpf, libbpf: Use valid btf in bpf_program__set_attach_target
bpf_program__set_attach_target(prog, fd, ...) will always fail when
fd = 0 (attach to a kernel symbol) because obj->btf_vmlinux is NULL
and there is no way to set it (at the moment btf_vmlinux is meant
to be temporary storage for use in bpf_object__load_xattr()).

Fix this by using libbpf_find_vmlinux_btf_id().

At some point we may want to opportunistically cache btf_vmlinux
so it can be reused with multiple programs.

Signed-off-by: Luigi Rizzo <lrizzo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Petar Penkov <ppenkov@google.com>
Link: https://lore.kernel.org/bpf/20201005224528.389097-1-lrizzo@google.com
2020-10-12 14:27:04 -07:00
Hangbin Liu
54fe2f1e26 libbpf: Check if pin_path was set even map fd exist
Say a user reuse map fd after creating a map manually and set the
pin_path, then load the object via libbpf.

In libbpf bpf_object__create_maps(), bpf_object__reuse_map() will
return 0 if there is no pinned map in map->pin_path. Then after
checking if map fd exist, we should also check if pin_path was set
and do bpf_map__pin() instead of continue the loop.

Fix it by creating map if fd not exist and continue checking pin_path
after that.

Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201006021345.3817033-3-liuhangbin@gmail.com
2020-10-12 14:27:04 -07:00
Hangbin Liu
fd28e0130a libbpf: Close map fd if init map slots failed
Previously we forgot to close the map fd if bpf_map_update_elem()
failed during map slot init, which will leak map fd.

Let's move map slot initialization to new function init_map_slots() to
simplify the code. And close the map fd if init slot failed.

Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201006021345.3817033-2-liuhangbin@gmail.com
2020-10-12 14:27:04 -07:00
Hao Luo
f908087023 bpf: Introducte bpf_this_cpu_ptr()
Add bpf_this_cpu_ptr() to help access percpu var on this cpu. This
helper always returns a valid pointer, therefore no need to check
returned value for NULL. Also note that all programs run with
preemption disabled, which means that the returned pointer is stable
during all the execution of the program.

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200929235049.2533242-6-haoluo@google.com
2020-10-12 14:27:04 -07:00
Hao Luo
b3b297aa16 bpf: Introduce bpf_per_cpu_ptr()
Add bpf_per_cpu_ptr() to help bpf programs access percpu vars.
bpf_per_cpu_ptr() has the same semantic as per_cpu_ptr() in the kernel
except that it may return NULL. This happens when the cpu parameter is
out of range. So the caller must check the returned value.

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200929235049.2533242-5-haoluo@google.com
2020-10-12 14:27:04 -07:00
Hao Luo
6d0fcc3bd5 bpf/libbpf: BTF support for typed ksyms
If a ksym is defined with a type, libbpf will try to find the ksym's btf
information from kernel btf. If a valid btf entry for the ksym is found,
libbpf can pass in the found btf id to the verifier, which validates the
ksym's type and value.

Typeless ksyms (i.e. those defined as 'void') will not have such btf_id,
but it has the symbol's address (read from kallsyms) and its value is
treated as a raw pointer.

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200929235049.2533242-3-haoluo@google.com
2020-10-12 14:27:04 -07:00
Hao Luo
3706bf773b bpf: Introduce pseudo_btf_id
Pseudo_btf_id is a type of ld_imm insn that associates a btf_id to a
ksym so that further dereferences on the ksym can use the BTF info
to validate accesses. Internally, when seeing a pseudo_btf_id ld insn,
the verifier reads the btf_id stored in the insn[0]'s imm field and
marks the dst_reg as PTR_TO_BTF_ID. The btf_id points to a VAR_KIND,
which is encoded in btf_vminux by pahole. If the VAR is not of a struct
type, the dst reg will be marked as PTR_TO_MEM instead of PTR_TO_BTF_ID
and the mem_size is resolved to the size of the VAR's type.

>From the VAR btf_id, the verifier can also read the address of the
ksym's corresponding kernel var from kallsyms and use that to fill
dst_reg.

Therefore, the proper functionality of pseudo_btf_id depends on (1)
kallsyms and (2) the encoding of kernel global VARs in pahole, which
should be available since pahole v1.18.

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200929235049.2533242-2-haoluo@google.com
2020-10-12 14:27:04 -07:00
Song Liu
09718f4ecd bpf: Introduce BPF_F_PRESERVE_ELEMS for perf event array
Currently, perf event in perf event array is removed from the array when
the map fd used to add the event is closed. This behavior makes it
difficult to the share perf events with perf event array.

Introduce perf event map that keeps the perf event open with a new flag
BPF_F_PRESERVE_ELEMS. With this flag set, perf events in the array are not
removed when the original map fd is closed. Instead, the perf event will
stay in the map until 1) it is explicitly removed from the array; or 2)
the array is freed.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200930224927.1936644-2-songliubraving@fb.com
2020-10-12 14:27:04 -07:00
Andrii Nakryiko
8205f37a56 sync: ignore libc_compat.h
Libbpf doesn't rely on libc_compat.h anymore, so ignore it for the purposes of
syncing libbpf sources into Github.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-10-12 12:18:53 -07:00
Andrii Nakryiko
ecbd504994 makefile: add quiet mode support
Add quiet-by-default mode to Makefile, similar to libbpf Makefile in Linux
repo.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-10-11 00:39:03 -07:00
Andrii Nakryiko
b6dd2f2b7d vmtests: un-blacklist fixed selftests
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-09-30 18:19:55 -07:00
Andrii Nakryiko
a132697261 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   b0efc216f577997bf563d76d51673ed79c3d5f71
Checkpoint bpf-next commit: f4d385e4d51d035c7f0d68a3e9564c9453c13aa4
Baseline bpf commit:        9cf51446e68607136e42a4e531a30c888c472463
Checkpoint bpf commit:      9cf51446e68607136e42a4e531a30c888c472463

Andrii Nakryiko (1):
  libbpf: Make btf_dump work with modifiable BTF

Daniel Borkmann (3):
  bpf: Add classid helper only based on skb->sk
  bpf: Add redirect_neigh helper as redirect drop-in
  bpf, libbpf: Add bpf_tail_call_static helper for bpf programs

 include/uapi/linux/bpf.h | 24 ++++++++++++++
 src/bpf_helpers.h        | 46 +++++++++++++++++++++++++++
 src/btf.c                | 17 ++++++++++
 src/btf_dump.c           | 69 +++++++++++++++++++++++++++-------------
 src/libbpf_internal.h    |  1 +
 5 files changed, 135 insertions(+), 22 deletions(-)

--
2.24.1
2020-09-30 18:19:55 -07:00
Andrii Nakryiko
2d0aa12ea3 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-09-30 18:19:55 -07:00
Andrii Nakryiko
317ef1c295 libbpf: Make btf_dump work with modifiable BTF
Ensure that btf_dump can accommodate new BTF types being appended to BTF
instance after struct btf_dump was created. This came up during attemp to
use btf_dump for raw type dumping in selftests, but given changes are not
excessive, it's good to not have any gotchas in API usage, so I decided to
support such use case in general.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200929232843.1249318-2-andriin@fb.com
2020-09-30 18:19:55 -07:00
Daniel Borkmann
80c7838600 bpf, libbpf: Add bpf_tail_call_static helper for bpf programs
Port of tail_call_static() helper function from Cilium's BPF code base [0]
to libbpf, so others can easily consume it as well. We've been using this
in production code for some time now. The main idea is that we guarantee
that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the
JITed BPF insns with direct jumps instead of having to fall back to using
expensive retpolines. By using inline asm, we guarantee that the compiler
won't merge the call from different paths with potentially different
content of r2/r3.

We're also using Cilium's __throw_build_bug() macro (here as: __bpf_unreachable())
in different places as a neat trick to trigger compilation errors when
compiler does not remove code at compilation time. This works for the BPF
back end as it does not implement the __builtin_trap().

  [0] f5537c2602

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/1656a082e077552eb46642d513b4a6bde9a7dd01.1601477936.git.daniel@iogearbox.net
2020-09-30 18:19:55 -07:00
Daniel Borkmann
750801a0d5 bpf: Add redirect_neigh helper as redirect drop-in
Add a redirect_neigh() helper as redirect() drop-in replacement
for the xmit side. Main idea for the helper is to be very similar
in semantics to the latter just that the skb gets injected into
the neighboring subsystem in order to let the stack do the work
it knows best anyway to populate the L2 addresses of the packet
and then hand over to dev_queue_xmit() as redirect() does.

This solves two bigger items: i) skbs don't need to go up to the
stack on the host facing veth ingress side for traffic egressing
the container to achieve the same for populating L2 which also
has the huge advantage that ii) the skb->sk won't get orphaned in
ip_rcv_core() when entering the IP routing layer on the host stack.

Given that skb->sk neither gets orphaned when crossing the netns
as per 9c4c325252c5 ("skbuff: preserve sock reference when scrubbing
the skb.") the helper can then push the skbs directly to the phys
device where FQ scheduler can do its work and TCP stack gets proper
backpressure given we hold on to skb->sk as long as skb is still
residing in queues.

With the helper used in BPF data path to then push the skb to the
phys device, I observed a stable/consistent TCP_STREAM improvement
on veth devices for traffic going container -> host -> host ->
container from ~10Gbps to ~15Gbps for a single stream in my test
environment.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/bpf/f207de81629e1724899b73b8112e0013be782d35.1601477936.git.daniel@iogearbox.net
2020-09-30 18:19:55 -07:00
Daniel Borkmann
b5fd4c774d bpf: Add classid helper only based on skb->sk
Similarly to 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid
from v2 hooks"), add a helper to retrieve cgroup v1 classid solely
based on the skb->sk, so it can be used as key as part of BPF map
lookups out of tc from host ns, in particular given the skb->sk is
retained these days when crossing net ns thanks to 9c4c325252c5
("skbuff: preserve sock reference when scrubbing the skb."). This
is similar to bpf_skb_cgroup_id() which implements the same for v2.
Kubernetes ecosystem is still operating on v1 however, hence net_cls
needs to be used there until this can be dropped in with the v2
helper of bpf_skb_cgroup_id().

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/ed633cf27a1c620e901c5aa99ebdefb028dce600.1601477936.git.daniel@iogearbox.net
2020-09-30 18:19:55 -07:00
Vladimír Čunát
5a10cd2060 remove internal reallocarray()
... as it's covered by libbpf_reallocarray() since commit dc70da9c70.
2020-09-30 12:55:50 -07:00
Andrii Nakryiko
ff797cc905 vmtests: blacklist new tests for 5.5
Blacklist new tests that are depending on features in latest kernel. Also
temporarily blacklist raw_tp_test_run test, until it is fixed upstream.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-09-29 18:29:49 -07:00
Andrii Nakryiko
21ea184818 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   2f7de9865ba3cbfcf8b504f07154fdb6124176a4
Checkpoint bpf-next commit: b0efc216f577997bf563d76d51673ed79c3d5f71
Baseline bpf commit:        87f92ac4c12758c4da3bbe4393f1d884b610b8a6
Checkpoint bpf commit:      9cf51446e68607136e42a4e531a30c888c472463

Alan Maguire (2):
  bpf: Add bpf_snprintf_btf helper
  bpf: Add bpf_seq_printf_btf helper

Andrii Nakryiko (11):
  libbpf: Refactor internals of BTF type index
  libbpf: Remove assumption of single contiguous memory for BTF data
  libbpf: Generalize common logic for managing dynamically-sized arrays
  libbpf: Extract generic string hashing function for reuse
  libbpf: Allow modification of BTF and add btf__add_str API
  libbpf: Add btf__new_empty() to create an empty BTF object
  libbpf: Add BTF writing APIs
  libbpf: Add btf__str_by_offset() as a more generic variant of
    name_by_offset
  selftests/bpf: Test BTF writing APIs
  libbpf: Support BTF loading and raw data output in both endianness
  libbpf: Fix uninitialized variable in btf_parse_type_sec

Martin KaFai Lau (4):
  bpf: Change bpf_sk_release and bpf_sk_*cgroup_id to accept
    ARG_PTR_TO_BTF_ID_SOCK_COMMON
  bpf: Change bpf_sk_storage_*() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON
  bpf: Change bpf_tcp_*_syncookie to accept
    ARG_PTR_TO_BTF_ID_SOCK_COMMON
  bpf: Change bpf_sk_assign to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON

Song Liu (3):
  bpf: Fix comment for helper bpf_current_task_under_cgroup()
  bpf: Enable BPF_PROG_TEST_RUN for raw_tracepoint
  libbpf: Support test run of raw tracepoint programs

Toke Høiland-Jørgensen (2):
  bpf: Support attaching freplace programs to multiple attach points
  libbpf: Add support for freplace attachment in bpf_link_create

YiFei Zhu (2):
  bpf: Add BPF_PROG_BIND_MAP syscall
  libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .rodata section

Yonghong Song (1):
  libbpf: Fix a compilation error with xsk.c for ubuntu 16.04

 include/uapi/linux/bpf.h |  118 ++-
 src/bpf.c                |   67 +-
 src/bpf.h                |   39 +-
 src/btf.c                | 1851 ++++++++++++++++++++++++++++++++------
 src/btf.h                |   51 ++
 src/btf_dump.c           |    9 +-
 src/hashmap.h            |   12 +
 src/libbpf.c             |  113 ++-
 src/libbpf.h             |    3 +
 src/libbpf.map           |   28 +
 src/libbpf_internal.h    |    8 +
 src/xsk.c                |    1 +
 12 files changed, 1997 insertions(+), 303 deletions(-)

--
2.24.1
2020-09-29 18:29:49 -07:00
Andrii Nakryiko
760f71ec87 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-09-29 18:29:49 -07:00
Andrii Nakryiko
91e666c94c libbpf: Fix uninitialized variable in btf_parse_type_sec
Fix obvious unitialized variable use that wasn't reported by compiler. libbpf
Makefile changes to catch such errors are added separately.

Fixes: 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200929220604.833631-1-andriin@fb.com
2020-09-29 18:29:49 -07:00
Toke Høiland-Jørgensen
e40af4de0c libbpf: Add support for freplace attachment in bpf_link_create
This adds support for supplying a target btf ID for the bpf_link_create()
operation, and adds a new bpf_program__attach_freplace() high-level API for
attaching freplace functions with a target.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/160138355387.48470.18026176785351166890.stgit@toke.dk
2020-09-29 18:29:49 -07:00
Toke Høiland-Jørgensen
5e359219aa bpf: Support attaching freplace programs to multiple attach points
This enables support for attaching freplace programs to multiple attach
points. It does this by amending the UAPI for bpf_link_Create with a target
btf ID that can be used to supply the new attachment point along with the
target program fd. The target must be compatible with the target that was
supplied at program load time.

The implementation reuses the checks that were factored out of
check_attach_btf_id() to ensure compatibility between the BTF types of the
old and new attachment. If these match, a new bpf_tracing_link will be
created for the new attach target, allowing multiple attachments to
co-exist simultaneously.

The code could theoretically support multiple-attach of other types of
tracing programs as well, but since I don't have a use case for any of
those, there is no API support for doing so.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/160138355169.48470.17165680973640685368.stgit@toke.dk
2020-09-29 18:29:49 -07:00
Andrii Nakryiko
488110df60 libbpf: Support BTF loading and raw data output in both endianness
Teach BTF to recognized wrong endianness and transparently convert it
internally to host endianness. Original endianness of BTF will be preserved
and used during btf__get_raw_data() to convert resulting raw data to the same
endianness and a source raw_data. This means that little-endian host can parse
big-endian BTF with no issues, all the type data will be presented to the
client application in native endianness, but when it's time for emitting BTF
to persist it in a file (e.g., after BTF deduplication), original non-native
endianness will be preserved and stored.

It's possible to query original endianness of BTF data with new
btf__endianness() API. It's also possible to override desired output
endianness with btf__set_endianness(), so that if application needs to load,
say, big-endian BTF and store it as little-endian BTF, it's possible to
manually override this. If btf__set_endianness() was used to change
endianness, btf__endianness() will reflect overridden endianness.

Given there are no known use cases for supporting cross-endianness for
.BTF.ext, loading .BTF.ext in non-native endianness is not supported.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200929043046.1324350-3-andriin@fb.com
2020-09-29 18:29:49 -07:00
Andrii Nakryiko
f007a6bfdf selftests/bpf: Test BTF writing APIs
Add selftests for BTF writer APIs.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200929020533.711288-4-andriin@fb.com
2020-09-29 18:29:49 -07:00
Andrii Nakryiko
6f90197ab0 libbpf: Add btf__str_by_offset() as a more generic variant of name_by_offset
BTF strings are used not just for names, they can be arbitrary strings used
for CO-RE relocations, line/func infos, etc. Thus "name_by_offset" terminology
is too specific and might be misleading. Instead, introduce
btf__str_by_offset() API which uses generic string terminology.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200929020533.711288-3-andriin@fb.com
2020-09-29 18:29:49 -07:00
Andrii Nakryiko
a388fcb0f5 libbpf: Add BTF writing APIs
Add APIs for appending new BTF types at the end of BTF object.

Each BTF kind has either one API of the form btf__add_<kind>(). For types
that have variable amount of additional items (struct/union, enum, func_proto,
datasec), additional API is provided to emit each such item. E.g., for
emitting a struct, one would use the following sequence of API calls:

btf__add_struct(...);
btf__add_field(...);
...
btf__add_field(...);

Each btf__add_field() will ensure that the last BTF type is of STRUCT or
UNION kind and will automatically increment that type's vlen field.

All the strings are provided as C strings (const char *), not a string offset.
This significantly improves usability of BTF writer APIs. All such strings
will be automatically appended to string section or existing string will be
re-used, if such string was already added previously.

Each API attempts to do all the reasonable validations, like enforcing
non-empty names for entities with required names, proper value bounds, various
bit offset restrictions, etc.

Type ID validation is minimal because it's possible to emit a type that refers
to type that will be emitted later, so libbpf has no way to enforce such
cases. User must be careful to properly emit all the necessary types and
specify type IDs that will be valid in the finally generated BTF.

Each of btf__add_<kind>() APIs return new type ID on success or negative
value on error. APIs like btf__add_field() that emit additional items
return zero on success and negative value on error.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200929020533.711288-2-andriin@fb.com
2020-09-29 18:29:49 -07:00
Alan Maguire
2654268c79 bpf: Add bpf_seq_printf_btf helper
A helper is added to allow seq file writing of kernel data
structures using vmlinux BTF.  Its signature is

long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr,
                        u32 btf_ptr_size, u64 flags);

Flags and struct btf_ptr definitions/use are identical to the
bpf_snprintf_btf helper, and the helper returns 0 on success
or a negative error value.

Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-8-git-send-email-alan.maguire@oracle.com
2020-09-29 18:29:49 -07:00
Alan Maguire
e7647823a1 bpf: Add bpf_snprintf_btf helper
A helper is added to support tracing kernel type information in BPF
using the BPF Type Format (BTF).  Its signature is

long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr,
		      u32 btf_ptr_size, u64 flags);

struct btf_ptr * specifies

- a pointer to the data to be traced
- the BTF id of the type of data pointed to
- a flags field is provided for future use; these flags
  are not to be confused with the BTF_F_* flags
  below that control how the btf_ptr is displayed; the
  flags member of the struct btf_ptr may be used to
  disambiguate types in kernel versus module BTF, etc;
  the main distinction is the flags relate to the type
  and information needed in identifying it; not how it
  is displayed.

For example a BPF program with a struct sk_buff *skb
could do the following:

	static struct btf_ptr b = { };

	b.ptr = skb;
	b.type_id = __builtin_btf_type_id(struct sk_buff, 1);
	bpf_snprintf_btf(str, sizeof(str), &b, sizeof(b), 0, 0);

Default output looks like this:

(struct sk_buff){
 .transport_header = (__u16)65535,
 .mac_header = (__u16)65535,
 .end = (sk_buff_data_t)192,
 .head = (unsigned char *)0x000000007524fd8b,
 .data = (unsigned char *)0x000000007524fd8b,
 .truesize = (unsigned int)768,
 .users = (refcount_t){
  .refs = (atomic_t){
   .counter = (int)1,
  },
 },
}

Flags modifying display are as follows:

- BTF_F_COMPACT:	no formatting around type information
- BTF_F_NONAME:		no struct/union member names/types
- BTF_F_PTR_RAW:	show raw (unobfuscated) pointer values;
			equivalent to %px.
- BTF_F_ZERO:		show zero-valued struct/union members;
			they are not displayed by default

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-4-git-send-email-alan.maguire@oracle.com
2020-09-29 18:29:49 -07:00
Andrii Nakryiko
3cfff16611 libbpf: Add btf__new_empty() to create an empty BTF object
Add an ability to create an empty BTF object from scratch. This is going to be
used by pahole for BTF encoding. And also by selftest for convenient creation
of BTF objects.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200926011357.2366158-7-andriin@fb.com
2020-09-29 18:29:49 -07:00
Andrii Nakryiko
7ac1547f32 libbpf: Allow modification of BTF and add btf__add_str API
Allow internal BTF representation to switch from default read-only mode, in
which raw BTF data is a single non-modifiable block of memory with BTF header,
types, and strings layed out sequentially and contiguously in memory, into
a writable representation with types and strings data split out into separate
memory regions, that can be dynamically expanded.

Such writable internal representation is transparent to users of libbpf APIs,
but allows to append new types and strings at the end of BTF, which is
a typical use case when generating BTF programmatically. All the basic
guarantees of BTF types and strings layout is preserved, i.e., user can get
`struct btf_type *` pointer and read it directly. Such btf_type pointers might
be invalidated if BTF is modified, so some care is required in such mixed
read/write scenarios.

Switch from read-only to writable configuration happens automatically the
first time when user attempts to modify BTF by either adding a new type or new
string. It is still possible to get raw BTF data, which is a single piece of
memory that can be persisted in ELF section or into a file as raw BTF. Such
raw data memory is also still owned by BTF and will be freed either when BTF
object is freed or if another modification to BTF happens, as any modification
invalidates BTF raw representation.

This patch adds the first two BTF manipulation APIs: btf__add_str(), which
allows to add arbitrary strings to BTF string section, and btf__find_str()
which allows to find existing string offset, but not add it if it's missing.
All the added strings are automatically deduplicated. This is achieved by
maintaining an additional string lookup index for all unique strings. Such
index is built when BTF is switched to modifiable mode. If at that time BTF
strings section contained duplicate strings, they are not de-duplicated. This
is done specifically to not modify the existing content of BTF (types, their
string offsets, etc), which can cause confusion and is especially important
property if there is struct btf_ext associated with struct btf. By following
this "imperfect deduplication" process, btf_ext is kept consitent and correct.
If deduplication of strings is necessary, it can be forced by doing BTF
deduplication, at which point all the strings will be eagerly deduplicated and
all string offsets both in struct btf and struct btf_ext will be updated.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200926011357.2366158-6-andriin@fb.com
2020-09-29 18:29:49 -07:00
Andrii Nakryiko
897a0e79bd libbpf: Extract generic string hashing function for reuse
Calculating a hash of zero-terminated string is a common need when using
hashmap, so extract it for reuse.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200926011357.2366158-5-andriin@fb.com
2020-09-29 18:29:49 -07:00
Andrii Nakryiko
063eed6105 libbpf: Generalize common logic for managing dynamically-sized arrays
Managing dynamically-sized array is a common, but not trivial functionality,
which significant amount of logic and code to implement properly. So instead
of re-implementing it all the time, extract it into a helper function ans
reuse.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200926011357.2366158-4-andriin@fb.com
2020-09-29 18:29:49 -07:00
Andrii Nakryiko
71e8af71c5 libbpf: Remove assumption of single contiguous memory for BTF data
Refactor internals of struct btf to remove assumptions that BTF header, type
data, and string data are layed out contiguously in a memory in a single
memory allocation. Now we have three separate pointers pointing to the start
of each respective are: header, types, strings. In the next patches, these
pointers will be re-assigned to point to independently allocated memory areas,
if BTF needs to be modified.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200926011357.2366158-3-andriin@fb.com
2020-09-29 18:29:49 -07:00
Andrii Nakryiko
4023fbd99e libbpf: Refactor internals of BTF type index
Refactor implementation of internal BTF type index to not use direct pointers.
Instead it uses offset relative to the start of types data section. This
allows for types data to be reallocatable, enabling implementation of
modifiable BTF.

As now getting type by ID has an extra indirection step, convert all internal
type lookups to a new helper btf_type_id(), that returns non-const pointer to
a type by its ID.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200926011357.2366158-2-andriin@fb.com
2020-09-29 18:29:49 -07:00
Song Liu
b2e50daea8 libbpf: Support test run of raw tracepoint programs
Add bpf_prog_test_run_opts() with support of new fields in bpf_attr.test,
namely, flags and cpu. Also extend _opts operations to support outputs via
opts.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200925205432.1777-3-songliubraving@fb.com
2020-09-29 18:29:49 -07:00
Song Liu
b6f1385458 bpf: Enable BPF_PROG_TEST_RUN for raw_tracepoint
Add .test_run for raw_tracepoint. Also, introduce a new feature that runs
the target program on a specific CPU. This is achieved by a new flag in
bpf_attr.test, BPF_F_TEST_RUN_ON_CPU. When this flag is set, the program
is triggered on cpu with id bpf_attr.test.cpu. This feature is needed for
BPF programs that handle perf_event and other percpu resources, as the
program can access these resource locally.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200925205432.1777-2-songliubraving@fb.com
2020-09-29 18:29:49 -07:00
Martin KaFai Lau
146bdd7535 bpf: Change bpf_sk_assign to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON
This patch changes the bpf_sk_assign() to take
ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer
returned by the bpf_skc_to_*() helpers also.

The bpf_sk_lookup_assign() is taking ARG_PTR_TO_SOCKET_"OR_NULL".  Meaning
it specifically takes a literal NULL.  ARG_PTR_TO_BTF_ID_SOCK_COMMON
does not allow a literal NULL, so another ARG type is required
for this purpose and another follow-up patch can be used if
there is such need.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200925000415.3857374-1-kafai@fb.com
2020-09-29 18:29:49 -07:00
Martin KaFai Lau
76ee807ee3 bpf: Change bpf_tcp_*_syncookie to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON
This patch changes the bpf_tcp_*_syncookie() to take
ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer
returned by the bpf_skc_to_*() helpers also.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Lorenz Bauer <lmb@cloudflare.com>
Link: https://lore.kernel.org/bpf/20200925000409.3856725-1-kafai@fb.com
2020-09-29 18:29:49 -07:00
Martin KaFai Lau
32e5add48f bpf: Change bpf_sk_storage_*() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON
This patch changes the bpf_sk_storage_*() to take
ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer
returned by the bpf_skc_to_*() helpers also.

A micro benchmark has been done on a "cgroup_skb/egress" bpf program
which does a bpf_sk_storage_get().  It was driven by netperf doing
a 4096 connected UDP_STREAM test with 64bytes packet.
The stats from "kernel.bpf_stats_enabled" shows no meaningful difference.

The sk_storage_get_btf_proto, sk_storage_delete_btf_proto,
btf_sk_storage_get_proto, and btf_sk_storage_delete_proto are
no longer needed, so they are removed.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Lorenz Bauer <lmb@cloudflare.com>
Link: https://lore.kernel.org/bpf/20200925000402.3856307-1-kafai@fb.com
2020-09-29 18:29:49 -07:00
Martin KaFai Lau
120e99ccd8 bpf: Change bpf_sk_release and bpf_sk_*cgroup_id to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON
The previous patch allows the networking bpf prog to use the
bpf_skc_to_*() helpers to get a PTR_TO_BTF_ID socket pointer,
e.g. "struct tcp_sock *".  It allows the bpf prog to read all the
fields of the tcp_sock.

This patch changes the bpf_sk_release() and bpf_sk_*cgroup_id()
to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will
work with the pointer returned by the bpf_skc_to_*() helpers
also.  For example, the following will work:

	sk = bpf_skc_lookup_tcp(skb, tuple, tuplen, BPF_F_CURRENT_NETNS, 0);
	if (!sk)
		return;
	tp = bpf_skc_to_tcp_sock(sk);
	if (!tp) {
		bpf_sk_release(sk);
		return;
	}
	lsndtime = tp->lsndtime;
	/* Pass tp to bpf_sk_release() will also work */
	bpf_sk_release(tp);

Since PTR_TO_BTF_ID could be NULL, the helper taking
ARG_PTR_TO_BTF_ID_SOCK_COMMON has to check for NULL at runtime.

A btf_id of "struct sock" may not always mean a fullsock.  Regardless
the helper's running context may get a non-fullsock or not,
considering fullsock check/handling is pretty cheap, it is better to
keep the same verifier expectation on helper that takes ARG_PTR_TO_BTF_ID*
will be able to handle the minisock situation.  In the bpf_sk_*cgroup_id()
case,  it will try to get a fullsock by using sk_to_full_sk() as its
skb variant bpf_sk"b"_*cgroup_id() has already been doing.

bpf_sk_release can already handle minisock, so nothing special has to
be done.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200925000356.3856047-1-kafai@fb.com
2020-09-29 18:29:49 -07:00
YiFei Zhu
3cf3c6cd26 libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .rodata section
The patch adds a simple wrapper bpf_prog_bind_map around the syscall.
When the libbpf tries to load a program, it will probe the kernel for
the support of this syscall and unconditionally bind .rodata section
to the program.

Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: YiFei Zhu <zhuyifei1999@gmail.com>
Link: https://lore.kernel.org/bpf/20200915234543.3220146-4-sdf@google.com
2020-09-29 18:29:49 -07:00
YiFei Zhu
f38fccf3cc bpf: Add BPF_PROG_BIND_MAP syscall
This syscall binds a map to a program. Returns success if the map is
already bound to the program.

Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Cc: YiFei Zhu <zhuyifei1999@gmail.com>
Link: https://lore.kernel.org/bpf/20200915234543.3220146-3-sdf@google.com
2020-09-29 18:29:49 -07:00
Yonghong Song
08dc84e54a libbpf: Fix a compilation error with xsk.c for ubuntu 16.04
When syncing latest libbpf repo to bcc, ubuntu 16.04 (4.4.0 LTS kernel)
failed compilation for xsk.c:
  In file included from /tmp/debuild.0jkauG/bcc/src/cc/libbpf/src/xsk.c:23:0:
  /tmp/debuild.0jkauG/bcc/src/cc/libbpf/src/xsk.c: In function ‘xsk_get_ctx’:
  /tmp/debuild.0jkauG/bcc/src/cc/libbpf/include/linux/list.h:81:9: warning: implicit
  declaration of function ‘container_of’ [-Wimplicit-function-declaration]
           container_of(ptr, type, member)
           ^
  /tmp/debuild.0jkauG/bcc/src/cc/libbpf/include/linux/list.h:83:9: note: in expansion
  of macro ‘list_entry’
           list_entry((ptr)->next, type, member)
  ...
  src/cc/CMakeFiles/bpf-static.dir/build.make:209: recipe for target
  'src/cc/CMakeFiles/bpf-static.dir/libbpf/src/xsk.c.o' failed

Commit 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
added include file <linux/list.h>, which uses macro "container_of".
xsk.c file also includes <linux/ethtool.h> before <linux/list.h>.

In a more recent distro kernel, <linux/ethtool.h> includes <linux/kernel.h>
which contains the macro definition for "container_of". So compilation is all fine.
But in ubuntu 16.04 kernel, <linux/ethtool.h> does not contain <linux/kernel.h>
which caused the above compilation error.

Let explicitly add <linux/kernel.h> in xsk.c to avoid compilation error
in old distro's.

Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200914223210.1831262-1-yhs@fb.com
2020-09-29 18:29:49 -07:00
Song Liu
0102f65d72 bpf: Fix comment for helper bpf_current_task_under_cgroup()
This should be "current" not "skb".

Fixes: c6b5fb8690fa ("bpf: add documentation for eBPF helpers (42-50)")
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/bpf/20200910203314.70018-1-songliubraving@fb.com
2020-09-29 18:29:49 -07:00
Andrii Nakryiko
f700cf6667 vmtests: unblacklist few tests
They should be fixed by now.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-09-28 15:09:26 -07:00
Julia Kartseva
99921245f0 vmtest: update root fs, whitelist sk_{assign|lookup} test
1. Update mkrootfs.sh building root fs
- Remove /etc/fstab from root fs and mount each fs type separately in
S10-mount script.
- devtmpfs can be already mounted prior to S10-mount execution so make
it opt-out. This addresses [0].
- set -eux for scripts
2. Add iproute2 to root fs and whitelist sk_assign test. Addresses
[1][2]. Update INDEX file with 2020-09-27 version.

[0] https://github.com/libbpf/libbpf/pull/145#issuecomment-609673493
[1] https://github.com/libbpf/libbpf/pull/144
[2] https://github.com/libbpf/libbpf/pull/145
2020-09-28 13:09:06 -07:00
Andrii Nakryiko
37c5973bb7 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   2f7de9865ba3cbfcf8b504f07154fdb6124176a4
Checkpoint bpf-next commit: 2f7de9865ba3cbfcf8b504f07154fdb6124176a4
Baseline bpf commit:        746f534a4809e07f427f7d13d10f3a6a9641e5c3
Checkpoint bpf commit:      87f92ac4c12758c4da3bbe4393f1d884b610b8a6

Andrii Nakryiko (1):
  libbpf: Fix XDP program load regression for old kernels

Tony Ambardar (1):
  libbpf: Fix native endian assumption when parsing BTF

 src/btf.c    | 6 ++++++
 src/libbpf.c | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

--
2.24.1
2020-09-24 10:56:51 -07:00
Andrii Nakryiko
2200fefd87 libbpf: Fix XDP program load regression for old kernels
Fix regression in libbpf, introduced by XDP link change, which causes XDP
programs to fail to be loaded into kernel due to specified BPF_XDP
expected_attach_type. While kernel doesn't enforce expected_attach_type for
BPF_PROG_TYPE_XDP, some old kernels already support XDP program, but they
don't yet recognize expected_attach_type field in bpf_attr, so setting it to
non-zero value causes program load to fail.

Luckily, libbpf already has a mechanism to deal with such cases, so just make
expected_attach_type optional for XDP programs.

Fixes: dc8698cac7aa ("libbpf: Add support for BPF XDP link")
Reported-by: Nikita Shirokov <tehnerd@tehnerd.com>
Reported-by: Udip Pant <udippant@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200924171705.3803628-1-andriin@fb.com
2020-09-24 10:56:51 -07:00
Tony Ambardar
5f50b4b8c9 libbpf: Fix native endian assumption when parsing BTF
Code in btf__parse_raw() fails to detect raw BTF of non-native endianness
and assumes it must be ELF data, which then fails to parse as ELF and
yields a misleading error message:

  root:/# bpftool btf dump file /sys/kernel/btf/vmlinux
  libbpf: failed to get EHDR from /sys/kernel/btf/vmlinux

For example, this could occur after cross-compiling a BTF-enabled kernel
for a target with non-native endianness, which is currently unsupported.

Check for correct endianness and emit a clearer error message:

  root:/# bpftool btf dump file /sys/kernel/btf/vmlinux
  libbpf: non-native BTF endianness is not supported

Fixes: 94a1fedd63ed ("libbpf: Add btf__parse_raw() and generic btf__parse() APIs")
Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/90f81508ecc57bc0da318e0fe0f45cfe49b17ea7.1600417359.git.Tony.Ambardar@gmail.com
2020-09-24 10:56:51 -07:00
Andrii Nakryiko
787abf721e vmtests: ensure rst2man is installed, needed for bpftool selftests
Ensure rst2man package is installed. This is now a dependency for
selftests/bpf.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-09-11 10:09:12 -07:00
Andrii Nakryiko
820813bd1b sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   f9bec5d756b30d5b21aa5ff9b7d5d115741517c1
Checkpoint bpf-next commit: 2f7de9865ba3cbfcf8b504f07154fdb6124176a4
Baseline bpf commit:        e6135df45e21f1815a5948f452593124b1544a3e
Checkpoint bpf commit:      746f534a4809e07f427f7d13d10f3a6a9641e5c3

Quentin Monnet (1):
  tools, bpf: Synchronise BPF UAPI header with tools

 include/uapi/linux/bpf.h | 87 +++++++++++++++++++++-------------------
 1 file changed, 45 insertions(+), 42 deletions(-)

--
2.24.1
2020-09-11 10:09:12 -07:00
Andrii Nakryiko
8333e57e91 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-09-11 10:09:12 -07:00
Quentin Monnet
8052936468 tools, bpf: Synchronise BPF UAPI header with tools
Synchronise the bpf.h header under tools, to report the fixes recently
brought to the documentation for the BPF helpers.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200904161454.31135-4-quentin@isovalent.com
2020-09-11 10:09:12 -07:00
Vladimír Čunát
8b14cb43ff Makefile: link against zlib
Without this we would be missing symbols, as shown e.g. by
ldd -r libbpf.so
2020-09-09 00:03:51 -07:00
Andrii Nakryiko
011700e68d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   95cec14b0308085c028c4d4fb3d09fad3902b4c3
Checkpoint bpf-next commit: f9bec5d756b30d5b21aa5ff9b7d5d115741517c1
Baseline bpf commit:        e6135df45e21f1815a5948f452593124b1544a3e
Checkpoint bpf commit:      e6135df45e21f1815a5948f452593124b1544a3e

Andrii Nakryiko (2):
  libbpf: Fix another __u64 cast in printf
  libbpf: Fix potential multiplication overflow

 src/libbpf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--
2.24.1
2020-09-04 14:35:25 -07:00
Andrii Nakryiko
106e7dcf58 libbpf: Fix potential multiplication overflow
Detected by LGTM static analyze in Github repo, fix potential multiplication
overflow before result is casted to size_t.

Fixes: 8505e8709b5e ("libbpf: Implement generalized .BTF.ext func/line info adjustment")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200904041611.1695163-2-andriin@fb.com
2020-09-04 14:35:25 -07:00
Andrii Nakryiko
3a2ebfc21e libbpf: Fix another __u64 cast in printf
Another issue of __u64 needing either %lu or %llu, depending on the
architecture. Fix with cast to `unsigned long long`.

Fixes: 7e06aad52929 ("libbpf: Add multi-prog section support for struct_ops")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200904041611.1695163-1-andriin@fb.com
2020-09-04 14:35:25 -07:00
Andrii Nakryiko
91001a9923 include: implement list_empty() and list_for_each_entry()
Implement list_empty() function and list_for_each_entry() macro, newly used by
xsk.c in 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
(Linux commit sha).

Fixes: 5f630710f52e ("libbpf: Support shared umems between queues and devices")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-09-03 21:21:34 -07:00
Andrii Nakryiko
6384ee1968 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   2e80be60c465a4f8559327340eaf40845dd7797a
Checkpoint bpf-next commit: 95cec14b0308085c028c4d4fb3d09fad3902b4c3
Baseline bpf commit:        7787b6fc938e16aa418613c4a765c1dbb268ed9f
Checkpoint bpf commit:      e6135df45e21f1815a5948f452593124b1544a3e

Alexei Starovoitov (3):
  bpf: Introduce sleepable BPF programs
  bpf: Add bpf_copy_from_user() helper.
  libbpf: Support sleepable progs

Andrii Nakryiko (7):
  libbpf: Ensure ELF symbols table is found before further ELF
    processing
  libbpf: Parse multi-function sections into multiple BPF programs
  libbpf: Support CO-RE relocations for multi-prog sections
  libbpf: Make RELO_CALL work for multi-prog sections and sub-program
    calls
  libbpf: Implement generalized .BTF.ext func/line info adjustment
  libbpf: Add multi-prog section support for struct_ops
  libbpf: Deprecate notion of BPF program "title" in favor of "section
    name"

Magnus Karlsson (1):
  libbpf: Support shared umems between queues and devices

Tony Ambardar (1):
  libbpf: Fix build failure from uninitialized variable warning

Yonghong Song (1):
  bpf: Make bpf_link_info.iter similar to bpf_iter_link_info

 include/uapi/linux/bpf.h |   22 +-
 src/btf.h                |   18 +-
 src/libbpf.c             | 1314 +++++++++++++++++++++++++-------------
 src/libbpf.h             |    5 +-
 src/libbpf.map           |    2 +
 src/libbpf_common.h      |    2 +
 src/xsk.c                |  376 +++++++----
 src/xsk.h                |    9 +
 8 files changed, 1156 insertions(+), 592 deletions(-)

--
2.24.1
2020-09-03 21:21:34 -07:00
Andrii Nakryiko
3f9447bf92 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-09-03 21:21:34 -07:00
Tony Ambardar
3b80b6c77e libbpf: Fix build failure from uninitialized variable warning
While compiling libbpf, some GCC versions (at least 8.4.0) have difficulty
determining control flow and a emit warning for potentially uninitialized
usage of 'map', which results in a build error if using "-Werror":

In file included from libbpf.c:56:
libbpf.c: In function '__bpf_object__open':
libbpf_internal.h:59:2: warning: 'map' may be used uninitialized in this function [-Wmaybe-uninitialized]
  libbpf_print(level, "libbpf: " fmt, ##__VA_ARGS__); \
  ^~~~~~~~~~~~
libbpf.c:5032:18: note: 'map' was declared here
  struct bpf_map *map, *targ_map;
                  ^~~

The warning/error is false based on code inspection, so silence it with a
NULL initialization.

Fixes: 646f02ffdd49 ("libbpf: Add BTF-defined map-in-map support")
Reference: 063e68813391 ("libbpf: Fix false uninitialized variable warning")
Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200831000304.1696435-1-Tony.Ambardar@gmail.com
2020-09-03 21:21:34 -07:00
Andrii Nakryiko
78cdb58bdf libbpf: Deprecate notion of BPF program "title" in favor of "section name"
BPF program title is ambigious and misleading term. It is ELF section name, so
let's just call it that and deprecate bpf_program__title() API in favor of
bpf_program__section_name().

Additionally, using bpf_object__find_program_by_title() is now inherently
dangerous and ambiguous, as multiple BPF program can have the same section
name. So deprecate this API as well and recommend to switch to non-ambiguous
bpf_object__find_program_by_name().

Internally, clean up usage and mis-usage of BPF program section name for
denoting BPF program name. Shorten the field name to prog->sec_name to be
consistent with all other prog->sec_* variables.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200903203542.15944-11-andriin@fb.com
2020-09-03 21:21:34 -07:00
Andrii Nakryiko
4b60f82516 libbpf: Add multi-prog section support for struct_ops
Adjust struct_ops handling code to work with multi-program ELF sections
properly.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200903203542.15944-7-andriin@fb.com
2020-09-03 21:21:34 -07:00
Andrii Nakryiko
2b28b4fa4d libbpf: Implement generalized .BTF.ext func/line info adjustment
Complete multi-prog sections and multi sub-prog support in libbpf by properly
adjusting .BTF.ext's line and function information. Mark exposed
btf_ext__reloc_func_info() and btf_ext__reloc_func_info() APIs as deprecated.
These APIs have simplistic assumption that all sub-programs are going to be
appended to all main BPF programs, which doesn't hold in real life. It's
unlikely there are any users of this API, as it's very libbpf
internals-specific.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200903203542.15944-6-andriin@fb.com
2020-09-03 21:21:34 -07:00
Andrii Nakryiko
448789ba27 libbpf: Make RELO_CALL work for multi-prog sections and sub-program calls
This patch implements general and correct logic for bpf-to-bpf sub-program
calls. Only sub-programs used (called into) from entry-point (main) BPF
program are going to be appended at the end of main BPF program. This ensures
that BPF verifier won't encounter any dead code due to copying unreferenced
sub-program. This change means that each entry-point (main) BPF program might
have a different set of sub-programs appended to it and potentially in
different order. This has implications on how sub-program call relocations
need to be handled, described below.

All relocations are now split into two categores: data references (maps and
global variables) and code references (sub-program calls). This distinction is
important because data references need to be relocated just once per each BPF
program and sub-program. These relocation are agnostic to instruction
locations, because they are not code-relative and they are relocating against
static targets (maps, variables with fixes offsets, etc).

Sub-program RELO_CALL relocations, on the other hand, are highly-dependent on
code position, because they are recorded as instruction-relative offset. So
BPF sub-programs (those that do calls into other sub-programs) can't be
relocated once, they need to be relocated each time such a sub-program is
appended at the end of the main entry-point BPF program. As mentioned above,
each main BPF program might have different subset and differen order of
sub-programs, so call relocations can't be done just once. Splitting data
reference and calls relocations as described above allows to do this
efficiently and cleanly.

bpf_object__find_program_by_name() will now ignore non-entry BPF programs.
Previously one could have looked up '.text' fake BPF program, but the
existence of such BPF program was always an implementation detail and you
can't do much useful with it. Now, though, all non-entry sub-programs get
their own BPF program with name corresponding to a function name, so there is
no more '.text' name for BPF program. This means there is no regression,
effectively, w.r.t.  API behavior. But this is important aspect to highlight,
because it's going to be critical once libbpf implements static linking of BPF
programs. Non-entry static BPF programs will be allowed to have conflicting
names, but global and main-entry BPF program names should be unique. Just like
with normal user-space linking process. So it's important to restrict this
aspect right now, keep static and non-entry functions as internal
implementation details, and not have to deal with regressions in behavior
later.

This patch leaves .BTF.ext adjustment as is until next patch.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200903203542.15944-5-andriin@fb.com
2020-09-03 21:21:34 -07:00
Andrii Nakryiko
a3abae5122 libbpf: Support CO-RE relocations for multi-prog sections
Fix up CO-RE relocation code to handle relocations against ELF sections
containing multiple BPF programs. This requires lookup of a BPF program by its
section name and instruction index it contains. While it could have been done
as a simple loop, it could run into performance issues pretty quickly, as
number of CO-RE relocations can be quite large in real-world applications, and
each CO-RE relocation incurs BPF program look up now. So instead of simple
loop, implement a binary search by section name + insn offset.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200903203542.15944-4-andriin@fb.com
2020-09-03 21:21:34 -07:00
Andrii Nakryiko
bb5e70706a libbpf: Parse multi-function sections into multiple BPF programs
Teach libbpf how to parse code sections into potentially multiple bpf_program
instances, based on ELF FUNC symbols. Each BPF program will keep track of its
position within containing ELF section for translating section instruction
offsets into program instruction offsets: regardless of BPF program's location
in ELF section, it's first instruction is always at local instruction offset
0, so when libbpf is working with relocations (which use section-based
instruction offsets) this is critical to make proper translations.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200903203542.15944-3-andriin@fb.com
2020-09-03 21:21:34 -07:00
Andrii Nakryiko
994aae7fc8 libbpf: Ensure ELF symbols table is found before further ELF processing
libbpf ELF parsing logic might need symbols available before ELF parsing is
completed, so we need to make sure that symbols table section is found in
a separate pass before all the subsequent sections are processed.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200903203542.15944-2-andriin@fb.com
2020-09-03 21:21:34 -07:00
Magnus Karlsson
a6e9cf1532 libbpf: Support shared umems between queues and devices
Add support for shared umems between hardware queues and devices to
the AF_XDP part of libbpf. This so that zero-copy can be achieved in
applications that want to send and receive packets between HW queues
on one device or between different devices/netdevs.

In order to create sockets that share a umem between hardware queues
and devices, a new function has been added called
xsk_socket__create_shared(). It takes the same arguments as
xsk_socket_create() plus references to a fill ring and a completion
ring. So for every socket that share a umem, you need to have one more
set of fill and completion rings. This in order to maintain the
single-producer single-consumer semantics of the rings.

You can create all the sockets via the new xsk_socket__create_shared()
call, or create the first one with xsk_socket__create() and the rest
with xsk_socket__create_shared(). Both methods work.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/1598603189-32145-14-git-send-email-magnus.karlsson@intel.com
2020-09-03 21:21:34 -07:00
Alexei Starovoitov
06ae1b0e38 libbpf: Support sleepable progs
Pass request to load program as sleepable via ".s" suffix in the section name.
If it happens in the future that all map types and helpers are allowed with
BPF_F_SLEEPABLE flag "fmod_ret/" and "lsm/" can be aliased to "fmod_ret.s/" and
"lsm.s/" to make all lsm and fmod_ret programs sleepable by default. The fentry
and fexit programs would always need to have sleepable vs non-sleepable
distinction, since not all fentry/fexit progs will be attached to sleepable
kernel functions.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: KP Singh <kpsingh@google.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200827220114.69225-5-alexei.starovoitov@gmail.com
2020-09-03 21:21:34 -07:00
Alexei Starovoitov
b228eb84f1 bpf: Add bpf_copy_from_user() helper.
Sleepable BPF programs can now use copy_from_user() to access user memory.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20200827220114.69225-4-alexei.starovoitov@gmail.com
2020-09-03 21:21:34 -07:00
Alexei Starovoitov
5bd7cae11d bpf: Introduce sleepable BPF programs
Introduce sleepable BPF programs that can request such property for themselves
via BPF_F_SLEEPABLE flag at program load time. In such case they will be able
to use helpers like bpf_copy_from_user() that might sleep. At present only
fentry/fexit/fmod_ret and lsm programs can request to be sleepable and only
when they are attached to kernel functions that are known to allow sleeping.

The non-sleepable programs are relying on implicit rcu_read_lock() and
migrate_disable() to protect life time of programs, maps that they use and
per-cpu kernel structures used to pass info between bpf programs and the
kernel. The sleepable programs cannot be enclosed into rcu_read_lock().
migrate_disable() maps to preempt_disable() in non-RT kernels, so the progs
should not be enclosed in migrate_disable() as well. Therefore
rcu_read_lock_trace is used to protect the life time of sleepable progs.

There are many networking and tracing program types. In many cases the
'struct bpf_prog *' pointer itself is rcu protected within some other kernel
data structure and the kernel code is using rcu_dereference() to load that
program pointer and call BPF_PROG_RUN() on it. All these cases are not touched.
Instead sleepable bpf programs are allowed with bpf trampoline only. The
program pointers are hard-coded into generated assembly of bpf trampoline and
synchronize_rcu_tasks_trace() is used to protect the life time of the program.
The same trampoline can hold both sleepable and non-sleepable progs.

When rcu_read_lock_trace is held it means that some sleepable bpf program is
running from bpf trampoline. Those programs can use bpf arrays and preallocated
hash/lru maps. These map types are waiting on programs to complete via
synchronize_rcu_tasks_trace();

Updates to trampoline now has to do synchronize_rcu_tasks_trace() and
synchronize_rcu_tasks() to wait for sleepable progs to finish and for
trampoline assembly to finish.

This is the first step of introducing sleepable progs. Eventually dynamically
allocated hash maps can be allowed and networking program types can become
sleepable too.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20200827220114.69225-3-alexei.starovoitov@gmail.com
2020-09-03 21:21:34 -07:00
Yonghong Song
a454a08f53 bpf: Make bpf_link_info.iter similar to bpf_iter_link_info
bpf_link_info.iter is used by link_query to return bpf_iter_link_info
to user space. Fields may be different, e.g., map_fd vs. map_id, so
we cannot reuse the exact structure. But make them similar, e.g.,

  struct bpf_link_info {
     /* common fields */
     union {
	struct { ... } raw_tracepoint;
	struct { ... } tracing;
	...
	struct {
	    /* common fields for iter */
	    union {
		struct {
		    __u32 map_id;
		} map;
		/* other structs for other targets */
	    };
	};
    };
 };

so the structure is extensible the same way as bpf_iter_link_info.

Fixes: 6b0a249a301e ("bpf: Implement link_query for bpf iterators")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200828051922.758950-1-yhs@fb.com
2020-09-03 21:21:34 -07:00
Andrii Nakryiko
829e50fc15 sync: improve sync script to handle common issues
Few recurring issues are fixed.
1. When there are patches in bpf tree that hasn't been synced yet, but bpf was
   already merged into bpf-next, merged patches would be applied twice,
   causing failures, requiring manual resolution. Now this is handled smarter
   and shouldn't happen.
2. When synced libbpf repo contains fixes from bpf that weren't yet merged
   into bpf-next, those bpf tree changes would cause inconsistency against
   bpf-next tree state. That's expected and usually is pretty easy for human
   to discard during consistency check, but is hard for automation. So instead
   of failing at the very end, ask human whether discrepancies look good.
3. If sync script detected no new patches needed syncing, it previously didn't
   restore linux repo state back. Fixed.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-09-03 20:14:51 -07:00
Andrii Nakryiko
66780a46cb README.md: update Travis CI badge link
Update Travis CI status badge to point to travis-ci.com,
now that libbpf was migrated there.
2020-08-27 10:15:29 -07:00
Andrii Nakryiko
7bc52e6602 vmtests: blacklist 2 new feature tests and (temporarily) 3 existing selftest
Permanently blacklist 2 new selftest on 5.5 and temporarily blacklist
3 existing selftests.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-26 23:30:55 -07:00
Andrii Nakryiko
7267270f5f sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   0fcdfffe80346d015b920228203d0269284d8b13
Checkpoint bpf-next commit: 2e80be60c465a4f8559327340eaf40845dd7797a
Baseline bpf commit:        7787b6fc938e16aa418613c4a765c1dbb268ed9f
Checkpoint bpf commit:      7787b6fc938e16aa418613c4a765c1dbb268ed9f

Alex Gartrell (1):
  libbpf: Fix unintentional success return code in bpf_object__load

Andrii Nakryiko (1):
  libbpf: Fix compilation warnings for 64-bit printf args

Jiri Olsa (1):
  bpf: Add d_path helper

KP Singh (3):
  bpf: Generalize bpf_sk_storage
  bpf: Implement bpf_local_storage for inodes
  bpf: Allow local storage to be used from LSM programs

 include/uapi/linux/bpf.h | 69 +++++++++++++++++++++++++++++++++++++---
 src/libbpf.c             | 10 +++---
 src/libbpf_probes.c      |  5 +--
 3 files changed, 73 insertions(+), 11 deletions(-)

--
2.24.1
2020-08-26 23:30:55 -07:00
Andrii Nakryiko
b16bc44bd3 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-08-26 23:30:55 -07:00
Andrii Nakryiko
4cdad1b34b libbpf: Fix compilation warnings for 64-bit printf args
Fix compilation warnings due to __u64 defined differently as `unsigned long`
or `unsigned long long` on different architectures (e.g., ppc64le differs from
x86-64). Also cast one argument to size_t to fix printf warning of similar
nature.

Fixes: eacaaed784e2 ("libbpf: Implement enum value-based CO-RE relocations")
Fixes: 50e09460d9f8 ("libbpf: Skip well-known ELF sections when iterating ELF")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200827041109.3613090-1-andriin@fb.com
2020-08-26 23:30:55 -07:00
Alex Gartrell
f557d9e1fc libbpf: Fix unintentional success return code in bpf_object__load
There are code paths where EINVAL is returned directly without setting
errno. In that case, errno could be 0, which would mask the
failure. For example, if a careless programmer set log_level to 10000
out of laziness, they would have to spend a long time trying to figure
out why.

Fixes: 4f33ddb4e3e2 ("libbpf: Propagate EPERM to caller on program load")
Signed-off-by: Alex Gartrell <alexgartrell@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200826075549.1858580-1-alexgartrell@gmail.com
2020-08-26 23:30:55 -07:00
Jiri Olsa
e82da07e2d bpf: Add d_path helper
Adding d_path helper function that returns full path for
given 'struct path' object, which needs to be the kernel
BTF 'path' object. The path is returned in buffer provided
'buf' of size 'sz' and is zero terminated.

  bpf_d_path(&file->f_path, buf, size);

The helper calls directly d_path function, so there's only
limited set of function it can be called from. Adding just
very modest set for the start.

Updating also bpf.h tools uapi header and adding 'path' to
bpf_helpers_doc.py script.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-11-jolsa@kernel.org
2020-08-26 23:30:55 -07:00
KP Singh
c42c140954 bpf: Allow local storage to be used from LSM programs
Adds support for both bpf_{sk, inode}_storage_{get, delete} to be used
in LSM programs. These helpers are not used for tracing programs
(currently) as their usage is tied to the life-cycle of the object and
should only be used where the owning object won't be freed (when the
owning object is passed as an argument to the LSM hook). Thus, they
are safer to use in LSM hooks than tracing. Usage of local storage in
tracing programs will probably follow a per function based whitelist
approach.

Since the UAPI helper signature for bpf_sk_storage expect a bpf_sock,
it, leads to a compilation warning for LSM programs, it's also updated
to accept a void * pointer instead.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-7-kpsingh@chromium.org
2020-08-26 23:30:55 -07:00
KP Singh
e565f2bfe9 bpf: Implement bpf_local_storage for inodes
Similar to bpf_local_storage for sockets, add local storage for inodes.
The life-cycle of storage is managed with the life-cycle of the inode.
i.e. the storage is destroyed along with the owning inode.

The BPF LSM allocates an __rcu pointer to the bpf_local_storage in the
security blob which are now stackable and can co-exist with other LSMs.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-6-kpsingh@chromium.org
2020-08-26 23:30:55 -07:00
KP Singh
2bd0d158d4 bpf: Generalize bpf_sk_storage
Refactor the functionality in bpf_sk_storage.c so that concept of
storage linked to kernel objects can be extended to other objects like
inode, task_struct etc.

Each new local storage will still be a separate map and provide its own
set of helpers. This allows for future object specific extensions and
still share a lot of the underlying implementation.

This includes the changes suggested by Martin in:

  https://lore.kernel.org/bpf/20200725013047.4006241-1-kafai@fb.com/

adding new map operations to support bpf_local_storage maps:

* storages for different kernel objects to optionally have different
  memory charging strategy (map_local_storage_charge,
  map_local_storage_uncharge)
* Functionality to extract the storage pointer from a pointer to the
  owning object (map_owner_storage_ptr)

Co-developed-by: Martin KaFai Lau <kafai@fb.com>

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-4-kpsingh@chromium.org
2020-08-26 23:30:55 -07:00
Andrii Nakryiko
bbe442da7a sync: allow 3-way merge for patching to simplify manual conflict resolution
Allowing --3way leaves conflicts in the local files, which makes manual
conflict resolution so much easier.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-25 00:53:18 -07:00
Andrii Nakryiko
3f7b5b32b8 vmtests: blacklist tcp_hdr_options selftest for 5.5
Blacklist selftests for a new feature, not supported by 5.5 kernel.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-25 00:53:18 -07:00
Andrii Nakryiko
5a913e9401 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   dca5612f8eb9d0cf1dc254eb2adff1f16a588a7d
Checkpoint bpf-next commit: 0fcdfffe80346d015b920228203d0269284d8b13
Baseline bpf commit:        4af7b32f84aa4cd60e39b355bc8a1eab6cd8d8a4
Checkpoint bpf commit:      7787b6fc938e16aa418613c4a765c1dbb268ed9f

Andrii Nakryiko (6):
  libbpf: Factor out common ELF operations and improve logging
  libbpf: Add __noinline macro to bpf_helpers.h
  libbpf: Skip well-known ELF sections when iterating ELF
  libbpf: Normalize and improve logging across few functions
  libbpf: Avoid false unuinitialized variable warning in
    bpf_core_apply_relo
  libbpf: Fix type compatibility check copy-paste error

Martin KaFai Lau (6):
  tcp: bpf: Add TCP_BPF_DELACK_MAX setsockopt
  tcp: bpf: Add TCP_BPF_RTO_MIN for bpf_setsockopt
  bpf: tcp: Add bpf_skops_parse_hdr()
  bpf: tcp: Add bpf_skops_hdr_opt_len() and bpf_skops_write_hdr_opt()
  bpf: tcp: Allow bpf prog to write and parse TCP header option
  tcp: bpf: Optionally store mac header in TCP_SAVE_SYN

 include/uapi/linux/bpf.h | 306 ++++++++++++++++++++++-
 src/bpf_helpers.h        |   3 +
 src/libbpf.c             | 525 +++++++++++++++++++++++----------------
 3 files changed, 623 insertions(+), 211 deletions(-)

--
2.24.1
2020-08-25 00:53:18 -07:00
Andrii Nakryiko
cead23ac75 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-08-25 00:53:18 -07:00
Andrii Nakryiko
66091d267c libbpf: Fix type compatibility check copy-paste error
Fix copy-paste error in types compatibility check. Local type is accidentally
used instead of target type for the very first type check strictness check.
This can result in potentially less strict candidate comparison. Fix the
error.

Fixes: 3fc32f40c402 ("libbpf: Implement type-based CO-RE relocations support")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200821225653.2180782-1-andriin@fb.com
2020-08-25 00:53:18 -07:00
Andrii Nakryiko
2819b00b74 libbpf: Avoid false unuinitialized variable warning in bpf_core_apply_relo
Some versions of GCC report uninitialized targ_spec usage. GCC is wrong, but
let's avoid unnecessary warnings.

Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200821225556.2178419-1-andriin@fb.com
2020-08-25 00:53:18 -07:00
Martin KaFai Lau
cb4d6d6f1a tcp: bpf: Optionally store mac header in TCP_SAVE_SYN
This patch is adapted from Eric's patch in an earlier discussion [1].

The TCP_SAVE_SYN currently only stores the network header and
tcp header.  This patch allows it to optionally store
the mac header also if the setsockopt's optval is 2.

It requires one more bit for the "save_syn" bit field in tcp_sock.
This patch achieves this by moving the syn_smc bit next to the is_mptcp.
The syn_smc is currently used with the TCP experimental option.  Since
syn_smc is only used when CONFIG_SMC is enabled, this patch also puts
the "IS_ENABLED(CONFIG_SMC)" around it like the is_mptcp did
with "IS_ENABLED(CONFIG_MPTCP)".

The mac_hdrlen is also stored in the "struct saved_syn"
to allow a quick offset from the bpf prog if it chooses to start
getting from the network header or the tcp header.

[1]: https://lore.kernel.org/netdev/CANn89iLJNWh6bkH7DNhy_kmcAexuUCccqERqe7z2QsvPhGrYPQ@mail.gmail.com/

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/bpf/20200820190123.2886935-1-kafai@fb.com
2020-08-25 00:53:18 -07:00
Martin KaFai Lau
4f160ed607 bpf: tcp: Allow bpf prog to write and parse TCP header option
[ Note: The TCP changes here is mainly to implement the bpf
  pieces into the bpf_skops_*() functions introduced
  in the earlier patches. ]

The earlier effort in BPF-TCP-CC allows the TCP Congestion Control
algorithm to be written in BPF.  It opens up opportunities to allow
a faster turnaround time in testing/releasing new congestion control
ideas to production environment.

The same flexibility can be extended to writing TCP header option.
It is not uncommon that people want to test new TCP header option
to improve the TCP performance.  Another use case is for data-center
that has a more controlled environment and has more flexibility in
putting header options for internal only use.

For example, we want to test the idea in putting maximum delay
ACK in TCP header option which is similar to a draft RFC proposal [1].

This patch introduces the necessary BPF API and use them in the
TCP stack to allow BPF_PROG_TYPE_SOCK_OPS program to parse
and write TCP header options.  It currently supports most of
the TCP packet except RST.

Supported TCP header option:
───────────────────────────
This patch allows the bpf-prog to write any option kind.
Different bpf-progs can write its own option by calling the new helper
bpf_store_hdr_opt().  The helper will ensure there is no duplicated
option in the header.

By allowing bpf-prog to write any option kind, this gives a lot of
flexibility to the bpf-prog.  Different bpf-prog can write its
own option kind.  It could also allow the bpf-prog to support a
recently standardized option on an older kernel.

Sockops Callback Flags:
──────────────────────
The bpf program will only be called to parse/write tcp header option
if the following newly added callback flags are enabled
in tp->bpf_sock_ops_cb_flags:
BPF_SOCK_OPS_PARSE_UNKNOWN_HDR_OPT_CB_FLAG
BPF_SOCK_OPS_PARSE_ALL_HDR_OPT_CB_FLAG
BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG

A few words on the PARSE CB flags.  When the above PARSE CB flags are
turned on, the bpf-prog will be called on packets received
at a sk that has at least reached the ESTABLISHED state.
The parsing of the SYN-SYNACK-ACK will be discussed in the
"3 Way HandShake" section.

The default is off for all of the above new CB flags, i.e. the bpf prog
will not be called to parse or write bpf hdr option.  There are
details comment on these new cb flags in the UAPI bpf.h.

sock_ops->skb_data and bpf_load_hdr_opt()
─────────────────────────────────────────
sock_ops->skb_data and sock_ops->skb_data_end covers the whole
TCP header and its options.  They are read only.

The new bpf_load_hdr_opt() helps to read a particular option "kind"
from the skb_data.

Please refer to the comment in UAPI bpf.h.  It has details
on what skb_data contains under different sock_ops->op.

3 Way HandShake
───────────────
The bpf-prog can learn if it is sending SYN or SYNACK by reading the
sock_ops->skb_tcp_flags.

* Passive side

When writing SYNACK (i.e. sock_ops->op == BPF_SOCK_OPS_WRITE_HDR_OPT_CB),
the received SYN skb will be available to the bpf prog.  The bpf prog can
use the SYN skb (which may carry the header option sent from the remote bpf
prog) to decide what bpf header option should be written to the outgoing
SYNACK skb.  The SYN packet can be obtained by getsockopt(TCP_BPF_SYN*).
More on this later.  Also, the bpf prog can learn if it is in syncookie
mode (by checking sock_ops->args[0] == BPF_WRITE_HDR_TCP_SYNACK_COOKIE).

The bpf prog can store the received SYN pkt by using the existing
bpf_setsockopt(TCP_SAVE_SYN).  The example in a later patch does it.
[ Note that the fullsock here is a listen sk, bpf_sk_storage
  is not very useful here since the listen sk will be shared
  by many concurrent connection requests.

  Extending bpf_sk_storage support to request_sock will add weight
  to the minisock and it is not necessary better than storing the
  whole ~100 bytes SYN pkt. ]

When the connection is established, the bpf prog will be called
in the existing PASSIVE_ESTABLISHED_CB callback.  At that time,
the bpf prog can get the header option from the saved syn and
then apply the needed operation to the newly established socket.
The later patch will use the max delay ack specified in the SYN
header and set the RTO of this newly established connection
as an example.

The received ACK (that concludes the 3WHS) will also be available to
the bpf prog during PASSIVE_ESTABLISHED_CB through the sock_ops->skb_data.
It could be useful in syncookie scenario.  More on this later.

There is an existing getsockopt "TCP_SAVED_SYN" to return the whole
saved syn pkt which includes the IP[46] header and the TCP header.
A few "TCP_BPF_SYN*" getsockopt has been added to allow specifying where to
start getting from, e.g. starting from TCP header, or from IP[46] header.

The new getsockopt(TCP_BPF_SYN*) will also know where it can get
the SYN's packet from:
  - (a) the just received syn (available when the bpf prog is writing SYNACK)
        and it is the only way to get SYN during syncookie mode.
  or
  - (b) the saved syn (available in PASSIVE_ESTABLISHED_CB and also other
        existing CB).

The bpf prog does not need to know where the SYN pkt is coming from.
The getsockopt(TCP_BPF_SYN*) will hide this details.

Similarly, a flags "BPF_LOAD_HDR_OPT_TCP_SYN" is also added to
bpf_load_hdr_opt() to read a particular header option from the SYN packet.

* Fastopen

Fastopen should work the same as the regular non fastopen case.
This is a test in a later patch.

* Syncookie

For syncookie, the later example patch asks the active
side's bpf prog to resend the header options in ACK.  The server
can use bpf_load_hdr_opt() to look at the options in this
received ACK during PASSIVE_ESTABLISHED_CB.

* Active side

The bpf prog will get a chance to write the bpf header option
in the SYN packet during WRITE_HDR_OPT_CB.  The received SYNACK
pkt will also be available to the bpf prog during the existing
ACTIVE_ESTABLISHED_CB callback through the sock_ops->skb_data
and bpf_load_hdr_opt().

* Turn off header CB flags after 3WHS

If the bpf prog does not need to write/parse header options
beyond the 3WHS, the bpf prog can clear the bpf_sock_ops_cb_flags
to avoid being called for header options.
Or the bpf-prog can select to leave the UNKNOWN_HDR_OPT_CB_FLAG on
so that the kernel will only call it when there is option that
the kernel cannot handle.

[1]: draft-wang-tcpm-low-latency-opt-00
     https://tools.ietf.org/html/draft-wang-tcpm-low-latency-opt-00

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200820190104.2885895-1-kafai@fb.com
2020-08-25 00:53:18 -07:00
Martin KaFai Lau
647df00570 bpf: tcp: Add bpf_skops_hdr_opt_len() and bpf_skops_write_hdr_opt()
The bpf prog needs to parse the SYN header to learn what options have
been sent by the peer's bpf-prog before writing its options into SYNACK.
This patch adds a "syn_skb" arg to tcp_make_synack() and send_synack().
This syn_skb will eventually be made available (as read-only) to the
bpf prog.  This will be the only SYN packet available to the bpf
prog during syncookie.  For other regular cases, the bpf prog can
also use the saved_syn.

When writing options, the bpf prog will first be called to tell the
kernel its required number of bytes.  It is done by the new
bpf_skops_hdr_opt_len().  The bpf prog will only be called when the new
BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG is set in tp->bpf_sock_ops_cb_flags.
When the bpf prog returns, the kernel will know how many bytes are needed
and then update the "*remaining" arg accordingly.  4 byte alignment will
be included in the "*remaining" before this function returns.  The 4 byte
aligned number of bytes will also be stored into the opts->bpf_opt_len.
"bpf_opt_len" is a newly added member to the struct tcp_out_options.

Then the new bpf_skops_write_hdr_opt() will call the bpf prog to write the
header options.  The bpf prog is only called if it has reserved spaces
before (opts->bpf_opt_len > 0).

The bpf prog is the last one getting a chance to reserve header space
and writing the header option.

These two functions are half implemented to highlight the changes in
TCP stack.  The actual codes preparing the bpf running context and
invoking the bpf prog will be added in the later patch with other
necessary bpf pieces.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/bpf/20200820190052.2885316-1-kafai@fb.com
2020-08-25 00:53:18 -07:00
Martin KaFai Lau
44fdfd8e6e bpf: tcp: Add bpf_skops_parse_hdr()
The patch adds a function bpf_skops_parse_hdr().
It will call the bpf prog to parse the TCP header received at
a tcp_sock that has at least reached the ESTABLISHED state.

For the packets received during the 3WHS (SYN, SYNACK and ACK),
the received skb will be available to the bpf prog during the callback
in bpf_skops_established() introduced in the previous patch and
in the bpf_skops_write_hdr_opt() that will be added in the
next patch.

Calling bpf prog to parse header is controlled by two new flags in
tp->bpf_sock_ops_cb_flags:
BPF_SOCK_OPS_PARSE_UNKNOWN_HDR_OPT_CB_FLAG and
BPF_SOCK_OPS_PARSE_ALL_HDR_OPT_CB_FLAG.

When BPF_SOCK_OPS_PARSE_UNKNOWN_HDR_OPT_CB_FLAG is set,
the bpf prog will only be called when there is unknown
option in the TCP header.

When BPF_SOCK_OPS_PARSE_ALL_HDR_OPT_CB_FLAG is set,
the bpf prog will be called on all received TCP header.

This function is half implemented to highlight the changes in
TCP stack.  The actual codes preparing the bpf running context and
invoking the bpf prog will be added in the later patch with other
necessary bpf pieces.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/bpf/20200820190046.2885054-1-kafai@fb.com
2020-08-25 00:53:18 -07:00
Martin KaFai Lau
75d2adfe84 tcp: bpf: Add TCP_BPF_RTO_MIN for bpf_setsockopt
This patch adds bpf_setsockopt(TCP_BPF_RTO_MIN) to allow bpf prog
to set the min rto of a connection.  It could be used together
with the earlier patch which has added bpf_setsockopt(TCP_BPF_DELACK_MAX).

A later selftest patch will communicate the max delay ack in a
bpf tcp header option and then the receiving side can use
bpf_setsockopt(TCP_BPF_RTO_MIN) to set a shorter rto.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200820190027.2884170-1-kafai@fb.com
2020-08-25 00:53:18 -07:00
Martin KaFai Lau
f0f75f36a7 tcp: bpf: Add TCP_BPF_DELACK_MAX setsockopt
This change is mostly from an internal patch and adapts it from sysctl
config to the bpf_setsockopt setup.

The bpf_prog can set the max delay ack by using
bpf_setsockopt(TCP_BPF_DELACK_MAX).  This max delay ack can be communicated
to its peer through bpf header option.  The receiving peer can then use
this max delay ack and set a potentially lower rto by using
bpf_setsockopt(TCP_BPF_RTO_MIN) which will be introduced
in the next patch.

Another later selftest patch will also use it like the above to show
how to write and parse bpf tcp header option.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200820190021.2884000-1-kafai@fb.com
2020-08-25 00:53:18 -07:00
Andrii Nakryiko
a8fa8b6eea libbpf: Normalize and improve logging across few functions
Make libbpf logs follow similar pattern and provide more context like section
name or program name, where appropriate. Also, add BPF_INSN_SZ constant and
use it throughout to clean up code a little bit. This commit doesn't have any
functional changes and just removes some code changes out of the way before
bigger refactoring in libbpf internals.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200820231250.1293069-6-andriin@fb.com
2020-08-25 00:53:18 -07:00
Andrii Nakryiko
8a1acb7dfe libbpf: Skip well-known ELF sections when iterating ELF
Skip and don't log ELF sections that libbpf knows about and ignores during ELF
processing. This allows to not unnecessarily log details about those ELF
sections and cleans up libbpf debug log. Ignored sections include DWARF data,
string table, empty .text section and few special (e.g., .llvm_addrsig)
useless sections.

With such ELF sections out of the way, log unrecognized ELF sections at
pr_info level to increase visibility.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200820231250.1293069-5-andriin@fb.com
2020-08-25 00:53:18 -07:00
Andrii Nakryiko
b6e179e67c libbpf: Add __noinline macro to bpf_helpers.h
__noinline is pretty frequently used, especially with BPF subprograms, so add
them along the __always_inline, for user convenience and completeness.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200820231250.1293069-4-andriin@fb.com
2020-08-25 00:53:18 -07:00
Andrii Nakryiko
d81d872279 libbpf: Factor out common ELF operations and improve logging
Factor out common ELF operations done throughout the libbpf. This simplifies
usage across multiple places in libbpf, as well as hide error reporting from
higher-level functions and make error logging more consistent.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200820231250.1293069-3-andriin@fb.com
2020-08-25 00:53:18 -07:00
Andrii Nakryiko
4001a658e0 vmtests: add log folding
Sprinkle log folds around, including timing.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-22 00:57:32 -07:00
Andrii Nakryiko
dc1cd8503f vmtests: use built-in BPF_PRELOAD_UMD=y config
Modules might not be picked up properly in our qemu setup.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-21 19:06:11 -07:00
Andrii Nakryiko
9a3a42608d vmtests: update latest.config
Re-generate latest.config.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
63c78982c7 vmtests: harden fetching kernel sources
Ensure that corrupted tar archive won't screw up build.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
28e26bdc3e sync: add BPF_RAW_INSN macro
Add BPF_RAW_INSNS macro used by libbpf.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
7297e38474 vmtests: add CONFIG_BPF_PRELOAD=y and CONFIG_BPF_PRELOAD_UMD=m
Add new Kconfig values needed for selftests.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
a44116bb1f sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   06a4ec1d9dc652e17ee3ac2ceb6c7cf6c2b75cdd
Checkpoint bpf-next commit: dca5612f8eb9d0cf1dc254eb2adff1f16a588a7d
Baseline bpf commit:        3fb1a96a91120877488071a167d26d76be4be977
Checkpoint bpf commit:      4af7b32f84aa4cd60e39b355bc8a1eab6cd8d8a4

Andrii Nakryiko (17):
  libbpf: Make kernel feature probing lazy
  libbpf: Factor out common logic of testing and closing FD
  libbpf: Sanitize BPF program code for bpf_probe_read_{kernel,
    user}[_str]
  libbpf: Switch tracing and CO-RE helper macros to
    bpf_probe_read_kernel()
  libbpf: Detect minimal BTF support and skip BTF loading, if missing
  libbpf: Improve error logging for mismatched BTF kind cases
  libbpf: Clean up and improve CO-RE reloc logging
  libbpf: Improve relocation ambiguity detection
  libbpf: Remove any use of reallocarray() in libbpf
  tools/bpftool: Remove libbpf_internal.h usage in bpftool
  libbpf: Centralize poisoning and poison reallocarray()
  tools: Remove feature-libelf-mmap feature detection
  libbpf: Implement type-based CO-RE relocations support
  libbpf: Implement enum value-based CO-RE relocations
  libbpf: Fix detection of BPF helper call instruction
  libbpf: Fix libbpf build on compilers missing __builtin_mul_overflow
  libbpf: Add perf_buffer APIs for better integration with outside epoll
    loop

Tobias Klauser (1):
  bpf: Fix two typos in uapi/linux/bpf.h

Toke Høiland-Jørgensen (1):
  libbpf: Fix map index used in error message

Xu Wang (2):
  libbpf: Convert comma to semicolon
  libbpf: Simplify the return expression of build_map_pin_path()

Yonghong Song (1):
  bpf: Implement link_query for bpf iterators

 include/uapi/linux/bpf.h |   17 +-
 src/bpf.c                |    3 -
 src/bpf_core_read.h      |  120 +++-
 src/bpf_prog_linfo.c     |    3 -
 src/bpf_tracing.h        |    4 +-
 src/btf.c                |   31 +-
 src/btf.h                |   38 --
 src/btf_dump.c           |    9 +-
 src/hashmap.c            |    3 +
 src/libbpf.c             | 1177 ++++++++++++++++++++++++++++----------
 src/libbpf.h             |    4 +
 src/libbpf.map           |    8 +
 src/libbpf_internal.h    |  138 ++++-
 src/libbpf_probes.c      |    3 -
 src/netlink.c            |  128 +----
 src/nlattr.c             |    9 +-
 src/ringbuf.c            |    8 +-
 src/xsk.c                |    3 -
 18 files changed, 1149 insertions(+), 557 deletions(-)

--
2.24.1
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
4069acb787 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-08-21 18:22:15 -07:00
Tobias Klauser
c7d2b1f31b bpf: Fix two typos in uapi/linux/bpf.h
Also remove trailing whitespaces in bpf_skb_get_tunnel_key example code.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200821133642.18870-1-tklauser@distanz.ch
2020-08-21 18:22:15 -07:00
Toke Høiland-Jørgensen
b06fb2312c libbpf: Fix map index used in error message
The error message emitted by bpf_object__init_user_btf_maps() was using the
wrong section ID.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200819110534.9058-1-toke@redhat.com
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
1e2c7823f5 libbpf: Add perf_buffer APIs for better integration with outside epoll loop
Add a set of APIs to perf_buffer manage to allow applications to integrate
perf buffer polling into existing epoll-based infrastructure. One example is
applications using libevent already and wanting to plug perf_buffer polling,
instead of relying on perf_buffer__poll() and waste an extra thread to do it.
But perf_buffer is still extremely useful to set up and consume perf buffer
rings even for such use cases.

So to accomodate such new use cases, add three new APIs:
  - perf_buffer__buffer_cnt() returns number of per-CPU buffers maintained by
    given instance of perf_buffer manager;
  - perf_buffer__buffer_fd() returns FD of perf_event corresponding to
    a specified per-CPU buffer; this FD is then polled independently;
  - perf_buffer__consume_buffer() consumes data from single per-CPU buffer,
    identified by its slot index.

To support a simpler, but less efficient, way to integrate perf_buffer into
external polling logic, also expose underlying epoll FD through
perf_buffer__epoll_fd() API. It will need to be followed by
perf_buffer__poll(), wasting extra syscall, or perf_buffer__consume(), wasting
CPU to iterate buffers with no data. But could be simpler and more convenient
for some cases.

These APIs allow for great flexiblity, but do not sacrifice general usability
of perf_buffer.

Also exercise and check new APIs in perf_buffer selftest.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20200821165927.849538-1-andriin@fb.com
2020-08-21 18:22:15 -07:00
Yonghong Song
160917756a bpf: Implement link_query for bpf iterators
This patch implemented bpf_link callback functions
show_fdinfo and fill_link_info to support link_query
interface.

The general interface for show_fdinfo and fill_link_info
will print/fill the target_name. Each targets can
register show_fdinfo and fill_link_info callbacks
to print/fill more target specific information.

For example, the below is a fdinfo result for a bpf
task iterator.
  $ cat /proc/1749/fdinfo/7
  pos:    0
  flags:  02000000
  mnt_id: 14
  link_type:      iter
  link_id:        11
  prog_tag:       990e1f8152f7e54f
  prog_id:        59
  target_name:    task

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200821184418.574122-1-yhs@fb.com
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
4a2f7ac55f libbpf: Fix libbpf build on compilers missing __builtin_mul_overflow
GCC compilers older than version 5 don't support __builtin_mul_overflow yet.
Given GCC 4.9 is the minimal supported compiler for building kernel and the
fact that libbpf is a dependency of resolve_btfids, which is dependency of
CONFIG_DEBUG_INFO_BTF=y, this needs to be handled. This patch fixes the issue
by falling back to slower detection of integer overflow in such cases.

Fixes: 029258d7b228 ("libbpf: Remove any use of reallocarray() in libbpf")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200820061411.1755905-2-andriin@fb.com
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
a8a3089b5e libbpf: Fix detection of BPF helper call instruction
BPF_CALL | BPF_JMP32 is explicitly not allowed by verifier for BPF helper
calls, so don't detect it as a valid call. Also drop the check on func_id
pointer, as it's currently always non-null.

Fixes: 109cea5a594f ("libbpf: Sanitize BPF program code for bpf_probe_read_{kernel, user}[_str]")
Reported-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200820061411.1755905-1-andriin@fb.com
2020-08-21 18:22:15 -07:00
Xu Wang
475843fbf4 libbpf: Simplify the return expression of build_map_pin_path()
Simplify the return expression.

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200819025324.14680-1-vulab@iscas.ac.cn
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
f89dab0903 libbpf: Implement enum value-based CO-RE relocations
Implement two relocations of a new enumerator value-based CO-RE relocation
kind: ENUMVAL_EXISTS and ENUMVAL_VALUE.

First, ENUMVAL_EXISTS, allows to detect the presence of a named enumerator
value in the target (kernel) BTF. This is useful to do BPF helper/map/program
type support detection from BPF program side. bpf_core_enum_value_exists()
macro helper is provided to simplify built-in usage.

Second, ENUMVAL_VALUE, allows to capture enumerator integer value and relocate
it according to the target BTF, if it changes. This is useful to have
a guarantee against intentional or accidental re-ordering/re-numbering of some
of the internal (non-UAPI) enumerations, where kernel developers don't care
about UAPI backwards compatiblity concerns. bpf_core_enum_value() allows to
capture this succinctly and use correct enum values in code.

LLVM uses ldimm64 instruction to capture enumerator value-based relocations,
so add support for ldimm64 instruction patching as well.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200819194519.3375898-5-andriin@fb.com
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
cdb21b05e5 libbpf: Implement type-based CO-RE relocations support
Implement support for TYPE_EXISTS/TYPE_SIZE/TYPE_ID_LOCAL/TYPE_ID_REMOTE
relocations. These are examples of type-based relocations, as opposed to
field-based relocations supported already. The difference is that they are
calculating relocation values based on the type itself, not a field within
a struct/union.

Type-based relos have slightly different semantics when matching local types
to kernel target types, see comments in bpf_core_types_are_compat() for
details. Their behavior on failure to find target type in kernel BTF also
differs. Instead of "poisoning" relocatable instruction and failing load
subsequently in kernel, they return 0 (which is rarely a valid return result,
so user BPF code can use that to detect success/failure of the relocation and
deal with it without extra "guarding" relocations). Also, it's always possible
to check existence of the type in target kernel with TYPE_EXISTS relocation,
similarly to a field-based FIELD_EXISTS.

TYPE_ID_LOCAL relocation is a bit special in that it always succeeds (barring
any libbpf/Clang bugs) and resolved to BTF ID using **local** BTF info of BPF
program itself. Tests in subsequent patches demonstrate the usage and
semantics of new relocations.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200819194519.3375898-2-andriin@fb.com
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
a734ef0803 tools: Remove feature-libelf-mmap feature detection
It's trivial to handle missing ELF_C_MMAP_READ support in libelf the way that
objtool has solved it in
("774bec3fddcc objtool: Add fallback from ELF_C_READ_MMAP to ELF_C_READ").

So instead of having an entire feature detector for that, just do what objtool
does for perf and libbpf. And keep their Makefiles a bit simpler.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200819013607.3607269-5-andriin@fb.com
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
3c4954d5a6 libbpf: Centralize poisoning and poison reallocarray()
Most of libbpf source files already include libbpf_internal.h, so it's a good
place to centralize identifier poisoning. So move kernel integer type
poisoning there. And also add reallocarray to a poison list to prevent
accidental use of it. libbpf_reallocarray() should be used universally
instead.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200819013607.3607269-4-andriin@fb.com
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
c3b1c66810 tools/bpftool: Remove libbpf_internal.h usage in bpftool
Most netlink-related functions were unique to bpftool usage, so I moved them
into net.c. Few functions are still used by both bpftool and libbpf itself
internally, so I've copy-pasted them (libbpf_nl_get_link,
libbpf_netlink_open). It's a bit of duplication of code, but better separation
of libbpf as a library with public API and bpftool, relying on unexposed
functions in libbpf.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200819013607.3607269-3-andriin@fb.com
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
dc70da9c70 libbpf: Remove any use of reallocarray() in libbpf
Re-implement glibc's reallocarray() for libbpf internal-only use.
reallocarray(), unfortunately, is not available in all versions of glibc, so
requires extra feature detection and using reallocarray() stub from
<tools/libc_compat.h> and COMPAT_NEED_REALLOCARRAY. All this complicates build
of libbpf unnecessarily and is just a maintenance burden. Instead, it's
trivial to implement libbpf-specific internal version and use it throughout
libbpf.

Which is what this patch does, along with converting some realloc() uses that
should really have been reallocarray() in the first place.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200819013607.3607269-2-andriin@fb.com
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
9106c3028b libbpf: Improve relocation ambiguity detection
Split the instruction patching logic into relocation value calculation and
application of relocation to instruction. Using this, evaluate relocation
against each matching candidate and validate that all candidates agree on
relocated value. If not, report ambiguity and fail load.

This logic is necessary to avoid dangerous (however unlikely) accidental match
against two incompatible candidate types. Without this change, libbpf will
pick a random type as *the* candidate and apply potentially invalid
relocation.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200818223921.2911963-4-andriin@fb.com
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
3bde6ca8e8 libbpf: Clean up and improve CO-RE reloc logging
Add logging of local/target type kind (struct/union/typedef/etc). Preserve
unresolved root type ID (for cases of typedef). Improve the format of CO-RE
reloc spec output format to contain only relevant and succinct info.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200818223921.2911963-3-andriin@fb.com
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
d13e96ee32 libbpf: Improve error logging for mismatched BTF kind cases
Instead of printing out integer value of BTF kind, print out a string
representation of a kind.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200818223921.2911963-2-andriin@fb.com
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
dc1b3e2a45 libbpf: Detect minimal BTF support and skip BTF loading, if missing
Detect whether a kernel supports any BTF at all, and if not, don't even
attempt loading BTF to avoid unnecessary log messages like:

  libbpf: Error loading BTF: Invalid argument(22)
  libbpf: Error loading .BTF into kernel: -22. BTF is optional, ignoring.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200818213356.2629020-8-andriin@fb.com
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
30c61391bf libbpf: Switch tracing and CO-RE helper macros to bpf_probe_read_kernel()
Now that libbpf can automatically fallback to bpf_probe_read() on old kernels
not yet supporting bpf_probe_read_kernel(), switch libbpf BPF-side helper
macros to use appropriate BPF helper for reading kernel data.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/bpf/20200818213356.2629020-7-andriin@fb.com
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
e6f118dddd libbpf: Sanitize BPF program code for bpf_probe_read_{kernel, user}[_str]
Add BPF program code sanitization pass, replacing calls to BPF
bpf_probe_read_{kernel,user}[_str]() helpers with bpf_probe_read[_str](), if
libbpf detects that kernel doesn't support new variants.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200818213356.2629020-5-andriin@fb.com
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
5d4075553b libbpf: Factor out common logic of testing and closing FD
Factor out common piece of logic that detects support for a feature based on
successfully created FD. Also take care of closing FD, if it was created.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200818213356.2629020-4-andriin@fb.com
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
5205159359 libbpf: Make kernel feature probing lazy
Turn libbpf's kernel feature probing into lazily-performed checks. This allows
to skip performing unnecessary feature checks, if a given BPF application
doesn't rely on a particular kernel feature. As we grow number of feature
probes, libbpf might perform less unnecessary syscalls and scale better with
number of feature probes long-term.

By decoupling feature checks from bpf_object, it's also possible to perform
feature probing from libbpf static helpers and low-level APIs, if necessary.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200818213356.2629020-3-andriin@fb.com
2020-08-21 18:22:15 -07:00
Xu Wang
87d7f1a32b libbpf: Convert comma to semicolon
Replace a comma between expression statements by a semicolon.

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200818071611.21923-1-vulab@iscas.ac.cn
2020-08-21 18:22:15 -07:00
Andrii Nakryiko
e8547bd4f7 vmtests: fix selftests checkout script
Fix the script.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-18 11:37:43 -07:00
Andrii Nakryiko
93959e4e43 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   bfdd5aaa54b0a44d9df550fe4c9db7e1470a11b8
Checkpoint bpf-next commit: 06a4ec1d9dc652e17ee3ac2ceb6c7cf6c2b75cdd
Baseline bpf commit:        929e54a989680c6f134b02293732030b897475dc
Checkpoint bpf commit:      3fb1a96a91120877488071a167d26d76be4be977

Andrii Nakryiko (4):
  libbpf: Fix BTF-defined map-in-map initialization on 32-bit host
    arches
  libbpf: Handle BTF pointer sizes more carefully
  libbpf: Enforce 64-bitness of BTF for BPF object files
  libbpf: Fix build on ppc64le architecture

Jean-Philippe Brucker (1):
  libbpf: Handle GCC built-in types for Arm NEON

Toke Høiland-Jørgensen (1):
  libbpf: Prevent overriding errno when logging errors

Yonghong Song (1):
  libbpf: Do not use __builtin_offsetof for offsetof

 src/bpf_helpers.h |  2 +-
 src/btf.c         | 83 +++++++++++++++++++++++++++++++++++++++++++++--
 src/btf.h         |  2 ++
 src/btf_dump.c    | 39 ++++++++++++++++++++--
 src/libbpf.c      | 32 +++++++++++-------
 src/libbpf.map    |  2 ++
 6 files changed, 143 insertions(+), 17 deletions(-)

--
2.24.1
2020-08-18 11:37:43 -07:00
Andrii Nakryiko
7ee1f12f94 libbpf: Fix build on ppc64le architecture
On ppc64le we get the following warning:

  In file included from btf_dump.c:16:0:
  btf_dump.c: In function ‘btf_dump_emit_struct_def’:
  ../include/linux/kernel.h:20:17: error: comparison of distinct pointer types lacks a cast [-Werror]
    (void) (&_max1 == &_max2);  \
                   ^
  btf_dump.c:882:11: note: in expansion of macro ‘max’
      m_sz = max(0LL, btf__resolve_size(d->btf, m->type));
             ^~~

Fix by explicitly casting to __s64, which is a return type from
btf__resolve_size().

Fixes: 702eddc77a90 ("libbpf: Handle GCC built-in types for Arm NEON")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200818164456.1181661-1-andriin@fb.com
2020-08-18 11:37:43 -07:00
Andrii Nakryiko
ff09ad9dac libbpf: Enforce 64-bitness of BTF for BPF object files
BPF object files are always targeting 64-bit BPF target architecture, so
enforce that at BTF level as well.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200813204945.1020225-7-andriin@fb.com
2020-08-18 11:37:43 -07:00
Andrii Nakryiko
025fcdc306 libbpf: Handle BTF pointer sizes more carefully
With libbpf and BTF it is pretty common to have libbpf built for one
architecture, while BTF information was generated for a different architecture
(typically, but not always, BPF). In such case, the size of a pointer might
differ betweem architectures. libbpf previously was always making an
assumption that pointer size for BTF is the same as native architecture
pointer size, but that breaks for cases where libbpf is built as 32-bit
library, while BTF is for 64-bit architecture.

To solve this, add heuristic to determine pointer size by searching for `long`
or `unsigned long` integer type and using its size as a pointer size. Also,
allow to override the pointer size with a new API btf__set_pointer_size(), for
cases where application knows which pointer size should be used. User
application can check what libbpf "guessed" by looking at the result of
btf__pointer_size(). If it's not 0, then libbpf successfully determined a
pointer size, otherwise native arch pointer size will be used.

For cases where BTF is parsed from ELF file, use ELF's class (32-bit or
64-bit) to determine pointer size.

Fixes: 8a138aed4a80 ("bpf: btf: Add BTF support to libbpf")
Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200813204945.1020225-5-andriin@fb.com
2020-08-18 11:37:43 -07:00
Andrii Nakryiko
b3405fcb08 libbpf: Fix BTF-defined map-in-map initialization on 32-bit host arches
Libbpf built in 32-bit mode should be careful about not conflating 64-bit BPF
pointers in BPF ELF file and host architecture pointers. This patch fixes
issue of incorrect initializating of map-in-map inner map slots due to such
difference.

Fixes: 646f02ffdd49 ("libbpf: Add BTF-defined map-in-map support")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200813204945.1020225-4-andriin@fb.com
2020-08-18 11:37:43 -07:00
Toke Høiland-Jørgensen
1194953749 libbpf: Prevent overriding errno when logging errors
Turns out there were a few more instances where libbpf didn't save the
errno before writing an error message, causing errno to be overridden by
the printf() return and the error disappearing if logging is enabled.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200813142905.160381-1-toke@redhat.com
2020-08-18 11:37:43 -07:00
Jean-Philippe Brucker
1d76180057 libbpf: Handle GCC built-in types for Arm NEON
When building Arm NEON (SIMD) code from lib/raid6/neon.uc, GCC emits
DWARF information using a base type "__Poly8_t", which is internal to
GCC and not recognized by Clang. This causes build failures when
building with Clang a vmlinux.h generated from an arm64 kernel that was
built with GCC.

	vmlinux.h:47284:9: error: unknown type name '__Poly8_t'
	typedef __Poly8_t poly8x16_t[16];
	        ^~~~~~~~~

The polyX_t types are defined as unsigned integers in the "Arm C
Language Extension" document (101028_Q220_00_en). Emit typedefs based on
standard integer types for the GCC internal types, similar to those
emitted by Clang.

Including linux/kernel.h to use ARRAY_SIZE() incidentally redefined
max(), causing a build bug due to different types, hence the seemingly
unrelated change.

Reported-by: Jakov Petrina <jakov.petrina@sartura.hr>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200812143909.3293280-1-jean-philippe@linaro.org
2020-08-18 11:37:43 -07:00
Yonghong Song
048bf21dac libbpf: Do not use __builtin_offsetof for offsetof
Commit 5fbc220862fc ("tools/libpf: Add offsetof/container_of macro
in bpf_helpers.h") added a macro offsetof() to get the offset of a
structure member:

   #define offsetof(TYPE, MEMBER)  ((size_t)&((TYPE *)0)->MEMBER)

In certain use cases, size_t type may not be available so
Commit da7a35062bcc ("libbpf bpf_helpers: Use __builtin_offsetof
for offsetof") changed to use __builtin_offsetof which removed
the dependency on type size_t, which I suggested.

But using __builtin_offsetof will prevent CO-RE relocation
generation in case that, e.g., TYPE is annotated with "preserve_access_info"
where a relocation is desirable in case the member offset is changed
in a different kernel version. So this patch reverted back to
the original macro but using "unsigned long" instead of "site_t".

Fixes: da7a35062bcc ("libbpf bpf_helpers: Use __builtin_offsetof for offsetof")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/bpf/20200811030852.3396929-1-yhs@fb.com
2020-08-18 11:37:43 -07:00
Andrii Nakryiko
e954437a76 travis-ci: flatten build stages to gain more speed ups
Do both builds and selftest runs as part of a single build step. This would
allow to complete CI testing faster, as builds will happen in parallel with
"Kernel LATEST + selftests" run.

Also re-enable s390x build.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-10 22:38:26 -07:00
Andrii Nakryiko
c57be0b4d6 vmtests: speed up fetching of bpf-next sources
Attempt to first fetch bpf-next tree from a snapshot, falling back to shallow
clone, and if that is not enough, doing a full bpf-next clone. This should
both improve a speed and (because of full clone fallback) improve test
reliability if libbpf wasn't synced in a while.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-10 22:31:52 -07:00
Andrii Nakryiko
bf3ab4b0d8 travis-ci: remove s390x build as it fails to be queued by Travis CI
It's been failing for few days. Comment it out for now.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-09 13:19:59 -07:00
Andrii Nakryiko
663f66decf vmtests: blacklist problematic tests
Blacklist btf_map_in_map permanently for 5.5. bpf_verif_scale is broken due to
Clang issues on latest. Do not run ALU32 flavor for test_progs on 4.9.0, which
doesn't support ALU32 yet.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-07 16:37:09 -07:00
Andrii Nakryiko
ed187d0400 vmtest: bump LLVM_VER to 12
Bump LLVM_VER variable used in selftest build to 12.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-07 16:37:09 -07:00
Andrii Nakryiko
80453d4b2d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   3c4f850e8441ac8b3b6dbaa6107604c4199ef01f
Checkpoint bpf-next commit: bfdd5aaa54b0a44d9df550fe4c9db7e1470a11b8
Baseline bpf commit:        5b801dfb7feb2738975d80223efc2fc193e55573
Checkpoint bpf commit:      929e54a989680c6f134b02293732030b897475dc

Andrii Nakryiko (3):
  libbpf: Make destructors more robust by handling ERR_PTR(err) cases
  libbpf: Add bpf_link detach APIs
  libbpf: Add btf__parse_raw() and generic btf__parse() APIs

Daniel T. Lee (1):
  libbf: Fix uninitialized pointer at btf__parse_raw()

Jerry Crunchtime (1):
  libbpf: Fix register in PT_REGS MIPS macros

Yonghong Song (1):
  tools/bpf: Support new uapi for map element bpf iterator

 include/uapi/linux/bpf.h |  20 ++++---
 src/bpf.c                |  13 +++++
 src/bpf.h                |   7 ++-
 src/bpf_tracing.h        |   4 +-
 src/btf.c                | 118 ++++++++++++++++++++++++++-------------
 src/btf.h                |   5 +-
 src/btf_dump.c           |   2 +-
 src/libbpf.c             |  20 ++++---
 src/libbpf.h             |   6 +-
 src/libbpf.map           |   4 ++
 10 files changed, 137 insertions(+), 62 deletions(-)

--
2.24.1
2020-08-07 16:37:09 -07:00
Daniel T. Lee
7f96c4b1d2 libbf: Fix uninitialized pointer at btf__parse_raw()
Recently, from commit 94a1fedd63ed ("libbpf: Add btf__parse_raw() and
generic btf__parse() APIs"), new API has been added to libbpf that
allows to parse BTF from raw data file (btf__parse_raw()).

The commit derives build failure of samples/bpf due to improper access
of uninitialized pointer at btf_parse_raw().

    btf.c: In function btf__parse_raw:
    btf.c:625:28: error: btf may be used uninitialized in this function
      625 |  return err ? ERR_PTR(err) : btf;
          |         ~~~~~~~~~~~~~~~~~~~^~~~~

This commit fixes the build failure of samples/bpf by adding code of
initializing btf pointer as NULL.

Fixes: 94a1fedd63ed ("libbpf: Add btf__parse_raw() and generic btf__parse() APIs")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200805223359.32109-1-danieltimlee@gmail.com
2020-08-07 16:37:09 -07:00
Yonghong Song
2be293cb4a tools/bpf: Support new uapi for map element bpf iterator
Previous commit adjusted kernel uapi for map
element bpf iterator. This patch adjusted libbpf API
due to uapi change. bpftool and bpf_iter selftests
are also changed accordingly.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200805055058.1457623-1-yhs@fb.com
2020-08-07 16:37:09 -07:00
Andrii Nakryiko
a0334e97aa libbpf: Add btf__parse_raw() and generic btf__parse() APIs
Add public APIs to parse BTF from raw data file (e.g.,
/sys/kernel/btf/vmlinux), as well as generic btf__parse(), which will try to
determine correct format, currently either raw or ELF.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200802013219.864880-2-andriin@fb.com
2020-08-07 16:37:09 -07:00
Andrii Nakryiko
2d97d4097f libbpf: Add bpf_link detach APIs
Add low-level bpf_link_detach() API. Also add higher-level bpf_link__detach()
one.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200731182830.286260-3-andriin@fb.com
2020-08-07 16:37:09 -07:00
Jerry Crunchtime
80a52e3252 libbpf: Fix register in PT_REGS MIPS macros
The o32, n32 and n64 calling conventions require the return
value to be stored in $v0 which maps to $2 register, i.e.,
the register 2.

Fixes: c1932cd ("bpf: Add MIPS support to samples/bpf.")
Signed-off-by: Jerry Crunchtime <jerry.c.t@web.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/43707d31-0210-e8f0-9226-1af140907641@web.de
2020-08-07 16:37:09 -07:00
Andrii Nakryiko
2dc7cbd893 libbpf: Make destructors more robust by handling ERR_PTR(err) cases
Most of libbpf "constructors" on failure return ERR_PTR(err) result encoded as
a pointer. It's a common mistake to eventually pass such malformed pointers
into xxx__destroy()/xxx__free() "destructors". So instead of fixing up
clean up code in selftests and user programs, handle such error pointers in
destructors themselves. This works beautifully for NULL pointers passed to
destructors, so might as well just work for error pointers.

Suggested-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200729232148.896125-1-andriin@fb.com
2020-08-07 16:37:09 -07:00
Thomas Hebb
0466b9833b README: Add Arch to list of downstream distros
Signed-off-by: Thomas Hebb <tommyhebb@gmail.com>
2020-08-06 21:21:02 -07:00
Andrii Nakryiko
ba8d45968b vmtests: specify v12 of clang/llvm for now
Whatever happened, clang-11 and llvm-11, to which clang/llvm packages resolve,
respectively, are not there anymore. Seems like clang-12/llvm-12 are the
latest now, but for whatever reason clang/llvm don't resolve to them yet.
Hard-code version 12 for now.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-06 17:32:33 -07:00
Thomas Hebb
734b3f0afe check-reallocarray.sh: Use the same compiler Make does
Currently we hardcode "gcc", which means we get a bogus result any time
a non-default CC is passed to Make. In fact, it's bogus even when CC is
not explicitly set, since Make's default is "cc", which isn't
necessarily the same as "gcc".

Fix the issue by passing the compiler to use to check-reallocarray.sh.

Signed-off-by: Thomas Hebb <tommyhebb@gmail.com>
2020-07-28 14:05:35 -07:00
Andrii Nakryiko
f56874ba8a vmtests: blacklist sk_lookup on LATEST and cg_storage_multi on 5.5
Blacklist two failing tests.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-07-28 14:03:17 -07:00
Andrii Nakryiko
3f26bf1adf sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   9a97c9d2af5ca798377342debf7f0f44281d050e
Checkpoint bpf-next commit: 3c4f850e8441ac8b3b6dbaa6107604c4199ef01f
Baseline bpf commit:        5b801dfb7feb2738975d80223efc2fc193e55573
Checkpoint bpf commit:      5b801dfb7feb2738975d80223efc2fc193e55573

Andrii Nakryiko (1):
  bpf: Fix bpf_ringbuf_output() signature to return long

 include/uapi/linux/bpf.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--
2.24.1
2020-07-28 14:03:17 -07:00
Andrii Nakryiko
ab01213b35 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   5c3320d7fece4612d4a413aa3c8e82cdb5b49fcb
Checkpoint bpf-next commit: 9a97c9d2af5ca798377342debf7f0f44281d050e
Baseline bpf commit:        b2f9f1535bb93ee5fa2ea30ac1c26fa0d676154c
Checkpoint bpf commit:      5b801dfb7feb2738975d80223efc2fc193e55573

Andrii Nakryiko (3):
  libbpf: Support stripping modifiers for btf_dump
  tools/bpftool: Strip away modifiers from global variables
  libbpf: Add support for BPF XDP link

Ciara Loftus (1):
  xsk: Add new statistics

Horatiu Vultur (1):
  net: bridge: Add port attribute IFLA_BRPORT_MRP_IN_OPEN

Ian Rogers (1):
  libbpf bpf_helpers: Use __builtin_offsetof for offsetof

Jakub Sitnicki (2):
  bpf: Sync linux/bpf.h to tools/
  libbpf: Add support for SK_LOOKUP program type

Lorenzo Bianconi (3):
  cpumap: Formalize map value as a named struct
  bpf: cpumap: Add the possibility to attach an eBPF program to cpumap
  libbpf: Add SEC name for xdp programs attached to CPUMAP

Quentin Monnet (1):
  bpf: Fix formatting in documentation for BPF helpers

Randy Dunlap (1):
  bpf: Drop duplicated words in uapi helper comments

Song Liu (1):
  libbpf: Print hint when PERF_EVENT_IOC_SET_BPF returns -EPROTO

Yonghong Song (2):
  bpf: Implement bpf iterator for map elements
  tools/libbpf: Add support for bpf map element iterator

 include/uapi/linux/bpf.h     | 155 +++++++++++++++++++++++++++++------
 include/uapi/linux/if_link.h |   1 +
 include/uapi/linux/if_xdp.h  |   5 +-
 src/bpf.c                    |   1 +
 src/bpf.h                    |   3 +-
 src/bpf_helpers.h            |   2 +-
 src/btf.h                    |   4 +-
 src/btf_dump.c               |  10 ++-
 src/libbpf.c                 |  27 +++++-
 src/libbpf.h                 |   7 +-
 src/libbpf.map               |   3 +
 src/libbpf_probes.c          |   3 +
 12 files changed, 188 insertions(+), 33 deletions(-)

--
2.24.1
2020-07-28 14:03:17 -07:00
Andrii Nakryiko
8af35e73a2 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-07-28 14:03:17 -07:00
Andrii Nakryiko
a290d45322 libbpf: Add support for BPF XDP link
Sync UAPI header and add support for using bpf_link-based XDP attachment.
Make xdp/ prog type set expected attach type. Kernel didn't enforce
attach_type for XDP programs before, so there is no backwards compatiblity
issues there.

Also fix section_names selftest to recognize that xdp prog types now have
expected attach type.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200722064603.3350758-8-andriin@fb.com
2020-07-28 14:03:17 -07:00
Song Liu
3f6b428909 libbpf: Print hint when PERF_EVENT_IOC_SET_BPF returns -EPROTO
The kernel prevents potential unwinder warnings and crashes by blocking
BPF program with bpf_get_[stack|stackid] on perf_event without
PERF_SAMPLE_CALLCHAIN, or with exclude_callchain_[kernel|user]. Print a
hint message in libbpf to help the user debug such issues.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200723180648.1429892-4-songliubraving@fb.com
2020-07-28 14:03:17 -07:00
Yonghong Song
5efd8395ef tools/libbpf: Add support for bpf map element iterator
Add map_fd to bpf_iter_attach_opts and flags to
bpf_link_create_opts. Later on, bpftool or selftest
will be able to create a bpf map element iterator
by passing map_fd to the kernel during link
creation time.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200723184117.590673-1-yhs@fb.com
2020-07-28 14:03:17 -07:00
Yonghong Song
b1720407ff bpf: Implement bpf iterator for map elements
The bpf iterator for map elements are implemented.
The bpf program will receive four parameters:
  bpf_iter_meta *meta: the meta data
  bpf_map *map:        the bpf_map whose elements are traversed
  void *key:           the key of one element
  void *value:         the value of the same element

Here, meta and map pointers are always valid, and
key has register type PTR_TO_RDONLY_BUF_OR_NULL and
value has register type PTR_TO_RDWR_BUF_OR_NULL.
The kernel will track the access range of key and value
during verification time. Later, these values will be compared
against the values in the actual map to ensure all accesses
are within range.

A new field iter_seq_info is added to bpf_map_ops which
is used to add map type specific information, i.e., seq_ops,
init/fini seq_file func and seq_file private data size.
Subsequent patches will have actual implementation
for bpf_map_ops->iter_seq_info.

In user space, BPF_ITER_LINK_MAP_FD needs to be
specified in prog attr->link_create.flags, which indicates
that attr->link_create.target_fd is a map_fd.
The reason for such an explicit flag is for possible
future cases where one bpf iterator may allow more than
one possible customization, e.g., pid and cgroup id for
task_file.

Current kernel internal implementation only allows
the target to register at most one required bpf_iter_link_info.
To support the above case, optional bpf_iter_link_info's
are needed, the target can be extended to register such link
infos, and user provided link_info needs to match one of
target supported ones.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200723184112.590360-1-yhs@fb.com
2020-07-28 14:03:17 -07:00
Ian Rogers
698820a9d9 libbpf bpf_helpers: Use __builtin_offsetof for offsetof
The non-builtin route for offsetof has a dependency on size_t from
stdlib.h/stdint.h that is undeclared and may break targets.
The offsetof macro in bpf_helpers may disable the same macro in other
headers that have a #ifdef offsetof guard. Rather than add additional
dependencies improve the offsetof macro declared here to use the
builtin that is available since llvm 3.7 (the first with a BPF backend).

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200720061741.1514673-1-irogers@google.com
2020-07-28 14:03:17 -07:00
Jakub Sitnicki
6d92249be0 libbpf: Add support for SK_LOOKUP program type
Make libbpf aware of the newly added program type, and assign it a
section name.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200717103536.397595-13-jakub@cloudflare.com
2020-07-28 14:03:17 -07:00
Jakub Sitnicki
1736996279 bpf: Sync linux/bpf.h to tools/
Newly added program, context type and helper is used by tests in a
subsequent patch. Synchronize the header file.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200717103536.397595-12-jakub@cloudflare.com
2020-07-28 14:03:17 -07:00
Randy Dunlap
f9f5f054d2 bpf: Drop duplicated words in uapi helper comments
Drop doubled words "will" and "attach".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/6b9f71ae-4f8e-0259-2c5d-187ddaefe6eb@infradead.org
2020-07-28 14:03:17 -07:00
Lorenzo Bianconi
4a5aecf034 libbpf: Add SEC name for xdp programs attached to CPUMAP
As for DEVMAP, support SEC("xdp_cpumap/") as a short cut for loading
the program with type BPF_PROG_TYPE_XDP and expected attach type
BPF_XDP_CPUMAP.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/33174c41993a6d860d9c7c1f280a2477ee39ed11.1594734381.git.lorenzo@kernel.org
2020-07-28 14:03:17 -07:00
Lorenzo Bianconi
77f11b3674 bpf: cpumap: Add the possibility to attach an eBPF program to cpumap
Introduce the capability to attach an eBPF program to cpumap entries.
The idea behind this feature is to add the possibility to define on
which CPU run the eBPF program if the underlying hw does not support
RSS. Current supported verdicts are XDP_DROP and XDP_PASS.

This patch has been tested on Marvell ESPRESSObin using xdp_redirect_cpu
sample available in the kernel tree to identify possible performance
regressions. Results show there are no observable differences in
packet-per-second:

$./xdp_redirect_cpu --progname xdp_cpu_map0 --dev eth0 --cpu 1
rx: 354.8 Kpps
rx: 356.0 Kpps
rx: 356.8 Kpps
rx: 356.3 Kpps
rx: 356.6 Kpps
rx: 356.6 Kpps
rx: 356.7 Kpps
rx: 355.8 Kpps
rx: 356.8 Kpps
rx: 356.8 Kpps

Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/5c9febdf903d810b3415732e5cd98491d7d9067a.1594734381.git.lorenzo@kernel.org
2020-07-28 14:03:17 -07:00
Lorenzo Bianconi
cd46c9d67e cpumap: Formalize map value as a named struct
As it has been already done for devmap, introduce 'struct bpf_cpumap_val'
to formalize the expected values that can be passed in for a CPUMAP.
Update cpumap code to use the struct.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/754f950674665dae6139c061d28c1d982aaf4170.1594734381.git.lorenzo@kernel.org
2020-07-28 14:03:17 -07:00
Horatiu Vultur
41054a32df net: bridge: Add port attribute IFLA_BRPORT_MRP_IN_OPEN
This patch adds a new port attribute, IFLA_BRPORT_MRP_IN_OPEN, which
allows to notify the userspace when the node lost the contiuity of
MRP_InTest frames.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 14:03:17 -07:00
Andrii Nakryiko
852b4c8e73 tools/bpftool: Strip away modifiers from global variables
Reliably remove all the type modifiers from read-only (.rodata) global
variable definitions, including cases of inner field const modifiers and
arrays of const values.

Also modify one of selftests to ensure that const volatile struct doesn't
prevent user-space from modifying .rodata variable.

Fixes: 985ead416df3 ("bpftool: Add skeleton codegen command")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200713232409.3062144-3-andriin@fb.com
2020-07-28 14:03:17 -07:00
Andrii Nakryiko
de60a31eba libbpf: Support stripping modifiers for btf_dump
One important use case when emitting const/volatile/restrict is undesirable is
BPF skeleton generation of DATASEC layout. These are further memory-mapped and
can be written/read from user-space directly.

For important case of .rodata variables, bpftool strips away first-level
modifiers, to make their use on user-space side simple and not requiring extra
type casts to override compiler complaining about writing to const variables.

This logic works mostly fine, but breaks in some more complicated cases. E.g.:

    const volatile int params[10];

Because in BTF it's a chain of ARRAY -> CONST -> VOLATILE -> INT, bpftool
stops at ARRAY and doesn't strip CONST and VOLATILE. In skeleton this variable
will be emitted as is. So when used from user-space, compiler will complain
about writing to const array. This is problematic, as also mentioned in [0].

To solve this for arrays and other non-trivial cases (e.g., inner
const/volatile fields inside the struct), teach btf_dump to strip away any
modifier, when requested. This is done as an extra option on
btf_dump__emit_type_decl() API.

Reported-by: Anton Protopopov <a.s.protopopov@gmail.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200713232409.3062144-2-andriin@fb.com
2020-07-28 14:03:17 -07:00
Ciara Loftus
8ec7d86efe xsk: Add new statistics
It can be useful for the user to know the reason behind a dropped packet.
Introduce new counters which track drops on the receive path caused by:
1. rx ring being full
2. fill ring being empty

Also, on the tx path introduce a counter which tracks the number of times
we attempt pull from the tx ring when it is empty.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200708072835.4427-2-ciara.loftus@intel.com
2020-07-28 14:03:17 -07:00
Andrii Nakryiko
c3984343bc sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   2977282b63c3b6f112145ecf0bcefff0c65bd3ac
Checkpoint bpf-next commit: 5c3320d7fece4612d4a413aa3c8e82cdb5b49fcb
Baseline bpf commit:        b2f9f1535bb93ee5fa2ea30ac1c26fa0d676154c
Checkpoint bpf commit:      b2f9f1535bb93ee5fa2ea30ac1c26fa0d676154c

Andrii Nakryiko (1):
  libbpf: Fix memory leak and optimize BTF sanitization

 src/btf.c    |  2 +-
 src/btf.h    |  2 +-
 src/libbpf.c | 11 +++--------
 3 files changed, 5 insertions(+), 10 deletions(-)

--
2.24.1
2020-07-10 09:11:41 -07:00
Andrii Nakryiko
5255eb2799 libbpf: Fix memory leak and optimize BTF sanitization
Coverity's static analysis helpfully reported a memory leak introduced by
0f0e55d8247c ("libbpf: Improve BTF sanitization handling"). While fixing it,
I realized that btf__new() already creates a memory copy, so there is no need
to do this. So this patch also fixes misleading btf__new() signature to make
data into a `const void *` input parameter. And it avoids unnecessary memory
allocation and copy in BTF sanitization code altogether.

Fixes: 0f0e55d8247c ("libbpf: Improve BTF sanitization handling")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200710011023.1655008-1-andriin@fb.com
2020-07-10 09:11:41 -07:00
Andrii Nakryiko
8b5e81a17a sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   2977282b63c3b6f112145ecf0bcefff0c65bd3ac
Checkpoint bpf-next commit: 2977282b63c3b6f112145ecf0bcefff0c65bd3ac
Baseline bpf commit:        0f57a1e522f413e87852e632f55de4723e511939
Checkpoint bpf commit:      b2f9f1535bb93ee5fa2ea30ac1c26fa0d676154c

Jakub Bogusz (1):
  libbpf: Fix libbpf hashmap on (I)LP32 architectures

 src/hashmap.h | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

--
2.24.1
2020-07-09 22:00:15 -07:00
Jakub Bogusz
cd016d93f7 libbpf: Fix libbpf hashmap on (I)LP32 architectures
On ILP32, 64-bit result was shifted by value calculated for 32-bit long type
and returned value was much outside hashmap capacity.
As advised by Andrii Nakryiko, this patch uses different hashing variant for
architectures with size_t shorter than long long.

Fixes: e3b924224028 ("libbpf: add resizable non-thread safe internal hashmap")
Signed-off-by: Jakub Bogusz <qboosh@pld-linux.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200709225723.1069937-1-andriin@fb.com
2020-07-09 22:00:15 -07:00
Andrii Nakryiko
deaee9541d vmtests: update blacklist for 5.5
Add two tests (sockopt_sk and udp_limit) to blacklist of 5.5 kernel.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-07-08 17:12:53 -07:00
Andrii Nakryiko
daa2c7f851 ci: re-arrange tests to prioritize higher-signal tests
Put selftests in first stage. Put long-running LATEST build & test case first,
so that it can be better parallelized with 4.9 and 5.5.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-07-08 17:12:53 -07:00
Andrii Nakryiko
006904d416 vmtests: whitelist core_retro for 4.9 tests
Add core_retro to whitelist for 4.9, as it is supposed to work on old kernels.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-07-08 17:12:53 -07:00
Andrii Nakryiko
e47ebc895d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   6b207d66aa9fad0deed13d5f824e1ea193b0a777
Checkpoint bpf-next commit: 2977282b63c3b6f112145ecf0bcefff0c65bd3ac
Baseline bpf commit:        e708e2bd55c921f5bb554fa5837d132a878951cf
Checkpoint bpf commit:      0f57a1e522f413e87852e632f55de4723e511939

Andrii Nakryiko (4):
  libbpf: Make BTF finalization strict
  libbpf: Add btf__set_fd() for more control over loaded BTF FD
  libbpf: Improve BTF sanitization handling
  libbpf: Handle missing BPF_OBJ_GET_INFO_BY_FD gracefully in
    perf_buffer

Stanislav Fomichev (1):
  libbpf: Add support for BPF_CGROUP_INET_SOCK_RELEASE

 include/uapi/linux/bpf.h |   1 +
 src/btf.c                |   7 +-
 src/btf.h                |   1 +
 src/libbpf.c             | 154 ++++++++++++++++++++++-----------------
 src/libbpf.map           |   1 +
 5 files changed, 95 insertions(+), 69 deletions(-)

--
2.24.1
2020-07-08 17:12:53 -07:00
Andrii Nakryiko
3b2837e296 libbpf: Handle missing BPF_OBJ_GET_INFO_BY_FD gracefully in perf_buffer
perf_buffer__new() is relying on BPF_OBJ_GET_INFO_BY_FD availability for few
sanity checks. OBJ_GET_INFO for maps is actually much more recent feature than
perf_buffer support itself, so this causes unnecessary problems on old kernels
before BPF_OBJ_GET_INFO_BY_FD was added.

This patch makes those sanity checks optional and just assumes best if command
is not supported. If user specified something incorrectly (e.g., wrong map
type), kernel will reject it later anyway, except user won't get a nice
explanation as to why it failed. This seems like a good trade off for
supporting perf_buffer on old kernels.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200708015318.3827358-6-andriin@fb.com
2020-07-08 17:12:53 -07:00
Andrii Nakryiko
90716e9e14 libbpf: Improve BTF sanitization handling
Change sanitization process to preserve original BTF, which might be used by
libbpf itself for Kconfig externs, CO-RE relocs, etc, even if kernel is old
and doesn't support BTF. To achieve that, if libbpf detects the need for BTF
sanitization, it would clone original BTF, sanitize it in-place, attempt to
load it into kernel, and if successful, will preserve loaded BTF FD in
original `struct btf`, while freeing sanitized local copy.

If kernel doesn't support any BTF, original btf and btf_ext will still be
preserved to be used later for CO-RE relocation and other BTF-dependent libbpf
features, which don't dependon kernel BTF support.

Patch takes care to not specify BTF and BTF.ext features when loading BPF
programs and/or maps, if it was detected that kernel doesn't support BTF
features.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200708015318.3827358-4-andriin@fb.com
2020-07-08 17:12:53 -07:00
Andrii Nakryiko
d5a36e2070 libbpf: Add btf__set_fd() for more control over loaded BTF FD
Add setter for BTF FD to allow application more fine-grained control in more
advanced scenarios. Storing BTF FD inside `struct btf` provides little benefit
and probably would be better done differently (e.g., btf__load() could just
return FD on success), but we are stuck with this due to backwards
compatibility. The main problem is that it's impossible to load BTF and than
free user-space memory, but keep FD intact, because `struct btf` assumes
ownership of that FD upon successful load and will attempt to close it during
btf__free(). To allow callers (e.g., libbpf itself for BTF sanitization) to
have more control over this, add btf__set_fd() to allow to reset FD
arbitrarily, if necessary.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200708015318.3827358-3-andriin@fb.com
2020-07-08 17:12:53 -07:00
Andrii Nakryiko
133543c202 libbpf: Make BTF finalization strict
With valid ELF and valid BTF, there is no reason (apart from bugs) why BTF
finalization should fail. So make it strict and return error if it fails. This
makes CO-RE relocation more reliable, as they are not going to be just
silently skipped, if BTF finalization failed.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200708015318.3827358-2-andriin@fb.com
2020-07-08 17:12:53 -07:00
Stanislav Fomichev
abb82202da libbpf: Add support for BPF_CGROUP_INET_SOCK_RELEASE
Add auto-detection for the cgroup/sock_release programs.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200706230128.4073544-3-sdf@google.com
2020-07-08 17:12:53 -07:00
Andrii Nakryiko
5020fdf8fc vmtests: fix 4.9 build
Drop blacklist and instead use a small whitelist of tests that are still
supposed to work on old 4.9 kernel.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-07-07 11:10:16 -07:00
Andrii Nakryiko
a846caca79 vmtests: test no-alu32 variant of test_progs
Add testing of no-alu32 flavor of test_progs.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-07-07 10:41:57 -07:00
Julia Kartseva
1b42b15b5e travis_ci: run tests for 4.9 kernel
Make sure that libbpf sanitizes BTF properly for older kernels.
Add a stage for 4.9.0 kernel in TravisCI.
For now make test failures non-blocking by adding 4.9.0 to `allow_failures`
section.
Blacklist is copy-pasted 5.5.0 kernel blacklist.
2020-07-01 15:38:31 -07:00
Andrii Nakryiko
a2b27a1b62 vmtests: remove custom 5.5 selftest preparetion actions
Now that pre-generated vmlinux.h is used for compilation of non-latest tests,
we don't need custom adjustments for 5.5 kernel selftests. Adjust blacklist
now that those new self-tests are built into test_progs.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-07-01 15:19:18 -07:00
Andrii Nakryiko
7b9d71b21d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   ca4db6389d611eee2eb7c1dfe710b62d8ea06772
Checkpoint bpf-next commit: 6b207d66aa9fad0deed13d5f824e1ea193b0a777
Baseline bpf commit:        2bdeb3ed547d8822b2566797afa6c2584abdb119
Checkpoint bpf commit:      e708e2bd55c921f5bb554fa5837d132a878951cf

Andrii Nakryiko (1):
  libbpf: Make bpf_endian co-exist with vmlinux.h

Song Liu (1):
  bpf: Introduce helper bpf_get_task_stack()

 include/uapi/linux/bpf.h | 37 +++++++++++++++++++++++++++++++++-
 src/bpf_endian.h         | 43 ++++++++++++++++++++++++++++++++--------
 2 files changed, 71 insertions(+), 9 deletions(-)

--
2.24.1
2020-07-01 14:36:55 -07:00
Andrii Nakryiko
89f7f0796a sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-07-01 14:36:55 -07:00
Song Liu
c054d91247 bpf: Introduce helper bpf_get_task_stack()
Introduce helper bpf_get_task_stack(), which dumps stack trace of given
task. This is different to bpf_get_stack(), which gets stack track of
current task. One potential use case of bpf_get_task_stack() is to call
it from bpf_iter__task and dump all /proc/<pid>/stack to a seq_file.

bpf_get_task_stack() uses stack_trace_save_tsk() instead of
get_perf_callchain() for kernel stack. The benefit of this choice is that
stack_trace_save_tsk() doesn't require changes in arch/. The downside of
using stack_trace_save_tsk() is that stack_trace_save_tsk() dumps the
stack trace to unsigned long array. For 32-bit systems, we need to
translate it to u64 array.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200630062846.664389-3-songliubraving@fb.com
2020-07-01 14:36:55 -07:00
Andrii Nakryiko
9c104b1637 libbpf: Make bpf_endian co-exist with vmlinux.h
Make bpf_endian.h compatible with vmlinux.h. It is a frequent request from
users wanting to use bpf_endian.h in their BPF applications using CO-RE and
vmlinux.h.

To achieve that, re-implement byte swap macros and drop all the header
includes. This way it can be used both with linux header includes, as well as
with a vmlinux.h.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200630152125.3631920-2-andriin@fb.com
2020-07-01 14:36:55 -07:00
Andrii Nakryiko
d08d57cd91 vmtests: check in vmlinux.h and use it for non-latest builds
Manually generate vmlinux.h based on latest.config to be used for non-latest
selftest build. This will keep bpftool and newest selftests builds succeeding,
while at runtime blacklist will skip them.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-30 18:09:33 -07:00
Andrii Nakryiko
803243cc33 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   b3eece09e2e69f528a1ab6104861550dec149083
Checkpoint bpf-next commit: afa12644c877d3f627281bb6493d7ca8f9976e3d
Baseline bpf commit:        4e15507fea70c0c312d79610efa46b6853ccf8e0
Checkpoint bpf commit:      2bdeb3ed547d8822b2566797afa6c2584abdb119

Andrii Nakryiko (4):
  bpf: Switch most helper return values from 32-bit int to 64-bit long
  libbpf: Prevent loading vmlinux BTF twice
  libbpf: Support disabling auto-loading BPF programs
  libbpf: Fix CO-RE relocs against .text section

Colin Ian King (1):
  libbpf: Fix spelling mistake "kallasyms" -> "kallsyms"

Dmitry Yakunin (1):
  bpf: Add SO_KEEPALIVE and related options to bpf_setsockopt

Jesper Dangaard Brouer (1):
  libbpf: Adjust SEC short cut for expected attach type BPF_XDP_DEVMAP

Quentin Monnet (1):
  bpf: Fix formatting in documentation for BPF helpers

Yonghong Song (3):
  bpf: Add bpf_skc_to_tcp6_sock() helper
  bpf: Add bpf_skc_to_{tcp, tcp_timewait, tcp_request}_sock() helpers
  bpf: Add bpf_skc_to_udp6_sock() helper

 include/uapi/linux/bpf.h | 277 ++++++++++++++++++++++-----------------
 src/libbpf.c             |  93 +++++++++----
 src/libbpf.h             |   2 +
 src/libbpf.map           |   2 +
 4 files changed, 233 insertions(+), 141 deletions(-)

--
2.24.1
2020-06-29 13:46:33 -07:00
Andrii Nakryiko
d707f8027b sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-06-29 13:46:33 -07:00
Jesper Dangaard Brouer
652f2c0a40 libbpf: Adjust SEC short cut for expected attach type BPF_XDP_DEVMAP
Adjust the SEC("xdp_devmap/") prog type prefix to contain a
slash "/" for expected attach type BPF_XDP_DEVMAP.  This is consistent
with other prog types like tracing.

Fixes: 2778797037a6 ("libbpf: Add SEC name for xdp programs attached to device map")
Suggested-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159309521882.821855.6873145686353617509.stgit@firesoul
2020-06-29 13:46:33 -07:00
Quentin Monnet
2fcd394505 bpf: Fix formatting in documentation for BPF helpers
When producing the bpf-helpers.7 man page from the documentation from
the BPF user space header file, rst2man complains:

    <stdin>:2636: (ERROR/3) Unexpected indentation.
    <stdin>:2640: (WARNING/2) Block quote ends without a blank line; unexpected unindent.

Let's fix formatting for the relevant chunk (item list in
bpf_ringbuf_query()'s description), and for a couple other functions.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200623153935.6215-1-quentin@isovalent.com
2020-06-29 13:46:33 -07:00
Andrii Nakryiko
af3c9f9fc4 libbpf: Fix CO-RE relocs against .text section
bpf_object__find_program_by_title(), used by CO-RE relocation code, doesn't
return .text "BPF program", if it is a function storage for sub-programs.
Because of that, any CO-RE relocation in helper non-inlined functions will
fail. Fix this by searching for .text-corresponding BPF program manually.

Adjust one of bpf_iter selftest to exhibit this pattern.

Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm")
Reported-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200619230423.691274-1-andriin@fb.com
2020-06-29 13:46:33 -07:00
Andrii Nakryiko
a62b08dd0c libbpf: Support disabling auto-loading BPF programs
Currently, bpf_object__load() (and by induction skeleton's load), will always
attempt to prepare, relocate, and load into kernel every single BPF program
found inside the BPF object file. This is often convenient and the right thing
to do and what users expect.

But there are plenty of cases (especially with BPF development constantly
picking up the pace), where BPF application is intended to work with old
kernels, with potentially reduced set of features. But on kernels supporting
extra features, it would like to take a full advantage of them, by employing
extra BPF program. This could be a choice of using fentry/fexit over
kprobe/kretprobe, if kernel is recent enough and is built with BTF. Or BPF
program might be providing optimized bpf_iter-based solution that user-space
might want to use, whenever available. And so on.

With libbpf and BPF CO-RE in particular, it's advantageous to not have to
maintain two separate BPF object files to achieve this. So to enable such use
cases, this patch adds ability to request not auto-loading chosen BPF
programs. In such case, libbpf won't attempt to perform relocations (which
might fail due to old kernel), won't try to resolve BTF types for
BTF-aware (tp_btf/fentry/fexit/etc) program types, because BTF might not be
present, and so on. Skeleton will also automatically skip auto-attachment step
for such not loaded BPF programs.

Overall, this feature allows to simplify development and deployment of
real-world BPF applications with complicated compatibility requirements.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200625232629.3444003-2-andriin@fb.com
2020-06-29 13:46:33 -07:00
Yonghong Song
318ed9d544 bpf: Add bpf_skc_to_udp6_sock() helper
The helper is used in tracing programs to cast a socket
pointer to a udp6_sock pointer.
The return value could be NULL if the casting is illegal.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/bpf/20200623230815.3988481-1-yhs@fb.com
2020-06-29 13:46:33 -07:00
Yonghong Song
47370741be bpf: Add bpf_skc_to_{tcp, tcp_timewait, tcp_request}_sock() helpers
Three more helpers are added to cast a sock_common pointer to
an tcp_sock, tcp_timewait_sock or a tcp_request_sock for
tracing programs.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230811.3988277-1-yhs@fb.com
2020-06-29 13:46:33 -07:00
Yonghong Song
26e5e7dcb0 bpf: Add bpf_skc_to_tcp6_sock() helper
The helper is used in tracing programs to cast a socket
pointer to a tcp6_sock pointer.
The return value could be NULL if the casting is illegal.

A new helper return type RET_PTR_TO_BTF_ID_OR_NULL is added
so the verifier is able to deduce proper return types for the helper.

Different from the previous BTF_ID based helpers,
the bpf_skc_to_tcp6_sock() argument can be several possible
btf_ids. More specifically, all possible socket data structures
with sock_common appearing in the first in the memory layout.
This patch only added socket types related to tcp and udp.

All possible argument btf_id and return value btf_id
for helper bpf_skc_to_tcp6_sock() are pre-calculcated and
cached. In the future, it is even possible to precompute
these btf_id's at kernel build time.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230809.3988195-1-yhs@fb.com
2020-06-29 13:46:33 -07:00
Dmitry Yakunin
cd469e21e8 bpf: Add SO_KEEPALIVE and related options to bpf_setsockopt
This patch adds support of SO_KEEPALIVE flag and TCP related options
to bpf_setsockopt() routine. This is helpful if we want to enable or tune
TCP keepalive for applications which don't do it in the userspace code.

v3:
  - update kernel-doc in uapi (Nikita Vetoshkin <nekto0n@yandex-team.ru>)

v4:
  - update kernel-doc in tools too (Alexei Starovoitov)
  - add test to selftests (Alexei Starovoitov)

Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200620153052.9439-3-zeil@yandex-team.ru
2020-06-29 13:46:33 -07:00
Andrii Nakryiko
18bfe12dc1 libbpf: Prevent loading vmlinux BTF twice
Prevent loading/parsing vmlinux BTF twice in some cases: for CO-RE relocations
and for BTF-aware hooks (tp_btf, fentry/fexit, etc).

Fixes: a6ed02cac690 ("libbpf: Load btf_vmlinux only once per object.")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200624043805.1794620-1-andriin@fb.com
2020-06-29 13:46:33 -07:00
Colin Ian King
fef856084a libbpf: Fix spelling mistake "kallasyms" -> "kallsyms"
There is a spelling mistake in a pr_warn message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200623084207.149253-1-colin.king@canonical.com
2020-06-29 13:46:33 -07:00
Andrii Nakryiko
6f8e021c3c bpf: Switch most helper return values from 32-bit int to 64-bit long
Switch most of BPF helper definitions from returning int to long. These
definitions are coming from comments in BPF UAPI header and are used to
generate bpf_helper_defs.h (under libbpf) to be later included and used from
BPF programs.

In actual in-kernel implementation, all the helpers are defined as returning
u64, but due to some historical reasons, most of them are actually defined as
returning int in UAPI (usually, to return 0 on success, and negative value on
error).

This actually causes Clang to quite often generate sub-optimal code, because
compiler believes that return value is 32-bit, and in a lot of cases has to be
up-converted (usually with a pair of 32-bit bit shifts) to 64-bit values,
before they can be used further in BPF code.

Besides just "polluting" the code, these 32-bit shifts quite often cause
problems for cases in which return value matters. This is especially the case
for the family of bpf_probe_read_str() functions. There are few other similar
helpers (e.g., bpf_read_branch_records()), in which return value is used by
BPF program logic to record variable-length data and process it. For such
cases, BPF program logic carefully manages offsets within some array or map to
read variable-length data. For such uses, it's crucial for BPF verifier to
track possible range of register values to prove that all the accesses happen
within given memory bounds. Those extraneous zero-extending bit shifts,
inserted by Clang (and quite often interleaved with other code, which makes
the issues even more challenging and sometimes requires employing extra
per-variable compiler barriers), throws off verifier logic and makes it mark
registers as having unknown variable offset. We'll study this pattern a bit
later below.

Another common pattern is to check return of BPF helper for non-zero state to
detect error conditions and attempt alternative actions in such case. Even in
this simple and straightforward case, this 32-bit vs BPF's native 64-bit mode
quite often leads to sub-optimal and unnecessary extra code. We'll look at
this pattern as well.

Clang's BPF target supports two modes of code generation: ALU32, in which it
is capable of using lower 32-bit parts of registers, and no-ALU32, in which
only full 64-bit registers are being used. ALU32 mode somewhat mitigates the
above described problems, but not in all cases.

This patch switches all the cases in which BPF helpers return 0 or negative
error from returning int to returning long. It is shown below that such change
in definition leads to equivalent or better code. No-ALU32 mode benefits more,
but ALU32 mode doesn't degrade or still gets improved code generation.

Another class of cases switched from int to long are bpf_probe_read_str()-like
helpers, which encode successful case as non-negative values, while still
returning negative value for errors.

In all of such cases, correctness is preserved due to two's complement
encoding of negative values and the fact that all helpers return values with
32-bit absolute value. Two's complement ensures that for negative values
higher 32 bits are all ones and when truncated, leave valid negative 32-bit
value with the same value. Non-negative values have upper 32 bits set to zero
and similarly preserve value when high 32 bits are truncated. This means that
just casting to int/u32 is correct and efficient (and in ALU32 mode doesn't
require any extra shifts).

To minimize the chances of regressions, two code patterns were investigated,
as mentioned above. For both patterns, BPF assembly was analyzed in
ALU32/NO-ALU32 compiler modes, both with current 32-bit int return type and
new 64-bit long return type.

Case 1. Variable-length data reading and concatenation. This is quite
ubiquitous pattern in tracing/monitoring applications, reading data like
process's environment variables, file path, etc. In such case, many pieces of
string-like variable-length data are read into a single big buffer, and at the
end of the process, only a part of array containing actual data is sent to
user-space for further processing. This case is tested in test_varlen.c
selftest (in the next patch). Code flow is roughly as follows:

  void *payload = &sample->payload;
  u64 len;

  len = bpf_probe_read_kernel_str(payload, MAX_SZ1, &source_data1);
  if (len <= MAX_SZ1) {
      payload += len;
      sample->len1 = len;
  }
  len = bpf_probe_read_kernel_str(payload, MAX_SZ2, &source_data2);
  if (len <= MAX_SZ2) {
      payload += len;
      sample->len2 = len;
  }
  /* and so on */
  sample->total_len = payload - &sample->payload;
  /* send over, e.g., perf buffer */

There could be two variations with slightly different code generated: when len
is 64-bit integer and when it is 32-bit integer. Both variations were analysed.
BPF assembly instructions between two successive invocations of
bpf_probe_read_kernel_str() were used to check code regressions. Results are
below, followed by short analysis. Left side is using helpers with int return
type, the right one is after the switch to long.

ALU32 + INT                                ALU32 + LONG
===========                                ============

64-BIT (13 insns):                         64-BIT (10 insns):
------------------------------------       ------------------------------------
  17:   call 115                             17:   call 115
  18:   if w0 > 256 goto +9 <LBB0_4>         18:   if r0 > 256 goto +6 <LBB0_4>
  19:   w1 = w0                              19:   r1 = 0 ll
  20:   r1 <<= 32                            21:   *(u64 *)(r1 + 0) = r0
  21:   r1 s>>= 32                           22:   r6 = 0 ll
  22:   r2 = 0 ll                            24:   r6 += r0
  24:   *(u64 *)(r2 + 0) = r1              00000000000000c8 <LBB0_4>:
  25:   r6 = 0 ll                            25:   r1 = r6
  27:   r6 += r1                             26:   w2 = 256
00000000000000e0 <LBB0_4>:                   27:   r3 = 0 ll
  28:   r1 = r6                              29:   call 115
  29:   w2 = 256
  30:   r3 = 0 ll
  32:   call 115

32-BIT (11 insns):                         32-BIT (12 insns):
------------------------------------       ------------------------------------
  17:   call 115                             17:   call 115
  18:   if w0 > 256 goto +7 <LBB1_4>         18:   if w0 > 256 goto +8 <LBB1_4>
  19:   r1 = 0 ll                            19:   r1 = 0 ll
  21:   *(u32 *)(r1 + 0) = r0                21:   *(u32 *)(r1 + 0) = r0
  22:   w1 = w0                              22:   r0 <<= 32
  23:   r6 = 0 ll                            23:   r0 >>= 32
  25:   r6 += r1                             24:   r6 = 0 ll
00000000000000d0 <LBB1_4>:                   26:   r6 += r0
  26:   r1 = r6                            00000000000000d8 <LBB1_4>:
  27:   w2 = 256                             27:   r1 = r6
  28:   r3 = 0 ll                            28:   w2 = 256
  30:   call 115                             29:   r3 = 0 ll
                                             31:   call 115

In ALU32 mode, the variant using 64-bit length variable clearly wins and
avoids unnecessary zero-extension bit shifts. In practice, this is even more
important and good, because BPF code won't need to do extra checks to "prove"
that payload/len are within good bounds.

32-bit len is one instruction longer. Clang decided to do 64-to-32 casting
with two bit shifts, instead of equivalent `w1 = w0` assignment. The former
uses extra register. The latter might potentially lose some range information,
but not for 32-bit value. So in this case, verifier infers that r0 is [0, 256]
after check at 18:, and shifting 32 bits left/right keeps that range intact.
We should probably look into Clang's logic and see why it chooses bitshifts
over sub-register assignments for this.

NO-ALU32 + INT                             NO-ALU32 + LONG
==============                             ===============

64-BIT (14 insns):                         64-BIT (10 insns):
------------------------------------       ------------------------------------
  17:   call 115                             17:   call 115
  18:   r0 <<= 32                            18:   if r0 > 256 goto +6 <LBB0_4>
  19:   r1 = r0                              19:   r1 = 0 ll
  20:   r1 >>= 32                            21:   *(u64 *)(r1 + 0) = r0
  21:   if r1 > 256 goto +7 <LBB0_4>         22:   r6 = 0 ll
  22:   r0 s>>= 32                           24:   r6 += r0
  23:   r1 = 0 ll                          00000000000000c8 <LBB0_4>:
  25:   *(u64 *)(r1 + 0) = r0                25:   r1 = r6
  26:   r6 = 0 ll                            26:   r2 = 256
  28:   r6 += r0                             27:   r3 = 0 ll
00000000000000e8 <LBB0_4>:                   29:   call 115
  29:   r1 = r6
  30:   r2 = 256
  31:   r3 = 0 ll
  33:   call 115

32-BIT (13 insns):                         32-BIT (13 insns):
------------------------------------       ------------------------------------
  17:   call 115                             17:   call 115
  18:   r1 = r0                              18:   r1 = r0
  19:   r1 <<= 32                            19:   r1 <<= 32
  20:   r1 >>= 32                            20:   r1 >>= 32
  21:   if r1 > 256 goto +6 <LBB1_4>         21:   if r1 > 256 goto +6 <LBB1_4>
  22:   r2 = 0 ll                            22:   r2 = 0 ll
  24:   *(u32 *)(r2 + 0) = r0                24:   *(u32 *)(r2 + 0) = r0
  25:   r6 = 0 ll                            25:   r6 = 0 ll
  27:   r6 += r1                             27:   r6 += r1
00000000000000e0 <LBB1_4>:                 00000000000000e0 <LBB1_4>:
  28:   r1 = r6                              28:   r1 = r6
  29:   r2 = 256                             29:   r2 = 256
  30:   r3 = 0 ll                            30:   r3 = 0 ll
  32:   call 115                             32:   call 115

In NO-ALU32 mode, for the case of 64-bit len variable, Clang generates much
superior code, as expected, eliminating unnecessary bit shifts. For 32-bit
len, code is identical.

So overall, only ALU-32 32-bit len case is more-or-less equivalent and the
difference stems from internal Clang decision, rather than compiler lacking
enough information about types.

Case 2. Let's look at the simpler case of checking return result of BPF helper
for errors. The code is very simple:

  long bla;
  if (bpf_probe_read_kenerl(&bla, sizeof(bla), 0))
      return 1;
  else
      return 0;

ALU32 + CHECK (9 insns)                    ALU32 + CHECK (9 insns)
====================================       ====================================
  0:    r1 = r10                             0:    r1 = r10
  1:    r1 += -8                             1:    r1 += -8
  2:    w2 = 8                               2:    w2 = 8
  3:    r3 = 0                               3:    r3 = 0
  4:    call 113                             4:    call 113
  5:    w1 = w0                              5:    r1 = r0
  6:    w0 = 1                               6:    w0 = 1
  7:    if w1 != 0 goto +1 <LBB2_2>          7:    if r1 != 0 goto +1 <LBB2_2>
  8:    w0 = 0                               8:    w0 = 0
0000000000000048 <LBB2_2>:                 0000000000000048 <LBB2_2>:
  9:    exit                                 9:    exit

Almost identical code, the only difference is the use of full register
assignment (r1 = r0) vs half-registers (w1 = w0) in instruction #5. On 32-bit
architectures, new BPF assembly might be slightly less optimal, in theory. But
one can argue that's not a big issue, given that use of full registers is
still prevalent (e.g., for parameter passing).

NO-ALU32 + CHECK (11 insns)                NO-ALU32 + CHECK (9 insns)
====================================       ====================================
  0:    r1 = r10                             0:    r1 = r10
  1:    r1 += -8                             1:    r1 += -8
  2:    r2 = 8                               2:    r2 = 8
  3:    r3 = 0                               3:    r3 = 0
  4:    call 113                             4:    call 113
  5:    r1 = r0                              5:    r1 = r0
  6:    r1 <<= 32                            6:    r0 = 1
  7:    r1 >>= 32                            7:    if r1 != 0 goto +1 <LBB2_2>
  8:    r0 = 1                               8:    r0 = 0
  9:    if r1 != 0 goto +1 <LBB2_2>        0000000000000048 <LBB2_2>:
 10:    r0 = 0                               9:    exit
0000000000000058 <LBB2_2>:
 11:    exit

NO-ALU32 is a clear improvement, getting rid of unnecessary zero-extension bit
shifts.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200623032224.4020118-1-andriin@fb.com
2020-06-29 13:46:33 -07:00
Andrii Nakryiko
143213eb82 README: info on routing general BPF/libbpf quesions
We keep getting more and more questions about BPF/libbpf usage.
This repo is not the right place to ask them, as not that many people
monitor it. Re-route folks to bpf@vger.kernel.org

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-22 20:52:42 -07:00
Andrii Nakryiko
ac74ee188d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   1bdb6c9a1c43fdf9b83b2331dfc6229bd2e71d9b
Checkpoint bpf-next commit: b3eece09e2e69f528a1ab6104861550dec149083
Baseline bpf commit:        4e15507fea70c0c312d79610efa46b6853ccf8e0
Checkpoint bpf commit:      4e15507fea70c0c312d79610efa46b6853ccf8e0

Andrii Nakryiko (3):
  libbpf: Generalize libbpf externs support
  libbpf: Add support for extracting kernel symbol addresses
  libbpf: Wrap source argument of BPF_CORE_READ macro in parentheses

 src/bpf_core_read.h |   8 +-
 src/bpf_helpers.h   |   1 +
 src/btf.h           |   5 +
 src/libbpf.c        | 482 +++++++++++++++++++++++++++++++-------------
 4 files changed, 350 insertions(+), 146 deletions(-)

--
2.24.1
2020-06-22 20:31:52 -07:00
Andrii Nakryiko
15943906dc libbpf: Wrap source argument of BPF_CORE_READ macro in parentheses
Wrap source argument of BPF_CORE_READ family of macros into parentheses to
allow uses like this:

BPF_CORE_READ((struct cast_struct *)src, a, b, c);

Fixes: 7db3822ab991 ("libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200619231703.738941-8-andriin@fb.com
2020-06-22 20:31:52 -07:00
Andrii Nakryiko
85749135a6 libbpf: Add support for extracting kernel symbol addresses
Add support for another (in addition to existing Kconfig) special kind of
externs in BPF code, kernel symbol externs. Such externs allow BPF code to
"know" kernel symbol address and either use it for comparisons with kernel
data structures (e.g., struct file's f_op pointer, to distinguish different
kinds of file), or, with the help of bpf_probe_user_kernel(), to follow
pointers and read data from global variables. Kernel symbol addresses are
found through /proc/kallsyms, which should be present in the system.

Currently, such kernel symbol variables are typeless: they have to be defined
as `extern const void <symbol>` and the only operation you can do (in C code)
with them is to take its address. Such extern should reside in a special
section '.ksyms'. bpf_helpers.h header provides __ksym macro for this. Strong
vs weak semantics stays the same as with Kconfig externs. If symbol is not
found in /proc/kallsyms, this will be a failure for strong (non-weak) extern,
but will be defaulted to 0 for weak externs.

If the same symbol is defined multiple times in /proc/kallsyms, then it will
be error if any of the associated addresses differs. In that case, address is
ambiguous, so libbpf falls on the side of caution, rather than confusing user
with randomly chosen address.

In the future, once kernel is extended with variables BTF information, such
ksym externs will be supported in a typed version, which will allow BPF
program to read variable's contents directly, similarly to how it's done for
fentry/fexit input arguments.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20200619231703.738941-3-andriin@fb.com
2020-06-22 20:31:52 -07:00
Andrii Nakryiko
3b320677cd libbpf: Generalize libbpf externs support
Switch existing Kconfig externs to be just one of few possible kinds of more
generic externs. This refactoring is in preparation for ksymbol extern
support, added in the follow up patch. There are no functional changes
intended.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20200619231703.738941-2-andriin@fb.com
2020-06-22 20:31:52 -07:00
Andrii Nakryiko
15fee53503 vmtests: blacklist test using RINGBUF
Test was updated to use BPF_MAP_TYPE_RINGBUF, which is only available starting
from 5.8 version.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-22 17:11:04 -07:00
Andrii Nakryiko
169d35c746 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   69119673bd50b176ded34032fadd41530fb5af21
Checkpoint bpf-next commit: 1bdb6c9a1c43fdf9b83b2331dfc6229bd2e71d9b
Baseline bpf commit:        4e15507fea70c0c312d79610efa46b6853ccf8e0
Checkpoint bpf commit:      4e15507fea70c0c312d79610efa46b6853ccf8e0

Andrii Nakryiko (2):
  libbpf: Bump version to 0.1.0
  libbpf: Add a bunch of attribute getters/setters for map definitions

 src/libbpf.c   | 100 +++++++++++++++++++++++++++++++++++++++++++++----
 src/libbpf.h   |  30 +++++++++++++--
 src/libbpf.map |  17 +++++++++
 3 files changed, 137 insertions(+), 10 deletions(-)

--
2.24.1
2020-06-22 17:11:04 -07:00
Andrii Nakryiko
d8d4713476 libbpf: Add a bunch of attribute getters/setters for map definitions
Add a bunch of getter for various aspects of BPF map. Some of these attribute
(e.g., key_size, value_size, type, etc) are available right now in struct
bpf_map_def, but this patch adds getter allowing to fetch them individually.
bpf_map_def approach isn't very scalable, when ABI stability requirements are
taken into account. It's much easier to extend libbpf and add support for new
features, when each aspect of BPF map has separate getter/setter.

Getters follow the common naming convention of not explicitly having "get" in
its name: bpf_map__type() returns map type, bpf_map__key_size() returns
key_size. Setters, though, explicitly have set in their name:
bpf_map__set_type(), bpf_map__set_key_size().

This patch ensures we now have a getter and a setter for the following
map attributes:
  - type;
  - max_entries;
  - map_flags;
  - numa_node;
  - key_size;
  - value_size;
  - ifindex.

bpf_map__resize() enforces unnecessary restriction of max_entries > 0. It is
unnecessary, because libbpf actually supports zero max_entries for some cases
(e.g., for PERF_EVENT_ARRAY map) and treats it specially during map creation
time. To allow setting max_entries=0, new bpf_map__set_max_entries() setter is
added. bpf_map__resize()'s behavior is preserved for backwards compatibility
reasons.

Map ifindex getter is added as well. There is a setter already, but no
corresponding getter. Fix this assymetry as well. bpf_map__set_ifindex()
itself is converted from void function into error-returning one, similar to
other setters. The only error returned right now is -EBUSY, if BPF map is
already loaded and has corresponding FD.

One lacking attribute with no ability to get/set or even specify it
declaratively is numa_node. This patch fixes this gap and both adds
programmatic getter/setter, as well as adds support for numa_node field in
BTF-defined map.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200621062112.3006313-1-andriin@fb.com
2020-06-22 17:11:04 -07:00
Andrii Nakryiko
ef26b4c37f libbpf: Bump version to 0.1.0
Bump libbpf version to 0.1.0, as new development cycle starts.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200617183132.1970836-1-andriin@fb.com
2020-06-22 17:11:04 -07:00
Andrii Nakryiko
d7b2934cf9 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   69119673bd50b176ded34032fadd41530fb5af21
Checkpoint bpf-next commit: 4e15507fea70c0c312d79610efa46b6853ccf8e0
Baseline bpf commit:        6903cdae9f9f08d61e49c16cbef11c293e33a615
Checkpoint bpf commit:      4e15507fea70c0c312d79610efa46b6853ccf8e0

Andrii Nakryiko (1):
  libbpf: Forward-declare bpf_stats_type for systems with outdated UAPI
    headers

 src/bpf.h | 2 ++
 1 file changed, 2 insertions(+)

--
2.24.1
2020-06-22 15:43:44 -07:00
Andrii Nakryiko
c83d2166e8 libbpf: Forward-declare bpf_stats_type for systems with outdated UAPI headers
Systems that doesn't yet have the very latest linux/bpf.h header, enum
bpf_stats_type will be undefined, causing compilation warnings. Prevents this
by forward-declaring enum.

Fixes: 0bee106716cf ("libbpf: Add support for command BPF_ENABLE_STATS")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200621031159.2279101-1-andriin@fb.com
2020-06-22 15:43:44 -07:00
Andrii Nakryiko
fb27968bf1 vmtests: blacklist 5.5 test and temporary blacklist core_reloc test
Permanently blacklist load_bytes_relative test on 5.5 due to missing
functionality.

Also temporarily blacklist core_reloc test due to failure on latest kernel.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-17 11:48:22 -07:00
Andrii Nakryiko
d6ae406429 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   cb8e59cc87201af93dfbb6c3dccc8fcad72a09c2
Checkpoint bpf-next commit: 69119673bd50b176ded34032fadd41530fb5af21
Baseline bpf commit:        47f6bc4ce1ff70d7ba0924c2f1c218c96cd585fb
Checkpoint bpf commit:      6903cdae9f9f08d61e49c16cbef11c293e33a615

Andrii Nakryiko (2):
  libbpf: Support pre-initializing .bss global variables
  bpf: Fix definition of bpf_ringbuf_output() helper in UAPI comments

 include/uapi/linux/bpf.h | 2 +-
 src/libbpf.c             | 4 ----
 2 files changed, 1 insertion(+), 5 deletions(-)

--
2.24.1
2020-06-17 11:48:22 -07:00
Andrii Nakryiko
cb174c5b8d sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-06-17 11:48:22 -07:00
Andrii Nakryiko
17f747ed38 bpf: Fix definition of bpf_ringbuf_output() helper in UAPI comments
Fix definition of bpf_ringbuf_output() in UAPI header comments, which is used
to generate libbpf's bpf_helper_defs.h header. Return value is a number (error
code), not a pointer.

Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200615214926.3638836-1-andriin@fb.com
2020-06-17 11:48:22 -07:00
Andrii Nakryiko
bf34234885 libbpf: Support pre-initializing .bss global variables
Remove invalid assumption in libbpf that .bss map doesn't have to be updated
in kernel. With addition of skeleton and memory-mapped initialization image,
.bss doesn't have to be all zeroes when BPF map is created, because user-code
might have initialized those variables from user-space.

Fixes: eba9c5f498a1 ("libbpf: Refactor global data map initialization")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200612194504.557844-1-andriin@fb.com
2020-06-17 11:48:22 -07:00
Andrii Nakryiko
46c272f9b4 sync: don't check and warn about non-empty merges anymore
Initial versions of sync script couldn't handle non-empty merges. But since
then, script became smarter, more interactive and thus more powerful and can
handle some complicated situations easily on its own, while falling back to
human intervention for even more complicated situations. This non-empty merge
check has outlived its purpose and is just an annoying bump in sync process.
Drop it.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-10 13:59:07 -07:00
Andrii Nakryiko
40e69c9538 vmtests: un-blacklist ringbuf and cls_redirect selftests
Both tests should be fixed now.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-10 13:58:45 -07:00
Andrii Nakryiko
a975d8ea28 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   9bc499befeef07a4d79f4924bfca05634ad8fc97
Checkpoint bpf-next commit: cb8e59cc87201af93dfbb6c3dccc8fcad72a09c2
Baseline bpf commit:        bdc48fa11e46f867ea4d75fa59ee87a7f48be144
Checkpoint bpf commit:      47f6bc4ce1ff70d7ba0924c2f1c218c96cd585fb

Andrii Nakryiko (1):
  libbpf: Handle GCC noreturn-turned-volatile quirk

Arnaldo Carvalho de Melo (1):
  libbpf: Define __WORDSIZE if not available

Jesper Dangaard Brouer (1):
  bpf: Selftests and tools use struct bpf_devmap_val from uapi

 include/uapi/linux/bpf.h | 13 +++++++++++++
 src/btf_dump.c           | 33 ++++++++++++++++++++++++---------
 src/hashmap.h            |  7 +++----
 3 files changed, 40 insertions(+), 13 deletions(-)

--
2.24.1
2020-06-10 13:58:45 -07:00
Andrii Nakryiko
45f7113925 libbpf: Handle GCC noreturn-turned-volatile quirk
Handle a GCC quirk of emitting extra volatile modifier in DWARF (and
subsequently preserved in BTF by pahole) for function pointers marked as
__attribute__((noreturn)). This was the way to mark such functions before GCC
2.5 added noreturn attribute. Drop such func_proto modifiers, similarly to how
it's done for array (also to handle GCC quirk/bug).

Such volatile attribute is emitted by GCC only, so existing selftests can't
express such test. Simple repro is like this (compiled with GCC + BTF
generated by pahole):

  struct my_struct {
      void __attribute__((noreturn)) (*fn)(int);
  };
  struct my_struct a;

Without this fix, output will be:

struct my_struct {
    voidvolatile  (*fn)(int);
};

With the fix:

struct my_struct {
    void (*fn)(int);
};

Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion")
Reported-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/bpf/20200610052335.2862559-1-andriin@fb.com
2020-06-10 13:58:45 -07:00
Arnaldo Carvalho de Melo
6816734203 libbpf: Define __WORDSIZE if not available
Some systems, such as Android, don't have a define for __WORDSIZE, do it
in terms of __SIZEOF_LONG__, as done in perf since 2012:

   http://git.kernel.org/torvalds/c/3f34f6c0233ae055b5

For reference: https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html

I build tested it here and Andrii did some Travis CI build tests too.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200608161150.GA3073@kernel.org
2020-06-10 13:58:45 -07:00
Jesper Dangaard Brouer
11d2a59689 bpf: Selftests and tools use struct bpf_devmap_val from uapi
Sync tools uapi bpf.h header file and update selftests that use
struct bpf_devmap_val.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/159170951195.2102545.1833108712124273987.stgit@firesoul
2020-06-10 13:58:45 -07:00
Andrii Nakryiko
8c7527ea88 travis-ci: fix travis_terminate invocation
travis_terminate expects integer argument for exit code. Add it.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-10 12:12:01 -07:00
Toke Høiland-Jørgensen
c569e03985 README: Add BTF and Clang information for Arch Linux
Arch recently added BTF to their distribution kernels - see
https://bugs.archlinux.org/task/66260

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
2020-06-08 09:33:59 -07:00
Andrii Nakryiko
1862741fb0 vmtest: disable ringbuf test on latest for now
ringbuf selftest is flaky, disable it for now.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-04 10:48:08 -07:00
Andrii Nakryiko
6a269cf458 README: add OpenSUSE BTF availability info
Add note about OpenSUSE Tumbleweed and BTF.
2020-06-04 10:42:40 -07:00
Andrii Nakryiko
6e15a022db README: add BTF and CO-RE info
Add list of Linux distributions with kernel BTF built-in.
Give few useful links to BPF CO-RE-related material to help users get started.
2020-06-03 11:26:00 -07:00
Andrii Nakryiko
20d9816471 vmtest: temporary blacklist changes to make CI green
Coarse-grained blacklisting until test_progs blacklisting w/ subtests works
better.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-02 18:09:36 -07:00
Andrii Nakryiko
538b3f4ce7 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   9a25c1df24a6fea9dc79eec950453c4e00f707fd
Checkpoint bpf-next commit: 9bc499befeef07a4d79f4924bfca05634ad8fc97
Baseline bpf commit:        bdc48fa11e46f867ea4d75fa59ee87a7f48be144
Checkpoint bpf commit:      bdc48fa11e46f867ea4d75fa59ee87a7f48be144

Daniel Borkmann (2):
  bpf: Fix up bpf_skb_adjust_room helper's skb csum setting
  bpf: Add csum_level helper for fixing up csum levels

 include/uapi/linux/bpf.h | 51 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 50 insertions(+), 1 deletion(-)

--
2.24.1
2020-06-02 18:09:36 -07:00
Andrii Nakryiko
f2610ca9cf sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-06-02 18:09:36 -07:00
Daniel Borkmann
adb5dd203c bpf: Add csum_level helper for fixing up csum levels
Add a bpf_csum_level() helper which BPF programs can use in combination
with bpf_skb_adjust_room() when they pass in BPF_F_ADJ_ROOM_NO_CSUM_RESET
flag to the latter to avoid falling back to CHECKSUM_NONE.

The bpf_csum_level() allows to adjust CHECKSUM_UNNECESSARY skb->csum_levels
via BPF_CSUM_LEVEL_{INC,DEC} which calls __skb_{incr,decr}_checksum_unnecessary()
on the skb. The helper also allows a BPF_CSUM_LEVEL_RESET which sets the skb's
csum to CHECKSUM_NONE as well as a BPF_CSUM_LEVEL_QUERY to just return the
current level. Without this helper, there is no way to otherwise adjust the
skb->csum_level. I did not add an extra dummy flags as there is plenty of free
bitspace in level argument itself iff ever needed in future.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Lorenz Bauer <lmb@cloudflare.com>
Link: https://lore.kernel.org/bpf/279ae3717cb3d03c0ffeb511493c93c450a01e1a.1591108731.git.daniel@iogearbox.net
2020-06-02 18:09:36 -07:00
Daniel Borkmann
3aadd91e97 bpf: Fix up bpf_skb_adjust_room helper's skb csum setting
Lorenz recently reported:

  In our TC classifier cls_redirect [0], we use the following sequence of
  helper calls to decapsulate a GUE (basically IP + UDP + custom header)
  encapsulated packet:

    bpf_skb_adjust_room(skb, -encap_len, BPF_ADJ_ROOM_MAC, BPF_F_ADJ_ROOM_FIXED_GSO)
    bpf_redirect(skb->ifindex, BPF_F_INGRESS)

  It seems like some checksums of the inner headers are not validated in
  this case. For example, a TCP SYN packet with invalid TCP checksum is
  still accepted by the network stack and elicits a SYN ACK. [...]

  That is, we receive the following packet from the driver:

    | ETH | IP | UDP | GUE | IP | TCP |
    skb->ip_summed == CHECKSUM_UNNECESSARY

  ip_summed is CHECKSUM_UNNECESSARY because our NICs do rx checksum offloading.
  On this packet we run skb_adjust_room_mac(-encap_len), and get the following:

    | ETH | IP | TCP |
    skb->ip_summed == CHECKSUM_UNNECESSARY

  Note that ip_summed is still CHECKSUM_UNNECESSARY. After bpf_redirect()'ing
  into the ingress, we end up in tcp_v4_rcv(). There, skb_checksum_init() is
  turned into a no-op due to CHECKSUM_UNNECESSARY.

The bpf_skb_adjust_room() helper is not aware of protocol specifics. Internally,
it handles the CHECKSUM_COMPLETE case via skb_postpull_rcsum(), but that does
not cover CHECKSUM_UNNECESSARY. In this case skb->csum_level of the original
skb prior to bpf_skb_adjust_room() call was 0, that is, covering UDP. Right now
there is no way to adjust the skb->csum_level. NICs that have checksum offload
disabled (CHECKSUM_NONE) or that support CHECKSUM_COMPLETE are not affected.

Use a safe default for CHECKSUM_UNNECESSARY by resetting to CHECKSUM_NONE and
add a flag to the helper called BPF_F_ADJ_ROOM_NO_CSUM_RESET that allows users
from opting out. Opting out is useful for the case where we don't remove/add
full protocol headers, or for the case where a user wants to adjust the csum
level manually e.g. through bpf_csum_level() helper that is added in subsequent
patch.

The bpf_skb_proto_{4_to_6,6_to_4}() for NAT64/46 translation from the BPF
bpf_skb_change_proto() helper uses bpf_skb_net_hdr_{push,pop}() pair internally
as well but doesn't change layers, only transitions between v4 to v6 and vice
versa, therefore no adoption is required there.

  [0] https://lore.kernel.org/bpf/20200424185556.7358-1-lmb@cloudflare.com/

Fixes: 2be7e212d541 ("bpf: add bpf_skb_adjust_room helper")
Reported-by: Lorenz Bauer <lmb@cloudflare.com>
Reported-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/CACAyw9-uU_52esMd1JjuA80fRPHJv5vsSg8GnfW3t_qDU4aVKQ@mail.gmail.com/
Link: https://lore.kernel.org/bpf/11a90472e7cce83e76ddbfce81fdfce7bfc68808.1591108731.git.daniel@iogearbox.net
2020-06-02 18:09:36 -07:00
Andrii Nakryiko
1206ab0e75 vmtest: optionally adjust selftest files depending on kernel version
Some selftests can't be compiled on older kernels. This allows to fix these
problems, if necessary.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-01 22:22:32 -07:00
Andrii Nakryiko
70eac9941d Makefile: add ringbuf.o to the list of object files
Add newly added ringbuf.o to the list of OBJS.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-01 22:22:32 -07:00
Andrii Nakryiko
2fdbf42f98 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   dda18a5c0b75461d1ed228f80b59c67434b8d601
Checkpoint bpf-next commit: 9a25c1df24a6fea9dc79eec950453c4e00f707fd
Baseline bpf commit:        f85c1598ddfe83f61d0656bd1d2025fa3b148b99
Checkpoint bpf commit:      bdc48fa11e46f867ea4d75fa59ee87a7f48be144

Alexei Starovoitov (1):
  tools/bpf: sync bpf.h

Andrii Nakryiko (3):
  bpf: Implement BPF ring buffer and verifier support for it
  libbpf: Add BPF ring buffer support
  libbpf: Add _GNU_SOURCE for reallocarray to ringbuf.c

David Ahern (3):
  bpf: Add support to attach bpf program to a devmap entry
  xdp: Add xdp_txq_info to xdp_buff
  libbpf: Add SEC name for xdp programs attached to device map

Eelco Chaudron (2):
  libbpf: Add API to consume the perf ring buffer content
  libbpf: Fix perf_buffer__free() API for sparse allocs

Jakub Sitnicki (2):
  bpf: Add link-based BPF program attachment to network namespace
  libbpf: Add support for bpf_link-based netns attachment

John Fastabend (1):
  bpf, sk_msg: Add get socket storage helpers

 include/uapi/linux/bpf.h |  95 ++++++++++++-
 src/libbpf.c             |  49 ++++++-
 src/libbpf.h             |  24 ++++
 src/libbpf.map           |   7 +
 src/libbpf_probes.c      |   5 +
 src/ringbuf.c            | 288 +++++++++++++++++++++++++++++++++++++++
 6 files changed, 461 insertions(+), 7 deletions(-)
 create mode 100644 src/ringbuf.c

--
2.24.1
2020-06-01 22:22:32 -07:00
Andrii Nakryiko
365e4805a1 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-06-01 22:22:32 -07:00
Jakub Sitnicki
890f25520a libbpf: Add support for bpf_link-based netns attachment
Add bpf_program__attach_nets(), which uses LINK_CREATE subcommand to create
an FD-based kernel bpf_link, for attach types tied to network namespace,
that is BPF_FLOW_DISSECTOR for the moment.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200531082846.2117903-7-jakub@cloudflare.com
2020-06-01 22:22:32 -07:00
Jakub Sitnicki
fbdee96fa1 bpf: Add link-based BPF program attachment to network namespace
Extend bpf() syscall subcommands that operate on bpf_link, that is
LINK_CREATE, LINK_UPDATE, OBJ_GET_INFO, to accept attach types tied to
network namespaces (only flow dissector at the moment).

Link-based and prog-based attachment can be used interchangeably, but only
one can exist at a time. Attempts to attach a link when a prog is already
attached directly, and the other way around, will be met with -EEXIST.
Attempts to detach a program when link exists result in -EINVAL.

Attachment of multiple links of same attach type to one netns is not
supported with the intention to lift the restriction when a use-case
presents itself. Because of that link create returns -E2BIG when trying to
create another netns link, when one already exists.

Link-based attachments to netns don't keep a netns alive by holding a ref
to it. Instead links get auto-detached from netns when the latter is being
destroyed, using a pernet pre_exit callback.

When auto-detached, link lives in defunct state as long there are open FDs
for it. -ENOLINK is returned if a user tries to update a defunct link.

Because bpf_link to netns doesn't hold a ref to struct net, special care is
taken when releasing, updating, or filling link info. The netns might be
getting torn down when any of these link operations are in progress. That
is why auto-detach and update/release/fill_info are synchronized by the
same mutex. Also, link ops have to always check if auto-detach has not
happened yet and if netns is still alive (refcnt > 0).

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200531082846.2117903-5-jakub@cloudflare.com
2020-06-01 22:22:32 -07:00
Andrii Nakryiko
f54c56be0d libbpf: Add _GNU_SOURCE for reallocarray to ringbuf.c
On systems with recent enough glibc, reallocarray compat won't kick in, so
reallocarray() itself has to come from stdlib.h include. But _GNU_SOURCE is
necessary to enable it. So add it.

Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200601202601.2139477-1-andriin@fb.com
2020-06-01 22:22:32 -07:00
Alexei Starovoitov
8dc4b38871 tools/bpf: sync bpf.h
Sync bpf.h into tool/include/uapi/

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
David Ahern
ed023acd35 libbpf: Add SEC name for xdp programs attached to device map
Support SEC("xdp_devmap*") as a short cut for loading the program with
type BPF_PROG_TYPE_XDP and expected attach type BPF_XDP_DEVMAP.

Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200529220716.75383-5-dsahern@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
David Ahern
ff3116bfcb xdp: Add xdp_txq_info to xdp_buff
Add xdp_txq_info as the Tx counterpart to xdp_rxq_info. At the
moment only the device is added. Other fields (queue_index)
can be added as use cases arise.

>From a UAPI perspective, add egress_ifindex to xdp context for
bpf programs to see the Tx device.

Update the verifier to only allow accesses to egress_ifindex by
XDP programs with BPF_XDP_DEVMAP expected attach type.

Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200529220716.75383-4-dsahern@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
David Ahern
65f4b3ba4c bpf: Add support to attach bpf program to a devmap entry
Add BPF_XDP_DEVMAP attach type for use with programs associated with a
DEVMAP entry.

Allow DEVMAPs to associate a program with a device entry by adding
a bpf_prog.fd to 'struct bpf_devmap_val'. Values read show the program
id, so the fd and id are a union. bpf programs can get access to the
struct via vmlinux.h.

The program associated with the fd must have type XDP with expected
attach type BPF_XDP_DEVMAP. When a program is associated with a device
index, the program is run on an XDP_REDIRECT and before the buffer is
added to the per-cpu queue. At this point rxq data is still valid; the
next patch adds tx device information allowing the prorgam to see both
ingress and egress device indices.

XDP generic is skb based and XDP programs do not work with skb's. Block
the use case by walking maps used by a program that is to be attached
via xdpgeneric and fail if any of them are DEVMAP / DEVMAP_HASH with

Block attach of BPF_XDP_DEVMAP programs to devices.

Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200529220716.75383-3-dsahern@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
Andrii Nakryiko
e1bf7a787e libbpf: Add BPF ring buffer support
Declaring and instantiating BPF ring buffer doesn't require any changes to
libbpf, as it's just another type of maps. So using existing BTF-defined maps
syntax with __uint(type, BPF_MAP_TYPE_RINGBUF) and __uint(max_elements,
<size-of-ring-buf>) is all that's necessary to create and use BPF ring buffer.

This patch adds BPF ring buffer consumer to libbpf. It is very similar to
perf_buffer implementation in terms of API, but also attempts to fix some
minor problems and inconveniences with existing perf_buffer API.

ring_buffer support both single ring buffer use case (with just using
ring_buffer__new()), as well as allows to add more ring buffers, each with its
own callback and context. This allows to efficiently poll and consume
multiple, potentially completely independent, ring buffers, using single
epoll instance.

The latter is actually a problem in practice for applications
that are using multiple sets of perf buffers. They have to create multiple
instances for struct perf_buffer and poll them independently or in a loop,
each approach having its own problems (e.g., inability to use a common poll
timeout). struct ring_buffer eliminates this problem by aggregating many
independent ring buffer instances under the single "ring buffer manager".

Second, perf_buffer's callback can't return error, so applications that need
to stop polling due to error in data or data signalling the end, have to use
extra mechanisms to signal that polling has to stop. ring_buffer's callback
can return error, which will be passed through back to user code and can be
acted upon appropariately.

Two APIs allow to consume ring buffer data:
  - ring_buffer__poll(), which will wait for data availability notification
    and will consume data only from reported ring buffer(s); this API allows
    to efficiently use resources by reading data only when it becomes
    available;
  - ring_buffer__consume(), will attempt to read new records regardless of
    data availablity notification sub-system. This API is useful for cases
    when lowest latency is required, in expense of burning CPU resources.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200529075424.3139988-3-andriin@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
Andrii Nakryiko
17a6d61898 bpf: Implement BPF ring buffer and verifier support for it
This commit adds a new MPSC ring buffer implementation into BPF ecosystem,
which allows multiple CPUs to submit data to a single shared ring buffer. On
the consumption side, only single consumer is assumed.

Motivation
----------
There are two distinctive motivators for this work, which are not satisfied by
existing perf buffer, which prompted creation of a new ring buffer
implementation.
  - more efficient memory utilization by sharing ring buffer across CPUs;
  - preserving ordering of events that happen sequentially in time, even
  across multiple CPUs (e.g., fork/exec/exit events for a task).

These two problems are independent, but perf buffer fails to satisfy both.
Both are a result of a choice to have per-CPU perf ring buffer.  Both can be
also solved by having an MPSC implementation of ring buffer. The ordering
problem could technically be solved for perf buffer with some in-kernel
counting, but given the first one requires an MPSC buffer, the same solution
would solve the second problem automatically.

Semantics and APIs
------------------
Single ring buffer is presented to BPF programs as an instance of BPF map of
type BPF_MAP_TYPE_RINGBUF. Two other alternatives considered, but ultimately
rejected.

One way would be to, similar to BPF_MAP_TYPE_PERF_EVENT_ARRAY, make
BPF_MAP_TYPE_RINGBUF could represent an array of ring buffers, but not enforce
"same CPU only" rule. This would be more familiar interface compatible with
existing perf buffer use in BPF, but would fail if application needed more
advanced logic to lookup ring buffer by arbitrary key. HASH_OF_MAPS addresses
this with current approach. Additionally, given the performance of BPF
ringbuf, many use cases would just opt into a simple single ring buffer shared
among all CPUs, for which current approach would be an overkill.

Another approach could introduce a new concept, alongside BPF map, to
represent generic "container" object, which doesn't necessarily have key/value
interface with lookup/update/delete operations. This approach would add a lot
of extra infrastructure that has to be built for observability and verifier
support. It would also add another concept that BPF developers would have to
familiarize themselves with, new syntax in libbpf, etc. But then would really
provide no additional benefits over the approach of using a map.
BPF_MAP_TYPE_RINGBUF doesn't support lookup/update/delete operations, but so
doesn't few other map types (e.g., queue and stack; array doesn't support
delete, etc).

The approach chosen has an advantage of re-using existing BPF map
infrastructure (introspection APIs in kernel, libbpf support, etc), being
familiar concept (no need to teach users a new type of object in BPF program),
and utilizing existing tooling (bpftool). For common scenario of using
a single ring buffer for all CPUs, it's as simple and straightforward, as
would be with a dedicated "container" object. On the other hand, by being
a map, it can be combined with ARRAY_OF_MAPS and HASH_OF_MAPS map-in-maps to
implement a wide variety of topologies, from one ring buffer for each CPU
(e.g., as a replacement for perf buffer use cases), to a complicated
application hashing/sharding of ring buffers (e.g., having a small pool of
ring buffers with hashed task's tgid being a look up key to preserve order,
but reduce contention).

Key and value sizes are enforced to be zero. max_entries is used to specify
the size of ring buffer and has to be a power of 2 value.

There are a bunch of similarities between perf buffer
(BPF_MAP_TYPE_PERF_EVENT_ARRAY) and new BPF ring buffer semantics:
  - variable-length records;
  - if there is no more space left in ring buffer, reservation fails, no
    blocking;
  - memory-mappable data area for user-space applications for ease of
    consumption and high performance;
  - epoll notifications for new incoming data;
  - but still the ability to do busy polling for new data to achieve the
    lowest latency, if necessary.

BPF ringbuf provides two sets of APIs to BPF programs:
  - bpf_ringbuf_output() allows to *copy* data from one place to a ring
    buffer, similarly to bpf_perf_event_output();
  - bpf_ringbuf_reserve()/bpf_ringbuf_commit()/bpf_ringbuf_discard() APIs
    split the whole process into two steps. First, a fixed amount of space is
    reserved. If successful, a pointer to a data inside ring buffer data area
    is returned, which BPF programs can use similarly to a data inside
    array/hash maps. Once ready, this piece of memory is either committed or
    discarded. Discard is similar to commit, but makes consumer ignore the
    record.

bpf_ringbuf_output() has disadvantage of incurring extra memory copy, because
record has to be prepared in some other place first. But it allows to submit
records of the length that's not known to verifier beforehand. It also closely
matches bpf_perf_event_output(), so will simplify migration significantly.

bpf_ringbuf_reserve() avoids the extra copy of memory by providing a memory
pointer directly to ring buffer memory. In a lot of cases records are larger
than BPF stack space allows, so many programs have use extra per-CPU array as
a temporary heap for preparing sample. bpf_ringbuf_reserve() avoid this needs
completely. But in exchange, it only allows a known constant size of memory to
be reserved, such that verifier can verify that BPF program can't access
memory outside its reserved record space. bpf_ringbuf_output(), while slightly
slower due to extra memory copy, covers some use cases that are not suitable
for bpf_ringbuf_reserve().

The difference between commit and discard is very small. Discard just marks
a record as discarded, and such records are supposed to be ignored by consumer
code. Discard is useful for some advanced use-cases, such as ensuring
all-or-nothing multi-record submission, or emulating temporary malloc()/free()
within single BPF program invocation.

Each reserved record is tracked by verifier through existing
reference-tracking logic, similar to socket ref-tracking. It is thus
impossible to reserve a record, but forget to submit (or discard) it.

bpf_ringbuf_query() helper allows to query various properties of ring buffer.
Currently 4 are supported:
  - BPF_RB_AVAIL_DATA returns amount of unconsumed data in ring buffer;
  - BPF_RB_RING_SIZE returns the size of ring buffer;
  - BPF_RB_CONS_POS/BPF_RB_PROD_POS returns current logical possition of
    consumer/producer, respectively.
Returned values are momentarily snapshots of ring buffer state and could be
off by the time helper returns, so this should be used only for
debugging/reporting reasons or for implementing various heuristics, that take
into account highly-changeable nature of some of those characteristics.

One such heuristic might involve more fine-grained control over poll/epoll
notifications about new data availability in ring buffer. Together with
BPF_RB_NO_WAKEUP/BPF_RB_FORCE_WAKEUP flags for output/commit/discard helpers,
it allows BPF program a high degree of control and, e.g., more efficient
batched notifications. Default self-balancing strategy, though, should be
adequate for most applications and will work reliable and efficiently already.

Design and implementation
-------------------------
This reserve/commit schema allows a natural way for multiple producers, either
on different CPUs or even on the same CPU/in the same BPF program, to reserve
independent records and work with them without blocking other producers. This
means that if BPF program was interruped by another BPF program sharing the
same ring buffer, they will both get a record reserved (provided there is
enough space left) and can work with it and submit it independently. This
applies to NMI context as well, except that due to using a spinlock during
reservation, in NMI context, bpf_ringbuf_reserve() might fail to get a lock,
in which case reservation will fail even if ring buffer is not full.

The ring buffer itself internally is implemented as a power-of-2 sized
circular buffer, with two logical and ever-increasing counters (which might
wrap around on 32-bit architectures, that's not a problem):
  - consumer counter shows up to which logical position consumer consumed the
    data;
  - producer counter denotes amount of data reserved by all producers.

Each time a record is reserved, producer that "owns" the record will
successfully advance producer counter. At that point, data is still not yet
ready to be consumed, though. Each record has 8 byte header, which contains
the length of reserved record, as well as two extra bits: busy bit to denote
that record is still being worked on, and discard bit, which might be set at
commit time if record is discarded. In the latter case, consumer is supposed
to skip the record and move on to the next one. Record header also encodes
record's relative offset from the beginning of ring buffer data area (in
pages). This allows bpf_ringbuf_commit()/bpf_ringbuf_discard() to accept only
the pointer to the record itself, without requiring also the pointer to ring
buffer itself. Ring buffer memory location will be restored from record
metadata header. This significantly simplifies verifier, as well as improving
API usability.

Producer counter increments are serialized under spinlock, so there is
a strict ordering between reservations. Commits, on the other hand, are
completely lockless and independent. All records become available to consumer
in the order of reservations, but only after all previous records where
already committed. It is thus possible for slow producers to temporarily hold
off submitted records, that were reserved later.

Reservation/commit/consumer protocol is verified by litmus tests in
Documentation/litmus-test/bpf-rb.

One interesting implementation bit, that significantly simplifies (and thus
speeds up as well) implementation of both producers and consumers is how data
area is mapped twice contiguously back-to-back in the virtual memory. This
allows to not take any special measures for samples that have to wrap around
at the end of the circular buffer data area, because the next page after the
last data page would be first data page again, and thus the sample will still
appear completely contiguous in virtual memory. See comment and a simple ASCII
diagram showing this visually in bpf_ringbuf_area_alloc().

Another feature that distinguishes BPF ringbuf from perf ring buffer is
a self-pacing notifications of new data being availability.
bpf_ringbuf_commit() implementation will send a notification of new record
being available after commit only if consumer has already caught up right up
to the record being committed. If not, consumer still has to catch up and thus
will see new data anyways without needing an extra poll notification.
Benchmarks (see tools/testing/selftests/bpf/benchs/bench_ringbuf.c) show that
this allows to achieve a very high throughput without having to resort to
tricks like "notify only every Nth sample", which are necessary with perf
buffer. For extreme cases, when BPF program wants more manual control of
notifications, commit/discard/output helpers accept BPF_RB_NO_WAKEUP and
BPF_RB_FORCE_WAKEUP flags, which give full control over notifications of data
availability, but require extra caution and diligence in using this API.

Comparison to alternatives
--------------------------
Before considering implementing BPF ring buffer from scratch existing
alternatives in kernel were evaluated, but didn't seem to meet the needs. They
largely fell into few categores:
  - per-CPU buffers (perf, ftrace, etc), which don't satisfy two motivations
    outlined above (ordering and memory consumption);
  - linked list-based implementations; while some were multi-producer designs,
    consuming these from user-space would be very complicated and most
    probably not performant; memory-mapping contiguous piece of memory is
    simpler and more performant for user-space consumers;
  - io_uring is SPSC, but also requires fixed-sized elements. Naively turning
    SPSC queue into MPSC w/ lock would have subpar performance compared to
    locked reserve + lockless commit, as with BPF ring buffer. Fixed sized
    elements would be too limiting for BPF programs, given existing BPF
    programs heavily rely on variable-sized perf buffer already;
  - specialized implementations (like a new printk ring buffer, [0]) with lots
    of printk-specific limitations and implications, that didn't seem to fit
    well for intended use with BPF programs.

  [0] https://lwn.net/Articles/779550/

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200529075424.3139988-2-andriin@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
Eelco Chaudron
ff2322b879 libbpf: Fix perf_buffer__free() API for sparse allocs
In case the cpu_bufs are sparsely allocated they are not all
free'ed. These changes will fix this.

Fixes: fb84b8224655 ("libbpf: add perf buffer API")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159056888305.330763.9684536967379110349.stgit@ebuild
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
John Fastabend
ab1b4f3844 bpf, sk_msg: Add get socket storage helpers
Add helpers to use local socket storage.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/159033907577.12355.14740125020572756560.stgit@john-Precision-5820-Tower
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
Eelco Chaudron
df9a526f99 libbpf: Add API to consume the perf ring buffer content
This new API, perf_buffer__consume, can be used as follows:

- When you have a perf ring where wakeup_events is higher than 1,
  and you have remaining data in the rings you would like to pull
  out on exit (or maybe based on a timeout).

- For low latency cases where you burn a CPU that constantly polls
  the queues.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159048487929.89441.7465713173442594608.stgit@ebuild
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
Andrii Nakryiko
3b23942542 ci: blacklist bpf_iter tests
Disable a bunch of new kernel selftests that can't succeed on 5.5 kernel.
Flatten Travis tests into a single stage to parallelize and speed them up.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-05-20 01:00:06 -07:00
Andrii Nakryiko
90941cde5f sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   c321022244708aec4675de4f032ef1ba9ff0c640
Checkpoint bpf-next commit: dda18a5c0b75461d1ed228f80b59c67434b8d601
Baseline bpf commit:        7f645462ca01d01abb94d75e6768c8b3ed3a188b
Checkpoint bpf commit:      f85c1598ddfe83f61d0656bd1d2025fa3b148b99

Alexei Starovoitov (1):
  tools/bpf: sync bpf.h

Andrey Ignatov (2):
  bpf: Support narrow loads from bpf_sock_addr.user_port
  bpf: Introduce bpf_sk_{, ancestor_}cgroup_id helpers

Daniel Borkmann (2):
  bpf: Add get{peer, sock}name attach types for sock_addr
  bpf, libbpf: Enable get{peer, sock}name attach types

Eelco Chaudron (1):
  libbpf: Fix probe code to return EPERM if encountered

Gustavo A. R. Silva (1):
  bpf, libbpf: Replace zero-length array with flexible-array

Horatiu Vultur (1):
  net: bridge: Add port attribute IFLA_BRPORT_MRP_RING_OPEN

Ian Rogers (2):
  libbpf, hashmap: Remove unused #include
  libbpf, hashmap: Fix signedness warnings

Quentin Monnet (1):
  tools, bpf: Synchronise BPF UAPI header with tools

Song Liu (2):
  bpf: Sharing bpf runtime stats with BPF_ENABLE_STATS
  libbpf: Add support for command BPF_ENABLE_STATS

Stanislav Fomichev (2):
  bpf: Bpf_{g,s}etsockopt for struct bpf_sock_addr
  bpf: Allow any port in bpf_bind helper

Sumanth Korikkar (1):
  libbpf: Fix register naming in PT_REGS s390 macros

Yonghong Song (7):
  bpf: Allow loading of a bpf_iter program
  bpf: Support bpf tracing/iter programs for BPF_LINK_CREATE
  bpf: Create anonymous bpf iterator
  bpf: Add bpf_seq_printf and bpf_seq_write helpers
  tools/libbpf: Add bpf_iter support
  tools/libpf: Add offsetof/container_of macro in bpf_helpers.h
  bpf: Change btf_iter func proto prefix to "bpf_iter_"

 include/uapi/linux/bpf.h     | 208 +++++++++++++++++++++++++++--------
 include/uapi/linux/if_link.h |   1 +
 src/bpf.c                    |  20 ++++
 src/bpf.h                    |   3 +
 src/bpf_helpers.h            |  14 +++
 src/bpf_tracing.h            |  20 +++-
 src/hashmap.c                |   5 +-
 src/hashmap.h                |   1 -
 src/libbpf.c                 |  98 +++++++++++++++--
 src/libbpf.h                 |   9 ++
 src/libbpf.map               |   3 +
 src/libbpf_internal.h        |   2 +-
 12 files changed, 322 insertions(+), 62 deletions(-)

--
2.24.1
2020-05-20 01:00:06 -07:00
Andrii Nakryiko
97a0d1e7b5 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-05-20 01:00:06 -07:00
Alexei Starovoitov
d650751a9b tools/bpf: sync bpf.h
Sync tools/include/uapi/linux/bpf.h from include/uapi.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-05-20 01:00:06 -07:00
Daniel Borkmann
dcb0c5ac44 bpf, libbpf: Enable get{peer, sock}name attach types
Trivial patch to add the new get{peer,sock}name attach types to the section
definitions in order to hook them up to sock_addr cgroup program type.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/7fcd4b1e41a8ebb364754a5975c75a7795051bd2.1589841594.git.daniel@iogearbox.net
2020-05-20 01:00:06 -07:00
Daniel Borkmann
2c892f1aa1 bpf: Add get{peer, sock}name attach types for sock_addr
As stated in 983695fa6765 ("bpf: fix unconnected udp hooks"), the objective
for the existing cgroup connect/sendmsg/recvmsg/bind BPF hooks is to be
transparent to applications. In Cilium we make use of these hooks [0] in
order to enable E-W load balancing for existing Kubernetes service types
for all Cilium managed nodes in the cluster. Those backends can be local
or remote. The main advantage of this approach is that it operates as close
as possible to the socket, and therefore allows to avoid packet-based NAT
given in connect/sendmsg/recvmsg hooks we only need to xlate sock addresses.

This also allows to expose NodePort services on loopback addresses in the
host namespace, for example. As another advantage, this also efficiently
blocks bind requests for applications in the host namespace for exposed
ports. However, one missing item is that we also need to perform reverse
xlation for inet{,6}_getname() hooks such that we can return the service
IP/port tuple back to the application instead of the remote peer address.

The vast majority of applications does not bother about getpeername(), but
in a few occasions we've seen breakage when validating the peer's address
since it returns unexpectedly the backend tuple instead of the service one.
Therefore, this trivial patch allows to customise and adds a getpeername()
as well as getsockname() BPF cgroup hook for both IPv4 and IPv6 in order
to address this situation.

Simple example:

  # ./cilium/cilium service list
  ID   Frontend     Service Type   Backend
  1    1.2.3.4:80   ClusterIP      1 => 10.0.0.10:80

Before; curl's verbose output example, no getpeername() reverse xlation:

  # curl --verbose 1.2.3.4
  * Rebuilt URL to: 1.2.3.4/
  *   Trying 1.2.3.4...
  * TCP_NODELAY set
  * Connected to 1.2.3.4 (10.0.0.10) port 80 (#0)
  > GET / HTTP/1.1
  > Host: 1.2.3.4
  > User-Agent: curl/7.58.0
  > Accept: */*
  [...]

After; with getpeername() reverse xlation:

  # curl --verbose 1.2.3.4
  * Rebuilt URL to: 1.2.3.4/
  *   Trying 1.2.3.4...
  * TCP_NODELAY set
  * Connected to 1.2.3.4 (1.2.3.4) port 80 (#0)
  > GET / HTTP/1.1
  >  Host: 1.2.3.4
  > User-Agent: curl/7.58.0
  > Accept: */*
  [...]

Originally, I had both under a BPF_CGROUP_INET{4,6}_GETNAME type and exposed
peer to the context similar as in inet{,6}_getname() fashion, but API-wise
this is suboptimal as it always enforces programs having to test for ctx->peer
which can easily be missed, hence BPF_CGROUP_INET{4,6}_GET{PEER,SOCK}NAME split.
Similarly, the checked return code is on tnum_range(1, 1), but if a use case
comes up in future, it can easily be changed to return an error code instead.
Helper and ctx member access is the same as with connect/sendmsg/etc hooks.

  [0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.c

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/61a479d759b2482ae3efb45546490bacd796a220.1589841594.git.daniel@iogearbox.net
2020-05-20 01:00:06 -07:00
Ian Rogers
46407182c7 libbpf, hashmap: Fix signedness warnings
Fixes the following warnings:

  hashmap.c: In function ‘hashmap__clear’:
  hashmap.h:150:20: error: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare]
    150 |  for (bkt = 0; bkt < map->cap; bkt++)        \

  hashmap.c: In function ‘hashmap_grow’:
  hashmap.h:150:20: error: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare]
    150 |  for (bkt = 0; bkt < map->cap; bkt++)        \

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200515165007.217120-4-irogers@google.com
2020-05-20 01:00:06 -07:00
Ian Rogers
a00d463bb9 libbpf, hashmap: Remove unused #include
Remove #include of libbpf_internal.h that is unused.

Discussed in this thread:
https://lore.kernel.org/lkml/CAEf4BzZRmiEds_8R8g4vaAeWvJzPb4xYLnpF0X2VNY8oTzkphQ@mail.gmail.com/

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200515165007.217120-3-irogers@google.com
2020-05-20 01:00:06 -07:00
Sumanth Korikkar
d8fdd1e848 libbpf: Fix register naming in PT_REGS s390 macros
Fix register naming in PT_REGS s390 macros

Fixes: b8ebce86ffe6 ("libbpf: Provide CO-RE variants of PT_REGS macros")
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200513154414.29972-1-sumanthk@linux.ibm.com
2020-05-20 01:00:06 -07:00
Andrey Ignatov
b8482d74a1 bpf: Introduce bpf_sk_{, ancestor_}cgroup_id helpers
With having ability to lookup sockets in cgroup skb programs it becomes
useful to access cgroup id of retrieved sockets so that policies can be
implemented based on origin cgroup of such socket.

For example, a container running in a cgroup can have cgroup skb ingress
program that can lookup peer socket that is sending packets to a process
inside the container and decide whether those packets should be allowed
or denied based on cgroup id of the peer.

More specifically such ingress program can implement intra-host policy
"allow incoming packets only from this same container and not from any
other container on same host" w/o relying on source IP addresses since
quite often it can be the case that containers share same IP address on
the host.

Introduce two new helpers for this use-case: bpf_sk_cgroup_id() and
bpf_sk_ancestor_cgroup_id().

These helpers are similar to existing bpf_skb_{,ancestor_}cgroup_id
helpers with the only difference that sk is used to get cgroup id
instead of skb, and share code with them.

See documentation in UAPI for more details.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/f5884981249ce911f63e9b57ecd5d7d19154ff39.1589486450.git.rdna@fb.com
2020-05-20 01:00:06 -07:00
Andrey Ignatov
3cd9cac8fb bpf: Support narrow loads from bpf_sock_addr.user_port
bpf_sock_addr.user_port supports only 4-byte load and it leads to ugly
code in BPF programs, like:

	volatile __u32 user_port = ctx->user_port;
	__u16 port = bpf_ntohs(user_port);

Since otherwise clang may optimize the load to be 2-byte and it's
rejected by verifier.

Add support for 1- and 2-byte loads same way as it's supported for other
fields in bpf_sock_addr like user_ip4, msg_src_ip4, etc.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/c1e983f4c17573032601d0b2b1f9d1274f24bc16.1589420814.git.rdna@fb.com
2020-05-20 01:00:06 -07:00
Yonghong Song
70e6075d1d bpf: Change btf_iter func proto prefix to "bpf_iter_"
This is to be consistent with tracing and lsm programs
which have prefix "bpf_trace_" and "bpf_lsm_" respectively.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200513180216.2949387-1-yhs@fb.com
2020-05-20 01:00:06 -07:00
Eelco Chaudron
d71e9baa8b libbpf: Fix probe code to return EPERM if encountered
When the probe code was failing for any reason ENOTSUP was returned, even
if this was due to not having enough lock space. This patch fixes this by
returning EPERM to the user application, so it can respond and increase
the RLIMIT_MEMLOCK size.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/158927424896.2342.10402475603585742943.stgit@ebuild
2020-05-20 01:00:06 -07:00
Quentin Monnet
b41c6d34a4 tools, bpf: Synchronise BPF UAPI header with tools
Synchronise the bpf.h header under tools, to report the fixes recently
brought to the documentation for the BPF helpers.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200511161536.29853-5-quentin@isovalent.com
2020-05-20 01:00:06 -07:00
Gustavo A. R. Silva
9029d18d9b bpf, libbpf: Replace zero-length array with flexible-array
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200507185057.GA13981@embeddedor
2020-05-20 01:00:06 -07:00
Yonghong Song
f81f504e12 tools/libpf: Add offsetof/container_of macro in bpf_helpers.h
These two helpers will be used later in bpf_iter bpf program
bpf_iter_netlink.c. Put them in bpf_helpers.h since they could
be useful in other cases.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200509175919.2477104-1-yhs@fb.com
2020-05-20 01:00:06 -07:00
Yonghong Song
021e35fba2 tools/libbpf: Add bpf_iter support
Two new libbpf APIs are added to support bpf_iter:
  - bpf_program__attach_iter
    Given a bpf program and additional parameters, which is
    none now, returns a bpf_link.
  - bpf_iter_create
    syscall level API to create a bpf iterator.

The macro BPF_SEQ_PRINTF are also introduced. The format
looks like:
  BPF_SEQ_PRINTF(seq, "task id %d\n", pid);

This macro can help bpf program writers with
nicer bpf_seq_printf syntax similar to the kernel one.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200509175917.2476936-1-yhs@fb.com
2020-05-20 01:00:06 -07:00
Yonghong Song
7112841ade bpf: Add bpf_seq_printf and bpf_seq_write helpers
Two helpers bpf_seq_printf and bpf_seq_write, are added for
writing data to the seq_file buffer.

bpf_seq_printf supports common format string flag/width/type
fields so at least I can get identical results for
netlink and ipv6_route targets.

For bpf_seq_printf and bpf_seq_write, return value -EOVERFLOW
specifically indicates a write failure due to overflow, which
means the object will be repeated in the next bpf invocation
if object collection stays the same. Note that if the object
collection is changed, depending how collection traversal is
done, even if the object still in the collection, it may not
be visited.

For bpf_seq_printf, format %s, %p{i,I}{4,6} needs to
read kernel memory. Reading kernel memory may fail in
the following two cases:
  - invalid kernel address, or
  - valid kernel address but requiring a major fault
If reading kernel memory failed, the %s string will be
an empty string and %p{i,I}{4,6} will be all 0.
Not returning error to bpf program is consistent with
what bpf_trace_printk() does for now.

bpf_seq_printf may return -EBUSY meaning that internal percpu
buffer for memory copy of strings or other pointees is
not available. Bpf program can return 1 to indicate it
wants the same object to be repeated. Right now, this should not
happen on no-RT kernels since migrate_disable(), which guards
bpf prog call, calls preempt_disable().

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200509175914.2476661-1-yhs@fb.com
2020-05-20 01:00:06 -07:00
Yonghong Song
940f4df57b bpf: Create anonymous bpf iterator
A new bpf command BPF_ITER_CREATE is added.

The anonymous bpf iterator is seq_file based.
The seq_file private data are referenced by targets.
The bpf_iter infrastructure allocated additional space
at seq_file->private before the space used by targets
to store some meta data, e.g.,
  prog:       prog to run
  session_id: an unique id for each opened seq_file
  seq_num:    how many times bpf programs are queried in this session
  done_stop:  an internal state to decide whether bpf program
              should be called in seq_ops->stop() or not

The seq_num will start from 0 for valid objects.
The bpf program may see the same seq_num more than once if
 - seq_file buffer overflow happens and the same object
   is retried by bpf_seq_read(), or
 - the bpf program explicitly requests a retry of the
   same object

Since module is not supported for bpf_iter, all target
registeration happens at __init time, so there is no
need to change bpf_iter_unreg_target() as it is used
mostly in error path of the init function at which time
no bpf iterators have been created yet.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200509175905.2475770-1-yhs@fb.com
2020-05-20 01:00:06 -07:00
Yonghong Song
46c906b6d1 bpf: Support bpf tracing/iter programs for BPF_LINK_CREATE
Given a bpf program, the step to create an anonymous bpf iterator is:
  - create a bpf_iter_link, which combines bpf program and the target.
    In the future, there could be more information recorded in the link.
    A link_fd will be returned to the user space.
  - create an anonymous bpf iterator with the given link_fd.

The bpf_iter_link can be pinned to bpffs mount file system to
create a file based bpf iterator as well.

The benefit to use of bpf_iter_link:
  - using bpf link simplifies design and implementation as bpf link
    is used for other tracing bpf programs.
  - for file based bpf iterator, bpf_iter_link provides a standard
    way to replace underlying bpf programs.
  - for both anonymous and free based iterators, bpf link query
    capability can be leveraged.

The patch added support of tracing/iter programs for BPF_LINK_CREATE.
A new link type BPF_LINK_TYPE_ITER is added to facilitate link
querying. Currently, only prog_id is needed, so there is no
additional in-kernel show_fdinfo() and fill_link_info() hook
is needed for BPF_LINK_TYPE_ITER link.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200509175901.2475084-1-yhs@fb.com
2020-05-20 01:00:06 -07:00
Yonghong Song
9dc3736a7f bpf: Allow loading of a bpf_iter program
A bpf_iter program is a tracing program with attach type
BPF_TRACE_ITER. The load attribute
  attach_btf_id
is used by the verifier against a particular kernel function,
which represents a target, e.g., __bpf_iter__bpf_map
for target bpf_map which is implemented later.

The program return value must be 0 or 1 for now.
  0 : successful, except potential seq_file buffer overflow
      which is handled by seq_file reader.
  1 : request to restart the same object

In the future, other return values may be used for filtering or
teminating the iterator.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200509175900.2474947-1-yhs@fb.com
2020-05-20 01:00:06 -07:00
Stanislav Fomichev
8b3cbf12a2 bpf: Allow any port in bpf_bind helper
We want to have a tighter control on what ports we bind to in
the BPF_CGROUP_INET{4,6}_CONNECT hooks even if it means
connect() becomes slightly more expensive. The expensive part
comes from the fact that we now need to call inet_csk_get_port()
that verifies that the port is not used and allocates an entry
in the hash table for it.

Since we can't rely on "snum || !bind_address_no_port" to prevent
us from calling POST_BIND hook anymore, let's add another bind flag
to indicate that the call site is BPF program.

v5:
* fix wrong AF_INET (should be AF_INET6) in the bpf program for v6

v3:
* More bpf_bind documentation refinements (Martin KaFai Lau)
* Add UDP tests as well (Martin KaFai Lau)
* Don't start the thread, just do socket+bind+listen (Martin KaFai Lau)

v2:
* Update documentation (Andrey Ignatov)
* Pass BIND_FORCE_ADDRESS_NO_PORT conditionally (Andrey Ignatov)

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200508174611.228805-5-sdf@google.com
2020-05-20 01:00:06 -07:00
Stanislav Fomichev
dfa07417ff bpf: Bpf_{g,s}etsockopt for struct bpf_sock_addr
Currently, bpf_getsockopt and bpf_setsockopt helpers operate on the
'struct bpf_sock_ops' context in BPF_PROG_TYPE_SOCK_OPS program.
Let's generalize them and make them available for 'struct bpf_sock_addr'.
That way, in the future, we can allow those helpers in more places.

As an example, let's expose those 'struct bpf_sock_addr' based helpers to
BPF_CGROUP_INET{4,6}_CONNECT hooks. That way we can override CC before the
connection is made.

v3:
* Expose custom helpers for bpf_sock_addr context instead of doing
  generic bpf_sock argument (as suggested by Daniel). Even with
  try_socket_lock that doesn't sleep we have a problem where context sk
  is already locked and socket lock is non-nestable.

v2:
* s/BPF_PROG_TYPE_CGROUP_SOCKOPT/BPF_PROG_TYPE_SOCK_OPS/

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200430233152.199403-1-sdf@google.com
2020-05-20 01:00:06 -07:00
Song Liu
5c1c96c579 libbpf: Add support for command BPF_ENABLE_STATS
bpf_enable_stats() is added to enable given stats.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200430071506.1408910-3-songliubraving@fb.com
2020-05-20 01:00:06 -07:00
Song Liu
83f269b088 bpf: Sharing bpf runtime stats with BPF_ENABLE_STATS
Currently, sysctl kernel.bpf_stats_enabled controls BPF runtime stats.
Typical userspace tools use kernel.bpf_stats_enabled as follows:

  1. Enable kernel.bpf_stats_enabled;
  2. Check program run_time_ns;
  3. Sleep for the monitoring period;
  4. Check program run_time_ns again, calculate the difference;
  5. Disable kernel.bpf_stats_enabled.

The problem with this approach is that only one userspace tool can toggle
this sysctl. If multiple tools toggle the sysctl at the same time, the
measurement may be inaccurate.

To fix this problem while keep backward compatibility, introduce a new
bpf command BPF_ENABLE_STATS. On success, this command enables stats and
returns a valid fd. BPF_ENABLE_STATS takes argument "type". Currently,
only one type, BPF_STATS_RUN_TIME, is supported. We can extend the
command to support other types of stats in the future.

With BPF_ENABLE_STATS, user space tool would have the following flow:

  1. Get a fd with BPF_ENABLE_STATS, and make sure it is valid;
  2. Check program run_time_ns;
  3. Sleep for the monitoring period;
  4. Check program run_time_ns again, calculate the difference;
  5. Close the fd.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200430071506.1408910-2-songliubraving@fb.com
2020-05-20 01:00:06 -07:00
Horatiu Vultur
597d350e4a net: bridge: Add port attribute IFLA_BRPORT_MRP_RING_OPEN
This patch adds a new port attribute, IFLA_BRPORT_MRP_RING_OPEN, which allows
to notify the userspace when the port lost the continuite of MRP frames.

This attribute is set by kernel whenever the SW or HW detects that the ring is
being open or closed.

Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-20 01:00:06 -07:00
Andrii Nakryiko
7fc4d5025b vmtest: add bpf_obj_id to 5.5.0 blacklist
bpf_obj_id selftest added testing of bpf_link related operations, which are
not implemented in 5.5.0. Blacklist it.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
bd9e2feb2a sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   2fcd80144b93ff90836a44f2054b4d82133d3a85
Checkpoint bpf-next commit: c321022244708aec4675de4f032ef1ba9ff0c640
Baseline bpf commit:        edadedf1c5b4e4404192a0a4c3c0c05e3b7672ab
Checkpoint bpf commit:      7f645462ca01d01abb94d75e6768c8b3ed3a188b

Andrii Nakryiko (8):
  bpf: Add support for BPF_OBJ_GET_INFO_BY_FD for bpf_link
  libbpf: Add low-level APIs for new bpf_link commands
  libbpf: Refactor BTF-defined map definition parsing logic
  libbpf: Refactor map creation logic and fix cleanup leak
  libbpf: Add BTF-defined map-in-map support
  libbpf: Fix memory leak and possible double-free in hashmap__clear
  libbpf: Fix huge memory leak in libbpf_find_vmlinux_btf_id()
  libbpf: Fix false uninitialized variable warning

David Ahern (1):
  libbpf: Only check mode flags in get_xdp_id

Jakub Wilk (1):
  bpf: Fix reStructuredText markup

Maciej Żenczykowski (1):
  bpf: add bpf_ktime_get_boot_ns()

Mao Wenan (1):
  libbpf: Return err if bpf_object__load failed

Yoshiki Komachi (1):
  bpf_helpers.h: Add note for building with vmlinux.h or linux/types.h

Zou Wei (1):
  libbpf: Remove unneeded semicolon in btf_dump_emit_type

 include/uapi/linux/bpf.h |  46 ++-
 src/bpf.c                |  19 +-
 src/bpf.h                |   4 +-
 src/bpf_helpers.h        |   7 +
 src/btf_dump.c           |   2 +-
 src/hashmap.c            |   7 +
 src/libbpf.c             | 705 +++++++++++++++++++++++++++------------
 src/libbpf.map           |   6 +
 src/netlink.c            |   2 +
 9 files changed, 572 insertions(+), 226 deletions(-)

--
2.24.1
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
814ed5011f sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
f8faf2b33d libbpf: Fix false uninitialized variable warning
Some versions of GCC falsely detect that vi might not be initialized. That's
not true, but let's silence it with NULL initialization.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200430021436.1522502-1-andriin@fb.com
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
3cb0b3fd52 libbpf: Fix huge memory leak in libbpf_find_vmlinux_btf_id()
BTF object wasn't freed.

Fixes: a6ed02cac690 ("libbpf: Load btf_vmlinux only once per object.")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20200429012111.277390-9-andriin@fb.com
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
edb1aaa8dc libbpf: Fix memory leak and possible double-free in hashmap__clear
Fix memory leak in hashmap_clear() not freeing hashmap_entry structs for each
of the remaining entries. Also NULL-out bucket list to prevent possible
double-free between hashmap__clear() and hashmap__free().

Running test_progs-asan flavor clearly showed this problem.

Reported-by: Alston Tang <alston64@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200429012111.277390-5-andriin@fb.com
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
f3271942dd libbpf: Add BTF-defined map-in-map support
As discussed at LPC 2019 ([0]), this patch brings (a quite belated) support
for declarative BTF-defined map-in-map support in libbpf. It allows to define
ARRAY_OF_MAPS and HASH_OF_MAPS BPF maps without any user-space initialization
code involved.

Additionally, it allows to initialize outer map's slots with references to
respective inner maps at load time, also completely declaratively.

Despite a weak type system of C, the way BTF-defined map-in-map definition
works, it's actually quite hard to accidentally initialize outer map with
incompatible inner maps. This being C, of course, it's still possible, but
even that would be caught at load time and error returned with helpful debug
log pointing exactly to the slot that failed to be initialized.

As an example, here's a rather advanced HASH_OF_MAPS declaration and
initialization example, filling slots #0 and #4 with two inner maps:

  #include <bpf/bpf_helpers.h>

  struct inner_map {
          __uint(type, BPF_MAP_TYPE_ARRAY);
          __uint(max_entries, 1);
          __type(key, int);
          __type(value, int);
  } inner_map1 SEC(".maps"),
    inner_map2 SEC(".maps");

  struct outer_hash {
          __uint(type, BPF_MAP_TYPE_HASH_OF_MAPS);
          __uint(max_entries, 5);
          __uint(key_size, sizeof(int));
          __array(values, struct inner_map);
  } outer_hash SEC(".maps") = {
          .values = {
                  [0] = &inner_map2,
                  [4] = &inner_map1,
          },
  };

Here's the relevant part of libbpf debug log showing pretty clearly of what's
going on with map-in-map initialization:

  libbpf: .maps relo #0: for 6 value 0 rel.r_offset 96 name 260 ('inner_map1')
  libbpf: .maps relo #0: map 'outer_arr' slot [0] points to map 'inner_map1'
  libbpf: .maps relo #1: for 7 value 32 rel.r_offset 112 name 249 ('inner_map2')
  libbpf: .maps relo #1: map 'outer_arr' slot [2] points to map 'inner_map2'
  libbpf: .maps relo #2: for 7 value 32 rel.r_offset 144 name 249 ('inner_map2')
  libbpf: .maps relo #2: map 'outer_hash' slot [0] points to map 'inner_map2'
  libbpf: .maps relo #3: for 6 value 0 rel.r_offset 176 name 260 ('inner_map1')
  libbpf: .maps relo #3: map 'outer_hash' slot [4] points to map 'inner_map1'
  libbpf: map 'inner_map1': created successfully, fd=4
  libbpf: map 'inner_map2': created successfully, fd=5
  libbpf: map 'outer_hash': created successfully, fd=7
  libbpf: map 'outer_hash': slot [0] set to map 'inner_map2' fd=5
  libbpf: map 'outer_hash': slot [4] set to map 'inner_map1' fd=4

Notice from the log above that fd=6 (not logged explicitly) is used for inner
"prototype" map, necessary for creation of outer map. It is destroyed
immediately after outer map is created.

See also included selftest with some extra comments explaining extra details
of usage. Additionally, similar initialization syntax and libbpf functionality
can be used to do initialization of BPF_PROG_ARRAY with references to BPF
sub-programs. This can be done in follow up patches, if there will be a demand
for this.

  [0] https://linuxplumbersconf.org/event/4/contributions/448/

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200429002739.48006-4-andriin@fb.com
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
040f73a7c7 libbpf: Refactor map creation logic and fix cleanup leak
Factor out map creation and destruction logic to simplify code and especially
error handling. Also fix map FD leak in case of partially successful map
creation during bpf_object load operation.

Fixes: 57a00f41644f ("libbpf: Add auto-pinning of maps when loading BPF objects")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200429002739.48006-3-andriin@fb.com
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
35283f89c6 libbpf: Refactor BTF-defined map definition parsing logic
Factor out BTF map definition logic into stand-alone routine for easier reuse
for map-in-map case.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200429002739.48006-2-andriin@fb.com
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
1c4c845e79 libbpf: Add low-level APIs for new bpf_link commands
Add low-level API calls for bpf_link_get_next_id() and
bpf_link_get_fd_by_id().

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200429001614.1544-6-andriin@fb.com
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
2a374b5df0 bpf: Add support for BPF_OBJ_GET_INFO_BY_FD for bpf_link
Add ability to fetch bpf_link details through BPF_OBJ_GET_INFO_BY_FD command.
Also enhance show_fdinfo to potentially include bpf_link type-specific
information (similarly to obj_info).

Also introduce enum bpf_link_type stored in bpf_link itself and expose it in
UAPI. bpf_link_tracing also now will store and return bpf_attach_type.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200429001614.1544-5-andriin@fb.com
2020-05-01 18:58:47 -07:00
Zou Wei
7878754030 libbpf: Remove unneeded semicolon in btf_dump_emit_type
Fixes the following coccicheck warning:

 tools/lib/bpf/btf_dump.c:661:4-5: Unneeded semicolon

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1588064829-70613-1-git-send-email-zou_wei@huawei.com
2020-05-01 18:58:47 -07:00
Mao Wenan
da5aa114e2 libbpf: Return err if bpf_object__load failed
bpf_object__load() has various return code, when it failed to load
object, it must return err instead of -EINVAL.

Signed-off-by: Mao Wenan <maowenan@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200426063635.130680-3-maowenan@huawei.com
2020-05-01 18:58:47 -07:00
Maciej Żenczykowski
625f64a126 bpf: add bpf_ktime_get_boot_ns()
On a device like a cellphone which is constantly suspending
and resuming CLOCK_MONOTONIC is not particularly useful for
keeping track of or reacting to external network events.
Instead you want to use CLOCK_BOOTTIME.

Hence add bpf_ktime_get_boot_ns() as a mirror of bpf_ktime_get_ns()
based around CLOCK_BOOTTIME instead of CLOCK_MONOTONIC.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-05-01 18:58:47 -07:00
Yoshiki Komachi
ba344d9494 bpf_helpers.h: Add note for building with vmlinux.h or linux/types.h
The following error was shown when a bpf program was compiled without
vmlinux.h auto-generated from BTF:

 # clang -I./linux/tools/lib/ -I/lib/modules/$(uname -r)/build/include/ \
   -O2 -Wall -target bpf -emit-llvm -c bpf_prog.c -o bpf_prog.bc
 ...
 In file included from linux/tools/lib/bpf/bpf_helpers.h:5:
 linux/tools/lib/bpf/bpf_helper_defs.h:56:82: error: unknown type name '__u64'
 ...

It seems that bpf programs are intended for being built together with
the vmlinux.h (which will have all the __u64 and other typedefs). But
users may mistakenly think "include <linux/types.h>" is missing
because the vmlinux.h is not common for non-bpf developers. IMO, an
explicit comment therefore should be added to bpf_helpers.h as this
patch shows.

Signed-off-by: Yoshiki Komachi <komachi.yoshiki@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/1587427527-29399-1-git-send-email-komachi.yoshiki@gmail.com
2020-05-01 18:58:47 -07:00
Jakub Wilk
976e29343d bpf: Fix reStructuredText markup
The patch fixes:
$ scripts/bpf_helpers_doc.py > bpf-helpers.rst
$ rst2man bpf-helpers.rst > bpf-helpers.7
bpf-helpers.rst:1105: (WARNING/2) Inline strong start-string without end-string.

Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200422082324.2030-1-jwilk@jwilk.net
2020-05-01 18:58:47 -07:00
David Ahern
b3da63d59d libbpf: Only check mode flags in get_xdp_id
The commit in the Fixes tag changed get_xdp_id to only return prog_id
if flags is 0, but there are other XDP flags than the modes - e.g.,
XDP_FLAGS_UPDATE_IF_NOEXIST. Since the intention was only to look at
MODE flags, clear other ones before checking if flags is 0.

Fixes: f07cbad29741 ("libbpf: Fix bpf_get_link_xdp_id flags handling")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrey Ignatov <rdna@fb.com>
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
902ba3fd33 README: add Debian libbpf package link
Debian is now packaging libbpf from this repo. Add link to the package to README.
2020-05-01 18:20:43 -07:00
Andrii Nakryiko
cf3fc46ea8 sync: squelch annoying warning from filter-branch git command
Newer git started emitting warning about dangerousness of filter-branch.
Squelch it with FILTER_BRANCH_SQUELCH_WARNING=1 envvar.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-04-29 23:01:56 -07:00
Andrii Nakryiko
6a1615c263 vmtests: blacklist mmap test on 5.5
5.5 kernel has a bug in kernel allowing to violate read-only access to
mmap()-ed map. Disable selftest that now is failing.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-04-17 15:31:03 -07:00
Andrii Nakryiko
e66d297441 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   1a323ea5356edbb3073dc59d51b9e6b86908857d
Checkpoint bpf-next commit: 2fcd80144b93ff90836a44f2054b4d82133d3a85
Baseline bpf commit:        94b18a87efdd1626a1e6aef87271af4a7c616d36
Checkpoint bpf commit:      edadedf1c5b4e4404192a0a4c3c0c05e3b7672ab

Andrey Ignatov (1):
  libbpf: Fix bpf_get_link_xdp_id flags handling

Andrii Nakryiko (1):
  libbpf: Always specify expected_attach_type on program load if
    supported

Jeremy Cline (1):
  libbpf: Initialize *nl_pid so gcc 10 is happy

Toke Høiland-Jørgensen (1):
  libbpf: Fix type of old_fd in bpf_xdp_set_link_opts

 src/libbpf.c  | 126 ++++++++++++++++++++++++++++++++------------------
 src/libbpf.h  |   2 +-
 src/netlink.c |   6 +--
 3 files changed, 86 insertions(+), 48 deletions(-)

--
2.24.1
2020-04-17 15:31:03 -07:00
Toke Høiland-Jørgensen
632afdff45 libbpf: Fix type of old_fd in bpf_xdp_set_link_opts
The 'old_fd' parameter used for atomic replacement of XDP programs is
supposed to be an FD, but was left as a u32 from an earlier iteration of
the patch that added it. It was converted to an int when read, so things
worked correctly even with negative values, but better change the
definition to correctly reflect the intention.

Fixes: bd5ca3ef93cd ("libbpf: Add function to set link XDP fd while specifying old program")
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200414145025.182163-1-toke@redhat.com
2020-04-17 15:31:03 -07:00
Andrii Nakryiko
6e706b38bd libbpf: Always specify expected_attach_type on program load if supported
For some types of BPF programs that utilize expected_attach_type, libbpf won't
set load_attr.expected_attach_type, even if expected_attach_type is known from
section definition. This was done to preserve backwards compatibility with old
kernels that didn't recognize expected_attach_type attribute yet (which was
added in 5e43f899b03a ("bpf: Check attach type at prog load time"). But this
is problematic for some BPF programs that utilize newer features that require
kernel to know specific expected_attach_type (e.g., extended set of return
codes for cgroup_skb/egress programs).

This patch makes libbpf specify expected_attach_type by default, but also
detect support for this field in kernel and not set it during program load.
This allows to have a good metadata for bpf_program
(e.g., bpf_program__get_extected_attach_type()), but still work with old
kernels (for cases where it can work at all).

Additionally, due to expected_attach_type being always set for recognized
program types, bpf_program__attach_cgroup doesn't have to do extra checks to
determine correct attach type, so remove that additional logic.

Also adjust section_names selftest to account for this change.

More detailed discussion can be found in [0].

  [0] https://lore.kernel.org/bpf/20200412003604.GA15986@rdna-mbp.dhcp.thefacebook.com/

Fixes: 5cf1e9145630 ("bpf: cgroup inet skb programs can return 0 to 3")
Fixes: 5e43f899b03a ("bpf: Check attach type at prog load time")
Reported-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/20200414182645.1368174-1-andriin@fb.com
2020-04-17 15:31:03 -07:00
Andrey Ignatov
850293ba1c libbpf: Fix bpf_get_link_xdp_id flags handling
Currently if one of XDP_FLAGS_{DRV,HW,SKB}_MODE flags is passed to
bpf_get_link_xdp_id() and there is a single XDP program attached to
ifindex, that program's id will be returned by bpf_get_link_xdp_id() in
prog_id argument no matter what mode the program is attached in, i.e.
flags argument is not taken into account.

For example, if there is a single program attached with
XDP_FLAGS_SKB_MODE but user calls bpf_get_link_xdp_id() with flags =
XDP_FLAGS_DRV_MODE, that skb program will be returned.

Fix it by returning info->prog_id only if user didn't specify flags. If
flags is specified then return corresponding mode-specific-field from
struct xdp_link_info.

The initial error was introduced in commit 50db9f073188 ("libbpf: Add a
support for getting xdp prog id on ifindex") and then refactored in
473f4e133a12 so 473f4e133a12 is used in the Fixes tag.

Fixes: 473f4e133a12 ("libbpf: Add bpf_get_link_xdp_info() function to get more XDP information")
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/0e9e30490b44b447bb2bebc69c7135e7fe7e4e40.1586236080.git.rdna@fb.com
2020-04-17 15:31:03 -07:00
Jeremy Cline
fb528063b2 libbpf: Initialize *nl_pid so gcc 10 is happy
Builds of Fedora's kernel-tools package started to fail with "may be
used uninitialized" warnings for nl_pid in bpf_set_link_xdp_fd() and
bpf_get_link_xdp_info() on the s390 architecture.

Although libbpf_netlink_open() always returns a negative number when it
does not set *nl_pid, the compiler does not determine this and thus
believes the variable might be used uninitialized. Assuage gcc's fears
by explicitly initializing nl_pid.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1807781

Signed-off-by: Jeremy Cline <jcline@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200404051430.698058-1-jcline@redhat.com
2020-04-17 15:31:03 -07:00
Andrii Nakryiko
97ada10bd8 ci: update blacklists and Kconfig
Disable some of newest selftests on 5.5.0, turn on BPF_LSM on latest.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-04-02 00:02:25 -07:00
Andrii Nakryiko
9a35753b42 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   483d7a30f538e2f8addd32aa9a3d2e94ae55fa65
Checkpoint bpf-next commit: 1a323ea5356edbb3073dc59d51b9e6b86908857d
Baseline bpf commit:        94b18a87efdd1626a1e6aef87271af4a7c616d36
Checkpoint bpf commit:      94b18a87efdd1626a1e6aef87271af4a7c616d36

Andrii Nakryiko (2):
  bpf: Implement bpf_link-based cgroup BPF program attachment
  libbpf: Add support for bpf_link-based cgroup attachment

Antoine Tenart (1):
  net: macsec: add support for offloading to the MAC

Daniel Borkmann (2):
  bpf: Add netns cookie and enable it for bpf cgroup hooks
  bpf: Enable bpf cgroup hooks to retrieve cgroup v2 and ancestor id

Fletcher Dunn (1):
  libbpf, xsk: Init all ring members in xsk_umem__create and
    xsk_socket__create

Joe Stringer (1):
  bpf: Add socket assign support

KP Singh (2):
  bpf: Introduce BPF_PROG_TYPE_LSM
  tools/libbpf: Add support for BPF_PROG_TYPE_LSM

Mark Starovoytov (1):
  net: macsec: add support for specifying offload upon link creation

Stanislav Fomichev (1):
  libbpf: Don't allocate 16M for log buffer by default

Tobias Klauser (1):
  libbpf: Remove unused parameter `def` to get_map_field_int

Toke Høiland-Jørgensen (3):
  tools: Add EXPECTED_FD-related definitions in if_link.h
  libbpf: Add function to set link XDP fd while specifying old program
  libbpf: Add setter for initial value for internal maps

 include/uapi/linux/bpf.h     |  82 ++++++++++++++++++++-
 include/uapi/linux/if_link.h |   6 +-
 src/bpf.c                    |  37 +++++++++-
 src/bpf.h                    |  19 +++++
 src/btf.c                    |  20 ++++--
 src/libbpf.c                 | 134 +++++++++++++++++++++++++++++------
 src/libbpf.h                 |  22 +++++-
 src/libbpf.map               |   9 +++
 src/libbpf_probes.c          |   1 +
 src/netlink.c                |  34 ++++++++-
 src/xsk.c                    |  16 ++++-
 11 files changed, 345 insertions(+), 35 deletions(-)

--
2.24.1
2020-04-02 00:02:25 -07:00
Andrii Nakryiko
c4af2093cc sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-04-02 00:02:25 -07:00
Andrii Nakryiko
1543a19f36 libbpf: Add support for bpf_link-based cgroup attachment
Add bpf_program__attach_cgroup(), which uses BPF_LINK_CREATE subcommand to
create an FD-based kernel bpf_link. Also add low-level bpf_link_create() API.

If expected_attach_type is not specified explicitly with
bpf_program__set_expected_attach_type(), libbpf will try to determine proper
attach type from BPF program's section definition.

Also add support for bpf_link's underlying BPF program replacement:
  - unconditional through high-level bpf_link__update_program() API;
  - cmpxchg-like with specifying expected current BPF program through
    low-level bpf_link_update() API.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200330030001.2312810-4-andriin@fb.com
2020-04-02 00:02:25 -07:00
Andrii Nakryiko
8b41602694 bpf: Implement bpf_link-based cgroup BPF program attachment
Implement new sub-command to attach cgroup BPF programs and return FD-based
bpf_link back on success. bpf_link, once attached to cgroup, cannot be
replaced, except by owner having its FD. Cgroup bpf_link supports only
BPF_F_ALLOW_MULTI semantics. Both link-based and prog-based BPF_F_ALLOW_MULTI
attachments can be freely intermixed.

To prevent bpf_cgroup_link from keeping cgroup alive past the point when no
BPF program can be executed, implement auto-detachment of link. When
cgroup_bpf_release() is called, all attached bpf_links are forced to release
cgroup refcounts, but they leave bpf_link otherwise active and allocated, as
well as still owning underlying bpf_prog. This is because user-space might
still have FDs open and active, so bpf_link as a user-referenced object can't
be freed yet. Once last active FD is closed, bpf_link will be freed and
underlying bpf_prog refcount will be dropped. But cgroup refcount won't be
touched, because cgroup is released already.

The inherent race between bpf_cgroup_link release (from closing last FD) and
cgroup_bpf_release() is resolved by both operations taking cgroup_mutex. So
the only additional check required is when bpf_cgroup_link attempts to detach
itself from cgroup. At that time we need to check whether there is still
cgroup associated with that link. And if not, exit with success, because
bpf_cgroup_link was already successfully detached.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Roman Gushchin <guro@fb.com>
Link: https://lore.kernel.org/bpf/20200330030001.2312810-2-andriin@fb.com
2020-04-02 00:02:25 -07:00
Joe Stringer
cecb299ac4 bpf: Add socket assign support
Add support for TPROXY via a new bpf helper, bpf_sk_assign().

This helper requires the BPF program to discover the socket via a call
to bpf_sk*_lookup_*(), then pass this socket to the new helper. The
helper takes its own reference to the socket in addition to any existing
reference that may or may not currently be obtained for the duration of
BPF processing. For the destination socket to receive the traffic, the
traffic must be routed towards that socket via local route. The
simplest example route is below, but in practice you may want to route
traffic more narrowly (eg by CIDR):

  $ ip route add local default dev lo

This patch avoids trying to introduce an extra bit into the skb->sk, as
that would require more invasive changes to all code interacting with
the socket to ensure that the bit is handled correctly, such as all
error-handling cases along the path from the helper in BPF through to
the orphan path in the input. Instead, we opt to use the destructor
variable to switch on the prefetch of the socket.

Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200329225342.16317-2-joe@wand.net.nz
2020-04-02 00:02:25 -07:00
KP Singh
90e89264b9 tools/libbpf: Add support for BPF_PROG_TYPE_LSM
Since BPF_PROG_TYPE_LSM uses the same attaching mechanism as
BPF_PROG_TYPE_TRACING, the common logic is refactored into a static
function bpf_program__attach_btf_id.

A new API call bpf_program__attach_lsm is still added to avoid userspace
conflicts if this ever changes in the future.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Florent Revest <revest@google.com>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200329004356.27286-7-kpsingh@chromium.org
2020-04-02 00:02:25 -07:00
KP Singh
f69cc97272 bpf: Introduce BPF_PROG_TYPE_LSM
Introduce types and configs for bpf programs that can be attached to
LSM hooks. The programs can be enabled by the config option
CONFIG_BPF_LSM.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Florent Revest <revest@google.com>
Reviewed-by: Thomas Garnier <thgarnie@google.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Link: https://lore.kernel.org/bpf/20200329004356.27286-2-kpsingh@chromium.org
2020-04-02 00:02:25 -07:00
Toke Høiland-Jørgensen
a6e9750c8a libbpf: Add setter for initial value for internal maps
For internal maps (most notably the maps backing global variables), libbpf
uses an internal mmaped area to store the data after opening the object.
This data is subsequently copied into the kernel map when the object is
loaded.

This adds a function to set a new value for that data, which can be used to
before it is loaded into the kernel. This is especially relevant for RODATA
maps, since those are frozen on load.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200329132253.232541-1-toke@redhat.com
2020-04-02 00:02:25 -07:00
Toke Høiland-Jørgensen
60bade6674 libbpf: Add function to set link XDP fd while specifying old program
This adds a new function to set the XDP fd while specifying the FD of the
program to replace, using the newly added IFLA_XDP_EXPECTED_FD netlink
parameter. The new function uses the opts struct mechanism to be extendable
in the future.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/158515700857.92963.7052131201257841700.stgit@toke.dk
2020-04-02 00:02:25 -07:00
Toke Høiland-Jørgensen
e13c1b7b85 tools: Add EXPECTED_FD-related definitions in if_link.h
This adds the IFLA_XDP_EXPECTED_FD netlink attribute definition and the
XDP_FLAGS_REPLACE flag to if_link.h in tools/include.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/158515700747.92963.8615391897417388586.stgit@toke.dk
2020-04-02 00:02:25 -07:00
Fletcher Dunn
1d8451ccaf libbpf, xsk: Init all ring members in xsk_umem__create and xsk_socket__create
Fix a sharp edge in xsk_umem__create and xsk_socket__create.  Almost all of
the members of the ring buffer structs are initialized, but the "cached_xxx"
variables are not all initialized.  The caller is required to zero them.
This is needlessly dangerous.  The results if you don't do it can be very bad.
For example, they can cause xsk_prod_nb_free and xsk_cons_nb_avail to return
values greater than the size of the queue.  xsk_ring_cons__peek can return an
index that does not refer to an item that has been queued.

I have confirmed that without this change, my program misbehaves unless I
memset the ring buffers to zero before calling the function.  Afterwards,
my program works without (or with) the memset.

Signed-off-by: Fletcher Dunn <fletcherd@valvesoftware.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/85f12913cde94b19bfcb598344701c38@valvesoftware.com
2020-04-02 00:02:25 -07:00
Daniel Borkmann
fad6e249ea bpf: Enable bpf cgroup hooks to retrieve cgroup v2 and ancestor id
Enable the bpf_get_current_cgroup_id() helper for connect(), sendmsg(),
recvmsg() and bind-related hooks in order to retrieve the cgroup v2
context which can then be used as part of the key for BPF map lookups,
for example. Given these hooks operate in process context 'current' is
always valid and pointing to the app that is performing mentioned
syscalls if it's subject to a v2 cgroup. Also with same motivation of
commit 7723628101aa ("bpf: Introduce bpf_skb_ancestor_cgroup_id helper")
enable retrieval of ancestor from current so the cgroup id can be used
for policy lookups which can then forbid connect() / bind(), for example.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/d2a7ef42530ad299e3cbb245e6c12374b72145ef.1585323121.git.daniel@iogearbox.net
2020-04-02 00:02:25 -07:00
Daniel Borkmann
64f7fa917c bpf: Add netns cookie and enable it for bpf cgroup hooks
In Cilium we're mainly using BPF cgroup hooks today in order to implement
kube-proxy free Kubernetes service translation for ClusterIP, NodePort (*),
ExternalIP, and LoadBalancer as well as HostPort mapping [0] for all traffic
between Cilium managed nodes. While this works in its current shape and avoids
packet-level NAT for inter Cilium managed node traffic, there is one major
limitation we're facing today, that is, lack of netns awareness.

In Kubernetes, the concept of Pods (which hold one or multiple containers)
has been built around network namespaces, so while we can use the global scope
of attaching to root BPF cgroup hooks also to our advantage (e.g. for exposing
NodePort ports on loopback addresses), we also have the need to differentiate
between initial network namespaces and non-initial one. For example, ExternalIP
services mandate that non-local service IPs are not to be translated from the
host (initial) network namespace as one example. Right now, we have an ugly
work-around in place where non-local service IPs for ExternalIP services are
not xlated from connect() and friends BPF hooks but instead via less efficient
packet-level NAT on the veth tc ingress hook for Pod traffic.

On top of determining whether we're in initial or non-initial network namespace
we also have a need for a socket-cookie like mechanism for network namespaces
scope. Socket cookies have the nice property that they can be combined as part
of the key structure e.g. for BPF LRU maps without having to worry that the
cookie could be recycled. We are planning to use this for our sessionAffinity
implementation for services. Therefore, add a new bpf_get_netns_cookie() helper
which would resolve both use cases at once: bpf_get_netns_cookie(NULL) would
provide the cookie for the initial network namespace while passing the context
instead of NULL would provide the cookie from the application's network namespace.
We're using a hole, so no size increase; the assignment happens only once.
Therefore this allows for a comparison on initial namespace as well as regular
cookie usage as we have today with socket cookies. We could later on enable
this helper for other program types as well as we would see need.

  (*) Both externalTrafficPolicy={Local|Cluster} types
  [0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.c

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/c47d2346982693a9cf9da0e12690453aded4c788.1585323121.git.daniel@iogearbox.net
2020-04-02 00:02:25 -07:00
Stanislav Fomichev
240b8fa098 libbpf: Don't allocate 16M for log buffer by default
For each prog/btf load we allocate and free 16 megs of verifier buffer.
On production systems it doesn't really make sense because the
programs/btf have gone through extensive testing and (mostly) guaranteed
to successfully load.

Let's assume successful case by default and skip buffer allocation
on the first try. If there is an error, start with BPF_LOG_BUF_SIZE
and double it on each ENOSPC iteration.

v3:
* Return -ENOMEM when can't allocate log buffer (Andrii Nakryiko)

v2:
* Don't allocate the buffer at all on the first try (Andrii Nakryiko)

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200325195521.112210-1-sdf@google.com
2020-04-02 00:02:25 -07:00
Tobias Klauser
3756d20499 libbpf: Remove unused parameter def to get_map_field_int
Has been unused since commit ef99b02b23ef ("libbpf: capture value in BTF
type info for BTF-defined map defs").

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200325113655.19341-1-tklauser@distanz.ch
2020-04-02 00:02:25 -07:00
Mark Starovoytov
9e8b23289f net: macsec: add support for specifying offload upon link creation
This patch adds new netlink attribute to allow a user to (optionally)
specify the desired offload mode immediately upon MACSec link creation.

Separate iproute patch will be required to support this from user space.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-02 00:02:25 -07:00
Antoine Tenart
902eca48e5 net: macsec: add support for offloading to the MAC
This patch adds a new MACsec offloading option, MACSEC_OFFLOAD_MAC,
allowing a user to select a MAC as a provider for MACsec offloading
operations.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-02 00:02:25 -07:00
Andrii Nakryiko
9f0d55c24a vmtests: organize blacklists, enable sockmap_listen tests
Enable now-fixed sockmap_listen tests. Disabled vmlinux test on 5.5, on which
hrtimer_nanosleep() signature is incompatible. Filled out remaining
permanently disabled tests resons.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-03-17 14:56:36 -07:00
Andrii Nakryiko
e53dd1c436 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   9b79c0be350d3825ef26ed9eebac6ae50df506bc
Checkpoint bpf-next commit: 483d7a30f538e2f8addd32aa9a3d2e94ae55fa65
Baseline bpf commit:        90db6d772f749e38171d04619a5e3cd8804a6d02
Checkpoint bpf commit:      94b18a87efdd1626a1e6aef87271af4a7c616d36

Andrii Nakryiko (2):
  libbpf: Ignore incompatible types with matching name during CO-RE
    relocation
  libbpf: Provide CO-RE variants of PT_REGS macros

Wenbo Zhang (1):
  bpf, libbpf: Fix ___bpf_kretprobe_args1(x) macro definition

 src/bpf_tracing.h | 105 +++++++++++++++++++++++++++++++++++++++++++++-
 src/libbpf.c      |   4 ++
 2 files changed, 108 insertions(+), 1 deletion(-)

--
2.17.1
2020-03-17 14:56:36 -07:00
Andrii Nakryiko
da790d6014 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-03-17 14:56:36 -07:00
Wenbo Zhang
3d81b13b36 bpf, libbpf: Fix ___bpf_kretprobe_args1(x) macro definition
Use PT_REGS_RC instead of PT_REGS_RET to get ret correctly.

Fixes: df8ff35311c8 ("libbpf: Merge selftests' bpf_trace_helpers.h into libbpf's bpf_tracing.h")
Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200315083252.22274-1-ethercflow@gmail.com
2020-03-17 14:56:36 -07:00
Andrii Nakryiko
64bd9e074b libbpf: Provide CO-RE variants of PT_REGS macros
Syscall raw tracepoints have struct pt_regs pointer as tracepoint's first
argument. After that, reading any of pt_regs fields requires bpf_probe_read(),
even for tp_btf programs. Due to that, PT_REGS_PARMx macros are not usable as
is. This patch adds CO-RE variants of those macros that use BPF_CORE_READ() to
read necessary fields. This provides relocatable architecture-agnostic pt_regs
field accesses.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200313172336.1879637-4-andriin@fb.com
2020-03-17 14:56:36 -07:00
Andrii Nakryiko
53d473dd8e libbpf: Ignore incompatible types with matching name during CO-RE relocation
When finding target type candidates, ignore forward declarations, functions,
and other named types of incompatible kind. Not doing this can cause false
errors.  See [0] for one such case (due to struct pt_regs forward
declaration).

  [0] https://github.com/iovisor/bcc/pull/2806#issuecomment-598543645

Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm")
Reported-by: Wenbo Zhang <ethercflow@gmail.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200313172336.1879637-3-andriin@fb.com
2020-03-17 14:56:36 -07:00
Andrii Nakryiko
6d64d927a2 vmtests: enable previously failing kprobe selftests
With fixes in selftests, these tests should now pass.
Also add ability to add comments to blacklist/whitelist to explain why certain
test is disabled.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
cd87f1568e sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   abbc61a5f26d52a5d3abbbe552b275360b2c6631
Checkpoint bpf-next commit: 9b79c0be350d3825ef26ed9eebac6ae50df506bc
Baseline bpf commit:        542bf38f11d11bf98c69b2f83f3519ada8a76e95
Checkpoint bpf commit:      90db6d772f749e38171d04619a5e3cd8804a6d02

Andrii Nakryiko (4):
  libbpf: Fix handling of optional field_name in
    btf_dump__emit_type_decl
  bpf: Switch BPF UAPI #define constants used from BPF program side to
    enums
  libbpf: Assume unsigned values for BTF_KIND_ENUM
  libbpf: Split BTF presence checks into libbpf- and kernel-specific
    parts

Carlos Neira (1):
  bpf: Added new helper bpf_get_ns_current_pid_tgid

Eelco Chaudron (1):
  bpf: Add bpf_xdp_output() helper

KP Singh (2):
  bpf: Introduce BPF_MODIFY_RETURN
  tools/libbpf: Add support for BPF_MODIFY_RETURN

Willem de Bruijn (1):
  bpf: Sync uapi bpf.h to tools/

 include/uapi/linux/bpf.h | 223 +++++++++++++++++++++++++++------------
 src/btf_dump.c           |  10 +-
 src/libbpf.c             |  21 +++-
 3 files changed, 176 insertions(+), 78 deletions(-)

--
2.17.1
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
c417a4cb6f sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-03-12 22:57:51 -07:00
Eelco Chaudron
fa21d33fff bpf: Add bpf_xdp_output() helper
Introduce new helper that reuses existing xdp perf_event output
implementation, but can be called from raw_tracepoint programs
that receive 'struct xdp_buff *' as a tracepoint argument.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/158348514556.2239.11050972434793741444.stgit@xdp-tutorial
2020-03-12 22:57:51 -07:00
Carlos Neira
84cf76de9c bpf: Added new helper bpf_get_ns_current_pid_tgid
New bpf helper bpf_get_ns_current_pid_tgid,
This helper will return pid and tgid from current task
which namespace matches dev_t and inode number provided,
this will allows us to instrument a process inside a container.

Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200304204157.58695-3-cneirabustos@gmail.com
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
2ef4fdac6c libbpf: Split BTF presence checks into libbpf- and kernel-specific parts
Needs for application BTF being present differs between user-space libbpf needs and kernel
needs. Currently, BTF is mandatory only in kernel only when BPF application is
using STRUCT_OPS. While libbpf itself relies more heavily on presense of BTF:
  - for BTF-defined maps;
  - for Kconfig externs;
  - for STRUCT_OPS as well.

Thus, checks for presence and validness of bpf_object's BPF needs to be
performed separately, which is patch does.

Fixes: 5327644614a1 ("libbpf: Relax check whether BTF is mandatory")
Reported-by: Michal Rostecki <mrostecki@opensuse.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200312185033.736911-1-andriin@fb.com
2020-03-12 22:57:51 -07:00
KP Singh
1d72c9c382 tools/libbpf: Add support for BPF_MODIFY_RETURN
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200304191853.1529-6-kpsingh@chromium.org
2020-03-12 22:57:51 -07:00
KP Singh
7930230b43 bpf: Introduce BPF_MODIFY_RETURN
When multiple programs are attached, each program receives the return
value from the previous program on the stack and the last program
provides the return value to the attached function.

The fmod_ret bpf programs are run after the fentry programs and before
the fexit programs. The original function is only called if all the
fmod_ret programs return 0 to avoid any unintended side-effects. The
success value, i.e. 0 is not currently configurable but can be made so
where user-space can specify it at load time.

For example:

int func_to_be_attached(int a, int b)
{  <--- do_fentry

do_fmod_ret:
   <update ret by calling fmod_ret>
   if (ret != 0)
        goto do_fexit;

original_function:

    <side_effects_happen_here>

}  <--- do_fexit

The fmod_ret program attached to this function can be defined as:

SEC("fmod_ret/func_to_be_attached")
int BPF_PROG(func_name, int a, int b, int ret)
{
        // This will skip the original function logic.
        return 1;
}

The first fmod_ret program is passed 0 in its return argument.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200304191853.1529-4-kpsingh@chromium.org
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
483a8c238f libbpf: Assume unsigned values for BTF_KIND_ENUM
Currently, BTF_KIND_ENUM type doesn't record whether enum values should be
interpreted as signed or unsigned. In Linux, most enums are unsigned, though,
so interpreting them as unsigned matches real world better.

Change btf_dump test case to test maximum 32-bit value, instead of negative
value.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200303003233.3496043-3-andriin@fb.com
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
26cbe2384c bpf: Switch BPF UAPI #define constants used from BPF program side to enums
Switch BPF UAPI constants, previously defined as #define macro, to anonymous
enum values. This preserves constants values and behavior in expressions, but
has added advantaged of being captured as part of DWARF and, subsequently, BTF
type info. Which, in turn, greatly improves usefulness of generated vmlinux.h
for BPF applications, as it will not require BPF users to copy/paste various
flags and constants, which are frequently used with BPF helpers. Only those
constants that are used/useful from BPF program side are converted.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200303003233.3496043-2-andriin@fb.com
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
cb4a430c8a libbpf: Fix handling of optional field_name in btf_dump__emit_type_decl
Internal functions, used by btf_dump__emit_type_decl(), assume field_name is
never going to be NULL. Ensure it's always the case.

Fixes: 9f81654eebe8 ("libbpf: Expose BTF-to-C type declaration emitting API")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200303180800.3303471-1-andriin@fb.com
2020-03-12 22:57:51 -07:00
Willem de Bruijn
f67d535cdb bpf: Sync uapi bpf.h to tools/
sync tools/include/uapi/linux/bpf.h to match include/uapi/linux/bpf.h

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200303200503.226217-3-willemdebruijn.kernel@gmail.com
2020-03-12 22:57:51 -07:00
Julia Kartseva
ef4785f065 vmtest: libbpf#137 follow-ups
- Run test_{maps|verifier} only with the latest kernel
- Mount run control script
- Style

Signed-off-by: Julia Kartseva (hex@fb.com)
2020-03-12 21:36:30 -07:00
Andrii Nakryiko
9a424bea42 vmtests: add few missing Kconfig settings
Add few missing Kconfig settings that might be relied on in selftests.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-03-11 14:44:28 -07:00
Julia Kartseva
10e4311ad7 vmtest: add mkrootfs.sh to build Arch Linux disk image
Generate a disk image for libbpf testing in compressed *.zst format

The mkrootfs.sh has the following stages:
- run pacstrap to install libbpf and selftests dependencies.
- create /etc/fstab w/ bpffs and debugfs filesystems
- create /etc/init.d/rcS to mount in bootime
- create /etc/inittab to invoke /etc/init.d/rcS
- compress an image

In addition ./travis-ci/vmtest/run.sh set up ext4 fs and mounts
it as a loop device:
mkfs.ext4 -q "$tmp"
mount -o loop "$tmp" "$mnt"

Signed-off-by: Julia Kartseva (hex@fb.com)
2020-03-11 08:31:13 -07:00
Julia Kartseva
50febacba1 vmtest: disk image update; run test_{maps|verifier}; blacklist update
The disk image is updated to 2020-03-11.

blacklist for LATEST kernel:
attach_probe (needs root cause)
perf_buffer (needs root cause)
send_signal (flaky)
sockmap_listen (flaky)

Run test_maps and test_verifier.
test_maps is not expected to pass for kernels other then LATEST.

Signed-off-by: Julia Kartseva (hex@fb.com)
2020-03-11 08:31:13 -07:00
Andrii Nakryiko
ef7d57fcec vmtest: blacklist link_pinning selftest on 5.5.0
Link pinning is not supported by 5.5.0 and older kernels.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-03-03 00:05:56 -08:00
Andrii Nakryiko
7e7a15321e sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   503d539a6e417b018616bf3060e0b5814fafce47
Checkpoint bpf-next commit: abbc61a5f26d52a5d3abbbe552b275360b2c6631
Baseline bpf commit:        41f57cfde186dba6e357f9db25eafbed017e4487
Checkpoint bpf commit:      542bf38f11d11bf98c69b2f83f3519ada8a76e95

Andrii Nakryiko (3):
  libbpf: Fix use of PT_REGS_PARM macros with vmlinux.h
  libbpf: Merge selftests' bpf_trace_helpers.h into libbpf's
    bpf_tracing.h
  libbpf: Add bpf_link pinning/unpinning

 src/bpf_tracing.h | 120 +++++++++++++++++++++++++++++++++++++++++-
 src/libbpf.c      | 131 ++++++++++++++++++++++++++++++++++++----------
 src/libbpf.h      |   5 ++
 src/libbpf.map    |   5 ++
 4 files changed, 233 insertions(+), 28 deletions(-)

--
2.17.1
2020-03-03 00:05:56 -08:00
Andrii Nakryiko
77ac09c3eb libbpf: Add bpf_link pinning/unpinning
With bpf_link abstraction supported by kernel explicitly, add
pinning/unpinning API for links. Also allow to create (open) bpf_link from BPF
FS file.

This API allows to have an "ephemeral" FD-based BPF links (like raw tracepoint
or fexit/freplace attachments) surviving user process exit, by pinning them in
a BPF FS, which is an important use case for long-running BPF programs.

As part of this, expose underlying FD for bpf_link. While legacy bpf_link's
might not have a FD associated with them (which will be expressed as
a bpf_link with fd=-1), kernel's abstraction is based around FD-based usage,
so match it closely. This, subsequently, allows to have a generic
pinning/unpinning API for generalized bpf_link. For some types of bpf_links
kernel might not support pinning, in which case bpf_link__pin() will return
error.

With FD being part of generic bpf_link, also get rid of bpf_link_fd in favor
of using vanialla bpf_link.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200303043159.323675-3-andriin@fb.com
2020-03-03 00:05:56 -08:00
Andrii Nakryiko
40a08ef216 libbpf: Merge selftests' bpf_trace_helpers.h into libbpf's bpf_tracing.h
Move BPF_PROG, BPF_KPROBE, and BPF_KRETPROBE macro into libbpf's bpf_tracing.h
header to make it available for non-selftests users.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200229231112.1240137-5-andriin@fb.com
2020-03-03 00:05:56 -08:00
Andrii Nakryiko
b6683d1aeb libbpf: Fix use of PT_REGS_PARM macros with vmlinux.h
Add detection of vmlinux.h to bpf_tracing.h header for PT_REGS macro.
Currently, BPF applications have to define __KERNEL__ symbol to use correct
definition of struct pt_regs on x86 arch. This is due to different field names
under internal kernel vs UAPI conditions. To make this more transparent for
users, detect vmlinux.h by checking __VMLINUX_H__ symbol.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200229231112.1240137-3-andriin@fb.com
2020-03-03 00:05:56 -08:00
Julia Kartseva
5247b0b0dc vmtest: enable more networking kernel selftests
Set up loopback to enable more tests:
- bpf_tcp_ca
- cgroup_attach_autodetach
- cgroup_attach_multi
- cgroup_attach_override
- select_reuseport
- sockmap_ktls

Signed-off-by: Julia Kartseva hex@fb.com
2020-02-26 14:02:34 -08:00
Andrii Nakryiko
c2b01ad4f3 vmtest: trim down kernel config to minimize build time
Remove unnecesary drivers and features to speed up kernel compilation.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-02-26 12:08:31 -08:00
Andrii Nakryiko
c4468dec74 sync: bump kernel commit to latest to pull in latest selftests
Manually bump sync commit from kernel repo. There are no libbpf changes, but
we need latest selftest patches to try to debug more of crashing selftests.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-02-25 20:01:06 -08:00
Andrii Nakryiko
40229b3ffd ci: enable more test_progs tests
Trim tests blacklist.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-02-22 09:20:41 -08:00
Andrii Nakryiko
7f2d538c27 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   5327644614a18f5d0ff845844a4e9976210b3d8d
Checkpoint bpf-next commit: 8eece07c011f88da0ccf4127fca9a4e4faaf58ae
Baseline bpf commit:        41f57cfde186dba6e357f9db25eafbed017e4487
Checkpoint bpf commit:      41f57cfde186dba6e357f9db25eafbed017e4487

Eelco Chaudron (2):
  libbpf: Bump libpf current version to v0.0.8
  libbpf: Add support for dynamic program attach target

 src/libbpf.c   | 34 ++++++++++++++++++++++++++++++----
 src/libbpf.h   |  4 ++++
 src/libbpf.map |  5 +++++
 3 files changed, 39 insertions(+), 4 deletions(-)

--
2.17.1
2020-02-22 09:20:41 -08:00
Eelco Chaudron
b7c162a433 libbpf: Add support for dynamic program attach target
Currently when you want to attach a trace program to a bpf program
the section name needs to match the tracepoint/function semantics.

However the addition of the bpf_program__set_attach_target() API
allows you to specify the tracepoint/function dynamically.

The call flow would look something like this:

  xdp_fd = bpf_prog_get_fd_by_id(id);
  trace_obj = bpf_object__open_file("func.o", NULL);
  prog = bpf_object__find_program_by_title(trace_obj,
                                           "fentry/myfunc");
  bpf_program__set_expected_attach_type(prog, BPF_TRACE_FENTRY);
  bpf_program__set_attach_target(prog, xdp_fd,
                                 "xdpfilt_blk_all");
  bpf_object__load(trace_obj)

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/158220519486.127661.7964708960649051384.stgit@xdp-tutorial
2020-02-22 09:20:41 -08:00
Eelco Chaudron
36c26f12f1 libbpf: Bump libpf current version to v0.0.8
New development cycles starts, bump to v0.0.8.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/158220518424.127661.8278643006567775528.stgit@xdp-tutorial
2020-02-22 09:20:41 -08:00
Andrii Nakryiko
22d5d40493 ci: fetch and build latest pahole
Build latest pahole from sources and not rely on hacky Ubuntu repository
approach.
Also enable tests for latest kernel that rely on pahole 1.16.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-02-21 20:39:35 -08:00
Andrii Nakryiko
17c26b7da6 ci: clean up .travis.yaml
Clean up Travis CI config, extract multi-step initializations into scripts.
Also, move kernel-building tests to happen last to not block lightweight
Debian and Ubuntu tests.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-02-21 20:39:35 -08:00
Andrii Nakryiko
e287979374 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   35b9211c0a2427e8f39e534f442f43804fc8d5ca
Checkpoint bpf-next commit: 5327644614a18f5d0ff845844a4e9976210b3d8d
Baseline bpf commit:        08dc225d8868d5094ada62f471ebdfcce9dbc298
Checkpoint bpf commit:      41f57cfde186dba6e357f9db25eafbed017e4487

Andrii Nakryiko (1):
  libbpf: Relax check whether BTF is mandatory

Daniel Xu (1):
  selftests/bpf: Add bpf_read_branch_records() selftest

Toke Høiland-Jørgensen (2):
  bpf, uapi: Remove text about bpf_redirect_map() giving higher
    performance
  libbpf: Sanitise internal map names so they are not rejected by the
    kernel

 include/uapi/linux/bpf.h | 41 ++++++++++++++++++++++++++++++----------
 src/libbpf.c             | 12 ++++++++----
 2 files changed, 39 insertions(+), 14 deletions(-)

--
2.17.1
2020-02-20 17:56:42 -08:00
Andrii Nakryiko
552af3d963 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-02-20 17:56:42 -08:00
Toke Høiland-Jørgensen
c772c9cbde libbpf: Sanitise internal map names so they are not rejected by the kernel
The kernel only accepts map names with alphanumeric characters, underscores
and periods in their name. However, the auto-generated internal map names
used by libbpf takes their prefix from the user-supplied BPF object name,
which has no such restriction. This can lead to "Invalid argument" errors
when trying to load a BPF program using global variables.

Fix this by sanitising the map names, replacing any non-allowed characters
with underscores.

Fixes: d859900c4c56 ("bpf, libbpf: support global data/bss/rodata sections")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200217171701.215215-1-toke@redhat.com
2020-02-20 17:56:42 -08:00
Toke Høiland-Jørgensen
031a38cceb bpf, uapi: Remove text about bpf_redirect_map() giving higher performance
The performance of bpf_redirect() is now roughly the same as that of
bpf_redirect_map(). However, David Ahern pointed out that the header file
has not been updated to reflect this, and still says that a significant
performance increase is possible when using bpf_redirect_map(). Remove this
text from the bpf_redirect_map() description, and reword the description in
bpf_redirect() slightly. Also fix the 'Return' section of the
bpf_redirect_map() documentation.

Fixes: 1d233886dd90 ("xdp: Use bulking for non-map XDP_REDIRECT and consolidate code paths")
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200218130334.29889-1-toke@redhat.com
2020-02-20 17:56:42 -08:00
Andrii Nakryiko
6ff5062480 libbpf: Relax check whether BTF is mandatory
If BPF program is using BTF-defined maps, BTF is required only for
libbpf itself to process map definitions. If after that BTF fails to
be loaded into kernel (e.g., if it doesn't support BTF at all), this
shouldn't prevent valid BPF program from loading. Existing
retry-without-BTF logic for creating maps will succeed to create such
maps without any problems. So, presence of .maps section shouldn't make
BTF required for kernel. Update the check accordingly.

Validated by ensuring simple BPF program with BTF-defined maps is still
loaded on old kernel without BTF support and map is correctly parsed and
created.

Fixes: abd29c931459 ("libbpf: allow specifying map definitions using BTF")
Reported-by: Julia Kartseva <hex@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200220062635.1497872-1-andriin@fb.com
2020-02-20 17:56:42 -08:00
Daniel Xu
fdff85e63e selftests/bpf: Add bpf_read_branch_records() selftest
Add a selftest to test:

* default bpf_read_branch_records() behavior
* BPF_F_GET_BRANCH_RECORDS_SIZE flag behavior
* error path on non branch record perf events
* using helper to write to stack
* using helper to write to global

On host with hardware counter support:

    # ./test_progs -t perf_branches
    #27/1 perf_branches_hw:OK
    #27/2 perf_branches_no_hw:OK
    #27 perf_branches:OK
    Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED

On host without hardware counter support (VM):

    # ./test_progs -t perf_branches
    #27/1 perf_branches_hw:OK
    #27/2 perf_branches_no_hw:OK
    #27 perf_branches:OK
    Summary: 1/2 PASSED, 1 SKIPPED, 0 FAILED

Also sync tools/include/uapi/linux/bpf.h.

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200218030432.4600-3-dxu@dxuuu.xyz
2020-02-20 17:56:42 -08:00
Andrii Nakryiko
5c7661fd5e vmtest: update and sort blacklists
Update blacklists to omit some of the newest selftests. Also ensure that
blacklist is sorted alphabetically.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-02-20 17:56:42 -08:00
Andrii Nakryiko
1feb21b081 vmtest: remove temporary runqslower fix
It's now in bpf-next and this work around is not needed anymore.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-02-20 11:48:22 -08:00
Andrii Nakryiko
fa8cb316fb sync: fix commit signature determination in sync script
Commit signature, used to determine already synced commits, includes a short
stats per each file relevant. Fix this script to include only files that are
actually synced (i.e., exclude Makefile, Build file, etc).

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-02-20 11:12:59 -08:00
Julia Kartseva
f72fe00e70 vmtest: #121 follow-ups. Loop increase bpf-next git fetch depth
- The previously introduced git fetch depth of bpf-next tree is not sufficient
when bpf-next tree is far ahead from libbpf checkpoint commit, so increase the
depth up to 128 max. Since 128 may be an overkill for a general case, increase
exponentially in a loop until max is reached.

- Do not fetch bpf-next twice
- Remove setup_example.sh
2020-02-19 15:01:47 -08:00
Julia Kartseva
583bddce6b vmtest: build and run bpf kernel selftests against various kernels
Run kernel selftests in vmtest with the goal to test libbpf backward
compatibility with older kernels.
The list of kernels should be specified in .travis.yml config in
`jobs` section, e.g. KERNEL=5.5.0.
Enlisted kernel releases
- 5.5.0 # built from main
- 5.5.0-rc6 # built from bpf-next
- LATEST

The kernel specified as 'LATEST' in .travis.yml is built from bpf-next kernel
tree, the rest of the kernels are downloaded from the specified in INDEX file.
The kernel sources from bpf-next are manually patched with [1] from bpf tree to
fix ranqslower build. This workaround should be removed after the patch is merged
from bpf to bpf-next tree.
Due to kernel sources being checked out the duration of the LATEST kernel test is
~30m.

bpf selftests are built from tools/testing/selftests/bpf/ of bpf-next tree with
HEAD revision set to CHECKPOINT-COMMIT specified in libbpf so selftests and
libbpf are in sync.
Currently only programs are tested with test_progs program, test_maps and
test_verifier should follow.
test_progs are run with blacklist required due to:
- some features, e.g. fentry/fexit are not supported in older kernels
- environment limitations, e.g an absence of the recent pahole in Debian
- incomplete disk image

The blacklist is passed to test_progs with -b option as specified in [2]
patch set.

Most of the preceeding tests are disabled due to incomplete disk image currenly
lacking proper networking settings.
For the LATEST kernel fome fentry/fexit tests are disabled due to pahole v1.16
is not abailible in Debian yet.

Next steps are resolving issues with blacklisted tests, enabling maps and
verifier testing, expanding the list of tested kernels.

[1] https://lore.kernel.org/bpf/908498f794661c44dca54da9e09dc0c382df6fcb.1580425879.git.hex@fb.com/t.mbox.gz
[2] https://www.spinics.net/lists/netdev/msg625192.html
2020-02-17 22:12:17 -08:00
Julia Kartseva
a52fb86a96 vmtest: add configs for bpf kernel selftests
vmtest is run as a TravisCI job in order to test libbpf backward compatibility
with the older kernels

Add config files required to build and run bpf kernel selftests in vmtest:
- latest.config: latest kernel config
- INDEX: links to binaries (kernels, disk image) to download
- blacklist/BLACKLIST-${kernel}: blacklisted bpf program tests for ${kernel}
2020-02-17 22:12:17 -08:00
Andrii Nakryiko
e5dbc1a96f sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   a6ed02cac690b635dbb938690e795812ce1e14ca
Checkpoint bpf-next commit: 35b9211c0a2427e8f39e534f442f43804fc8d5ca
Baseline bpf commit:        1712b2fff8c682d145c7889d2290696647d82dab
Checkpoint bpf commit:      08dc225d8868d5094ada62f471ebdfcce9dbc298

Alexei Starovoitov (1):
  libbpf: Add support for program extensions

Andrii Nakryiko (2):
  libbpf: Improve handling of failed CO-RE relocations
  libbpf: Fix realloc usage in bpf_core_find_cands

Antoine Tenart (1):
  net: macsec: introduce the macsec_context structure

Martin KaFai Lau (1):
  bpf: Sync uapi bpf.h to tools/

 include/uapi/linux/bpf.h     |  10 +++-
 include/uapi/linux/if_link.h |   7 +++
 src/bpf.c                    |   3 +-
 src/libbpf.c                 | 112 +++++++++++++++++++++--------------
 src/libbpf.h                 |   8 ++-
 src/libbpf.map               |   2 +
 src/libbpf_probes.c          |   1 +
 7 files changed, 97 insertions(+), 46 deletions(-)

--
2.17.1
2020-01-24 14:08:27 -08:00
Andrii Nakryiko
96333403ca sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-01-24 14:08:27 -08:00
Andrii Nakryiko
928f2fc146 libbpf: Fix realloc usage in bpf_core_find_cands
Fix bug requesting invalid size of reallocated array when constructing CO-RE
relocation candidate list. This can cause problems if there are many potential
candidates and a very fine-grained memory allocator bucket sizes are used.

Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm")
Reported-by: William Smith <williampsmith@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200124201847.212528-1-andriin@fb.com
2020-01-24 14:08:27 -08:00
Andrii Nakryiko
8fd8b5bb46 libbpf: Improve handling of failed CO-RE relocations
Previously, if libbpf failed to resolve CO-RE relocation for some
instructions, it would either return error immediately, or, if
.relaxed_core_relocs option was set, would replace relocatable offset/imm part
of an instruction with a bogus value (-1). Neither approach is good, because
there are many possible scenarios where relocation is expected to fail (e.g.,
when some field knowingly can be missing on specific kernel versions). On the
other hand, replacing offset with invalid one can hide programmer errors, if
this relocation failue wasn't anticipated.

This patch deprecates .relaxed_core_relocs option and changes the approach to
always replacing instruction, for which relocation failed, with invalid BPF
helper call instruction. For cases where this is expected, BPF program should
already ensure that that instruction is unreachable, in which case this
invalid instruction is going to be silently ignored. But if instruction wasn't
guarded, BPF program will be rejected at verification step with verifier log
pointing precisely to the place in assembly where the problem is.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200124053837.2434679-1-andriin@fb.com
2020-01-24 14:08:27 -08:00
Martin KaFai Lau
b999e8f2c1 bpf: Sync uapi bpf.h to tools/
This patch sync uapi bpf.h to tools/.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200122233652.903348-1-kafai@fb.com
2020-01-24 14:08:27 -08:00
Alexei Starovoitov
c6c86a53f2 libbpf: Add support for program extensions
Add minimal support for program extensions. bpf_object_open_opts() needs to be
called with attach_prog_fd = target_prog_fd and BPF program extension needs to
have in .c file section definition like SEC("freplace/func_to_be_replaced").
libbpf will search for "func_to_be_replaced" in the target_prog_fd's BTF and
will pass it in attach_btf_id to the kernel. This approach works for tests, but
more compex use case may need to request function name (and attach_btf_id that
kernel sees) to be more dynamic. Such API will be added in future patches.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200121005348.2769920-3-ast@kernel.org
2020-01-24 14:08:27 -08:00
Antoine Tenart
c69f0d12f3 net: macsec: introduce the macsec_context structure
This patch introduces the macsec_context structure. It will be used
in the kernel to exchange information between the common MACsec
implementation (macsec.c) and the MACsec hardware offloading
implementations. This structure contains pointers to MACsec specific
structures which contain the actual MACsec configuration, and to the
underlying device (phydev for now).

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24 14:08:27 -08:00
Andrii Nakryiko
033ad7ee78 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   20f21d98cf12b8ecd69e8defc93fae9e3b353b13
Checkpoint bpf-next commit: a6ed02cac690b635dbb938690e795812ce1e14ca
Baseline bpf commit:        1712b2fff8c682d145c7889d2290696647d82dab
Checkpoint bpf commit:      1712b2fff8c682d145c7889d2290696647d82dab

Andrii Nakryiko (3):
  libbpf: Fix error handling bug in btf_dump__new
  libbpf: Simplify BTF initialization logic
  libbpf: Fix potential multiplication overflow in mmap() size
    calculation

KP Singh (1):
  libbpf: Load btf_vmlinux only once per object.

 src/btf_dump.c |   1 +
 src/libbpf.c   | 174 ++++++++++++++++++++++++++++++-------------------
 2 files changed, 109 insertions(+), 66 deletions(-)

--
2.17.1
2020-01-17 20:33:22 -08:00
KP Singh
397db2175d libbpf: Load btf_vmlinux only once per object.
As more programs (TRACING, STRUCT_OPS, and upcoming LSM) use vmlinux
BTF information, loading the BTF vmlinux information for every program
in an object is sub-optimal. The fix was originally proposed in:

   https://lore.kernel.org/bpf/CAEf4BzZodr3LKJuM7QwD38BiEH02Cc1UbtnGpVkCJ00Mf+V_Qg@mail.gmail.com/

The btf_vmlinux is populated in the object if any of the programs in
the object requires it just before the programs are loaded and freed
after the programs finish loading.

Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Brendan Jackman <jackmanb@chromium.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200117212825.11755-1-kpsingh@chromium.org
2020-01-17 20:33:22 -08:00
Andrii Nakryiko
6756bdc96e libbpf: Fix potential multiplication overflow in mmap() size calculation
Prevent potential overflow performed in 32-bit integers, before assigning
result to size_t. Reported by LGTM static analysis.

Fixes: eba9c5f498a1 ("libbpf: Refactor global data map initialization")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200117060801.1311525-4-andriin@fb.com
2020-01-17 20:33:22 -08:00
Andrii Nakryiko
9c1ae55dbd libbpf: Simplify BTF initialization logic
Current implementation of bpf_object's BTF initialization is very convoluted
and thus prone to errors. It doesn't have to be like that. This patch
simplifies it significantly.

This code also triggered static analysis issues over logically dead code due
to redundant error checks. This simplification should fix that as well.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200117060801.1311525-3-andriin@fb.com
2020-01-17 20:33:22 -08:00
Andrii Nakryiko
091f073ff0 libbpf: Fix error handling bug in btf_dump__new
Fix missing jump to error handling in btf_dump__new, found by Coverity static
code analysis.

Fixes: 9f81654eebe8 ("libbpf: Expose BTF-to-C type declaration emitting API")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200117060801.1311525-2-andriin@fb.com
2020-01-17 20:33:22 -08:00
Andrii Nakryiko
ad51a528dc travis-ci: make sure before_script override is non-empty
Travis CI seems to be ignoring empty before_script override. Let's make sure
it's a non-empty no-op.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-01-17 20:00:47 -08:00
Andrii Nakryiko
fa29cc01ff sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   858e284f0ec18bff2620d9a6afe764dc683f8ba1
Checkpoint bpf-next commit: 20f21d98cf12b8ecd69e8defc93fae9e3b353b13
Baseline bpf commit:        1712b2fff8c682d145c7889d2290696647d82dab
Checkpoint bpf commit:      1712b2fff8c682d145c7889d2290696647d82dab

Andrii Nakryiko (1):
  libbpf: Revert bpf_helper_defs.h inclusion regression

 src/bpf_helpers.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--
2.17.1
2020-01-16 20:06:41 -08:00
Andrii Nakryiko
a2bec08412 libbpf: Revert bpf_helper_defs.h inclusion regression
Revert bpf_helpers.h's change to include auto-generated bpf_helper_defs.h
through <> instead of "", which causes it to be searched in include path. This
can break existing applications that don't have their include path pointing
directly to where libbpf installs its headers.

There is ongoing work to make all (not just bpf_helper_defs.h) includes more
consistent across libbpf and its consumers, but this unbreaks user code as is
right now without any regressions. Selftests still behave sub-optimally
(taking bpf_helper_defs.h from libbpf's source directory, if it's present
there), which will be fixed in subsequent patches.

Fixes: 6910d7d3867a ("selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir")
Reported-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200117004103.148068-1-andriin@fb.com
2020-01-16 20:06:41 -08:00
Andrii Nakryiko
080fd68e9c sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   1d1a3bcffe360a56fd8cc287ed74d4c3066daf42
Checkpoint bpf-next commit: 858e284f0ec18bff2620d9a6afe764dc683f8ba1
Baseline bpf commit:        e7a5f1f1cd0008e5ad379270a8657e121eedb669
Checkpoint bpf commit:      1712b2fff8c682d145c7889d2290696647d82dab

Andrii Nakryiko (2):
  tools: Sync uapi/linux/if_link.h
  libbpf: Support .text sub-calls relocations

Brian Vazquez (1):
  libbpf: Fix unneeded extra initialization in bpf_map_batch_common

Martin KaFai Lau (1):
  libbpf: Expose bpf_find_kernel_btf as a LIBBPF_API

Yonghong Song (3):
  bpf: Add bpf_send_signal_thread() helper
  tools/bpf: Sync uapi header bpf.h
  libbpf: Add libbpf support to batch ops

 include/uapi/linux/bpf.h     |  40 +++++++++++-
 include/uapi/linux/if_link.h |   1 +
 src/bpf.c                    |  58 +++++++++++++++++
 src/bpf.h                    |  22 +++++++
 src/btf.c                    | 102 +++++++++++++++++++++++++++--
 src/btf.h                    |   2 +
 src/libbpf.c                 | 122 +++++++----------------------------
 src/libbpf.map               |   5 ++
 8 files changed, 247 insertions(+), 105 deletions(-)

--
2.17.1
2020-01-16 10:38:48 -08:00
Andrii Nakryiko
437f57042c sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-01-16 10:38:48 -08:00
Brian Vazquez
f4f271b068 libbpf: Fix unneeded extra initialization in bpf_map_batch_common
bpf_attr doesn't required to be declared with '= {}' as memset is used
in the code.

Fixes: 2ab3d86ea1859 ("libbpf: Add libbpf support to batch ops")
Reported-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200116045918.75597-1-brianvv@google.com
2020-01-16 10:38:48 -08:00
Martin KaFai Lau
bd35a43bb3 libbpf: Expose bpf_find_kernel_btf as a LIBBPF_API
This patch exposes bpf_find_kernel_btf() as a LIBBPF_API.
It will be used in 'bpftool map dump' in a following patch
to dump a map with btf_vmlinux_value_type_id set.

bpf_find_kernel_btf() is renamed to libbpf_find_kernel_btf()
and moved to btf.c.  As <linux/kernel.h> is included,
some of the max/min type casting needs to be fixed.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200115230031.1102305-1-kafai@fb.com
2020-01-16 10:38:48 -08:00
Yonghong Song
d91f681d3b libbpf: Add libbpf support to batch ops
Added four libbpf API functions to support map batch operations:
  . int bpf_map_delete_batch( ... )
  . int bpf_map_lookup_batch( ... )
  . int bpf_map_lookup_and_delete_batch( ... )
  . int bpf_map_update_batch( ... )

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-8-brianvv@google.com
2020-01-16 10:38:48 -08:00
Yonghong Song
1e51491d05 tools/bpf: Sync uapi header bpf.h
sync uapi header include/uapi/linux/bpf.h to
tools/include/uapi/linux/bpf.h

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-7-brianvv@google.com
2020-01-16 10:38:48 -08:00
Andrii Nakryiko
37440e95d1 libbpf: Support .text sub-calls relocations
The LLVM patch https://reviews.llvm.org/D72197 makes LLVM emit function call
relocations within the same section. This includes a default .text section,
which contains any BPF sub-programs. This wasn't the case before and so libbpf
was able to get a way with slightly simpler handling of subprogram call
relocations.

This patch adds support for .text section relocations. It needs to ensure
correct order of relocations, so does two passes:
- first, relocate .text instructions, if there are any relocations in it;
- then process all the other programs and copy over patched .text instructions
for all sub-program calls.

v1->v2:
- break early once .text program is processed.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115190856.2391325-1-andriin@fb.com
2020-01-16 10:38:48 -08:00
Yonghong Song
681f2f9291 bpf: Add bpf_send_signal_thread() helper
Commit 8b401f9ed244 ("bpf: implement bpf_send_signal() helper")
added helper bpf_send_signal() which permits bpf program to
send a signal to the current process. The signal may be
delivered to any threads in the process.

We found a use case where sending the signal to the current
thread is more preferable.
  - A bpf program will collect the stack trace and then
    send signal to the user application.
  - The user application will add some thread specific
    information to the just collected stack trace for
    later analysis.

If bpf_send_signal() is used, user application will need
to check whether the thread receiving the signal matches
the thread collecting the stack by checking thread id.
If not, it will need to send signal to another thread
through pthread_kill().

This patch proposed a new helper bpf_send_signal_thread(),
which sends the signal to the thread corresponding to
the current kernel task. This way, user space is guaranteed that
bpf_program execution context and user space signal handling
context are the same thread.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115035002.602336-1-yhs@fb.com
2020-01-16 10:38:48 -08:00
Andrii Nakryiko
0e4638ec14 tools: Sync uapi/linux/if_link.h
Sync uapi/linux/if_link.h into tools to avoid out of sync warnings during
libbpf build.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200113073143.1779940-2-andriin@fb.com
2020-01-16 10:38:48 -08:00
hex
234a45a128 Update README with Distribution section
- List of current distros having libbpf packaged from GH
- Rationale of having libbpf packaged from GH
- List of package dependencies
2020-01-14 16:12:42 -08:00
Andrii Nakryiko
8687395198 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   2bbc078f812d45b8decb55935dab21199bd21489
Checkpoint bpf-next commit: 1d1a3bcffe360a56fd8cc287ed74d4c3066daf42
Baseline bpf commit:        3c2f450e553ce47fcb0d6141807a8858e3213c9c
Checkpoint bpf commit:      e7a5f1f1cd0008e5ad379270a8657e121eedb669

Alexei Starovoitov (1):
  libbpf: Sanitize global functions

Andrey Ignatov (1):
  bpf: Document BPF_F_QUERY_EFFECTIVE flag

Andrii Nakryiko (3):
  libbpf: Make bpf_map order and indices stable
  selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir
  libbpf: Poison kernel-only integer types

Martin KaFai Lau (2):
  bpf: Synch uapi bpf.h to tools/
  bpf: libbpf: Add STRUCT_OPS support

Michal Rostecki (1):
  libbpf: Add probe for large INSN limit

 include/uapi/linux/bpf.h |  26 +-
 include/uapi/linux/btf.h |   6 +
 src/bpf.c                |  13 +-
 src/bpf.h                |   5 +-
 src/bpf_helpers.h        |   2 +-
 src/bpf_prog_linfo.c     |   3 +
 src/btf.c                |   3 +
 src/btf_dump.c           |   3 +
 src/hashmap.c            |   3 +
 src/libbpf.c             | 701 +++++++++++++++++++++++++++++++++++++--
 src/libbpf.h             |   6 +-
 src/libbpf.map           |   4 +
 src/libbpf_errno.c       |   3 +
 src/libbpf_probes.c      |  26 ++
 src/netlink.c            |   3 +
 src/nlattr.c             |   3 +
 src/str_error.c          |   3 +
 src/xsk.c                |   3 +
 18 files changed, 784 insertions(+), 32 deletions(-)

--
2.17.1
2020-01-10 11:15:12 -08:00
Andrii Nakryiko
0cccc9ff28 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-01-10 11:15:12 -08:00
Andrii Nakryiko
b50eb28758 libbpf: Poison kernel-only integer types
It's been a recurring issue with types like u32 slipping into libbpf source
code accidentally. This is not detected during builds inside kernel source
tree, but becomes a compilation error in libbpf's Github repo. Libbpf is
supposed to use only __{s,u}{8,16,32,64} typedefs, so poison {s,u}{8,16,32,64}
explicitly in every .c file. Doing that in a bit more centralized way, e.g.,
inside libbpf_internal.h breaks selftests, which are both using kernel u32 and
libbpf_internal.h.

This patch also fixes a new u32 occurence in libbpf.c, added recently.

Fixes: 590a00888250 ("bpf: libbpf: Add STRUCT_OPS support")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200110181916.271446-1-andriin@fb.com
2020-01-10 11:15:12 -08:00
Alexei Starovoitov
8d936a1570 libbpf: Sanitize global functions
In case the kernel doesn't support BTF_FUNC_GLOBAL sanitize BTF produced by the
compiler for global functions.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200110064124.1760511-2-ast@kernel.org
2020-01-10 11:15:12 -08:00
Andrii Nakryiko
2c8602eb54 selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir
Reorder includes search path to ensure $(OUTPUT) and $(CURDIR) go before
libbpf's directory. Also fix bpf_helpers.h to include bpf_helper_defs.h in
such a way as to leverage includes search path. This allows selftests to not
use libbpf's local and potentially stale bpf_helper_defs.h. It's important
because selftests/bpf's Makefile only re-generates bpf_helper_defs.h in
seltests' output directory, not the one in libbpf's directory.

Also force regeneration of bpf_helper_defs.h when libbpf.a is updated to
reduce staleness.

Fixes: fa633a0f8919 ("libbpf: Fix build on read-only filesystems")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200110051716.1591485-3-andriin@fb.com
2020-01-10 11:15:12 -08:00
Andrii Nakryiko
1ef23426e7 libbpf: Make bpf_map order and indices stable
Currently, libbpf re-sorts bpf_map structs after all the maps are added and
initialized, which might change their relative order and invalidate any
bpf_map pointer or index taken before that. This is inconvenient and
error-prone. For instance, it can cause .kconfig map index to point to a wrong
map.

Furthermore, libbpf itself doesn't rely on any specific ordering of bpf_maps,
so it's just an unnecessary complication right now. This patch drops sorting
of maps and makes their relative positions fixed. If efficient index is ever
needed, it's better to have a separate array of pointers as a search index,
instead of reordering bpf_map struct in-place. This will be less error-prone
and will allow multiple independent orderings, if necessary (e.g., either by
section index or by name).

Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables")
Reported-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200110034247.1220142-1-andriin@fb.com
2020-01-10 11:15:12 -08:00
Andrey Ignatov
d2100072b9 bpf: Document BPF_F_QUERY_EFFECTIVE flag
Document BPF_F_QUERY_EFFECTIVE flag, mostly to clarify how it affects
attach_flags what may not be obvious and what may lead to confision.

Specifically attach_flags is returned only for target_fd but if programs
are inherited from an ancestor cgroup then returned attach_flags for
current cgroup may be confusing. For example, two effective programs of
same attach_type can be returned but w/o BPF_F_ALLOW_MULTI in
attach_flags.

Simple repro:
  # bpftool c s /sys/fs/cgroup/path/to/task
  ID       AttachType      AttachFlags     Name
  # bpftool c s /sys/fs/cgroup/path/to/task effective
  ID       AttachType      AttachFlags     Name
  95043    ingress                         tw_ipt_ingress
  95048    ingress                         tw_ingress

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200108014006.938363-1-rdna@fb.com
2020-01-10 11:15:12 -08:00
Martin KaFai Lau
f3edca46e5 bpf: libbpf: Add STRUCT_OPS support
This patch adds BPF STRUCT_OPS support to libbpf.

The only sec_name convention is SEC(".struct_ops") to identify the
struct_ops implemented in BPF,
e.g. To implement a tcp_congestion_ops:

SEC(".struct_ops")
struct tcp_congestion_ops dctcp = {
	.init           = (void *)dctcp_init,  /* <-- a bpf_prog */
	/* ... some more func prts ... */
	.name           = "bpf_dctcp",
};

Each struct_ops is defined as a global variable under SEC(".struct_ops")
as above.  libbpf creates a map for each variable and the variable name
is the map's name.  Multiple struct_ops is supported under
SEC(".struct_ops").

In the bpf_object__open phase, libbpf will look for the SEC(".struct_ops")
section and find out what is the btf-type the struct_ops is
implementing.  Note that the btf-type here is referring to
a type in the bpf_prog.o's btf.  A "struct bpf_map" is added
by bpf_object__add_map() as other maps do.  It will then
collect (through SHT_REL) where are the bpf progs that the
func ptrs are referring to.  No btf_vmlinux is needed in
the open phase.

In the bpf_object__load phase, the map-fields, which depend
on the btf_vmlinux, are initialized (in bpf_map__init_kern_struct_ops()).
It will also set the prog->type, prog->attach_btf_id, and
prog->expected_attach_type.  Thus, the prog's properties do
not rely on its section name.
[ Currently, the bpf_prog's btf-type ==> btf_vmlinux's btf-type matching
  process is as simple as: member-name match + btf-kind match + size match.
  If these matching conditions fail, libbpf will reject.
  The current targeting support is "struct tcp_congestion_ops" which
  most of its members are function pointers.
  The member ordering of the bpf_prog's btf-type can be different from
  the btf_vmlinux's btf-type. ]

Then, all obj->maps are created as usual (in bpf_object__create_maps()).

Once the maps are created and prog's properties are all set,
the libbpf will proceed to load all the progs.

bpf_map__attach_struct_ops() is added to register a struct_ops
map to a kernel subsystem.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200109003514.3856730-1-kafai@fb.com
2020-01-10 11:15:12 -08:00
Martin KaFai Lau
cabb077325 bpf: Synch uapi bpf.h to tools/
This patch sync uapi bpf.h to tools/

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200109003512.3856559-1-kafai@fb.com
2020-01-10 11:15:12 -08:00
Michal Rostecki
b95b281039 libbpf: Add probe for large INSN limit
Introduce a new probe which checks whether kernel has large maximum
program size which was increased in the following commit:

c04c0d2b968a ("bpf: increase complexity limit and maximum program size")

Based on the similar check in Cilium[0], authored by Daniel Borkmann.

  [0] 657d0f585a

Co-authored-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Link: https://lore.kernel.org/bpf/20200108162428.25014-2-mrostecki@opensuse.org
2020-01-10 11:15:12 -08:00
Andrii Nakryiko
5033d7177e sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   7745ff9842617323adbe24e71c495d5ebd9aa352
Checkpoint bpf-next commit: 2bbc078f812d45b8decb55935dab21199bd21489
Baseline bpf commit:        1148f9adbe71415836a18a36c1b4ece999ab0973
Checkpoint bpf commit:      3c2f450e553ce47fcb0d6141807a8858e3213c9c

Andrey Ignatov (2):
  bpf: Support replacing cgroup-bpf program in MULTI mode
  libbpf: Introduce bpf_prog_attach_xattr

Andrii Nakryiko (1):
  libbpf: Support CO-RE relocations for LDX/ST/STX instructions

 include/uapi/linux/bpf.h | 10 ++++++++++
 src/bpf.c                | 17 ++++++++++++++++-
 src/bpf.h                | 11 +++++++++++
 src/libbpf.c             | 31 ++++++++++++++++++++++++++++---
 src/libbpf.map           |  1 +
 5 files changed, 66 insertions(+), 4 deletions(-)

--
2.17.1
2019-12-30 12:05:19 -08:00
Andrii Nakryiko
49058f8c6f libbpf: Support CO-RE relocations for LDX/ST/STX instructions
Clang patch [0] enables emitting relocatable generic ALU/ALU64 instructions
(i.e, shifts and arithmetic operations), as well as generic load/store
instructions. The former ones are already supported by libbpf as is. This
patch adds further support for load/store instructions. Relocatable field
offset is encoded in BPF instruction's 16-bit offset section and are adjusted
by libbpf based on target kernel BTF.

These Clang changes and corresponding libbpf changes allow for more succinct
generated BPF code by encoding relocatable field reads as a single
ST/LDX/STX instruction. It also enables relocatable access to BPF context.
Previously, if context struct (e.g., __sk_buff) was accessed with CO-RE
relocations (e.g., due to preserve_access_index attribute), it would be
rejected by BPF verifier due to modified context pointer dereference. With
Clang patch, such context accesses are both relocatable and have a fixed
offset from the point of view of BPF verifier.

  [0] https://reviews.llvm.org/D71790

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191223180305.86417-1-andriin@fb.com
2019-12-30 12:05:19 -08:00
Andrey Ignatov
8b20ffa4b9 libbpf: Introduce bpf_prog_attach_xattr
Introduce a new bpf_prog_attach_xattr function that, in addition to
program fd, target fd and attach type, accepts an extendable struct
bpf_prog_attach_opts.

bpf_prog_attach_opts relies on DECLARE_LIBBPF_OPTS macro to maintain
backward and forward compatibility and has the following "optional"
attach attributes:

* existing attach_flags, since it's not required when attaching in NONE
  mode. Even though it's quite often used in MULTI and OVERRIDE mode it
  seems to be a good idea to reduce number of arguments to
  bpf_prog_attach_xattr;

* newly introduced attribute of BPF_PROG_ATTACH command: replace_prog_fd
  that is fd of previously attached cgroup-bpf program to replace if
  BPF_F_REPLACE flag is used.

The new function is named to be consistent with other xattr-functions
(bpf_prog_test_run_xattr, bpf_create_map_xattr, bpf_load_program_xattr).

The struct bpf_prog_attach_opts is supposed to be used with
DECLARE_LIBBPF_OPTS macro.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/bd6e0732303eb14e4b79cb128268d9e9ad6db208.1576741281.git.rdna@fb.com
2019-12-30 12:05:19 -08:00
Andrey Ignatov
1b1e30679f bpf: Support replacing cgroup-bpf program in MULTI mode
The common use-case in production is to have multiple cgroup-bpf
programs per attach type that cover multiple use-cases. Such programs
are attached with BPF_F_ALLOW_MULTI and can be maintained by different
people.

Order of programs usually matters, for example imagine two egress
programs: the first one drops packets and the second one counts packets.
If they're swapped the result of counting program will be different.

It brings operational challenges with updating cgroup-bpf program(s)
attached with BPF_F_ALLOW_MULTI since there is no way to replace a
program:

* One way to update is to detach all programs first and then attach the
  new version(s) again in the right order. This introduces an
  interruption in the work a program is doing and may not be acceptable
  (e.g. if it's egress firewall);

* Another way is attach the new version of a program first and only then
  detach the old version. This introduces the time interval when two
  versions of same program are working, what may not be acceptable if a
  program is not idempotent. It also imposes additional burden on
  program developers to make sure that two versions of their program can
  co-exist.

Solve the problem by introducing a "replace" mode in BPF_PROG_ATTACH
command for cgroup-bpf programs being attached with BPF_F_ALLOW_MULTI
flag. This mode is enabled by newly introduced BPF_F_REPLACE attach flag
and bpf_attr.replace_bpf_fd attribute to pass fd of the old program to
replace

That way user can replace any program among those attached with
BPF_F_ALLOW_MULTI flag without the problems described above.

Details of the new API:

* If BPF_F_REPLACE is set but replace_bpf_fd doesn't have valid
  descriptor of BPF program, BPF_PROG_ATTACH will return corresponding
  error (EINVAL or EBADF).

* If replace_bpf_fd has valid descriptor of BPF program but such a
  program is not attached to specified cgroup, BPF_PROG_ATTACH will
  return ENOENT.

BPF_F_REPLACE is introduced to make the user intent clear, since
replace_bpf_fd alone can't be used for this (its default value, 0, is a
valid fd). BPF_F_REPLACE also makes it possible to extend the API in the
future (e.g. add BPF_F_BEFORE and BPF_F_AFTER if needed).

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Andrii Narkyiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/30cd850044a0057bdfcaaf154b7d2f39850ba813.1576741281.git.rdna@fb.com
2019-12-30 12:05:19 -08:00
Andrii Nakryiko
e7a82fc033 sync: add zlib dependency and libbpf_common.h to list of installed headers
zlib is now a direct dependency of libbpf (previously zlib was only dependency
of libelf, on which libbpf depends as well). For non-pkg-config case, specify
`-lz` compiler flag explicitly.

Recent sync also added another public header to libbpf. Include it in a list
of headers that are installed on target system.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
7c5583ab2d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   679152d3a32e305c213f83160c328c37566ae8bc
Checkpoint bpf-next commit: 7745ff9842617323adbe24e71c495d5ebd9aa352
Baseline bpf commit:        fe3300897cbfd76c6cb825776e5ac0ca50a91ca4
Checkpoint bpf commit:      1148f9adbe71415836a18a36c1b4ece999ab0973

Andrii Nakryiko (26):
  libbpf: Extract and generalize CPU mask parsing logic
  libbpf: Don't attach perf_buffer to offline/missing CPUs
  libbpf: Don't require root for bpf_object__open()
  libbpf: Add generic bpf_program__attach()
  libbpf: Move non-public APIs from libbpf.h to libbpf_internal.h
  libbpf: Add BPF_EMBED_OBJ macro for embedding BPF .o files
  libbpf: Extract common user-facing helpers
  libbpf: Expose btf__align_of() API
  libbpf: Expose BTF-to-C type declaration emitting API
  libbpf: Expose BPF program's function name
  libbpf: Refactor global data map initialization
  libbpf: Postpone BTF ID finding for TRACING programs to load phase
  libbpf: Reduce log level of supported section names dump
  libbpf: Add BPF object skeleton support
  libbpf: Extract internal map names into constants
  libbpf: Support libbpf-provided extern variables
  bpftool: Generate externs datasec in BPF skeleton
  libbpf: Support flexible arrays in CO-RE
  libbpf: Add zlib as a dependency in pkg-config template
  libbpf: Reduce log level for custom section names
  libbpf: Remove BPF_EMBED_OBJ macro from libbpf.h
  libbpf: Add bpf_link__disconnect() API to preserve underlying BPF
    resource
  libbpf: Put Kconfig externs into .kconfig section
  libbpf: Allow to augment system Kconfig through extra optional config
  libbpf: BTF is required when externs are present
  libbpf: Fix another __u64 printf warning

Jakub Sitnicki (1):
  libbpf: Recognize SK_REUSEPORT programs from section name

Prashant Bhole (1):
  libbpf: Fix build by renaming variables

Toke Høiland-Jørgensen (4):
  libbpf: Print hint about ulimit when getting permission denied error
  libbpf: Fix libbpf_common.h when installing libbpf through 'make
    install'
  libbpf: Add missing newline in opts validation macro
  libbpf: Fix printing of ulimit value

 include/uapi/linux/btf.h |    7 +-
 src/bpf.h                |    6 +-
 src/bpf_helpers.h        |   11 +
 src/btf.c                |   48 +-
 src/btf.h                |   29 +-
 src/btf_dump.c           |  115 ++-
 src/libbpf.c             | 1673 ++++++++++++++++++++++++++++++++------
 src/libbpf.h             |  107 +--
 src/libbpf.map           |   12 +
 src/libbpf.pc.template   |    2 +-
 src/libbpf_common.h      |   40 +
 src/libbpf_internal.h    |   21 +-
 12 files changed, 1678 insertions(+), 393 deletions(-)
 create mode 100644 src/libbpf_common.h

--
2.17.1
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
0d9d85e345 libbpf: Fix another __u64 printf warning
Fix yet another printf warning for %llu specifier on ppc64le. This time size_t
casting won't work, so cast to verbose `unsigned long long`.

Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191219052103.3515-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Toke Høiland-Jørgensen
c8c4edf4c9 libbpf: Fix printing of ulimit value
Naresh pointed out that libbpf builds fail on 32-bit architectures because
rlimit.rlim_cur is defined as 'unsigned long long' on those architectures.
Fix this by using %zu in printf and casting to size_t.

Fixes: dc3a2d254782 ("libbpf: Print hint about ulimit when getting permission denied error")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191219090236.905059-1-toke@redhat.com
2019-12-19 15:34:27 -08:00
Toke Høiland-Jørgensen
23983fd75b libbpf: Add missing newline in opts validation macro
The error log output in the opts validation macro was missing a newline.

Fixes: 2ce8450ef5a3 ("libbpf: add bpf_object__open_{file, mem} w/ extensible opts")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191219120714.928380-1-toke@redhat.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
4bbdefdce1 libbpf: BTF is required when externs are present
BTF is required to get type information about extern variables.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191219002837.3074619-4-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
5bc09e54fa libbpf: Allow to augment system Kconfig through extra optional config
Instead of all or nothing approach of overriding Kconfig file location, allow
to extend it with extra values and override chosen subset of values though
optional user-provided extra config, passed as a string through open options'
.kconfig option. If same config key is present in both user-supplied config
and Kconfig, user-supplied one wins. This allows applications to more easily
test various conditions despite host kernel's real configuration. If all of
BPF object's __kconfig externs are satisfied from user-supplied config, system
Kconfig won't be read at all.

Simplify selftests by not needing to create temporary Kconfig files.

Suggested-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191219002837.3074619-3-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
9f61b5b95c libbpf: Put Kconfig externs into .kconfig section
Move Kconfig-provided externs into custom .kconfig section. Add __kconfig into
bpf_helpers.h for user convenience. Update selftests accordingly.

Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191219002837.3074619-2-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
baa3268b13 libbpf: Add bpf_link__disconnect() API to preserve underlying BPF resource
There are cases in which BPF resource (program, map, etc) has to outlive
userspace program that "installed" it in the system in the first place.
When BPF program is attached, libbpf returns bpf_link object, which
is supposed to be destroyed after no longer necessary through
bpf_link__destroy() API. Currently, bpf_link destruction causes both automatic
detachment and frees up any resources allocated to for bpf_link in-memory
representation. This is inconvenient for the case described above because of
coupling of detachment and resource freeing.

This patch introduces bpf_link__disconnect() API call, which marks bpf_link as
disconnected from its underlying BPF resouces. This means that when bpf_link
is destroyed later, all its memory resources will be freed, but BPF resource
itself won't be detached.

This design allows to follow strict and resource-leak-free design by default,
while giving easy and straightforward way for user code to opt for keeping BPF
resource attached beyond lifetime of a bpf_link. For some BPF programs (i.e.,
FS-based tracepoints, kprobes, raw tracepoint, etc), user has to make sure to
pin BPF program to prevent kernel to automatically detach it on process exit.
This should typically be achived by pinning BPF program (or map in some cases)
in BPF FS.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191218225039.2668205-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
f5599ef856 libbpf: Remove BPF_EMBED_OBJ macro from libbpf.h
Drop BPF_EMBED_OBJ and struct bpf_embed_data now that skeleton automatically
embeds contents of its source object file. While BPF_EMBED_OBJ is useful
independently of skeleton, we are currently don't have any use cases utilizing
it, so let's remove them until/if we need it.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191218052552.2915188-3-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
99c65fed78 libbpf: Reduce log level for custom section names
Libbpf is trying to recognize BPF program type based on its section name
during bpf_object__open() phase. This is not strictly enforced and user code
has ability to specify/override correct BPF program type after open.  But if
BPF program is using custom section name, libbpf will still emit warnings,
which can be quite annoying to users. This patch reduces log level of
information messages emitted by libbpf if section name is not canonical. User
can still get a list of all supported section names as debug-level message.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191217234228.1739308-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Toke Høiland-Jørgensen
e9adfa851f libbpf: Fix libbpf_common.h when installing libbpf through 'make install'
This fixes two issues with the newly introduced libbpf_common.h file:

- The header failed to include <string.h> for the definition of memset()
- The new file was not included in the install_headers rule in the Makefile

Both of these issues cause breakage when installing libbpf with 'make
install' and trying to use it in applications.

Fixes: 544402d4b493 ("libbpf: Extract common user-facing helpers")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191217112810.768078-1-toke@redhat.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
8363b8d4e6 libbpf: Add zlib as a dependency in pkg-config template
List zlib as another dependency of libbpf in pkg-config template.
Verified it is correctly resolved to proper -lz flag:

$ make DESTDIR=/tmp/libbpf-install install
$ pkg-config --libs /tmp/libbpf-install/usr/local/lib64/pkgconfig/libbpf.pc
-L/usr/local/lib64 -lbpf
$ pkg-config --libs --static /tmp/libbpf-install/usr/local/lib64/pkgconfig/libbpf.pc
-L/usr/local/lib64 -lbpf -lelf -lz

Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Cc: Luca Boccassi <bluca@debian.org>
Link: https://lore.kernel.org/bpf/20191216183830.3972964-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Toke Høiland-Jørgensen
f892b464d0 libbpf: Print hint about ulimit when getting permission denied error
Probably the single most common error newcomers to XDP are stumped by is
the 'permission denied' error they get when trying to load their program
and 'ulimit -l' is set too low. For examples, see [0], [1].

Since the error code is UAPI, we can't change that. Instead, this patch
adds a few heuristics in libbpf and outputs an additional hint if they are
met: If an EPERM is returned on map create or program load, and geteuid()
shows we are root, and the current RLIMIT_MEMLOCK is not infinity, we
output a hint about raising 'ulimit -l' as an additional log line.

[0] https://marc.info/?l=xdp-newbies&m=157043612505624&w=2
[1] https://github.com/xdp-project/xdp-tutorial/issues/86

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191216181204.724953-1-toke@redhat.com
2019-12-19 15:34:27 -08:00
Prashant Bhole
a4132d1590 libbpf: Fix build by renaming variables
In btf__align_of() variable name 't' is shadowed by inner block
declaration of another variable with same name. Patch renames
variables in order to fix it.

  CC       sharedobjs/btf.o
btf.c: In function ‘btf__align_of’:
btf.c:303:21: error: declaration of ‘t’ shadows a previous local [-Werror=shadow]
  303 |   int i, align = 1, t;
      |                     ^
btf.c:283:25: note: shadowed declaration is here
  283 |  const struct btf_type *t = btf__type_by_id(btf, id);
      |

Fixes: 3d208f4ca111 ("libbpf: Expose btf__align_of() API")
Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20191216082738.28421-1-prashantbhole.linux@gmail.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
303916a126 libbpf: Support flexible arrays in CO-RE
Some data stuctures in kernel are defined with either zero-sized array or
flexible (dimensionless) array at the end of a struct. Actual data of such
array follows in memory immediately after the end of that struct, forming its
variable-sized "body" of elements. Support such access pattern in CO-RE
relocation handling.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191215070844.1014385-2-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
8eea7ed8e8 bpftool: Generate externs datasec in BPF skeleton
Add support for generation of mmap()-ed read-only view of libbpf-provided
extern variables. As externs are not supposed to be provided by user code
(that's what .data, .bss, and .rodata is for), don't mmap() it initially. Only
after skeleton load is performed, map .extern contents as read-only memory.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-4-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
fa030ffd20 libbpf: Support libbpf-provided extern variables
Add support for extern variables, provided to BPF program by libbpf. Currently
the following extern variables are supported:
  - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is
    executing, follows KERNEL_VERSION() macro convention, can be 4- and 8-byte
    long;
  - CONFIG_xxx values; a set of values of actual kernel config. Tristate,
    boolean, strings, and integer values are supported.

Set of possible values is determined by declared type of extern variable.
Supported types of variables are:
- Tristate values. Are represented as `enum libbpf_tristate`. Accepted values
  are **strictly** 'y', 'n', or 'm', which are represented as TRI_YES, TRI_NO,
  or TRI_MODULE, respectively.
- Boolean values. Are represented as bool (_Bool) types. Accepted values are
  'y' and 'n' only, turning into true/false values, respectively.
- Single-character values. Can be used both as a substritute for
  bool/tristate, or as a small-range integer:
  - 'y'/'n'/'m' are represented as is, as characters 'y', 'n', or 'm';
  - integers in a range [-128, 127] or [0, 255] (depending on signedness of
    char in target architecture) are recognized and represented with
    respective values of char type.
- Strings. String values are declared as fixed-length char arrays. String of
  up to that length will be accepted and put in first N bytes of char array,
  with the rest of bytes zeroed out. If config string value is longer than
  space alloted, it will be truncated and warning message emitted. Char array
  is always zero terminated. String literals in config have to be enclosed in
  double quotes, just like C-style string literals.
- Integers. 8-, 16-, 32-, and 64-bit integers are supported, both signed and
  unsigned variants. Libbpf enforces parsed config value to be in the
  supported range of corresponding integer type. Integers values in config can
  be:
  - decimal integers, with optional + and - signs;
  - hexadecimal integers, prefixed with 0x or 0X;
  - octal integers, starting with 0.

Config file itself is searched in /boot/config-$(uname -r) location with
fallback to /proc/config.gz, unless config path is specified explicitly
through bpf_object_open_opts' kernel_config_path option. Both gzipped and
plain text formats are supported. Libbpf adds explicit dependency on zlib
because of this, but this shouldn't be a problem, given libelf already depends
on zlib.

All detected extern variables, are put into a separate .extern internal map.
It, similarly to .rodata map, is marked as read-only from BPF program side, as
well as is frozen on load. This allows BPF verifier to track extern values as
constants and perform enhanced branch prediction and dead code elimination.
This can be relied upon for doing kernel version/feature detection and using
potentially unsupported field relocations or BPF helpers in a CO-RE-based BPF
program, while still having a single version of BPF program running on old and
new kernels. Selftests are validating this explicitly for unexisting BPF
helper.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-3-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
ea06bc30fa libbpf: Extract internal map names into constants
Instead of duplicating string literals, keep them in one place and consistent.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-2-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
531ac0e65f libbpf: Add BPF object skeleton support
Add new set of APIs, allowing to open/load/attach BPF object through BPF
object skeleton, generated by bpftool for a specific BPF object file. All the
xxx_skeleton() APIs wrap up corresponding bpf_object_xxx() APIs, but
additionally also automate map/program lookups by name, global data
initialization and mmap()-ing, etc.  All this greatly improves and simplifies
userspace usability of working with BPF programs. See follow up patches for
examples.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-13-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
e35cb347ce libbpf: Reduce log level of supported section names dump
It's quite spammy. And now that bpf_object__open() is trying to determine
program type from its section name, we are getting these verbose messages all
the time. Reduce their log level to DEBUG.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-12-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
68fa3f0b57 libbpf: Postpone BTF ID finding for TRACING programs to load phase
Move BTF ID determination for BPF_PROG_TYPE_TRACING programs to a load phase.
Performing it at open step is inconvenient, because it prevents BPF skeleton
generation on older host kernel, which doesn't contain BTF_KIND_FUNCs
information in vmlinux BTF. This is a common set up, though, when, e.g.,
selftests are compiled on older host kernel, but the test program itself is
executed in qemu VM with bleeding edge kernel. Having this BTF searching
performed at load time allows to successfully use bpf_object__open() for
codegen and inspection of BPF object file.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-11-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
1c145f0fda libbpf: Refactor global data map initialization
Refactor global data map initialization to use anonymous mmap()-ed memory
instead of malloc()-ed one. This allows to do a transparent re-mmap()-ing of
already existing memory address to point to BPF map's memory after
bpf_object__load() step (done in follow up patch). This choreographed setup
allows to have a nice and unsurprising way to pre-initialize read-only (and
r/w as well) maps by user and after BPF map creation keep working with
mmap()-ed contents of this map. All in a way that doesn't require user code to
update any pointers: the illusion of working with memory contents is preserved
before and after actual BPF map instantiation.

Selftests and runqslower example demonstrate this feature in follow up patches.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-10-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
418c07226a libbpf: Expose BPF program's function name
Add APIs to get BPF program function name, as opposed to bpf_program__title(),
which returns BPF program function's section name. Function name has a benefit
of being a valid C identifier and uniquely identifies a specific BPF program,
while section name can be duplicated across multiple independent BPF programs.

Add also bpf_object__find_program_by_name(), similar to
bpf_object__find_program_by_title(), to facilitate looking up BPF programs by
their C function names.

Convert one of selftests to new API for look up.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-9-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
5ec0ba6530 libbpf: Expose BTF-to-C type declaration emitting API
Expose API that allows to emit type declaration and field/variable definition
(if optional field name is specified) in valid C syntax for any provided BTF
type. This is going to be used by bpftool when emitting data section layout as
a struct. As part of making this API useful in a stand-alone fashion, move
initialization of some of the internal btf_dump state to earlier phase.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-8-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
600ba1c5e1 libbpf: Expose btf__align_of() API
Expose BTF API that calculates type alignment requirements.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-7-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
aa73e35dc3 libbpf: Extract common user-facing helpers
LIBBPF_API and DECLARE_LIBBPF_OPTS are needed in many public libbpf API
headers. Extract them into libbpf_common.h to avoid unnecessary
interdependency between btf.h, libbpf.h, and bpf.h or code duplication.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-6-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
dca6176410 libbpf: Add BPF_EMBED_OBJ macro for embedding BPF .o files
Add a convenience macro BPF_EMBED_OBJ, which allows to embed other files
(typically used to embed BPF .o files) into a hosting userspace programs. To
C program it is exposed as struct bpf_embed_data, containing a pointer to
raw data and its size in bytes.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-5-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
6f88f26945 libbpf: Move non-public APIs from libbpf.h to libbpf_internal.h
Few libbpf APIs are not public but currently exposed through libbpf.h to be
used by bpftool. Move them to libbpf_internal.h, where intent of being
non-stable and non-public is much more obvious.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-4-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
f7af143516 libbpf: Add generic bpf_program__attach()
Generalize BPF program attaching and allow libbpf to auto-detect type (and
extra parameters, where applicable) and attach supported BPF program types
based on program sections. Currently this is supported for:
- kprobe/kretprobe;
- tracepoint;
- raw tracepoint;
- tracing programs (typed raw TP/fentry/fexit).

More types support can be trivially added within this framework.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-3-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
4f3c7b3e13 libbpf: Don't require root for bpf_object__open()
Reorganize bpf_object__open and bpf_object__load steps such that
bpf_object__open doesn't need root access. This was previously done for
feature probing and BTF sanitization. This doesn't have to happen on open,
though, so move all those steps into the load phase.

This is important, because it makes it possible for tools like bpftool, to
just open BPF object file and inspect their contents: programs, maps, BTF,
etc. For such operations it is prohibitive to require root access. On the
other hand, there is a lot of custom libbpf logic in those steps, so its best
avoided for tools to reimplement all that on their own.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-2-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
b85e83f6cb libbpf: Don't attach perf_buffer to offline/missing CPUs
It's quite common on some systems to have more CPUs enlisted as "possible",
than there are (and could ever be) present/online CPUs. In such cases,
perf_buffer creationg will fail due to inability to create perf event on
missing CPU with error like this:

libbpf: failed to open perf buffer event on cpu #16: No such device

This patch fixes the logic of perf_buffer__new() to ignore CPUs that are
missing or currently offline. In rare cases where user explicitly listed
specific CPUs to connect to, behavior is unchanged: libbpf will try to open
perf event buffer on specified CPU(s) anyways.

Fixes: fb84b8224655 ("libbpf: add perf buffer API")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191212013609.1691168-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
33d1fbea57 libbpf: Extract and generalize CPU mask parsing logic
This logic is re-used for parsing a set of online CPUs. Having it as an
isolated piece of code working with input string makes it conveninent to test
this logic as well. While refactoring, also improve the robustness of original
implementation.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191212013548.1690564-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Jakub Sitnicki
b234d12c97 libbpf: Recognize SK_REUSEPORT programs from section name
Allow loading BPF object files that contain SK_REUSEPORT programs without
having to manually set the program type before load if the the section name
is set to "sk_reuseport".

Makes user-space code needed to load SK_REUSEPORT BPF program more concise.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191212102259.418536-2-jakub@cloudflare.com
2019-12-19 15:34:27 -08:00
hex
7a1d185108 libbpf: fix Coverity scan CI
A follow up of [1]
Travis CI stages use default phases when no override provided.
This leads to Coverity scan stage fail due to execuing the default
before_script: phase of VMTEST.
Fix this with an explicit override with empty value.

[1] https://github.com/libbpf/libbpf/pull/108
2019-12-17 16:46:57 -08:00
hex
76d5bb6a13 libbpf: Add VMTEST to CI
Extend continuous integration tests by adding testing against various kernel
versions.
The code is based on vmtest CI scripts implemented by osandov@
for drgn [1] with the following modifications:
- The downloadables are stored in Amazon S3 cloud indexed in [2]
- `--setup-cmd` command line option is added to vmtest/run.sh so
  setup commands run on VM boot can be set in e.g. `.travis.yml`
- Travis build matrix [2] is introduced for VM tests so VM tests are
  followed by the existing CI tests. The matrix has `KERNEL` and
  `VMTEST_SETUPCMD` dimensions.
- Minor style fixes.

The vmtest extention code is located in travis-ci/vmtest and contains
`run.sh` and `setup_example.sh`
- `run.sh` is responsible for the vmtest workflow: downloading vmlinux
  and rootfs image from the cloud, fs mounting, syncing libbpf sources
  to the image, setting up scripts run on VM boot, starting VM using
  QEMU.
  `run.sh` covers more use cases than a script for a job run in TravisCI,
  e.g. int can build a kernel w/ `--build` option.

- `setup_example.sh` is an example of a script run in VM which can be
  modified to e.g. run actual libbpf tests. A setup script should have
  executable permission.

To set up a new kernel version for a test:
1) upload vmlinuz.* and vmlinux.*\.zst to Amazon S3 store
located at [4];
2) modify INDEX [2] file.

[1] https://github.com/osandov/drgn
[2] https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/INDEX
[3] https://docs.travis-ci.com/user/build-matrix
[4] https://libbpf-vmtest.s3-us-west-1.amazonaws.com/
2019-12-16 21:04:03 -08:00
Frantisek Sumsal
c42bfcbf0e travis: build on ppc64le as well 2019-12-13 01:04:46 -08:00
Andrii Nakryiko
c2fc7c15a3 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   e7096c131e5161fa3b8e52a650d7719d2857adfd
Checkpoint bpf-next commit: 679152d3a32e305c213f83160c328c37566ae8bc
Baseline bpf commit:        e42617b825f8073569da76dc4510bfa019b1c35a
Checkpoint bpf commit:      fe3300897cbfd76c6cb825776e5ac0ca50a91ca4

Andrii Nakryiko (2):
  libbpf: Bump libpf current version to v0.0.7
  libbpf: Fix printf compilation warnings on ppc64le arch

 src/libbpf.c   | 37 +++++++++++++++++++------------------
 src/libbpf.map |  3 +++
 2 files changed, 22 insertions(+), 18 deletions(-)

--
2.17.1
2019-12-12 14:40:26 -08:00
Andrii Nakryiko
4060a65222 libbpf: Fix printf compilation warnings on ppc64le arch
On ppc64le __u64 and __s64 are defined as long int and unsigned long int,
respectively. This causes compiler to emit warning when %lld/%llu are used to
printf 64-bit numbers. Fix this by casting to size_t/ssize_t with %zu and %zd
format specifiers, respectively.

v1->v2:
- use size_t/ssize_t instead of custom typedefs (Martin).

Fixes: 1f8e2bcb2cd5 ("libbpf: Refactor relocation handling")
Fixes: abd29c931459 ("libbpf: allow specifying map definitions using BTF")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191212171918.638010-1-andriin@fb.com
2019-12-12 14:40:26 -08:00
Andrii Nakryiko
a26f6b1375 libbpf: Bump libpf current version to v0.0.7
New development cycles starts, bump to v0.0.7 proactively.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191209224022.3544519-1-andriin@fb.com
2019-12-12 14:40:26 -08:00
Toke Høiland-Jørgensen
6e686c26fa Makefile: Add cscope and tags rules
These were added to the kernel repo, but not in Github. However, they are
useful for browsing the source in Github while prototyping new features and
compiling them into userspace utilities.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
2019-12-11 10:38:48 -08:00
Andrii Nakryiko
ab067ed371 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   b615e5a1e067dcb327482d1af7463268b89b1629
Checkpoint bpf-next commit: e7096c131e5161fa3b8e52a650d7719d2857adfd
Baseline bpf commit:        34e59836565e36fade1464e054a3551c1a0364be
Checkpoint bpf commit:      e42617b825f8073569da76dc4510bfa019b1c35a

Alexei Starovoitov (2):
  libbpf: Fix sym->st_value print on 32-bit arches
  selftests/bpf: Add test for BPF trampoline

Andrii Nakryiko (1):
  libbpf: Fix global variable relocation

Martin KaFai Lau (1):
  bpf: Introduce BPF_TRACE_x helper for the tracing tests

 src/libbpf.c | 45 ++++++++++++++++++++-------------------------
 1 file changed, 20 insertions(+), 25 deletions(-)

--
2.17.1
2019-12-09 09:44:20 -08:00
Martin KaFai Lau
9b69fbe4d1 bpf: Introduce BPF_TRACE_x helper for the tracing tests
For BPF_PROG_TYPE_TRACING, the bpf_prog's ctx is an array of u64.
This patch borrows the idea from BPF_CALL_x in filter.h to
convert a u64 to the arg type of the traced function.

The new BPF_TRACE_x has an arg to specify the return type of a bpf_prog.
It will be used in the future TCP-ops bpf_prog that may return "void".

The new macros are defined in the new header file "bpf_trace_helpers.h".
It is under selftests/bpf/ for now.  It could be moved to libbpf later
after seeing more upcoming non-tracing use cases.

The tests are changed to use these new macros also.  Hence,
the k[s]u8/16/32/64 are no longer needed and they are removed
from the bpf_helpers.h.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191123202504.1502696-1-kafai@fb.com
2019-12-09 09:44:20 -08:00
Alexei Starovoitov
04d8fc50ab selftests/bpf: Add test for BPF trampoline
Add sanity test for BPF trampoline that checks kernel functions
with up to 6 arguments of different sizes.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-10-ast@kernel.org
2019-12-09 09:44:20 -08:00
Alexei Starovoitov
ceff1e0363 libbpf: Fix sym->st_value print on 32-bit arches
The st_value field is a 64-bit value and causing this error on 32-bit arches:

In file included from libbpf.c:52:
libbpf.c: In function 'bpf_program__record_reloc':
libbpf_internal.h:59:22: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'Elf64_Addr' {aka 'const long long unsigned int'} [-Werror=format=]

Fix it with (__u64) cast.

Fixes: 1f8e2bcb2cd5 ("libbpf: Refactor relocation handling")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-12-09 09:44:20 -08:00
Andrii Nakryiko
d28acc595f libbpf: Fix global variable relocation
Similarly to a0d7da26ce86 ("libbpf: Fix call relocation offset calculation
bug"), relocations against global variables need to take into account
referenced symbol's st_value, which holds offset into a corresponding data
section (and, subsequently, offset into internal backing map). For static
variables this offset is always zero and data offset is completely described
by respective instruction's imm field.

Convert a bunch of selftests to global variables. Previously they were relying
on `static volatile` trick to ensure Clang doesn't inline static variables,
which with global variables is not necessary anymore.

Fixes: 393cdfbee809 ("libbpf: Support initialized global variables")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191127200651.1381348-1-andriin@fb.com
2019-12-09 09:44:20 -08:00
Andrii Nakryiko
9ef191ea7d license: add LICENSE with dual-license SPDX expression
Add LICENSE specifying dual-license expression.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-11-26 11:06:43 -08:00
Andrii Nakryiko
1add860402 license: add license note to README
Add mention of dual-licensing to README

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-11-26 11:02:19 -08:00
Andrii Nakryiko
c658f21738 libbpf: add BSD-2-Clause and LGPL-2.1 licenses
Libbpf is dual-licensed under BSD-2-Clause and LGPL-2.1 licenses. Include
their texts in the root of the repo.

Suggestes-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-11-26 09:54:43 -08:00
Andrii Nakryiko
9f519af7f4 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   e47a179997ceee6864fbae620eee09ea9c345a4d
Checkpoint bpf-next commit: b615e5a1e067dcb327482d1af7463268b89b1629
Baseline bpf commit:        d0fbb51dfaa612f960519b798387be436e8f83c5
Checkpoint bpf commit:      34e59836565e36fade1464e054a3551c1a0364be

Alexei Starovoitov (4):
  libbpf: Introduce btf__find_by_name_kind()
  libbpf: Add support to attach to fentry/fexit tracing progs
  selftests/bpf: Add test for BPF trampoline
  libbpf: Add support for attaching BPF programs to other BPF programs

Andrii Nakryiko (8):
  bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY
  libbpf: Make global data internal arrays mmap()-able, if possible
  libbpf: Fix call relocation offset calculation bug
  libbpf: Refactor relocation handling
  libbpf: Fix various errors and warning reported by checkpatch.pl
  libbpf: Support initialized global variables
  libbpf: Fix bpf_object name determination for bpf_object__open_file()
  libbpf: Fix usage of u32 in userspace code

Luigi Rizzo (1):
  net-af_xdp: Use correct number of channels from ethtool

Martin KaFai Lau (1):
  bpf: Introduce BPF_TRACE_x helper for the tracing tests

 include/uapi/linux/bpf.h |   6 +
 src/bpf.c                |   8 +-
 src/bpf.h                |   5 +-
 src/btf.c                |  22 ++
 src/btf.h                |   2 +
 src/libbpf.c             | 478 ++++++++++++++++++++++++++-------------
 src/libbpf.h             |   7 +-
 src/libbpf.map           |   3 +
 src/xsk.c                |  11 +-
 9 files changed, 371 insertions(+), 171 deletions(-)

--
2.17.1
2019-11-25 16:55:44 -08:00
Andrii Nakryiko
b7bdc604ef libbpf: Fix usage of u32 in userspace code
u32 is not defined for libbpf when compiled outside of kernel sources (e.g.,
in Github projection). Use __u32 instead.

Fixes: b8c54ea455dc ("libbpf: Add support to attach to fentry/fexit tracing progs")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191125212948.1163343-1-andriin@fb.com
2019-11-25 16:55:44 -08:00
Martin KaFai Lau
354dd9844e bpf: Introduce BPF_TRACE_x helper for the tracing tests
For BPF_PROG_TYPE_TRACING, the bpf_prog's ctx is an array of u64.
This patch borrows the idea from BPF_CALL_x in filter.h to
convert a u64 to the arg type of the traced function.

The new BPF_TRACE_x has an arg to specify the return type of a bpf_prog.
It will be used in the future TCP-ops bpf_prog that may return "void".

The new macros are defined in the new header file "bpf_trace_helpers.h".
It is under selftests/bpf/ for now.  It could be moved to libbpf later
after seeing more upcoming non-tracing use cases.

The tests are changed to use these new macros also.  Hence,
the k[s]u8/16/32/64 are no longer needed and they are removed
from the bpf_helpers.h.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191123202504.1502696-1-kafai@fb.com
2019-11-25 16:55:44 -08:00
Andrii Nakryiko
9b91dce691 libbpf: Fix bpf_object name determination for bpf_object__open_file()
If bpf_object__open_file() gets path like "some/dir/obj.o", it should derive
BPF object's name as "obj" (unless overriden through opts->object_name).
Instead, due to using `path` as a fallback value for opts->obj_name, path is
used as is for object name, so for above example BPF object's name will be
verbatim "some/dir/obj", which leads to all sorts of troubles, especially when
internal maps are concern (they are using up to 8 characters of object name).
Fix that by ensuring object_name stays NULL, unless overriden.

Fixes: 291ee02b5e40 ("libbpf: Refactor bpf_object__open APIs to use common opts")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191122003527.551556-1-andriin@fb.com
2019-11-25 16:55:44 -08:00
Andrii Nakryiko
83535cb2bf libbpf: Support initialized global variables
Initialized global variables are no different in ELF from static variables,
and don't require any extra support from libbpf. But they are matching
semantics of global data (backed by BPF maps) more closely, preventing
LLVM/Clang from aggressively inlining constant values and not requiring
volatile incantations to prevent those. This patch enables global variables.
It still disables uninitialized variables, which will be put into special COM
(common) ELF section, because BPF doesn't allow uninitialized data to be
accessed.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191121070743.1309473-5-andriin@fb.com
2019-11-25 16:55:44 -08:00
Andrii Nakryiko
3f05b513d4 libbpf: Fix various errors and warning reported by checkpatch.pl
Fix a bunch of warnings and errors reported by checkpatch.pl, to make it
easier to spot new problems.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191121070743.1309473-4-andriin@fb.com
2019-11-25 16:55:44 -08:00
Andrii Nakryiko
0d0d05de08 libbpf: Refactor relocation handling
Relocation handling code is convoluted and unnecessarily deeply nested. Split
out per-relocation logic into separate function. Also refactor the logic to be
more a sequence of per-relocation type checks and processing steps, making it
simpler to follow control flow. This makes it easier to further extends it to
new kinds of relocations (e.g., support for extern variables).

This patch also makes relocation's section verification more robust.
Previously relocations against not yet supported externs were silently ignored
because of obj->efile.text_shndx was zero, when all BPF programs had custom
section names and there was no .text section. Also, invalid LDIMM64 relocations
against non-map sections were passed through, if they were pointing to a .text
section (or 0, which is invalid section). All these bugs are fixed within this
refactoring and checks are made more appropriate for each type of relocation.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191121070743.1309473-3-andriin@fb.com
2019-11-25 16:55:44 -08:00
Andrii Nakryiko
44409068f7 libbpf: Fix call relocation offset calculation bug
When relocating subprogram call, libbpf doesn't take into account
relo->text_off, which comes from symbol's value. This generally works fine for
subprograms implemented as static functions, but breaks for global functions.

Taking a simplified test_pkt_access.c as an example:

__attribute__ ((noinline))
static int test_pkt_access_subprog1(volatile struct __sk_buff *skb)
{
        return skb->len * 2;
}

__attribute__ ((noinline))
static int test_pkt_access_subprog2(int val, volatile struct __sk_buff *skb)
{
        return skb->len + val;
}

SEC("classifier/test_pkt_access")
int test_pkt_access(struct __sk_buff *skb)
{
        if (test_pkt_access_subprog1(skb) != skb->len * 2)
                return TC_ACT_SHOT;
        if (test_pkt_access_subprog2(2, skb) != skb->len + 2)
                return TC_ACT_SHOT;
        return TC_ACT_UNSPEC;
}

When compiled, we get two relocations, pointing to '.text' symbol. .text has
st_value set to 0 (it points to the beginning of .text section):

0000000000000008  000000050000000a R_BPF_64_32            0000000000000000 .text
0000000000000040  000000050000000a R_BPF_64_32            0000000000000000 .text

test_pkt_access_subprog1 and test_pkt_access_subprog2 offsets (targets of two
calls) are encoded within call instruction's imm32 part as -1 and 2,
respectively:

0000000000000000 test_pkt_access_subprog1:
       0:       61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0)
       1:       64 00 00 00 01 00 00 00 w0 <<= 1
       2:       95 00 00 00 00 00 00 00 exit

0000000000000018 test_pkt_access_subprog2:
       3:       61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0)
       4:       04 00 00 00 02 00 00 00 w0 += 2
       5:       95 00 00 00 00 00 00 00 exit

0000000000000000 test_pkt_access:
       0:       bf 16 00 00 00 00 00 00 r6 = r1
===>   1:       85 10 00 00 ff ff ff ff call -1
       2:       bc 01 00 00 00 00 00 00 w1 = w0
       3:       b4 00 00 00 02 00 00 00 w0 = 2
       4:       61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0)
       5:       64 02 00 00 01 00 00 00 w2 <<= 1
       6:       5e 21 08 00 00 00 00 00 if w1 != w2 goto +8 <LBB0_3>
       7:       bf 61 00 00 00 00 00 00 r1 = r6
===>   8:       85 10 00 00 02 00 00 00 call 2
       9:       bc 01 00 00 00 00 00 00 w1 = w0
      10:       61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0)
      11:       04 02 00 00 02 00 00 00 w2 += 2
      12:       b4 00 00 00 ff ff ff ff w0 = -1
      13:       1e 21 01 00 00 00 00 00 if w1 == w2 goto +1 <LBB0_3>
      14:       b4 00 00 00 02 00 00 00 w0 = 2
0000000000000078 LBB0_3:
      15:       95 00 00 00 00 00 00 00 exit

Now, if we compile example with global functions, the setup changes.
Relocations are now against specifically test_pkt_access_subprog1 and
test_pkt_access_subprog2 symbols, with test_pkt_access_subprog2 pointing 24
bytes into its respective section (.text), i.e., 3 instructions in:

0000000000000008  000000070000000a R_BPF_64_32            0000000000000000 test_pkt_access_subprog1
0000000000000048  000000080000000a R_BPF_64_32            0000000000000018 test_pkt_access_subprog2

Calls instructions now encode offsets relative to function symbols and are both
set ot -1:

0000000000000000 test_pkt_access_subprog1:
       0:       61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0)
       1:       64 00 00 00 01 00 00 00 w0 <<= 1
       2:       95 00 00 00 00 00 00 00 exit

0000000000000018 test_pkt_access_subprog2:
       3:       61 20 00 00 00 00 00 00 r0 = *(u32 *)(r2 + 0)
       4:       0c 10 00 00 00 00 00 00 w0 += w1
       5:       95 00 00 00 00 00 00 00 exit

0000000000000000 test_pkt_access:
       0:       bf 16 00 00 00 00 00 00 r6 = r1
===>   1:       85 10 00 00 ff ff ff ff call -1
       2:       bc 01 00 00 00 00 00 00 w1 = w0
       3:       b4 00 00 00 02 00 00 00 w0 = 2
       4:       61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0)
       5:       64 02 00 00 01 00 00 00 w2 <<= 1
       6:       5e 21 09 00 00 00 00 00 if w1 != w2 goto +9 <LBB2_3>
       7:       b4 01 00 00 02 00 00 00 w1 = 2
       8:       bf 62 00 00 00 00 00 00 r2 = r6
===>   9:       85 10 00 00 ff ff ff ff call -1
      10:       bc 01 00 00 00 00 00 00 w1 = w0
      11:       61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0)
      12:       04 02 00 00 02 00 00 00 w2 += 2
      13:       b4 00 00 00 ff ff ff ff w0 = -1
      14:       1e 21 01 00 00 00 00 00 if w1 == w2 goto +1 <LBB2_3>
      15:       b4 00 00 00 02 00 00 00 w0 = 2
0000000000000080 LBB2_3:
      16:       95 00 00 00 00 00 00 00 exit

Thus the right formula to calculate target call offset after relocation should
take into account relocation's target symbol value (offset within section),
call instruction's imm32 offset, and (subtracting, to get relative instruction
offset) instruction index of call instruction itself. All that is shifted by
number of instructions in main program, given all sub-programs are copied over
after main program.

Convert few selftests relying on bpf-to-bpf calls to use global functions
instead of static ones.

Fixes: 48cca7e44f9f ("libbpf: add support for bpf_call")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191119224447.3781271-1-andriin@fb.com
2019-11-25 16:55:44 -08:00
Luigi Rizzo
16ecc53e73 net-af_xdp: Use correct number of channels from ethtool
Drivers use different fields to report the number of channels, so take
the maximum of all data channels (rx, tx, combined) when determining the
size of the xsk map. The current code used only 'combined' which was set
to 0 in some drivers e.g. mlx4.

Tested: compiled and run xdpsock -q 3 -r -S on mlx4

Signed-off-by: Luigi Rizzo <lrizzo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20191119001951.92930-1-lrizzo@google.com
2019-11-25 16:55:44 -08:00
Andrii Nakryiko
38f66776db libbpf: Make global data internal arrays mmap()-able, if possible
Add detection of BPF_F_MMAPABLE flag support for arrays and add it as an extra
flag to internal global data maps, if supported by kernel. This allows users
to memory-map global data and use it without BPF map operations, greatly
simplifying user experience.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20191117172806.2195367-5-andriin@fb.com
2019-11-25 16:55:44 -08:00
Andrii Nakryiko
e9d33df74d bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY
Add ability to memory-map contents of BPF array map. This is extremely useful
for working with BPF global data from userspace programs. It allows to avoid
typical bpf_map_{lookup,update}_elem operations, improving both performance
and usability.

There had to be special considerations for map freezing, to avoid having
writable memory view into a frozen map. To solve this issue, map freezing and
mmap-ing is happening under mutex now:
  - if map is already frozen, no writable mapping is allowed;
  - if map has writable memory mappings active (accounted in map->writecnt),
    map freezing will keep failing with -EBUSY;
  - once number of writable memory mappings drops to zero, map freezing can be
    performed again.

Only non-per-CPU plain arrays are supported right now. Maps with spinlocks
can't be memory mapped either.

For BPF_F_MMAPABLE array, memory allocation has to be done through vmalloc()
to be mmap()'able. We also need to make sure that array data memory is
page-sized and page-aligned, so we over-allocate memory in such a way that
struct bpf_array is at the end of a single page of memory with array->value
being aligned with the start of the second page. On deallocation we need to
accomodate this memory arrangement to free vmalloc()'ed memory correctly.

One important consideration regarding how memory-mapping subsystem functions.
Memory-mapping subsystem provides few optional callbacks, among them open()
and close().  close() is called for each memory region that is unmapped, so
that users can decrease their reference counters and free up resources, if
necessary. open() is *almost* symmetrical: it's called for each memory region
that is being mapped, **except** the very first one. So bpf_map_mmap does
initial refcnt bump, while open() will do any extra ones after that. Thus
number of close() calls is equal to number of open() calls plus one more.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lore.kernel.org/bpf/20191117172806.2195367-4-andriin@fb.com
2019-11-25 16:55:44 -08:00
Alexei Starovoitov
05b515de7d libbpf: Add support for attaching BPF programs to other BPF programs
Extend libbpf api to pass attach_prog_fd into bpf_object__open.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-19-ast@kernel.org
2019-11-25 16:55:44 -08:00
Alexei Starovoitov
c2bbeaa900 selftests/bpf: Add test for BPF trampoline
Add sanity test for BPF trampoline that checks kernel functions
with up to 6 arguments of different sizes.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-10-ast@kernel.org
2019-11-25 16:55:44 -08:00
Alexei Starovoitov
799d153f41 libbpf: Add support to attach to fentry/fexit tracing progs
Teach libbpf to recognize tracing programs types and attach them to
fentry/fexit.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-7-ast@kernel.org
2019-11-25 16:55:44 -08:00
Alexei Starovoitov
69ff3960eb libbpf: Introduce btf__find_by_name_kind()
Introduce btf__find_by_name_kind() helper to search BTF by name and kind, since
name alone can be ambiguous.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-6-ast@kernel.org
2019-11-25 16:55:44 -08:00
Frantisek Sumsal
b91f53ec5f travis: use travis_terminate instead of set {+,-}e combo
Apart from that it looks a bit nicer, it also acts as a workaround for
https://travis-ci.community/t/exit-0-cannot-exit-successfully-on-arm/5731/4
2019-11-14 13:49:21 -08:00
Frantisek Sumsal
dd8f1bdd45 travis: bump the Ubuntu release to Bionic
The main reason why this is necessary is that gcc 5.x on Xenial doesn't
support ASan on s390x. Bumping the release to Bionic with gcc 7.x allows
us to build libbpf on s390x with ASan without issues.
2019-11-14 13:49:21 -08:00
Frantisek Sumsal
3720f31852 travis: add an s390x job
Travis now supports IBM Z and IBM Power architectures, so let's enable
them in our CI as well.

As libbpf won't compile on ppc64le right now (with current CFLAGS), let
skip it until the issue is resolved, see discussion in
https://github.com/libbpf/libbpf/pull/98#issuecomment-553873098

See: https://blog.travis-ci.com/2019-11-12-multi-cpu-architecture-ibm-power-ibm-z
2019-11-14 13:49:21 -08:00
Andrii Nakryiko
c51c492a65 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   ed578021210e14f15a654c825fba6a700c9a39a7
Checkpoint bpf-next commit: e47a179997ceee6864fbae620eee09ea9c345a4d
Baseline bpf commit:        7de086909365cd60a5619a45af3f4152516fd75c
Checkpoint bpf commit:      d0fbb51dfaa612f960519b798387be436e8f83c5

Andrii Nakryiko (6):
  libbpf: Fix negative FD close() in xsk_setup_xdp_prog()
  libbpf: Fix memory leak/double free issue
  libbpf: Fix potential overflow issue
  libbpf: Fix another potential overflow issue in bpf_prog_linfo
  libbpf: Make btf__resolve_size logic always check size error condition
  libbpf: Improve handling of corrupted ELF during map initialization

Magnus Karlsson (2):
  libbpf: Support XDP_SHARED_UMEM with external XDP program
  libbpf: Allow for creating Rx or Tx only AF_XDP sockets

Toke Høiland-Jørgensen (5):
  libbpf: Unpin auto-pinned maps if loading fails
  libbpf: Propagate EPERM to caller on program load
  libbpf: Use pr_warn() when printing netlink errors
  libbpf: Add bpf_get_link_xdp_info() function to get more XDP
    information
  libbpf: Add getter for program size

 src/bpf.c            |  2 +-
 src/bpf_prog_linfo.c | 14 +++----
 src/btf.c            |  3 +-
 src/libbpf.c         | 47 ++++++++++++++----------
 src/libbpf.h         | 13 +++++++
 src/libbpf.map       |  2 +
 src/netlink.c        | 87 +++++++++++++++++++++++++++++---------------
 src/nlattr.c         | 10 ++---
 src/xsk.c            | 34 +++++++++++------
 9 files changed, 136 insertions(+), 76 deletions(-)

--
2.17.1
2019-11-13 16:39:58 -08:00
Magnus Karlsson
d3e68e036e libbpf: Allow for creating Rx or Tx only AF_XDP sockets
The libbpf AF_XDP code is extended to allow for the creation of Rx
only or Tx only sockets. Previously it returned an error if the socket
was not initialized for both Rx and Tx.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: William Tu <u9012063@gmail.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/bpf/1573148860-30254-4-git-send-email-magnus.karlsson@intel.com
2019-11-13 16:39:58 -08:00
Magnus Karlsson
6ce8910d4d libbpf: Support XDP_SHARED_UMEM with external XDP program
Add support in libbpf to create multiple sockets that share a single
umem. Note that an external XDP program need to be supplied that
routes the incoming traffic to the desired sockets. So you need to
supply the libbpf_flag XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD and load
your own XDP program.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: William Tu <u9012063@gmail.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/bpf/1573148860-30254-2-git-send-email-magnus.karlsson@intel.com
2019-11-13 16:39:58 -08:00
Toke Høiland-Jørgensen
79b1d813f9 libbpf: Add getter for program size
This adds a new getter for the BPF program size (in bytes). This is useful
for a caller that is trying to predict how much memory will be locked by
loading a BPF object into the kernel.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/157333185272.88376.10996937115395724683.stgit@toke.dk
2019-11-13 16:39:58 -08:00
Toke Høiland-Jørgensen
26954e103d libbpf: Add bpf_get_link_xdp_info() function to get more XDP information
Currently, libbpf only provides a function to get a single ID for the XDP
program attached to the interface. However, it can be useful to get the
full set of program IDs attached, along with the attachment mode, in one
go. Add a new getter function to support this, using an extendible
structure to carry the information. Express the old bpf_get_link_id()
function in terms of the new function.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157333185164.88376.7520653040667637246.stgit@toke.dk
2019-11-13 16:39:58 -08:00
Toke Høiland-Jørgensen
c8c02fca3a libbpf: Use pr_warn() when printing netlink errors
The netlink functions were using fprintf(stderr, ) directly to print out
error messages, instead of going through the usual logging macros. This
makes it impossible for the calling application to silence or redirect
those error messages. Fix this by switching to pr_warn() in nlattr.c and
netlink.c.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/157333185055.88376.15999360127117901443.stgit@toke.dk
2019-11-13 16:39:58 -08:00
Toke Høiland-Jørgensen
0e2f5f9615 libbpf: Propagate EPERM to caller on program load
When loading an eBPF program, libbpf overrides the return code for EPERM
errors instead of returning it to the caller. This makes it hard to figure
out what went wrong on load.

In particular, EPERM is returned when the system rlimit is too low to lock
the memory required for the BPF program. Previously, this was somewhat
obscured because the rlimit error would be hit on map creation (which does
return it correctly). However, since maps can now be reused, object load
can proceed all the way to loading programs without hitting the error;
propagating it even in this case makes it possible for the caller to react
appropriately (and, e.g., attempt to raise the rlimit before retrying).

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/157333184946.88376.11768171652794234561.stgit@toke.dk
2019-11-13 16:39:58 -08:00
Toke Høiland-Jørgensen
b539321838 libbpf: Unpin auto-pinned maps if loading fails
Since the automatic map-pinning happens during load, it will leave pinned
maps around if the load fails at a later stage. Fix this by unpinning any
pinned maps on cleanup. To avoid unpinning pinned maps that were reused
rather than newly pinned, add a new boolean property on struct bpf_map to
keep track of whether that map was reused or not; and only unpin those maps
that were not reused.

Fixes: 57a00f41644f ("libbpf: Add auto-pinning of maps when loading BPF objects")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/157333184731.88376.9992935027056165873.stgit@toke.dk
2019-11-13 16:39:58 -08:00
Andrii Nakryiko
0f15f88443 libbpf: Improve handling of corrupted ELF during map initialization
If we get ELF file with "maps" section, but no symbols pointing to it, we'll
end up with division by zero. Add check against this situation and exit early
with error. Found by Coverity scan against Github libbpf sources.

Fixes: bf82927125dd ("libbpf: refactor map initialization")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-6-andriin@fb.com
2019-11-13 16:39:58 -08:00
Andrii Nakryiko
bada95a5f3 libbpf: Make btf__resolve_size logic always check size error condition
Perform size check always in btf__resolve_size. Makes the logic a bit more
robust against corrupted BTF and silences LGTM/Coverity complaining about
always true (size < 0) check.

Fixes: 69eaab04c675 ("btf: extract BTF type size calculation")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-5-andriin@fb.com
2019-11-13 16:39:58 -08:00
Andrii Nakryiko
fb929625dc libbpf: Fix another potential overflow issue in bpf_prog_linfo
Fix few issues found by Coverity and LGTM.

Fixes: b053b439b72a ("bpf: libbpf: bpftool: Print bpf_line_info during prog dump")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-4-andriin@fb.com
2019-11-13 16:39:58 -08:00
Andrii Nakryiko
1a828b3d58 libbpf: Fix potential overflow issue
Fix a potential overflow issue found by LGTM analysis, based on Github libbpf
source code.

Fixes: 3d65014146c6 ("bpf: libbpf: Add btf_line_info support to libbpf")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-3-andriin@fb.com
2019-11-13 16:39:58 -08:00
Andrii Nakryiko
330f4683e2 libbpf: Fix memory leak/double free issue
Coverity scan against Github libbpf code found the issue of not freeing memory and
leaving already freed memory still referenced from bpf_program. Fix it by
re-assigning successfully reallocated memory sooner.

Fixes: 2993e0515bb4 ("tools/bpf: add support to read .BTF.ext sections")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-2-andriin@fb.com
2019-11-13 16:39:58 -08:00
Andrii Nakryiko
2ef7f5607c libbpf: Fix negative FD close() in xsk_setup_xdp_prog()
Fix issue reported by static analysis (Coverity). If bpf_prog_get_fd_by_id()
fails, xsk_lookup_bpf_maps() will fail as well and clean-up code will attempt
close() with fd=-1. Fix by checking bpf_prog_get_fd_by_id() return result and
exiting early.

Fixes: 10a13bb40e54 ("libbpf: remove qidconf and better support external bpf programs.")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107054059.313884-1-andriin@fb.com
2019-11-13 16:39:58 -08:00
Andrii Nakryiko
4da243c179 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   f23c7ce341c2dfd187d4e3712ba6c110969463a0
Checkpoint bpf-next commit: ed578021210e14f15a654c825fba6a700c9a39a7
Baseline bpf commit:        7de086909365cd60a5619a45af3f4152516fd75c
Checkpoint bpf commit:      7de086909365cd60a5619a45af3f4152516fd75c

Andrii Nakryiko (1):
  libbpf: Simplify BPF_CORE_READ_BITFIELD_PROBED usage

 src/bpf_core_read.h | 27 +++++++++++----------------
 1 file changed, 11 insertions(+), 16 deletions(-)

--
2.17.1
2019-11-06 14:11:45 -08:00
Andrii Nakryiko
4d8fc6d438 libbpf: Simplify BPF_CORE_READ_BITFIELD_PROBED usage
Streamline BPF_CORE_READ_BITFIELD_PROBED interface to follow
BPF_CORE_READ_BITFIELD (direct) and BPF_CORE_READ, in general, i.e., just
return read result or 0, if underlying bpf_probe_read() failed.

In practice, real applications rarely check bpf_probe_read() result, because
it has to always work or otherwise it's a bug. So propagating internal
bpf_probe_read() error from this macro hurts usability without providing real
benefits in practice. This patch fixes the issue and simplifies usage,
noticeable even in selftest itself.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191106201500.2582438-1-andriin@fb.com
2019-11-06 14:11:45 -08:00
Andrii Nakryiko
6d4abdda08 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   a566e35f1e8b4b3be1e96a804d1cca38b578167c
Checkpoint bpf-next commit: f23c7ce341c2dfd187d4e3712ba6c110969463a0
Baseline bpf commit:        fc11078dd3514c65eabce166b8431a56d8a667cb
Checkpoint bpf commit:      7de086909365cd60a5619a45af3f4152516fd75c

Alexei Starovoitov (1):
  libbpf: Add support for prog_tracing

Andrii Nakryiko (2):
  libbpf: Add support for relocatable bitfields
  libbpf: Add support for field size relocations

Daniel Borkmann (1):
  bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str
    helpers

Toke Høiland-Jørgensen (4):
  libbpf: Fix error handling in bpf_map__reuse_fd()
  libbpf: Store map pin path and status in struct bpf_map
  libbpf: Move directory creation into _pin() functions
  libbpf: Add auto-pinning of maps when loading BPF objects

 include/uapi/linux/bpf.h | 124 ++++---
 src/bpf.c                |   8 +-
 src/bpf.h                |   5 +-
 src/bpf_core_read.h      |  79 +++++
 src/bpf_helpers.h        |   6 +
 src/libbpf.c             | 707 ++++++++++++++++++++++++++++++---------
 src/libbpf.h             |  23 +-
 src/libbpf.map           |   5 +
 src/libbpf_internal.h    |   4 +
 src/libbpf_probes.c      |   1 +
 10 files changed, 749 insertions(+), 213 deletions(-)

--
2.17.1
2019-11-05 16:00:11 -08:00
Andrii Nakryiko
67ab4c0f82 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2019-11-05 16:00:11 -08:00
Andrii Nakryiko
df45cf7a3e libbpf: Add support for field size relocations
Add bpf_core_field_size() macro, capturing a relocation against field size.
Adjust bits of internal libbpf relocation logic to allow capturing size
relocations of various field types: arrays, structs/unions, enums, etc.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191101222810.1246166-4-andriin@fb.com
2019-11-05 16:00:11 -08:00
Andrii Nakryiko
4438972ccc libbpf: Add support for relocatable bitfields
Add support for the new field relocation kinds, necessary to support
relocatable bitfield reads. Provide macro for abstracting necessary code doing
full relocatable bitfield extraction into u64 value. Two separate macros are
provided:
- BPF_CORE_READ_BITFIELD macro for direct memory read-enabled BPF programs
(e.g., typed raw tracepoints). It uses direct memory dereference to extract
bitfield backing integer value.
- BPF_CORE_READ_BITFIELD_PROBED macro for cases where bpf_probe_read() needs
to be used to extract same backing integer value.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191101222810.1246166-3-andriin@fb.com
2019-11-05 16:00:11 -08:00
Daniel Borkmann
09cd9ff2db bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers
The current bpf_probe_read() and bpf_probe_read_str() helpers are broken
in that they assume they can be used for probing memory access for kernel
space addresses /as well as/ user space addresses.

However, plain use of probe_kernel_read() for both cases will attempt to
always access kernel space address space given access is performed under
KERNEL_DS and some archs in-fact have overlapping address spaces where a
kernel pointer and user pointer would have the /same/ address value and
therefore accessing application memory via bpf_probe_read{,_str}() would
read garbage values.

Lets fix BPF side by making use of recently added 3d7081822f7f ("uaccess:
Add non-pagefault user-space read functions"). Unfortunately, the only way
to fix this status quo is to add dedicated bpf_probe_read_{user,kernel}()
and bpf_probe_read_{user,kernel}_str() helpers. The bpf_probe_read{,_str}()
helpers are kept as-is to retain their current behavior.

The two *_user() variants attempt the access always under USER_DS set, the
two *_kernel() variants will -EFAULT when accessing user memory if the
underlying architecture has non-overlapping address ranges, also avoiding
throwing the kernel warning via 00c42373d397 ("x86-64: add warning for
non-canonical user access address dereferences").

Fixes: a5e8c07059d0 ("bpf: add bpf_probe_read_str helper")
Fixes: 2541517c32be ("tracing, perf: Implement BPF programs attached to kprobes")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/796ee46e948bc808d54891a1108435f8652c6ca4.1572649915.git.daniel@iogearbox.net
2019-11-05 16:00:11 -08:00
Toke Høiland-Jørgensen
e7d860d2fc libbpf: Add auto-pinning of maps when loading BPF objects
This adds support to libbpf for setting map pinning information as part of
the BTF map declaration, to get automatic map pinning (and reuse) on load.
The pinning type currently only supports a single PIN_BY_NAME mode, where
each map will be pinned by its name in a path that can be overridden, but
defaults to /sys/fs/bpf.

Since auto-pinning only does something if any maps actually have a
'pinning' BTF attribute set, we default the new option to enabled, on the
assumption that seamless pinning is what most callers want.

When a map has a pin_path set at load time, libbpf will compare the map
pinned at that location (if any), and if the attributes match, will re-use
that map instead of creating a new one. If no existing map is found, the
newly created map will instead be pinned at the location.

Programs wanting to customise the pinning can override the pinning paths
using bpf_map__set_pin_path() before calling bpf_object__load() (including
setting it to NULL to disable pinning of a particular map).

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269298092.394725.3966306029218559681.stgit@toke.dk
2019-11-05 16:00:11 -08:00
Toke Høiland-Jørgensen
ff3d2702d8 libbpf: Move directory creation into _pin() functions
The existing pin_*() functions all try to create the parent directory
before pinning. Move this check into the per-object _pin() functions
instead. This ensures consistent behaviour when auto-pinning is
added (which doesn't go through the top-level pin_maps() function), at the
cost of a few more calls to mkdir().

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269297985.394725.5882630952992598610.stgit@toke.dk
2019-11-05 16:00:11 -08:00
Toke Høiland-Jørgensen
44f9712f79 libbpf: Store map pin path and status in struct bpf_map
Support storing and setting a pin path in struct bpf_map, which can be used
for automatic pinning. Also store the pin status so we can avoid attempts
to re-pin a map that has already been pinned (or reused from a previous
pinning).

The behaviour of bpf_object__{un,}pin_maps() is changed so that if it is
called with a NULL path argument (which was previously illegal), it will
(un)pin only those maps that have a pin_path set.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269297876.394725.14782206533681896279.stgit@toke.dk
2019-11-05 16:00:11 -08:00
Toke Høiland-Jørgensen
fe4cb796df libbpf: Fix error handling in bpf_map__reuse_fd()
bpf_map__reuse_fd() was calling close() in the error path before returning
an error value based on errno. However, close can change errno, so that can
lead to potentially misleading error messages. Instead, explicitly store
errno in the err variable before each goto.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269297769.394725.12634985106772698611.stgit@toke.dk
2019-11-05 16:00:11 -08:00
Alexei Starovoitov
15de8ad80d libbpf: Add support for prog_tracing
Cleanup libbpf from expected_attach_type == attach_btf_id hack
and introduce BPF_PROG_TYPE_TRACING.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191030223212.953010-3-ast@kernel.org
2019-11-05 16:00:11 -08:00
Frantisek Sumsal
d7a137510a coverity: explicitly use bash instead of sh
On Ubuntu `/bin/sh` is a symlink to `/bin/dash`, which doesn't support
certain builtins used by the Coverity script (namely pushd/popd)
2019-11-05 13:28:13 -08:00
Frantisek Sumsal
91e4f27dd7 travis: use sudo during the 'install' phase 2019-11-04 15:08:38 -08:00
Frantisek Sumsal
1339ef70a3 README: add Coverity badge 2019-11-01 23:22:57 -07:00
Frantisek Sumsal
c204e3d610 travis: automate Coverity builds 2019-11-01 23:22:57 -07:00
Frantisek Sumsal
32d0a03332 README: add a LGTM badge 2019-10-29 15:45:36 -07:00
Andrii Nakryiko
05346cfd90 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   3820729160440158a014add69cc0d371061a96b2
Checkpoint bpf-next commit: a566e35f1e8b4b3be1e96a804d1cca38b578167c
Baseline bpf commit:        2afd23f78f39da84937006ecd24aa664a4ab052b
Checkpoint bpf commit:      fc11078dd3514c65eabce166b8431a56d8a667cb

Andrii Nakryiko (2):
  libbpf: Fix off-by-one error in ELF sanity check
  libbpf: Don't use kernel-side u32 type in xsk.c

Magnus Karlsson (1):
  libbpf: Fix compatibility for kernels without need_wakeup

 src/libbpf.c |  2 +-
 src/xsk.c    | 83 ++++++++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 72 insertions(+), 13 deletions(-)

--
2.17.1
2019-10-29 09:25:36 -07:00
Andrii Nakryiko
a7a32b899c libbpf: Don't use kernel-side u32 type in xsk.c
u32 is a kernel-side typedef. User-space library is supposed to use __u32.
This breaks Github's projection of libbpf. Do u32 -> __u32 fix.

Fixes: 94ff9ebb49a5 ("libbpf: Fix compatibility for kernels without need_wakeup")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20191029055953.2461336-1-andriin@fb.com
2019-10-29 09:25:36 -07:00
Andrii Nakryiko
68a051f2d2 libbpf: Fix off-by-one error in ELF sanity check
libbpf's bpf_object__elf_collect() does simple sanity check after iterating
over all ELF sections, if checks that .strtab index is correct. Unfortunately,
due to section indices being 1-based, the check breaks for cases when .strtab
ends up being the very last section in ELF.

Fixes: 77ba9a5b48a7 ("tools lib bpf: Fetch map names from correct strtab")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191028233727.1286699-1-andriin@fb.com
2019-10-29 09:25:36 -07:00
Magnus Karlsson
8e80367637 libbpf: Fix compatibility for kernels without need_wakeup
When the need_wakeup flag was added to AF_XDP, the format of the
XDP_MMAP_OFFSETS getsockopt was extended. Code was added to the
kernel to take care of compatibility issues arrising from running
applications using any of the two formats. However, libbpf was
not extended to take care of the case when the application/libbpf
uses the new format but the kernel only supports the old
format. This patch adds support in libbpf for parsing the old
format, before the need_wakeup flag was added, and emulating a
set of static need_wakeup flags that will always work for the
application.

v2 -> v3:
* Incorporated code improvements suggested by Jonathan Lemon

v1 -> v2:
* Rebased to bpf-next
* Rewrote the code as the previous version made you blind

Fixes: a4500432c2587cb2a ("libbpf: add support for need_wakeup flag in AF_XDP part")
Reported-by: Eloy Degen <degeneloy@gmail.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/bpf/1571995035-21889-1-git-send-email-magnus.karlsson@intel.com
2019-10-29 09:25:36 -07:00
Andrii Nakryiko
9a5adecc62 sync: ignore test_libbpf.c
Adjust sync script to ignore test_libbpf.c, not test_libbpf.cpp.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-29 09:25:36 -07:00
Frantisek Sumsal
b923d0e3c6 lgtm: fix the extraction process
As this project uses only Makefile, without any configuration step, and due to
a "non-standard" location of the source files, LGTM kept failing to find the
respective Makefile and build the sources. By tricking LGTM's build system
auto detection, that we use automake/configure, it correctly sets the source
dir, thus the compilation, extraction & analysis steps now work in the src/
subdirectory, as expected.
2019-10-28 15:15:47 -07:00
Andrii Nakryiko
f02e248ae1 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   5e5b03d163e15a40b0fa57c70b4e8edd549b0b98
Checkpoint bpf-next commit: 3820729160440158a014add69cc0d371061a96b2
Baseline bpf commit:        cd7455f1013ef96d5cbf5c05d2b7c06f273810a6
Checkpoint bpf commit:      2afd23f78f39da84937006ecd24aa664a4ab052b

Björn Töpel (1):
  libbpf: Use implicit XSKMAP lookup from AF_XDP XDP program

KP Singh (1):
  libbpf: Fix strncat bounds error in libbpf_prog_type_by_name

 src/libbpf.c |  2 +-
 src/xsk.c    | 42 ++++++++++++++++++++++++++++++++----------
 2 files changed, 33 insertions(+), 11 deletions(-)

--
2.17.1
2019-10-24 22:59:06 -07:00
KP Singh
e152510d72 libbpf: Fix strncat bounds error in libbpf_prog_type_by_name
On compiling samples with this change, one gets an error:

 error: ‘strncat’ specified bound 118 equals destination size
  [-Werror=stringop-truncation]

    strncat(dst, name + section_names[i].len,
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     sizeof(raw_tp_btf_name) - (dst - raw_tp_btf_name));
     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

strncat requires the destination to have enough space for the
terminating null byte.

Fixes: f75a697e09137 ("libbpf: Auto-detect btf_id of BTF-based raw_tracepoint")
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191023154038.24075-1-kpsingh@chromium.org
2019-10-24 22:59:06 -07:00
Björn Töpel
59ac1946b0 libbpf: Use implicit XSKMAP lookup from AF_XDP XDP program
In commit 43e74c0267a3 ("bpf_xdp_redirect_map: Perform map lookup in
eBPF helper") the bpf_redirect_map() helper learned to do map lookup,
which means that the explicit lookup in the XDP program for AF_XDP is
not needed for post-5.3 kernels.

This commit adds the implicit map lookup with default action, which
improves the performance for the "rx_drop" [1] scenario with ~4%.

For pre-5.3 kernels, the bpf_redirect_map() returns XDP_ABORTED, and a
fallback path for backward compatibility is entered, where explicit
lookup is still performed. This means a slight regression for older
kernels (an additional bpf_redirect_map() call), but I consider that a
fair punishment for users not upgrading their kernels. ;-)

v1->v2: Backward compatibility (Toke) [2]
v2->v3: Avoid masking/zero-extension by using JMP32 [3]

[1] # xdpsock -i eth0 -z -r
[2] https://lore.kernel.org/bpf/87pnirb3dc.fsf@toke.dk/
[3] https://lore.kernel.org/bpf/87v9sip0i8.fsf@toke.dk/

Suggested-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20191022072206.6318-1-bjorn.topel@gmail.com
2019-10-24 22:59:06 -07:00
Andrii Nakryiko
5150a4a0fb includes: add BPF_JMP32_IMM macro to fix build
Recent xsk change started using new BPF_JMP32_IMM macro. Add it to our
local copy of include/linux/filter.h to fix the build.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-24 22:59:06 -07:00
Frantisek Sumsal
2a25957df6 travis: add an aarch64 Xenial job 2019-10-23 10:13:54 -07:00
Andrii Nakryiko
e441f55089 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   da927466a152a9497c05926a95c6aebba6d3ad5b
Checkpoint bpf-next commit: 5e5b03d163e15a40b0fa57c70b4e8edd549b0b98
Baseline bpf commit:        9e8acd9c44a0dd52b2922eeb82398c04e356c058
Checkpoint bpf commit:      cd7455f1013ef96d5cbf5c05d2b7c06f273810a6

Alexei Starovoitov (3):
  bpf: Add attach_btf_id attribute to program load
  libbpf: Auto-detect btf_id of BTF-based raw_tracepoints
  bpf: Check types of arguments passed into helpers

Andrii Nakryiko (5):
  tools: Sync if_link.h
  libbpf: Add bpf_program__get_{type, expected_attach_type) APIs
  libbpf: Add uprobe/uretprobe and tp/raw_tp section suffixes
  libbpf: Teach bpf_object__open to guess program types
  libbpf: Make DECLARE_LIBBPF_OPTS macro strictly a variable declaration

John Fastabend (1):
  bpf, libbpf: Add kernel version section parsing back

Kefeng Wang (1):
  tools, bpf: Rename pr_warning to pr_warn to align with kernel logging

 include/uapi/linux/bpf.h     |  28 +-
 include/uapi/linux/if_link.h |   2 +
 src/bpf.c                    |   3 +
 src/btf.c                    |  56 +--
 src/btf_dump.c               |  18 +-
 src/libbpf.c                 | 830 +++++++++++++++++++----------------
 src/libbpf.h                 |  24 +-
 src/libbpf.map               |   2 +
 src/libbpf_internal.h        |   8 +-
 src/xsk.c                    |   4 +-
 10 files changed, 539 insertions(+), 436 deletions(-)

--
2.17.1
2019-10-22 16:15:55 -07:00
Andrii Nakryiko
beb9f88080 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2019-10-22 16:15:55 -07:00
Andrii Nakryiko
c7b5116f71 libbpf: Make DECLARE_LIBBPF_OPTS macro strictly a variable declaration
LIBBPF_OPTS is implemented as a mix of field declaration and memset
+ assignment. This makes it neither variable declaration nor purely
statements, which is a problem, because you can't mix it with either
other variable declarations nor other function statements, because C90
compiler mode emits warning on mixing all that together.

This patch changes LIBBPF_OPTS into a strictly declaration of variable
and solves this problem, as can be seen in case of bpftool, which
previously would emit compiler warning, if done this way (LIBBPF_OPTS as
part of function variables declaration block).

This patch also renames LIBBPF_OPTS into DECLARE_LIBBPF_OPTS to follow
kernel convention for similar macros more closely.

v1->v2:
- rename LIBBPF_OPTS into DECLARE_LIBBPF_OPTS (Jakub Sitnicki).

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20191022172100.3281465-1-andriin@fb.com
2019-10-22 16:15:55 -07:00
Andrii Nakryiko
2b0cd55bf5 libbpf: Teach bpf_object__open to guess program types
Teach bpf_object__open how to guess program type and expected attach
type from section names, similar to what bpf_prog_load() does. This
seems like a really useful features and an oversight to not have this
done during bpf_object_open(). To preserver backwards compatible
behavior of bpf_prog_load(), its attr->prog_type is treated as an
override of bpf_object__open() decisions, if attr->prog_type is not
UNSPECIFIED.

There is a slight difference in behavior for bpf_prog_load().
Previously, if bpf_prog_load() was loading BPF object with more than one
program, first program's guessed program type and expected attach type
would determine corresponding attributes of all the subsequent program
types, even if their sections names suggest otherwise. That seems like
a rather dubious behavior and with this change it will behave more
sanely: each program's type is determined individually, unless they are
forced to uniformity through attr->prog_type.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191021033902.3856966-5-andriin@fb.com
2019-10-22 16:15:55 -07:00
Andrii Nakryiko
188276ca5f libbpf: Add uprobe/uretprobe and tp/raw_tp section suffixes
Map uprobe/uretprobe into KPROBE program type. tp/raw_tp are just an
alias for more verbose tracepoint/raw_tracepoint, respectively.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191021033902.3856966-4-andriin@fb.com
2019-10-22 16:15:55 -07:00
Andrii Nakryiko
87c4984da8 libbpf: Add bpf_program__get_{type, expected_attach_type) APIs
There are bpf_program__set_type() and
bpf_program__set_expected_attach_type(), but no corresponding getters,
which seems rather incomplete. Fix this.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191021033902.3856966-3-andriin@fb.com
2019-10-22 16:15:55 -07:00
Andrii Nakryiko
a5611ba6e8 tools: Sync if_link.h
Sync if_link.h into tools/ and get rid of annoying libbpf Makefile warning.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191021033902.3856966-2-andriin@fb.com
2019-10-22 16:15:55 -07:00
Kefeng Wang
c6e01425b6 tools, bpf: Rename pr_warning to pr_warn to align with kernel logging
For kernel logging macros, pr_warning() is completely removed and
replaced by pr_warn(). By using pr_warn() in tools/lib/bpf/ for
symmetry to kernel logging macros, we could eventually drop the
use of pr_warning() in the whole kernel tree.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191021055532.185245-1-wangkefeng.wang@huawei.com
2019-10-22 16:15:55 -07:00
John Fastabend
58e3a8fac1 bpf, libbpf: Add kernel version section parsing back
With commit "libbpf: stop enforcing kern_version,..." we removed the
kernel version section parsing in favor of querying for the kernel
using uname() and populating the version using the result of the
query. After this any version sections were simply ignored.

Unfortunately, the world of kernels is not so friendly. I've found some
customized kernels where uname() does not match the in kernel version.
To fix this so programs can load in this environment this patch adds
back parsing the section and if it exists uses the user specified
kernel version to override the uname() result. However, keep most the
kernel uname() discovery bits so users are not required to insert the
version except in these odd cases.

Fixes: 5e61f27070292 ("libbpf: stop enforcing kern_version, populate it for users")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157140968634.9073.6407090804163937103.stgit@john-XPS-13-9370
2019-10-22 16:15:55 -07:00
Alexei Starovoitov
1b27702c14 bpf: Check types of arguments passed into helpers
Introduce new helper that reuses existing skb perf_event output
implementation, but can be called from raw_tracepoint programs
that receive 'struct sk_buff *' as tracepoint argument or
can walk other kernel data structures to skb pointer.

In order to do that teach verifier to resolve true C types
of bpf helpers into in-kernel BTF ids.
The type of kernel pointer passed by raw tracepoint into bpf
program will be tracked by the verifier all the way until
it's passed into helper function.
For example:
kfree_skb() kernel function calls trace_kfree_skb(skb, loc);
bpf programs receives that skb pointer and may eventually
pass it into bpf_skb_output() bpf helper which in-kernel is
implemented via bpf_skb_event_output() kernel function.
Its first argument in the kernel is 'struct sk_buff *'.
The verifier makes sure that types match all the way.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191016032505.2089704-11-ast@kernel.org
2019-10-22 16:15:55 -07:00
Alexei Starovoitov
39cf9fc90f libbpf: Auto-detect btf_id of BTF-based raw_tracepoints
It's a responsiblity of bpf program author to annotate the program
with SEC("tp_btf/name") where "name" is a valid raw tracepoint.
The libbpf will try to find "name" in vmlinux BTF and error out
in case vmlinux BTF is not available or "name" is not found.
If "name" is indeed a valid raw tracepoint then in-kernel BTF
will have "btf_trace_##name" typedef that points to function
prototype of that raw tracepoint. BTF description captures
exact argument the kernel C code is passing into raw tracepoint.
The kernel verifier will check the types while loading bpf program.

libbpf keeps BTF type id in expected_attach_type, but since
kernel ignores this attribute for tracing programs copy it
into attach_btf_id attribute before loading.

Later the kernel will use prog->attach_btf_id to select raw tracepoint
during bpf_raw_tracepoint_open syscall command.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191016032505.2089704-6-ast@kernel.org
2019-10-22 16:15:55 -07:00
Alexei Starovoitov
bc4a6e9709 bpf: Add attach_btf_id attribute to program load
Add attach_btf_id attribute to prog_load command.
It's similar to existing expected_attach_type attribute which is
used in several cgroup based program types.
Unfortunately expected_attach_type is ignored for
tracing programs and cannot be reused for new purpose.
Hence introduce attach_btf_id to verify bpf programs against
given in-kernel BTF type id at load time.
It is strictly checked to be valid for raw_tp programs only.
In a later patches it will become:
btf_id == 0 semantics of existing raw_tp progs.
btd_id > 0 raw_tp with BTF and additional type safety.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191016032505.2089704-5-ast@kernel.org
2019-10-22 16:15:55 -07:00
Andrii Nakryiko
4a50ceb043 Makefile: back-port _FILE_OFFSET_BITS=64 and _LARGEFILE64_SOURCE to Makefile
Upstream commit 71dd77fd4bf7 ("libbpf: use LFS (_FILE_OFFSET_BITS) instead
of direct mmap2 syscall") added _FILE_OFFSET_BITS=64 and
_LARGEFILE64_SOURCE CFLAGS. Back-port them to Github's mirror to avoid
compilation problems on ARM.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-22 14:50:23 -07:00
Andrii Nakryiko
4d86cae4f0 ci: disable GCC's -Wstringop-truncation noisy error
This error is usually a false positive for us. Disable it.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
33b374395f sync: adjust sync script for test_libbpf.c rename and bpf_helper_defs.h
Accomodate changes:
- test_libbpf.cpp was renamed to test_libbpf.c;
- bpf_helper_defs.h should be ignored for consistency check at the end,
  as it's not checked in on linux side;

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
ade4409352 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   f05c2001ecc98629cecd47728e4db11e5a17e58d
Checkpoint bpf-next commit: da927466a152a9497c05926a95c6aebba6d3ad5b
Baseline bpf commit:        106c35dda32f8b63f88cad7433f1b8bb0056958a
Checkpoint bpf commit:      9e8acd9c44a0dd52b2922eeb82398c04e356c058

Andrii Nakryiko (7):
  libbpf: Fix struct end padding in btf_dump
  libbpf: Generate more efficient BPF_CORE_READ code
  libbpf: Handle invalid typedef emitted by old GCC
  libbpf: Update BTF reloc support to latest Clang format
  libbpf: Refactor bpf_object__open APIs to use common opts
  libbpf: Add support for field existance CO-RE relocation
  libbpf: Add BPF-side definitions of supported field relocation kinds

Ilya Maximets (1):
  libbpf: Fix passing uninitialized bytes to setsockopt

 src/bpf_core_read.h   |  28 ++++++-
 src/btf.c             |  16 ++--
 src/btf.h             |   4 +-
 src/btf_dump.c        |  19 ++++-
 src/libbpf.c          | 169 ++++++++++++++++++++++++++----------------
 src/libbpf.h          |   4 +-
 src/libbpf_internal.h |  25 +++++--
 src/xsk.c             |   1 +
 8 files changed, 180 insertions(+), 86 deletions(-)

--
2.17.1
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
2f9abb2a26 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
fca60960ea libbpf: Add BPF-side definitions of supported field relocation kinds
Add enum definition for Clang's __builtin_preserve_field_info()
second argument (info_kind). Currently only byte offset and existence
are supported. Corresponding Clang changes introducing this built-in can
be found at [0]

  [0] https://reviews.llvm.org/D67980

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191015182849.3922287-5-andriin@fb.com
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
0db22b01a1 libbpf: Add support for field existance CO-RE relocation
Add support for BPF_FRK_EXISTS relocation kind to detect existence of
captured field in a destination BTF, allowing conditional logic to
handle incompatible differences between kernels.

Also introduce opt-in relaxed CO-RE relocation handling option, which
makes libbpf emit warning for failed relocations, but proceed with other
relocations. Instruction, for which relocation failed, is patched with
(u32)-1 value.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191015182849.3922287-4-andriin@fb.com
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
807b9d7be1 libbpf: Refactor bpf_object__open APIs to use common opts
Refactor all the various bpf_object__open variations to ultimately
specify common bpf_object_open_opts struct. This makes it easy to keep
extending this common struct w/ extra parameters without having to
update all the legacy APIs.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191015182849.3922287-3-andriin@fb.com
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
a3d02f9ab4 libbpf: Update BTF reloc support to latest Clang format
BTF offset reloc was generalized in recent Clang into field relocation,
capturing extra u32 field, specifying what aspect of captured field
needs to be relocated. This changes .BTF.ext's record size for this
relocation from 12 bytes to 16 bytes. Given these format changes
happened in Clang before official released version, it's ok to not
support outdated 12-byte record size w/o breaking ABI.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191015182849.3922287-2-andriin@fb.com
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
54aac21f7e libbpf: Handle invalid typedef emitted by old GCC
Old GCC versions are producing invalid typedef for __gnuc_va_list
pointing to void. Special-case this and emit valid:

typedef __builtin_va_list __gnuc_va_list;

Reported-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20191011032901.452042-1-andriin@fb.com
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
d8dd0beb98 libbpf: Generate more efficient BPF_CORE_READ code
Existing BPF_CORE_READ() macro generates slightly suboptimal code. If
there are intermediate pointers to be read, initial source pointer is
going to be assigned into a temporary variable and then temporary
variable is going to be uniformly used as a "source" pointer for all
intermediate pointer reads. Schematically (ignoring all the type casts),
BPF_CORE_READ(s, a, b, c) is expanded into:
({
	const void *__t = src;
	bpf_probe_read(&__t, sizeof(*__t), &__t->a);
	bpf_probe_read(&__t, sizeof(*__t), &__t->b);

	typeof(s->a->b->c) __r;
	bpf_probe_read(&__r, sizeof(*__r), &__t->c);
})

This initial `__t = src` makes calls more uniform, but causes slightly
less optimal register usage sometimes when compiled with Clang. This can
cascase into, e.g., more register spills.

This patch fixes this issue by generating more optimal sequence:
({
	const void *__t;
	bpf_probe_read(&__t, sizeof(*__t), &src->a); /* <-- src here */
	bpf_probe_read(&__t, sizeof(*__t), &__t->b);

	typeof(s->a->b->c) __r;
	bpf_probe_read(&__r, sizeof(*__r), &__t->c);
})

Fixes: 7db3822ab991 ("libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191011023847.275936-1-andriin@fb.com
2019-10-15 19:43:48 -07:00
Ilya Maximets
e94f57a9ab libbpf: Fix passing uninitialized bytes to setsockopt
'struct xdp_umem_reg' has 4 bytes of padding at the end that makes
valgrind complain about passing uninitialized stack memory to the
syscall:

  Syscall param socketcall.setsockopt() points to uninitialised byte(s)
    at 0x4E7AB7E: setsockopt (in /usr/lib64/libc-2.29.so)
    by 0x4BDE035: xsk_umem__create@@LIBBPF_0.0.4 (xsk.c:172)
  Uninitialised value was created by a stack allocation
    at 0x4BDDEBA: xsk_umem__create@@LIBBPF_0.0.4 (xsk.c:140)

Padding bytes appeared after introducing of a new 'flags' field.
memset() is required to clear them.

Fixes: 10d30e301732 ("libbpf: add flags to umem config")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191009164929.17242-1-i.maximets@ovn.org
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
bda436be4a libbpf: Fix struct end padding in btf_dump
Fix a case where explicit padding at the end of a struct is necessary
due to non-standart alignment requirements of fields (which BTF doesn't
capture explicitly).

Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion")
Reported-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20191008231009.2991130-2-andriin@fb.com
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
a30df5c09f makefile: install new BPF-side headers along libbpf user-land ones
Install BPF-side helper headers:
- bpf_helpers.h
- bpf_helper_defs.h
- bpf_tracing.h
- bpf_endian.h
- bpf_core_read.h

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
e776bf7ec7 sync: teach sync script to generate bpf_helper_defs.h
Linux repo doesn't commit bpf_helper_defs.h, as it's re-generated on
build every time. For Github projection, though, it's much nicer to have
this header be pre-generated during sync and commited. This makes
integration story easier for all the users that use libbpf as
a submodule.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
46688687d5 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   02dc96ef6c25f990452c114c59d75c368a1f4c8f
Checkpoint bpf-next commit: f05c2001ecc98629cecd47728e4db11e5a17e58d
Baseline bpf commit:        1bd63524593b964934a33afd442df16b8f90e2b5
Checkpoint bpf commit:      106c35dda32f8b63f88cad7433f1b8bb0056958a

Andrii Nakryiko (7):
  libbpf: Bump current version to v0.0.6
  libbpf: stop enforcing kern_version, populate it for users
  libbpf: add bpf_object__open_{file, mem} w/ extensible opts
  libbpf: fix bpf_object__name() to actually return object name
  uapi/bpf: fix helper docs
  libbpf: Move bpf_{helpers, helper_defs, endian, tracing}.h into libbpf
  libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers

 include/uapi/linux/bpf.h |  32 +++----
 src/bpf_core_read.h      | 167 +++++++++++++++++++++++++++++++++
 src/bpf_endian.h         |  72 +++++++++++++++
 src/bpf_helpers.h        |  41 ++++++++
 src/bpf_tracing.h        | 195 +++++++++++++++++++++++++++++++++++++++
 src/libbpf.c             | 183 ++++++++++++++++++------------------
 src/libbpf.h             |  48 +++++++++-
 src/libbpf.map           |   6 ++
 src/libbpf_internal.h    |  32 +++++++
 9 files changed, 661 insertions(+), 115 deletions(-)
 create mode 100644 src/bpf_core_read.h
 create mode 100644 src/bpf_endian.h
 create mode 100644 src/bpf_helpers.h
 create mode 100644 src/bpf_tracing.h

--
2.17.1
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
19cbbd8f52 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
c87b3a6065 libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers
Add few macros simplifying BCC-like multi-level probe reads, while also
emitting CO-RE relocations for each read.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191008175942.1769476-7-andriin@fb.com
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
4c55ba2b19 libbpf: Move bpf_{helpers, helper_defs, endian, tracing}.h into libbpf
Move bpf_helpers.h, bpf_tracing.h, and bpf_endian.h into libbpf. Move
bpf_helper_defs.h generation into libbpf's Makefile. Ensure all those
headers are installed along the other libbpf headers. Also, adjust
selftests and samples include path to include libbpf now.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191008175942.1769476-6-andriin@fb.com
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
104006a054 uapi/bpf: fix helper docs
Various small fixes to BPF helper documentation comments, enabling
automatic header generation with a list of BPF helpers.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
bf83a95dee libbpf: fix bpf_object__name() to actually return object name
bpf_object__name() was returning file path, not name. Fix this.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
732f598282 libbpf: add bpf_object__open_{file, mem} w/ extensible opts
Add new set of bpf_object__open APIs using new approach to optional
parameters extensibility allowing simpler ABI compatibility approach.

This patch demonstrates an approach to implementing libbpf APIs that
makes it easy to extend existing APIs with extra optional parameters in
such a way, that ABI compatibility is preserved without having to do
symbol versioning and generating lots of boilerplate code to handle it.
To facilitate succinct code for working with options, add OPTS_VALID,
OPTS_HAS, and OPTS_GET macros that hide all the NULL, size, and zero
checks.

Additionally, newly added libbpf APIs are encouraged to follow similar
pattern of having all mandatory parameters as formal function parameters
and always have optional (NULL-able) xxx_opts struct, which should
always have real struct size as a first field and the rest would be
optional parameters added over time, which tune the behavior of existing
API, if specified by user.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
de3c5a17cb libbpf: stop enforcing kern_version, populate it for users
Kernel version enforcement for kprobes/kretprobes was removed from
5.0 kernel in 6c4fc209fcf9 ("bpf: remove useless version check for prog load").
Since then, BPF programs were specifying SEC("version") just to please
libbpf. We should stop enforcing this in libbpf, if even kernel doesn't
care. Furthermore, libbpf now will pre-populate current kernel version
of the host system, in case we are still running on old kernel.

This patch also removes __bpf_object__open_xattr from libbpf.h, as
nothing in libbpf is relying on having it in that header. That function
was never exported as LIBBPF_API and even name suggests its internal
version. So this should be safe to remove, as it doesn't break ABI.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
1a8a75037b libbpf: Bump current version to v0.0.6
New release cycle started, let's bump to v0.0.6 proactively.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20190930222503.519782-1-andriin@fb.com
2019-10-09 14:42:45 -07:00
99 changed files with 124918 additions and 4691 deletions

16
.github/actions/debian/action.yml vendored Normal file
View File

@@ -0,0 +1,16 @@
name: 'debian'
description: 'Build'
inputs:
target:
description: 'Run target'
required: true
runs:
using: "composite"
steps:
- run: |
source /tmp/ci_setup
bash -x $CI_ROOT/managers/debian.sh SETUP
bash -x $CI_ROOT/managers/debian.sh ${{ inputs.target }}
bash -x $CI_ROOT/managers/debian.sh CLEANUP
shell: bash

23
.github/actions/setup/action.yml vendored Normal file
View File

@@ -0,0 +1,23 @@
name: 'setup'
description: 'setup env, create /tmp/ci_setup'
runs:
using: "composite"
steps:
- id: variables
run: |
export REPO_ROOT=$GITHUB_WORKSPACE
export CI_ROOT=$REPO_ROOT/travis-ci
# this is somewhat ugly, but that is the easiest way to share this code with
# arch specific docker
echo 'echo ::group::Env setup' > /tmp/ci_setup
echo export DEBIAN_FRONTEND=noninteractive >> /tmp/ci_setup
echo sudo apt-get update >> /tmp/ci_setup
echo sudo apt-get install -y aptitude qemu-kvm zstd binutils-dev elfutils libcap-dev libelf-dev libdw-dev >> /tmp/ci_setup
echo export PROJECT_NAME='libbpf' >> /tmp/ci_setup
echo export AUTHOR_EMAIL="$(git log -1 --pretty=\"%aE\")" >> /tmp/ci_setup
echo export REPO_ROOT=$GITHUB_WORKSPACE >> /tmp/ci_setup
echo export CI_ROOT=$REPO_ROOT/travis-ci >> /tmp/ci_setup
echo export VMTEST_ROOT=$CI_ROOT/vmtest >> /tmp/ci_setup
echo 'echo ::endgroup::' >> /tmp/ci_setup
shell: bash

35
.github/actions/vmtest/action.yml vendored Normal file
View File

@@ -0,0 +1,35 @@
name: 'vmtest'
description: 'Build + run vmtest'
inputs:
kernel:
description: 'kernel version or LATEST'
required: true
default: 'LATEST'
kernel-rev:
description: 'CHECKPOINT or rev/tag/branch'
required: true
default: 'CHECKPOINT'
kernel-origin:
description: 'kernel repo'
required: true
default: 'https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git'
pahole:
description: 'pahole rev/tag/branch'
required: true
default: 'master'
pahole-origin:
description: 'pahole repo'
required: true
default: 'https://git.kernel.org/pub/scm/devel/pahole/pahole.git'
runs:
using: "composite"
steps:
- run: |
source /tmp/ci_setup
export KERNEL=${{ inputs.kernel }}
export KERNEL_BRANCH=${{ inputs.kernel-rev }}
export KERNEL_ORIGIN=${{ inputs.kernel-origin }}
export PAHOLE_BRANCH=${{ inputs.pahole }}
export PAHOLE_ORIGIN=${{ inputs.pahole-origin }}
$CI_ROOT/vmtest/run_vmtest.sh
shell: bash

30
.github/workflows/coverity.yml vendored Normal file
View File

@@ -0,0 +1,30 @@
name: libbpf-ci-coverity
on:
schedule:
- cron: '0 18 * * *'
jobs:
coverity:
runs-on: ubuntu-latest
name: Coverity
steps:
- uses: actions/checkout@v2
- uses: ./.github/actions/setup
- name: Run coverity
run: |
echo ::group::Setup CI env
source /tmp/ci_setup
export COVERITY_SCAN_NOTIFICATION_EMAIL="${AUTHOR_EMAIL}"
export COVERITY_SCAN_BRANCH_PATTERN=${GITHUB_REF##refs/*/}
export TRAVIS_BRANCH=${COVERITY_SCAN_BRANCH_PATTERN}
echo ::endgroup::
scripts/coverity.sh
env:
COVERITY_SCAN_TOKEN: ${{ secrets.COVERITY_SCAN_TOKEN }}
COVERITY_SCAN_PROJECT_NAME: libbpf
COVERITY_SCAN_BUILD_COMMAND_PREPEND: 'cd src/'
COVERITY_SCAN_BUILD_COMMAND: 'make'
- name: SCM log
run: cat /home/runner/work/libbpf/libbpf/src/cov-int/scm_log.txt

36
.github/workflows/ondemand.yml vendored Normal file
View File

@@ -0,0 +1,36 @@
name: ondemand
on:
workflow_dispatch:
inputs:
kernel-origin:
description: 'git repo for linux kernel'
default: 'https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git'
required: true
kernel-rev:
description: 'rev/tag/branch for linux kernel'
default: "master"
required: true
pahole-origin:
description: 'git repo for pahole'
default: 'https://git.kernel.org/pub/scm/devel/pahole/pahole.git'
required: true
pahole-rev:
description: 'ref/tag/branch for pahole'
default: "master"
required: true
jobs:
vmtest:
runs-on: ubuntu-latest
name: vmtest with customized pahole/Kernel
steps:
- uses: actions/checkout@v2
- uses: ./.github/actions/setup
- uses: ./.github/actions/vmtest
with:
kernel: 'LATEST'
kernel-rev: ${{ github.event.inputs.kernel-rev }}
kernel-origin: ${{ github.event.inputs.kernel-origin }}
pahole: ${{ github.event.inputs.pahole-rev }}
pahole-origin: ${{ github.event.inputs.pahole-origin }}

20
.github/workflows/pahole.yml vendored Normal file
View File

@@ -0,0 +1,20 @@
name: pahole-staging
on:
schedule:
- cron: '0 18 * * *'
jobs:
vmtest:
runs-on: ubuntu-latest
name: Kernel LATEST + staging pahole
env:
STAGING: tmp.master
steps:
- uses: actions/checkout@v2
- uses: ./.github/actions/setup
- uses: ./.github/actions/vmtest
with:
kernel: LATEST
pahole: $STAGING

100
.github/workflows/test.yml vendored Normal file
View File

@@ -0,0 +1,100 @@
name: libbpf-ci
on:
pull_request:
push:
schedule:
- cron: '0 18 * * *'
concurrency:
group: ci-test-${{ github.head_ref }}
cancel-in-progress: true
jobs:
vmtest:
runs-on: ubuntu-latest
name: Kernel ${{ matrix.kernel }} + selftests
strategy:
fail-fast: false
matrix:
include:
- kernel: 'LATEST'
- kernel: '5.5.0'
- kernel: '4.9.0'
steps:
- uses: actions/checkout@v2
name: Checkout
- uses: ./.github/actions/setup
name: Setup
- uses: ./.github/actions/vmtest
name: vmtest
with:
kernel: ${{ matrix.kernel }}
debian:
runs-on: ubuntu-latest
name: Debian Build (${{ matrix.name }})
strategy:
fail-fast: false
matrix:
include:
- name: default
target: RUN
- name: ASan+UBSan
target: RUN_ASAN
- name: clang
target: RUN_CLANG
- name: clang ASan+UBSan
target: RUN_CLANG_ASAN
- name: gcc-10
target: RUN_GCC10
- name: gcc-10 ASan+UBSan
target: RUN_GCC10_ASAN
steps:
- uses: actions/checkout@v2
name: Checkout
- uses: ./.github/actions/setup
name: Setup
- uses: ./.github/actions/debian
name: Build
with:
target: ${{ matrix.target }}
ubuntu:
runs-on: ubuntu-latest
name: Ubuntu Focal Build (${{ matrix.arch }})
strategy:
fail-fast: false
matrix:
include:
- arch: aarch64
- arch: ppc64le
- arch: s390x
- arch: x86
steps:
- uses: actions/checkout@v2
name: Checkout
- uses: ./.github/actions/setup
name: Pre-Setup
- run: source /tmp/ci_setup && sudo -E $CI_ROOT/managers/ubuntu.sh
if: matrix.arch == 'x86'
name: Setup
- uses: uraimo/run-on-arch-action@v2.0.5
name: Build in docker
if: matrix.arch != 'x86'
with:
distro:
ubuntu20.04
arch:
${{ matrix.arch }}
setup:
cp /tmp/ci_setup $GITHUB_WORKSPACE
dockerRunArgs: |
--volume "${GITHUB_WORKSPACE}:${GITHUB_WORKSPACE}"
shell: /bin/bash
install: |
export DEBIAN_FRONTEND=noninteractive
export TZ="America/Los_Angeles"
apt-get update -y
apt-get install -y tzdata build-essential sudo
run: source ${GITHUB_WORKSPACE}/ci_setup && $CI_ROOT/managers/ubuntu.sh

14
.lgtm.yml Normal file
View File

@@ -0,0 +1,14 @@
# vi: set ts=2 sw=2:
extraction:
cpp:
prepare:
packages:
- libelf-dev
- pkg-config
after_prepare:
# As the buildsystem detection by LGTM is performed _only_ during the
# 'configure' phase, we need to trick LGTM we use a supported build
# system (configure, meson, cmake, etc.). This way LGTM correctly detects
# that our sources are in the src/ subfolder.
- touch src/configure
- chmod +x src/configure

17
.readthedocs.yaml Normal file
View File

@@ -0,0 +1,17 @@
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
# Required
version: 2
# Build documentation in the docs/ directory with Sphinx
sphinx:
builder: html
configuration: docs/conf.py
# Optionally set the version of Python and requirements required to build your docs
python:
version: 3.7
install:
- requirements: docs/sphinx/requirements.txt

View File

@@ -1,122 +1,130 @@
sudo: required
dist: xenial
language: bash
dist: focal
services:
- docker
env:
global:
- AUTHOR_EMAIL="$(git log -1 $TRAVIS_COMMIT --pretty=\"%aE\")"
- CI_MANAGERS="$TRAVIS_BUILD_DIR/travis-ci/managers"
- PROJECT_NAME='libbpf'
- AUTHOR_EMAIL="$(git log -1 --pretty=\"%aE\")"
- REPO_ROOT="$TRAVIS_BUILD_DIR"
- CI_ROOT="$REPO_ROOT/travis-ci"
- VMTEST_ROOT="$CI_ROOT/vmtest"
addons:
apt:
packages:
- qemu-kvm
- zstd
- binutils-dev
- elfutils
- libcap-dev
- libelf-dev
- libdw-dev
stages:
# Run Coverity periodically instead of for each PR for following reasons:
# 1) Coverity jobs are heavily rate-limited
# 2) Due to security restrictions of encrypted environment variables
# in Travis CI, pull requests made from forks can't access encrypted
# env variables, making Coverity unusable
# See: https://docs.travis-ci.com/user/pull-requests#pull-requests-and-security-restrictions
- name: Coverity
if: type = cron
jobs:
include:
- stage: Build & test
name: Debian Testing
- stage: Builds & Tests
name: Kernel 5.5.0 + selftests
language: bash
env: KERNEL=5.5.0
script: $CI_ROOT/vmtest/run_vmtest.sh || travis_terminate 1
- name: Kernel LATEST + selftests
language: bash
env: KERNEL=LATEST
script: $CI_ROOT/vmtest/run_vmtest.sh || travis_terminate 1
- name: Kernel 4.9.0 + selftests
language: bash
env: KERNEL=4.9.0
script: $CI_ROOT/vmtest/run_vmtest.sh || travis_terminate 1
- name: Debian Build
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Debian Build (ASan+UBSan)
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_ASAN || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Debian Build (clang)
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_CLANG || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Debian Build (clang ASan+UBSan)
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_CLANG_ASAN || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Debian Build (gcc-10)
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_GCC10 || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Debian Build (gcc-10 ASan+UBSan)
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_GCC10_ASAN || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Ubuntu Focal Build
language: bash
script: sudo $CI_ROOT/managers/ubuntu.sh || travis_terminate 1
- name: Ubuntu Focal Build (arm)
arch: arm64
language: bash
script: sudo $CI_ROOT/managers/ubuntu.sh || travis_terminate 1
- name: Ubuntu Focal Build (s390x)
arch: s390x
language: bash
script: sudo $CI_ROOT/managers/ubuntu.sh || travis_terminate 1
- name: Ubuntu Focal Build (ppc64le)
arch: ppc64le
language: bash
script: sudo $CI_ROOT/managers/ubuntu.sh || travis_terminate 1
- stage: Coverity
language: bash
env:
- DEBIAN_RELEASE="testing"
- CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"
before_install:
- sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce
- docker --version
# Coverity configuration
# COVERITY_SCAN_TOKEN=xxx
# Encrypted using `travis encrypt --repo libbpf/libbpf COVERITY_SCAN_TOKEN=xxx`
- secure: "I9OsMRHbb82IUivDp+I+w/jEQFOJgBDAqYqf1ollqCM1QhocxMcS9bwIAgfPhdXi2hohV7sRrVMZstahY67FAvJLGxNopi4tAPDIAaIFxgO0yDxMhaTMx5xDfMwlIm2FOP/9gB9BQsd6M7CmoQZgXYwBIv7xd1ooxoQrh2rOK1YrRl7UQu3+c3zPTjDfIYZzR3bFttMqZ9/c4U0v8Ry5IFXrel3hCshndHA1TtttJrUSrILlZcmVc1ch7JIy6zCbCU/2lGv0B/7rWXfF8MT7O9jPtFOhJ1DEcd2zhw2n4j9YT3a8OhtnM61LA6ask632mwCOsxpFLTun7AzuR1Cb5mdPHsxhxnCHcXXARa2mJjem0QG1NhwxwJE8sbRDapojexxCvweYlEN40ofwMDSnj/qNt95XIcrk0tiIhGFx0gVNWvAdmZwx+N4mwGPMTAN0AEOFjpgI+ZdB89m+tL/CbEgE1flc8QxUxJhcp5OhH6yR0z9qYOp0nXIbHsIaCiRvt/7LqFRQfheifztWVz4mdQlCdKS9gcOQ09oKicPevKO1L0Ue3cb7Ug7jOpMs+cdh3XokJtUeYEr1NijMHT9+CTAhhO5RToWXIZRon719z3fwoUBNDREATwVFMlVxqSO/pbYgaKminigYbl785S89YYaZ6E5UvaKRHM6KHKMDszs="
- COVERITY_SCAN_PROJECT_NAME="libbpf"
- COVERITY_SCAN_NOTIFICATION_EMAIL="${AUTHOR_EMAIL}"
- COVERITY_SCAN_BRANCH_PATTERN="$TRAVIS_BRANCH"
# Note: `make -C src/` as a BUILD_COMMAND will not work here
- COVERITY_SCAN_BUILD_COMMAND_PREPEND="cd src/"
- COVERITY_SCAN_BUILD_COMMAND="make"
install:
- $CI_MANAGERS/debian.sh SETUP
- sudo echo 'deb-src http://archive.ubuntu.com/ubuntu/ focal main restricted universe multiverse' >>/etc/apt/sources.list
- sudo apt-get update
- sudo apt-get -y build-dep libelf-dev
- sudo apt-get install -y libelf-dev pkg-config
script:
- set -e
- $CI_MANAGERS/debian.sh RUN
- set +e
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
- name: Debian Testing (ASan+UBSan)
language: bash
env:
- DEBIAN_RELEASE="testing"
- CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"
before_install:
- sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
script:
- set -e
- $CI_MANAGERS/debian.sh RUN_ASAN
- set +e
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
- name: Debian Testing (clang)
language: bash
env:
- DEBIAN_RELEASE="testing"
- CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"
before_install:
- sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
script:
- set -e
- $CI_MANAGERS/debian.sh RUN_CLANG
- set +e
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
- name: Debian Testing (clang ASan+UBSan)
language: bash
env:
- DEBIAN_RELEASE="testing"
- CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"
before_install:
- sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
script:
- set -e
- $CI_MANAGERS/debian.sh RUN_CLANG_ASAN
- set +e
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
- name: Debian Testing (gcc-8)
language: bash
env:
- DEBIAN_RELEASE="testing"
- CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"
before_install:
- sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
script:
- set -e
- $CI_MANAGERS/debian.sh RUN_GCC8
- set +e
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
- name: Debian Testing (gcc-8 ASan+UBSan)
language: bash
env:
- DEBIAN_RELEASE="testing"
- CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"
before_install:
- sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
script:
- set -e
- $CI_MANAGERS/debian.sh RUN_GCC8_ASAN
- set +e
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
- name: Ubuntu Xenial
language: bash
script:
- set -e
- sudo $CI_MANAGERS/xenial.sh
- set +e
- scripts/coverity.sh || travis_terminate 1
allow_failures:
- env: KERNEL=x.x.x

View File

@@ -1 +1 @@
1bd63524593b964934a33afd442df16b8f90e2b5
8d6c414cd2fb74aa6812e9bfec6178f8246c4f3a

View File

@@ -1 +1 @@
02dc96ef6c25f990452c114c59d75c368a1f4c8f
5319255b8df9271474bc9027cabf82253934f28d

1
LICENSE Normal file
View File

@@ -0,0 +1 @@
LGPL-2.1 OR BSD-2-Clause

32
LICENSE.BSD-2-Clause Normal file
View File

@@ -0,0 +1,32 @@
Valid-License-Identifier: BSD-2-Clause
SPDX-URL: https://spdx.org/licenses/BSD-2-Clause.html
Usage-Guide:
To use the BSD 2-clause "Simplified" License put the following SPDX
tag/value pair into a comment according to the placement guidelines in
the licensing rules documentation:
SPDX-License-Identifier: BSD-2-Clause
License-Text:
Copyright (c) <year> <owner> . All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.

503
LICENSE.LGPL-2.1 Normal file
View File

@@ -0,0 +1,503 @@
Valid-License-Identifier: LGPL-2.1
Valid-License-Identifier: LGPL-2.1+
SPDX-URL: https://spdx.org/licenses/LGPL-2.1.html
Usage-Guide:
To use this license in source code, put one of the following SPDX
tag/value pairs into a comment according to the placement
guidelines in the licensing rules documentation.
For 'GNU Lesser General Public License (LGPL) version 2.1 only' use:
SPDX-License-Identifier: LGPL-2.1
For 'GNU Lesser General Public License (LGPL) version 2.1 or any later
version' use:
SPDX-License-Identifier: LGPL-2.1+
License-Text:
GNU LESSER GENERAL PUBLIC LICENSE
Version 2.1, February 1999
Copyright (C) 1991, 1999 Free Software Foundation, Inc.
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
[This is the first released version of the Lesser GPL. It also counts as
the successor of the GNU Library Public License, version 2, hence the
version number 2.1.]
Preamble
The licenses for most software are designed to take away your freedom to
share and change it. By contrast, the GNU General Public Licenses are
intended to guarantee your freedom to share and change free software--to
make sure the software is free for all its users.
This license, the Lesser General Public License, applies to some specially
designated software packages--typically libraries--of the Free Software
Foundation and other authors who decide to use it. You can use it too, but
we suggest you first think carefully about whether this license or the
ordinary General Public License is the better strategy to use in any
particular case, based on the explanations below.
When we speak of free software, we are referring to freedom of use, not
price. Our General Public Licenses are designed to make sure that you have
the freedom to distribute copies of free software (and charge for this
service if you wish); that you receive source code or can get it if you
want it; that you can change the software and use pieces of it in new free
programs; and that you are informed that you can do these things.
To protect your rights, we need to make restrictions that forbid
distributors to deny you these rights or to ask you to surrender these
rights. These restrictions translate to certain responsibilities for you if
you distribute copies of the library or if you modify it.
For example, if you distribute copies of the library, whether gratis or for
a fee, you must give the recipients all the rights that we gave you. You
must make sure that they, too, receive or can get the source code. If you
link other code with the library, you must provide complete object files to
the recipients, so that they can relink them with the library after making
changes to the library and recompiling it. And you must show them these
terms so they know their rights.
We protect your rights with a two-step method: (1) we copyright the
library, and (2) we offer you this license, which gives you legal
permission to copy, distribute and/or modify the library.
To protect each distributor, we want to make it very clear that there is no
warranty for the free library. Also, if the library is modified by someone
else and passed on, the recipients should know that what they have is not
the original version, so that the original author's reputation will not be
affected by problems that might be introduced by others.
Finally, software patents pose a constant threat to the existence of any
free program. We wish to make sure that a company cannot effectively
restrict the users of a free program by obtaining a restrictive license
from a patent holder. Therefore, we insist that any patent license obtained
for a version of the library must be consistent with the full freedom of
use specified in this license.
Most GNU software, including some libraries, is covered by the ordinary GNU
General Public License. This license, the GNU Lesser General Public
License, applies to certain designated libraries, and is quite different
from the ordinary General Public License. We use this license for certain
libraries in order to permit linking those libraries into non-free
programs.
When a program is linked with a library, whether statically or using a
shared library, the combination of the two is legally speaking a combined
work, a derivative of the original library. The ordinary General Public
License therefore permits such linking only if the entire combination fits
its criteria of freedom. The Lesser General Public License permits more lax
criteria for linking other code with the library.
We call this license the "Lesser" General Public License because it does
Less to protect the user's freedom than the ordinary General Public
License. It also provides other free software developers Less of an
advantage over competing non-free programs. These disadvantages are the
reason we use the ordinary General Public License for many
libraries. However, the Lesser license provides advantages in certain
special circumstances.
For example, on rare occasions, there may be a special need to encourage
the widest possible use of a certain library, so that it becomes a de-facto
standard. To achieve this, non-free programs must be allowed to use the
library. A more frequent case is that a free library does the same job as
widely used non-free libraries. In this case, there is little to gain by
limiting the free library to free software only, so we use the Lesser
General Public License.
In other cases, permission to use a particular library in non-free programs
enables a greater number of people to use a large body of free
software. For example, permission to use the GNU C Library in non-free
programs enables many more people to use the whole GNU operating system, as
well as its variant, the GNU/Linux operating system.
Although the Lesser General Public License is Less protective of the users'
freedom, it does ensure that the user of a program that is linked with the
Library has the freedom and the wherewithal to run that program using a
modified version of the Library.
The precise terms and conditions for copying, distribution and modification
follow. Pay close attention to the difference between a "work based on the
library" and a "work that uses the library". The former contains code
derived from the library, whereas the latter must be combined with the
library in order to run.
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License Agreement applies to any software library or other program
which contains a notice placed by the copyright holder or other
authorized party saying it may be distributed under the terms of this
Lesser General Public License (also called "this License"). Each
licensee is addressed as "you".
A "library" means a collection of software functions and/or data
prepared so as to be conveniently linked with application programs
(which use some of those functions and data) to form executables.
The "Library", below, refers to any such software library or work which
has been distributed under these terms. A "work based on the Library"
means either the Library or any derivative work under copyright law:
that is to say, a work containing the Library or a portion of it, either
verbatim or with modifications and/or translated straightforwardly into
another language. (Hereinafter, translation is included without
limitation in the term "modification".)
"Source code" for a work means the preferred form of the work for making
modifications to it. For a library, complete source code means all the
source code for all modules it contains, plus any associated interface
definition files, plus the scripts used to control compilation and
installation of the library.
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of running
a program using the Library is not restricted, and output from such a
program is covered only if its contents constitute a work based on the
Library (independent of the use of the Library in a tool for writing
it). Whether that is true depends on what the Library does and what the
program that uses the Library does.
1. You may copy and distribute verbatim copies of the Library's complete
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the notices
that refer to this License and to the absence of any warranty; and
distribute a copy of this License along with the Library.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Library or any portion of it,
thus forming a work based on the Library, and copy and distribute such
modifications or work under the terms of Section 1 above, provided that
you also meet all of these conditions:
a) The modified work must itself be a software library.
b) You must cause the files modified to carry prominent notices stating
that you changed the files and the date of any change.
c) You must cause the whole of the work to be licensed at no charge to
all third parties under the terms of this License.
d) If a facility in the modified Library refers to a function or a table
of data to be supplied by an application program that uses the
facility, other than as an argument passed when the facility is
invoked, then you must make a good faith effort to ensure that, in
the event an application does not supply such function or table, the
facility still operates, and performs whatever part of its purpose
remains meaningful.
(For example, a function in a library to compute square roots has a
purpose that is entirely well-defined independent of the
application. Therefore, Subsection 2d requires that any
application-supplied function or table used by this function must be
optional: if the application does not supply it, the square root
function must still compute square roots.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Library, and
can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based on
the Library, the distribution of the whole must be on the terms of this
License, whose permissions for other licensees extend to the entire
whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Library.
In addition, mere aggregation of another work not based on the Library
with the Library (or with a work based on the Library) on a volume of a
storage or distribution medium does not bring the other work under the
scope of this License.
3. You may opt to apply the terms of the ordinary GNU General Public
License instead of this License to a given copy of the Library. To do
this, you must alter all the notices that refer to this License, so that
they refer to the ordinary GNU General Public License, version 2,
instead of to this License. (If a newer version than version 2 of the
ordinary GNU General Public License has appeared, then you can specify
that version instead if you wish.) Do not make any other change in these
notices.
Once this change is made in a given copy, it is irreversible for that
copy, so the ordinary GNU General Public License applies to all
subsequent copies and derivative works made from that copy.
This option is useful when you wish to copy part of the code of the
Library into a program that is not a library.
4. You may copy and distribute the Library (or a portion or derivative of
it, under Section 2) in object code or executable form under the terms
of Sections 1 and 2 above provided that you accompany it with the
complete corresponding machine-readable source code, which must be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange.
If distribution of object code is made by offering access to copy from a
designated place, then offering equivalent access to copy the source
code from the same place satisfies the requirement to distribute the
source code, even though third parties are not compelled to copy the
source along with the object code.
5. A program that contains no derivative of any portion of the Library, but
is designed to work with the Library by being compiled or linked with
it, is called a "work that uses the Library". Such a work, in isolation,
is not a derivative work of the Library, and therefore falls outside the
scope of this License.
However, linking a "work that uses the Library" with the Library creates
an executable that is a derivative of the Library (because it contains
portions of the Library), rather than a "work that uses the
library". The executable is therefore covered by this License. Section 6
states terms for distribution of such executables.
When a "work that uses the Library" uses material from a header file
that is part of the Library, the object code for the work may be a
derivative work of the Library even though the source code is
not. Whether this is true is especially significant if the work can be
linked without the Library, or if the work is itself a library. The
threshold for this to be true is not precisely defined by law.
If such an object file uses only numerical parameters, data structure
layouts and accessors, and small macros and small inline functions (ten
lines or less in length), then the use of the object file is
unrestricted, regardless of whether it is legally a derivative
work. (Executables containing this object code plus portions of the
Library will still fall under Section 6.)
Otherwise, if the work is a derivative of the Library, you may
distribute the object code for the work under the terms of Section
6. Any executables containing that work also fall under Section 6,
whether or not they are linked directly with the Library itself.
6. As an exception to the Sections above, you may also combine or link a
"work that uses the Library" with the Library to produce a work
containing portions of the Library, and distribute that work under terms
of your choice, provided that the terms permit modification of the work
for the customer's own use and reverse engineering for debugging such
modifications.
You must give prominent notice with each copy of the work that the
Library is used in it and that the Library and its use are covered by
this License. You must supply a copy of this License. If the work during
execution displays copyright notices, you must include the copyright
notice for the Library among them, as well as a reference directing the
user to the copy of this License. Also, you must do one of these things:
a) Accompany the work with the complete corresponding machine-readable
source code for the Library including whatever changes were used in
the work (which must be distributed under Sections 1 and 2 above);
and, if the work is an executable linked with the Library, with the
complete machine-readable "work that uses the Library", as object
code and/or source code, so that the user can modify the Library and
then relink to produce a modified executable containing the modified
Library. (It is understood that the user who changes the contents of
definitions files in the Library will not necessarily be able to
recompile the application to use the modified definitions.)
b) Use a suitable shared library mechanism for linking with the
Library. A suitable mechanism is one that (1) uses at run time a copy
of the library already present on the user's computer system, rather
than copying library functions into the executable, and (2) will
operate properly with a modified version of the library, if the user
installs one, as long as the modified version is interface-compatible
with the version that the work was made with.
c) Accompany the work with a written offer, valid for at least three
years, to give the same user the materials specified in Subsection
6a, above, for a charge no more than the cost of performing this
distribution.
d) If distribution of the work is made by offering access to copy from a
designated place, offer equivalent access to copy the above specified
materials from the same place.
e) Verify that the user has already received a copy of these materials
or that you have already sent this user a copy.
For an executable, the required form of the "work that uses the Library"
must include any data and utility programs needed for reproducing the
executable from it. However, as a special exception, the materials to be
distributed need not include anything that is normally distributed (in
either source or binary form) with the major components (compiler,
kernel, and so on) of the operating system on which the executable runs,
unless that component itself accompanies the executable.
It may happen that this requirement contradicts the license restrictions
of other proprietary libraries that do not normally accompany the
operating system. Such a contradiction means you cannot use both them
and the Library together in an executable that you distribute.
7. You may place library facilities that are a work based on the Library
side-by-side in a single library together with other library facilities
not covered by this License, and distribute such a combined library,
provided that the separate distribution of the work based on the Library
and of the other library facilities is otherwise permitted, and provided
that you do these two things:
a) Accompany the combined library with a copy of the same work based on
the Library, uncombined with any other library facilities. This must
be distributed under the terms of the Sections above.
b) Give prominent notice with the combined library of the fact that part
of it is a work based on the Library, and explaining where to find
the accompanying uncombined form of the same work.
8. You may not copy, modify, sublicense, link with, or distribute the
Library except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense, link with, or distribute the
Library is void, and will automatically terminate your rights under this
License. However, parties who have received copies, or rights, from you
under this License will not have their licenses terminated so long as
such parties remain in full compliance.
9. You are not required to accept this License, since you have not signed
it. However, nothing else grants you permission to modify or distribute
the Library or its derivative works. These actions are prohibited by law
if you do not accept this License. Therefore, by modifying or
distributing the Library (or any work based on the Library), you
indicate your acceptance of this License to do so, and all its terms and
conditions for copying, distributing or modifying the Library or works
based on it.
10. Each time you redistribute the Library (or any work based on the
Library), the recipient automatically receives a license from the
original licensor to copy, distribute, link with or modify the Library
subject to these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted
herein. You are not responsible for enforcing compliance by third
parties with this License.
11. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Library at all. For example, if a patent license
would not permit royalty-free redistribution of the Library by all
those who receive copies directly or indirectly through you, then the
only way you could satisfy both it and this License would be to refrain
entirely from distribution of the Library.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply, and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system which is implemented
by public license practices. Many people have made generous
contributions to the wide range of software distributed through that
system in reliance on consistent application of that system; it is up
to the author/donor to decide if he or she is willing to distribute
software through any other system and a licensee cannot impose that
choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
12. If the distribution and/or use of the Library is restricted in certain
countries either by patents or by copyrighted interfaces, the original
copyright holder who places the Library under this License may add an
explicit geographical distribution limitation excluding those
countries, so that distribution is permitted only in or among countries
not thus excluded. In such case, this License incorporates the
limitation as if written in the body of this License.
13. The Free Software Foundation may publish revised and/or new versions of
the Lesser General Public License from time to time. Such new versions
will be similar in spirit to the present version, but may differ in
detail to address new problems or concerns.
Each version is given a distinguishing version number. If the Library
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and
conditions either of that version or of any later version published by
the Free Software Foundation. If the Library does not specify a license
version number, you may choose any version ever published by the Free
Software Foundation.
14. If you wish to incorporate parts of the Library into other free
programs whose distribution conditions are incompatible with these,
write to the author to ask for permission. For software which is
copyrighted by the Free Software Foundation, write to the Free Software
Foundation; we sometimes make exceptions for this. Our decision will be
guided by the two goals of preserving the free status of all
derivatives of our free software and of promoting the sharing and reuse
of software generally.
NO WARRANTY
15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE
ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE LIBRARY IS WITH
YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL
NECESSARY SERVICING, REPAIR OR CORRECTION.
16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU FOR
DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL
DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE LIBRARY
(INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED
INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF
THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR
OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Libraries
If you develop a new library, and you want it to be of the greatest
possible use to the public, we recommend making it free software that
everyone can redistribute and change. You can do so by permitting
redistribution under these terms (or, alternatively, under the terms of the
ordinary General Public License).
To apply these terms, attach the following notices to the library. It is
safest to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least the
"copyright" line and a pointer to where the full notice is found.
one line to give the library's name and an idea of what it does.
Copyright (C) year name of author
This library is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation; either version 2.1 of the License, or (at
your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License
for more details.
You should have received a copy of the GNU Lesser General Public License
along with this library; if not, write to the Free Software Foundation,
Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Also add
information on how to contact you by electronic and paper mail.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the library, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in
the library `Frob' (a library for tweaking knobs) written
by James Random Hacker.
signature of Ty Coon, 1 April 1990
Ty Coon, President of Vice
That's all there is to it!

141
README.md
View File

@@ -1,29 +1,48 @@
This is a mirror of [bpf-next linux tree](https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next)'s
This is a mirror of [bpf-next Linux source
tree](https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next)'s
`tools/lib/bpf` directory plus its supporting header files.
The following files will by sync'ed with bpf-next repo:
- `src/` <-> `bpf-next/tools/lib/bpf/`
- `include/uapi/linux/bpf_common.h` <-> `bpf-next/tools/include/uapi/linux/bpf_common.h`
- `include/uapi/linux/bpf.h` <-> `bpf-next/tools/include/uapi/linux/bpf.h`
- `include/uapi/linux/btf.h` <-> `bpf-next/tools/include/uapi/linux/btf.h`
- `include/uapi/linux/if_link.h` <-> `bpf-next/tools/include/uapi/linux/if_link.h`
- `include/uapi/linux/if_xdp.h` <-> `bpf-next/tools/include/uapi/linux/if_xdp.h`
- `include/uapi/linux/netlink.h` <-> `bpf-next/tools/include/uapi/linux/netlink.h`
- `include/tools/libc_compat.h` <-> `bpf-next/tools/include/tools/libc_compat.h`
All the gory details of syncing can be found in `scripts/sync-kernel.sh`
script.
Other header files at this repo (`include/linux/*.h`) are reduced versions of
their counterpart files at bpf-next's `tools/include/linux/*.h` to make compilation
successful.
Some header files in this repo (`include/linux/*.h`) are reduced versions of
their counterpart files at
[bpf-next](https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/)'s
`tools/include/linux/*.h` to make compilation successful.
Build [![Build Status](https://travis-ci.org/libbpf/libbpf.svg?branch=master)](https://travis-ci.org/libbpf/libbpf)
BPF/libbpf usage and questions
==============================
Please check out [libbpf-bootstrap](https://github.com/libbpf/libbpf-bootstrap)
and [the companion blog post](https://nakryiko.com/posts/libbpf-bootstrap/) for
the examples of building BPF applications with libbpf.
[libbpf-tools](https://github.com/iovisor/bcc/tree/master/libbpf-tools) are also
a good source of the real-world libbpf-based tracing tools.
All general BPF questions, including kernel functionality, libbpf APIs and
their application, should be sent to bpf@vger.kernel.org mailing list. You can
subscribe to it [here](http://vger.kernel.org/vger-lists.html#bpf) and search
its archive [here](https://lore.kernel.org/bpf/). Please search the archive
before asking new questions. It very well might be that this was already
addressed or answered before.
bpf@vger.kernel.org is monitored by many more people and they will happily try
to help you with whatever issue you have. This repository's PRs and issues
should be opened only for dealing with issues pertaining to specific way this
libbpf mirror repo is set up and organized.
Build
[![Github Actions Builds & Tests](https://github.com/libbpf/libbpf/actions/workflows/test.yml/badge.svg)](https://github.com/libbpf/libbpf/actions/workflows/test.yml)
[![Total alerts](https://img.shields.io/lgtm/alerts/g/libbpf/libbpf.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/libbpf/libbpf/alerts/)
[![Coverity](https://img.shields.io/coverity/scan/18195.svg)](https://scan.coverity.com/projects/libbpf)
=====
libelf is an internal dependency of libbpf and thus it is required to link
against and must be installed on the system for applications to work.
pkg-config is used by default to find libelf, and the program called can be
overridden with `PKG_CONFIG`.
If using `pkg-config` at build time is not desired, it can be disabled by setting
`NO_PKG_CONFIG=1` when calling make.
If using `pkg-config` at build time is not desired, it can be disabled by
setting `NO_PKG_CONFIG=1` when calling make.
To build both static libbpf.a and shared libbpf.so:
```bash
@@ -47,3 +66,91 @@ headers in a build directory /build/root/:
$ cd src
$ PKG_CONFIG_PATH=/build/root/lib64/pkgconfig DESTDIR=/build/root make install
```
Distributions
=============
Distributions packaging libbpf from this mirror:
- [Fedora](https://src.fedoraproject.org/rpms/libbpf)
- [Gentoo](https://packages.gentoo.org/packages/dev-libs/libbpf)
- [Debian](https://packages.debian.org/source/sid/libbpf)
- [Arch](https://www.archlinux.org/packages/extra/x86_64/libbpf/)
- [Ubuntu](https://packages.ubuntu.com/source/groovy/libbpf)
- [Alpine](https://pkgs.alpinelinux.org/packages?name=libbpf)
Benefits of packaging from the mirror over packaging from kernel sources:
- Consistent versioning across distributions.
- No ties to any specific kernel, transparent handling of older kernels.
Libbpf is designed to be kernel-agnostic and work across multitude of
kernel versions. It has built-in mechanisms to gracefully handle older
kernels, that are missing some of the features, by working around or
gracefully degrading functionality. Thus libbpf is not tied to a specific
kernel version and can/should be packaged and versioned independently.
- Continuous integration testing via
[TravisCI](https://travis-ci.org/libbpf/libbpf).
- Static code analysis via [LGTM](https://lgtm.com/projects/g/libbpf/libbpf)
and [Coverity](https://scan.coverity.com/projects/libbpf).
Package dependencies of libbpf, package names may vary across distros:
- zlib
- libelf
BPF CO-RE (Compile Once Run Everywhere)
=========================================
Libbpf supports building BPF CO-RE-enabled applications, which, in contrast to
[BCC](https://github.com/iovisor/bcc/), do not require Clang/LLVM runtime
being deployed to target servers and doesn't rely on kernel-devel headers
being available.
It does rely on kernel to be built with [BTF type
information](https://www.kernel.org/doc/html/latest/bpf/btf.html), though.
Some major Linux distributions come with kernel BTF already built in:
- Fedora 31+
- RHEL 8.2+
- OpenSUSE Tumbleweed (in the next release, as of 2020-06-04)
- Arch Linux (from kernel 5.7.1.arch1-1)
- Manjaro (from kernel 5.4 if compiled after 2021-06-18)
- Ubuntu 20.10
- Debian 11 (amd64/arm64)
If your kernel doesn't come with BTF built-in, you'll need to build custom
kernel. You'll need:
- `pahole` 1.16+ tool (part of `dwarves` package), which performs DWARF to
BTF conversion;
- kernel built with `CONFIG_DEBUG_INFO_BTF=y` option;
- you can check if your kernel has BTF built-in by looking for
`/sys/kernel/btf/vmlinux` file:
```shell
$ ls -la /sys/kernel/btf/vmlinux
-r--r--r--. 1 root root 3541561 Jun 2 18:16 /sys/kernel/btf/vmlinux
```
To develop and build BPF programs, you'll need Clang/LLVM 10+. The following
distributions have Clang/LLVM 10+ packaged by default:
- Fedora 32+
- Ubuntu 20.04+
- Arch Linux
- Ubuntu 20.10 (LLVM 11)
- Debian 11 (LLVM 11)
- Alpine 3.13+
Otherwise, please make sure to update it on your system.
The following resources are useful to understand what BPF CO-RE is and how to
use it:
- [BPF Portability and CO-RE](https://nakryiko.com/posts/bpf-portability-and-co-re/)
- [HOWTO: BCC to libbpf conversion](https://nakryiko.com/posts/bcc-to-libbpf-howto-guide/)
- [libbpf-tools in BCC repo](https://github.com/iovisor/bcc/tree/master/libbpf-tools)
contain lots of real-world tools converted from BCC to BPF CO-RE. Consider
converting some more to both contribute to the BPF community and gain some
more experience with it.
License
=======
This work is dual-licensed under BSD 2-clause license and GNU LGPL v2.1 license.
You can choose between one of them if you use this work.
`SPDX-License-Identifier: BSD-2-Clause OR LGPL-2.1`

2
docs/.gitignore vendored Normal file
View File

@@ -0,0 +1,2 @@
sphinx/build
sphinx/doxygen/build

51
docs/api.rst Normal file
View File

@@ -0,0 +1,51 @@
.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
.. _api:
.. toctree:: Table of Contents
LIBBPF API
==================
libbpf.h
--------
.. doxygenfile:: libbpf.h
:project: libbpf
:sections: func define public-type enum
bpf.h
-----
.. doxygenfile:: bpf.h
:project: libbpf
:sections: func define public-type enum
btf.h
-----
.. doxygenfile:: btf.h
:project: libbpf
:sections: func define public-type enum
xsk.h
-----
.. doxygenfile:: xsk.h
:project: libbpf
:sections: func define public-type enum
bpf_tracing.h
-------------
.. doxygenfile:: bpf_tracing.h
:project: libbpf
:sections: func define public-type enum
bpf_core_read.h
---------------
.. doxygenfile:: bpf_core_read.h
:project: libbpf
:sections: func define public-type enum
bpf_endian.h
------------
.. doxygenfile:: bpf_endian.h
:project: libbpf
:sections: func define public-type enum

40
docs/conf.py Normal file
View File

@@ -0,0 +1,40 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: GPL-2.0
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
import os
import subprocess
project = "libbpf"
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.doctest',
'sphinx.ext.mathjax',
'sphinx.ext.viewcode',
'sphinx.ext.imgmath',
'sphinx.ext.todo',
'breathe',
]
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = []
read_the_docs_build = os.environ.get('READTHEDOCS', None) == 'True'
if read_the_docs_build:
subprocess.call('cd sphinx ; make clean', shell=True)
subprocess.call('cd sphinx/doxygen ; doxygen', shell=True)
html_theme = 'sphinx_rtd_theme'
breathe_projects = { "libbpf": "./sphinx/doxygen/build/xml/" }
breathe_default_project = "libbpf"
breathe_show_define_initializer = True
breathe_show_enumvalue_initializer = True

22
docs/index.rst Normal file
View File

@@ -0,0 +1,22 @@
.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
libbpf
======
For API documentation see the `versioned API documentation site <https://libbpf.readthedocs.io/en/latest/api.html>`_.
.. toctree::
:maxdepth: 1
libbpf_naming_convention
libbpf_build
This is documentation for libbpf, a userspace library for loading and
interacting with bpf programs.
All general BPF questions, including kernel functionality, libbpf APIs and
their application, should be sent to bpf@vger.kernel.org mailing list.
You can `subscribe <http://vger.kernel.org/vger-lists.html#bpf>`_ to the
mailing list search its `archive <https://lore.kernel.org/bpf/>`_.
Please search the archive before asking new questions. It very well might
be that this was already addressed or answered before.

37
docs/libbpf_build.rst Normal file
View File

@@ -0,0 +1,37 @@
.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
Building libbpf
===============
libelf and zlib are internal dependencies of libbpf and thus are required to link
against and must be installed on the system for applications to work.
pkg-config is used by default to find libelf, and the program called
can be overridden with PKG_CONFIG.
If using pkg-config at build time is not desired, it can be disabled by
setting NO_PKG_CONFIG=1 when calling make.
To build both static libbpf.a and shared libbpf.so:
.. code-block:: bash
$ cd src
$ make
To build only static libbpf.a library in directory build/ and install them
together with libbpf headers in a staging directory root/:
.. code-block:: bash
$ cd src
$ mkdir build root
$ BUILD_STATIC_ONLY=y OBJDIR=build DESTDIR=root make install
To build both static libbpf.a and shared libbpf.so against a custom libelf
dependency installed in /build/root/ and install them together with libbpf
headers in a build directory /build/root/:
.. code-block:: bash
$ cd src
$ PKG_CONFIG_PATH=/build/root/lib64/pkgconfig DESTDIR=/build/root make

View File

@@ -1,7 +1,7 @@
.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
libbpf API naming convention
============================
API naming convention
=====================
libbpf API provides access to a few logically separated groups of
functions and types. Every group has its own naming convention
@@ -10,14 +10,14 @@ new function or type is added to keep libbpf API clean and consistent.
All types and functions provided by libbpf API should have one of the
following prefixes: ``bpf_``, ``btf_``, ``libbpf_``, ``xsk_``,
``perf_buffer_``.
``btf_dump_``, ``ring_buffer_``, ``perf_buffer_``.
System call wrappers
--------------------
System call wrappers are simple wrappers for commands supported by
sys_bpf system call. These wrappers should go to ``bpf.h`` header file
and map one-on-one to corresponding commands.
and map one to one to corresponding commands.
For example ``bpf_map_lookup_elem`` wraps ``BPF_MAP_LOOKUP_ELEM``
command of sys_bpf, ``bpf_prog_attach`` wraps ``BPF_PROG_ATTACH``, etc.
@@ -49,10 +49,6 @@ object, ``bpf_object``, double underscore and ``open`` that defines the
purpose of the function to open ELF file and create ``bpf_object`` from
it.
Another example: ``bpf_program__load`` is named for corresponding
object, ``bpf_program``, that is separated from other part of the name
by double underscore.
All objects and corresponding functions other than BTF related should go
to ``libbpf.h``. BTF types and functions should go to ``btf.h``.
@@ -72,12 +68,8 @@ of both low-level ring access functions and high-level configuration
functions. These can be mixed and matched. Note that these functions
are not reentrant for performance reasons.
Please take a look at Documentation/networking/af_xdp.rst in the Linux
kernel source tree on how to use XDP sockets and for some common
mistakes in case you do not get any traffic up to user space.
libbpf ABI
==========
ABI
---
libbpf can be both linked statically or used as DSO. To avoid possible
conflicts with other libraries an application is linked with, all
@@ -116,7 +108,8 @@ This bump in ABI version is at most once per kernel development cycle.
For example, if current state of ``libbpf.map`` is:
.. code-block::
.. code-block:: none
LIBBPF_0.0.1 {
global:
bpf_func_a;
@@ -128,7 +121,8 @@ For example, if current state of ``libbpf.map`` is:
, and a new symbol ``bpf_func_c`` is being introduced, then
``libbpf.map`` should be changed like this:
.. code-block::
.. code-block:: none
LIBBPF_0.0.1 {
global:
bpf_func_a;
@@ -148,7 +142,7 @@ Format of version script and ways to handle ABI changes, including
incompatible ones, described in details in [1].
Stand-alone build
=================
-------------------
Under https://github.com/libbpf/libbpf there is a (semi-)automated
mirror of the mainline's version of libbpf for a stand-alone build.
@@ -156,13 +150,53 @@ mirror of the mainline's version of libbpf for a stand-alone build.
However, all changes to libbpf's code base must be upstreamed through
the mainline kernel tree.
API documentation convention
============================
The libbpf API is documented via comments above definitions in
header files. These comments can be rendered by doxygen and sphinx
for well organized html output. This section describes the
convention in which these comments should be formated.
Here is an example from btf.h:
.. code-block:: c
/**
* @brief **btf__new()** creates a new instance of a BTF object from the raw
* bytes of an ELF's BTF section
* @param data raw bytes
* @param size number of bytes passed in `data`
* @return new BTF object instance which has to be eventually freed with
* **btf__free()**
*
* On error, error-code-encoded-as-pointer is returned, not a NULL. To extract
* error code from such a pointer `libbpf_get_error()` should be used. If
* `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is
* returned on error instead. In both cases thread-local `errno` variable is
* always set to error code as well.
*/
The comment must start with a block comment of the form '/\*\*'.
The documentation always starts with a @brief directive. This line is a short
description about this API. It starts with the name of the API, denoted in bold
like so: **api_name**. Please include an open and close parenthesis if this is a
function. Follow with the short description of the API. A longer form description
can be added below the last directive, at the bottom of the comment.
Parameters are denoted with the @param directive, there should be one for each
parameter. If this is a function with a non-void return, use the @return directive
to document it.
License
=======
-------------------
libbpf is dual-licensed under LGPL 2.1 and BSD 2-Clause.
Links
=====
-------------------
[1] https://www.akkadia.org/drepper/dsohowto.pdf
(Chapter 3. Maintaining APIs and ABIs).

9
docs/sphinx/Makefile Normal file
View File

@@ -0,0 +1,9 @@
SPHINXBUILD ?= sphinx-build
SOURCEDIR = ../src
BUILDDIR = build
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)"
%:
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)"

View File

@@ -0,0 +1,277 @@
DOXYFILE_ENCODING = UTF-8
PROJECT_NAME = "libbpf"
PROJECT_NUMBER =
PROJECT_BRIEF =
PROJECT_LOGO =
OUTPUT_DIRECTORY = ./build
CREATE_SUBDIRS = NO
ALLOW_UNICODE_NAMES = NO
OUTPUT_LANGUAGE = English
OUTPUT_TEXT_DIRECTION = None
BRIEF_MEMBER_DESC = YES
REPEAT_BRIEF = YES
ALWAYS_DETAILED_SEC = NO
INLINE_INHERITED_MEMB = NO
FULL_PATH_NAMES = YES
STRIP_FROM_PATH =
STRIP_FROM_INC_PATH =
SHORT_NAMES = NO
JAVADOC_AUTOBRIEF = NO
JAVADOC_BANNER = NO
QT_AUTOBRIEF = NO
MULTILINE_CPP_IS_BRIEF = NO
PYTHON_DOCSTRING = NO
INHERIT_DOCS = YES
SEPARATE_MEMBER_PAGES = NO
TAB_SIZE = 4
ALIASES =
OPTIMIZE_OUTPUT_FOR_C = YES
OPTIMIZE_OUTPUT_JAVA = NO
OPTIMIZE_FOR_FORTRAN = NO
OPTIMIZE_OUTPUT_VHDL = NO
OPTIMIZE_OUTPUT_SLICE = NO
EXTENSION_MAPPING =
MARKDOWN_SUPPORT = YES
TOC_INCLUDE_HEADINGS = 5
AUTOLINK_SUPPORT = YES
BUILTIN_STL_SUPPORT = NO
CPP_CLI_SUPPORT = NO
SIP_SUPPORT = NO
IDL_PROPERTY_SUPPORT = YES
DISTRIBUTE_GROUP_DOC = NO
GROUP_NESTED_COMPOUNDS = NO
SUBGROUPING = YES
INLINE_GROUPED_CLASSES = NO
INLINE_SIMPLE_STRUCTS = NO
TYPEDEF_HIDES_STRUCT = NO
LOOKUP_CACHE_SIZE = 0
NUM_PROC_THREADS = 1
EXTRACT_ALL = NO
EXTRACT_PRIVATE = NO
EXTRACT_PRIV_VIRTUAL = NO
EXTRACT_PACKAGE = NO
EXTRACT_STATIC = NO
EXTRACT_LOCAL_CLASSES = YES
EXTRACT_LOCAL_METHODS = NO
EXTRACT_ANON_NSPACES = NO
RESOLVE_UNNAMED_PARAMS = YES
HIDE_UNDOC_MEMBERS = NO
HIDE_UNDOC_CLASSES = NO
HIDE_FRIEND_COMPOUNDS = NO
HIDE_IN_BODY_DOCS = NO
INTERNAL_DOCS = NO
CASE_SENSE_NAMES = YES
HIDE_SCOPE_NAMES = NO
HIDE_COMPOUND_REFERENCE= NO
SHOW_INCLUDE_FILES = YES
SHOW_GROUPED_MEMB_INC = NO
FORCE_LOCAL_INCLUDES = NO
INLINE_INFO = YES
SORT_MEMBER_DOCS = YES
SORT_BRIEF_DOCS = NO
SORT_MEMBERS_CTORS_1ST = NO
SORT_GROUP_NAMES = NO
SORT_BY_SCOPE_NAME = NO
STRICT_PROTO_MATCHING = NO
GENERATE_TODOLIST = YES
GENERATE_TESTLIST = YES
GENERATE_BUGLIST = YES
GENERATE_DEPRECATEDLIST= YES
ENABLED_SECTIONS =
MAX_INITIALIZER_LINES = 30
SHOW_USED_FILES = YES
SHOW_FILES = YES
SHOW_NAMESPACES = YES
FILE_VERSION_FILTER =
LAYOUT_FILE =
CITE_BIB_FILES =
QUIET = NO
WARNINGS = YES
WARN_IF_UNDOCUMENTED = YES
WARN_IF_DOC_ERROR = YES
WARN_NO_PARAMDOC = NO
WARN_AS_ERROR = NO
WARN_FORMAT = "$file:$line: $text"
WARN_LOGFILE =
INPUT = ../../../src
INPUT_ENCODING = UTF-8
FILE_PATTERNS = *.c \
*.h
RECURSIVE = NO
EXCLUDE =
EXCLUDE_SYMLINKS = NO
EXCLUDE_PATTERNS =
EXCLUDE_SYMBOLS = ___*
EXAMPLE_PATH =
EXAMPLE_PATTERNS = *
EXAMPLE_RECURSIVE = NO
IMAGE_PATH =
INPUT_FILTER =
FILTER_PATTERNS =
FILTER_SOURCE_FILES = NO
FILTER_SOURCE_PATTERNS =
USE_MDFILE_AS_MAINPAGE = YES
SOURCE_BROWSER = NO
INLINE_SOURCES = NO
STRIP_CODE_COMMENTS = YES
REFERENCED_BY_RELATION = NO
REFERENCES_RELATION = NO
REFERENCES_LINK_SOURCE = YES
SOURCE_TOOLTIPS = YES
USE_HTAGS = NO
VERBATIM_HEADERS = YES
ALPHABETICAL_INDEX = YES
IGNORE_PREFIX =
GENERATE_HTML = NO
HTML_OUTPUT = html
HTML_FILE_EXTENSION = .html
HTML_HEADER =
HTML_FOOTER =
HTML_STYLESHEET =
HTML_EXTRA_STYLESHEET =
HTML_EXTRA_FILES =
HTML_COLORSTYLE_HUE = 220
HTML_COLORSTYLE_SAT = 100
HTML_COLORSTYLE_GAMMA = 80
HTML_TIMESTAMP = NO
HTML_DYNAMIC_MENUS = YES
HTML_DYNAMIC_SECTIONS = NO
HTML_INDEX_NUM_ENTRIES = 100
GENERATE_DOCSET = NO
DOCSET_FEEDNAME = "Doxygen generated docs"
DOCSET_BUNDLE_ID = org.doxygen.Project
DOCSET_PUBLISHER_ID = org.doxygen.Publisher
DOCSET_PUBLISHER_NAME = Publisher
GENERATE_HTMLHELP = NO
CHM_FILE =
HHC_LOCATION =
GENERATE_CHI = NO
CHM_INDEX_ENCODING =
BINARY_TOC = NO
TOC_EXPAND = NO
GENERATE_QHP = NO
QCH_FILE =
QHP_NAMESPACE = org.doxygen.Project
QHP_VIRTUAL_FOLDER = doc
QHP_CUST_FILTER_NAME =
QHP_CUST_FILTER_ATTRS =
QHP_SECT_FILTER_ATTRS =
QHG_LOCATION =
GENERATE_ECLIPSEHELP = NO
ECLIPSE_DOC_ID = org.doxygen.Project
DISABLE_INDEX = NO
GENERATE_TREEVIEW = NO
ENUM_VALUES_PER_LINE = 4
TREEVIEW_WIDTH = 250
EXT_LINKS_IN_WINDOW = NO
HTML_FORMULA_FORMAT = png
FORMULA_FONTSIZE = 10
FORMULA_TRANSPARENT = YES
FORMULA_MACROFILE =
USE_MATHJAX = NO
MATHJAX_FORMAT = HTML-CSS
MATHJAX_RELPATH = https://cdn.jsdelivr.net/npm/mathjax@2
MATHJAX_EXTENSIONS =
MATHJAX_CODEFILE =
SEARCHENGINE = YES
SERVER_BASED_SEARCH = NO
EXTERNAL_SEARCH = NO
SEARCHENGINE_URL =
SEARCHDATA_FILE = searchdata.xml
EXTERNAL_SEARCH_ID =
EXTRA_SEARCH_MAPPINGS =
GENERATE_LATEX = NO
LATEX_OUTPUT = latex
LATEX_CMD_NAME =
MAKEINDEX_CMD_NAME = makeindex
LATEX_MAKEINDEX_CMD = makeindex
COMPACT_LATEX = NO
PAPER_TYPE = a4
EXTRA_PACKAGES =
LATEX_HEADER =
LATEX_FOOTER =
LATEX_EXTRA_STYLESHEET =
LATEX_EXTRA_FILES =
PDF_HYPERLINKS = YES
USE_PDFLATEX = YES
LATEX_BATCHMODE = NO
LATEX_HIDE_INDICES = NO
LATEX_SOURCE_CODE = NO
LATEX_BIB_STYLE = plain
LATEX_TIMESTAMP = NO
LATEX_EMOJI_DIRECTORY =
GENERATE_RTF = NO
RTF_OUTPUT = rtf
COMPACT_RTF = NO
RTF_HYPERLINKS = NO
RTF_STYLESHEET_FILE =
RTF_EXTENSIONS_FILE =
RTF_SOURCE_CODE = NO
GENERATE_MAN = NO
MAN_OUTPUT = man
MAN_EXTENSION = .3
MAN_SUBDIR =
MAN_LINKS = NO
GENERATE_XML = YES
XML_OUTPUT = xml
XML_PROGRAMLISTING = YES
XML_NS_MEMB_FILE_SCOPE = NO
GENERATE_DOCBOOK = NO
DOCBOOK_OUTPUT = docbook
DOCBOOK_PROGRAMLISTING = NO
GENERATE_AUTOGEN_DEF = NO
GENERATE_PERLMOD = NO
PERLMOD_LATEX = NO
PERLMOD_PRETTY = YES
PERLMOD_MAKEVAR_PREFIX =
ENABLE_PREPROCESSING = YES
MACRO_EXPANSION = NO
EXPAND_ONLY_PREDEF = YES
SEARCH_INCLUDES = YES
INCLUDE_PATH =
INCLUDE_FILE_PATTERNS =
PREDEFINED =
EXPAND_AS_DEFINED =
SKIP_FUNCTION_MACROS = NO
TAGFILES =
GENERATE_TAGFILE =
ALLEXTERNALS = NO
EXTERNAL_GROUPS = YES
EXTERNAL_PAGES = YES
CLASS_DIAGRAMS = YES
DIA_PATH =
HIDE_UNDOC_RELATIONS = YES
HAVE_DOT = NO
DOT_NUM_THREADS = 0
DOT_FONTNAME = Helvetica
DOT_FONTSIZE = 10
DOT_FONTPATH =
CLASS_GRAPH = YES
COLLABORATION_GRAPH = YES
GROUP_GRAPHS = YES
UML_LOOK = NO
UML_LIMIT_NUM_FIELDS = 10
DOT_UML_DETAILS = NO
DOT_WRAP_THRESHOLD = 17
TEMPLATE_RELATIONS = NO
INCLUDE_GRAPH = YES
INCLUDED_BY_GRAPH = YES
CALL_GRAPH = NO
CALLER_GRAPH = NO
GRAPHICAL_HIERARCHY = YES
DIRECTORY_GRAPH = YES
DOT_IMAGE_FORMAT = png
INTERACTIVE_SVG = NO
DOT_PATH =
DOTFILE_DIRS =
MSCFILE_DIRS =
DIAFILE_DIRS =
PLANTUML_JAR_PATH =
PLANTUML_CFG_FILE =
PLANTUML_INCLUDE_PATH =
DOT_GRAPH_MAX_NODES = 50
MAX_DOT_GRAPH_DEPTH = 0
DOT_TRANSPARENT = NO
DOT_MULTI_TARGETS = NO
GENERATE_LEGEND = YES
DOT_CLEANUP = YES

View File

@@ -0,0 +1 @@
breathe

View File

@@ -5,6 +5,14 @@
#include <linux/bpf.h>
#define BPF_RAW_INSN(CODE, DST, SRC, OFF, IMM) \
((struct bpf_insn) { \
.code = CODE, \
.dst_reg = DST, \
.src_reg = SRC, \
.off = OFF, \
.imm = IMM })
#define BPF_ALU64_IMM(OP, DST, IMM) \
((struct bpf_insn) { \
.code = BPF_ALU64 | BPF_OP(OP) | BPF_K, \
@@ -107,5 +115,12 @@
.off = OFF, \
.imm = IMM })
#define BPF_JMP32_IMM(OP, DST, IMM, OFF) \
((struct bpf_insn) { \
.code = BPF_JMP32 | BPF_OP(OP) | BPF_K, \
.dst_reg = DST, \
.src_reg = 0, \
.off = OFF, \
.imm = IMM })
#endif

View File

@@ -72,11 +72,20 @@ static inline void list_del(struct list_head *entry)
entry->prev = LIST_POISON2;
}
static inline int list_empty(const struct list_head *head)
{
return head->next == head;
}
#define list_entry(ptr, type, member) \
container_of(ptr, type, member)
#define list_first_entry(ptr, type, member) \
list_entry((ptr)->next, type, member)
#define list_next_entry(pos, member) \
list_entry((pos)->member.next, typeof(*(pos)), member)
#define list_for_each_entry(pos, head, member) \
for (pos = list_first_entry(head, typeof(*pos), member); \
&pos->member != (head); \
pos = list_next_entry(pos, member))
#endif

View File

@@ -1,20 +0,0 @@
// SPDX-License-Identifier: (LGPL-2.0+ OR BSD-2-Clause)
/* Copyright (C) 2018 Netronome Systems, Inc. */
#ifndef __TOOLS_LIBC_COMPAT_H
#define __TOOLS_LIBC_COMPAT_H
#include <stdlib.h>
#include <linux/overflow.h>
#ifdef COMPAT_NEED_REALLOCARRAY
static inline void *reallocarray(void *ptr, size_t nmemb, size_t size)
{
size_t bytes;
if (unlikely(check_mul_overflow(nmemb, size, &bytes)))
return NULL;
return realloc(ptr, bytes);
}
#endif
#endif

File diff suppressed because it is too large Load Diff

View File

@@ -22,9 +22,9 @@ struct btf_header {
};
/* Max # of type identifier */
#define BTF_MAX_TYPE 0x0000ffff
#define BTF_MAX_TYPE 0x000fffff
/* Max offset into the string section */
#define BTF_MAX_NAME_OFFSET 0x0000ffff
#define BTF_MAX_NAME_OFFSET 0x00ffffff
/* Max # of struct/union/enum members or func args */
#define BTF_MAX_VLEN 0xffff
@@ -43,7 +43,7 @@ struct btf_type {
* "size" tells the size of the type it is describing.
*
* "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
* FUNC, FUNC_PROTO and VAR.
* FUNC, FUNC_PROTO, VAR and TAG.
* "type" is a type_id referring to another type.
*/
union {
@@ -52,28 +52,33 @@ struct btf_type {
};
};
#define BTF_INFO_KIND(info) (((info) >> 24) & 0x0f)
#define BTF_INFO_KIND(info) (((info) >> 24) & 0x1f)
#define BTF_INFO_VLEN(info) ((info) & 0xffff)
#define BTF_INFO_KFLAG(info) ((info) >> 31)
#define BTF_KIND_UNKN 0 /* Unknown */
#define BTF_KIND_INT 1 /* Integer */
#define BTF_KIND_PTR 2 /* Pointer */
#define BTF_KIND_ARRAY 3 /* Array */
#define BTF_KIND_STRUCT 4 /* Struct */
#define BTF_KIND_UNION 5 /* Union */
#define BTF_KIND_ENUM 6 /* Enumeration */
#define BTF_KIND_FWD 7 /* Forward */
#define BTF_KIND_TYPEDEF 8 /* Typedef */
#define BTF_KIND_VOLATILE 9 /* Volatile */
#define BTF_KIND_CONST 10 /* Const */
#define BTF_KIND_RESTRICT 11 /* Restrict */
#define BTF_KIND_FUNC 12 /* Function */
#define BTF_KIND_FUNC_PROTO 13 /* Function Proto */
#define BTF_KIND_VAR 14 /* Variable */
#define BTF_KIND_DATASEC 15 /* Section */
#define BTF_KIND_MAX BTF_KIND_DATASEC
#define NR_BTF_KINDS (BTF_KIND_MAX + 1)
enum {
BTF_KIND_UNKN = 0, /* Unknown */
BTF_KIND_INT = 1, /* Integer */
BTF_KIND_PTR = 2, /* Pointer */
BTF_KIND_ARRAY = 3, /* Array */
BTF_KIND_STRUCT = 4, /* Struct */
BTF_KIND_UNION = 5, /* Union */
BTF_KIND_ENUM = 6, /* Enumeration */
BTF_KIND_FWD = 7, /* Forward */
BTF_KIND_TYPEDEF = 8, /* Typedef */
BTF_KIND_VOLATILE = 9, /* Volatile */
BTF_KIND_CONST = 10, /* Const */
BTF_KIND_RESTRICT = 11, /* Restrict */
BTF_KIND_FUNC = 12, /* Function */
BTF_KIND_FUNC_PROTO = 13, /* Function Proto */
BTF_KIND_VAR = 14, /* Variable */
BTF_KIND_DATASEC = 15, /* Section */
BTF_KIND_FLOAT = 16, /* Floating point */
BTF_KIND_TAG = 17, /* Tag */
NR_BTF_KINDS,
BTF_KIND_MAX = NR_BTF_KINDS - 1,
};
/* For some specific BTF_KIND, "struct btf_type" is immediately
* followed by extra data.
@@ -142,7 +147,14 @@ struct btf_param {
enum {
BTF_VAR_STATIC = 0,
BTF_VAR_GLOBAL_ALLOCATED,
BTF_VAR_GLOBAL_ALLOCATED = 1,
BTF_VAR_GLOBAL_EXTERN = 2,
};
enum btf_func_linkage {
BTF_FUNC_STATIC = 0,
BTF_FUNC_GLOBAL = 1,
BTF_FUNC_EXTERN = 2,
};
/* BTF_KIND_VAR is followed by a single "struct btf_var" to describe
@@ -162,4 +174,15 @@ struct btf_var_secinfo {
__u32 size;
};
/* BTF_KIND_TAG is followed by a single "struct btf_tag" to describe
* additional information related to the tag applied location.
* If component_idx == -1, the tag is applied to a struct, union,
* variable or function. Otherwise, it is applied to a struct/union
* member or a func argument, and component_idx indicates which member
* or argument (0 ... vlen-1).
*/
struct btf_tag {
__s32 component_idx;
};
#endif /* _UAPI__LINUX_BTF_H__ */

View File

@@ -167,6 +167,9 @@ enum {
IFLA_NEW_IFINDEX,
IFLA_MIN_MTU,
IFLA_MAX_MTU,
IFLA_PROP_LIST,
IFLA_ALT_IFNAME, /* Alternative ifname */
IFLA_PERM_ADDRESS,
__IFLA_MAX
};
@@ -227,6 +230,7 @@ enum {
IFLA_INET6_ICMP6STATS, /* statistics (icmpv6) */
IFLA_INET6_TOKEN, /* device token */
IFLA_INET6_ADDR_GEN_MODE, /* implicit address generator mode */
IFLA_INET6_RA_MTU, /* mtu carried in the RA message */
__IFLA_INET6_MAX
};
@@ -340,6 +344,8 @@ enum {
IFLA_BRPORT_NEIGH_SUPPRESS,
IFLA_BRPORT_ISOLATED,
IFLA_BRPORT_BACKUP_PORT,
IFLA_BRPORT_MRP_RING_OPEN,
IFLA_BRPORT_MRP_IN_OPEN,
__IFLA_BRPORT_MAX
};
#define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
@@ -404,6 +410,8 @@ enum {
IFLA_MACVLAN_MACADDR,
IFLA_MACVLAN_MACADDR_DATA,
IFLA_MACVLAN_MACADDR_COUNT,
IFLA_MACVLAN_BC_QUEUE_LEN,
IFLA_MACVLAN_BC_QUEUE_LEN_USED,
__IFLA_MACVLAN_MAX,
};
@@ -460,6 +468,7 @@ enum {
IFLA_MACSEC_REPLAY_PROTECT,
IFLA_MACSEC_VALIDATION,
IFLA_MACSEC_PAD,
IFLA_MACSEC_OFFLOAD,
__IFLA_MACSEC_MAX,
};
@@ -483,6 +492,14 @@ enum macsec_validation_type {
MACSEC_VALIDATE_MAX = __MACSEC_VALIDATE_END - 1,
};
enum macsec_offload {
MACSEC_OFFLOAD_OFF = 0,
MACSEC_OFFLOAD_PHY = 1,
MACSEC_OFFLOAD_MAC = 2,
__MACSEC_OFFLOAD_END,
MACSEC_OFFLOAD_MAX = __MACSEC_OFFLOAD_END - 1,
};
/* IPVLAN section */
enum {
IFLA_IPVLAN_UNSPEC,
@@ -637,6 +654,7 @@ enum {
IFLA_BOND_AD_ACTOR_SYSTEM,
IFLA_BOND_TLB_DYNAMIC_LB,
IFLA_BOND_PEER_NOTIF_DELAY,
IFLA_BOND_AD_LACP_ACTIVE,
__IFLA_BOND_MAX,
};
@@ -950,11 +968,12 @@ enum {
#define XDP_FLAGS_SKB_MODE (1U << 1)
#define XDP_FLAGS_DRV_MODE (1U << 2)
#define XDP_FLAGS_HW_MODE (1U << 3)
#define XDP_FLAGS_REPLACE (1U << 4)
#define XDP_FLAGS_MODES (XDP_FLAGS_SKB_MODE | \
XDP_FLAGS_DRV_MODE | \
XDP_FLAGS_HW_MODE)
#define XDP_FLAGS_MASK (XDP_FLAGS_UPDATE_IF_NOEXIST | \
XDP_FLAGS_MODES)
XDP_FLAGS_MODES | XDP_FLAGS_REPLACE)
/* These are stored into IFLA_XDP_ATTACHED on dump. */
enum {
@@ -974,6 +993,7 @@ enum {
IFLA_XDP_DRV_PROG_ID,
IFLA_XDP_SKB_PROG_ID,
IFLA_XDP_HW_PROG_ID,
IFLA_XDP_EXPECTED_FD,
__IFLA_XDP_MAX,
};

View File

@@ -73,9 +73,12 @@ struct xdp_umem_reg {
};
struct xdp_statistics {
__u64 rx_dropped; /* Dropped for reasons other than invalid desc */
__u64 rx_dropped; /* Dropped for other reasons */
__u64 rx_invalid_descs; /* Dropped due to invalid descriptor */
__u64 tx_invalid_descs; /* Dropped due to invalid descriptor */
__u64 rx_ring_full; /* Dropped due to rx ring being full */
__u64 rx_fill_ring_empty_descs; /* Failed to retrieve item from fill ring */
__u64 tx_ring_empty_descs; /* Failed to retrieve item from tx ring */
};
struct xdp_options {

View File

@@ -0,0 +1,612 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
#ifndef __LINUX_PKT_CLS_H
#define __LINUX_PKT_CLS_H
#include <linux/types.h>
#include <linux/pkt_sched.h>
#define TC_COOKIE_MAX_SIZE 16
/* Action attributes */
enum {
TCA_ACT_UNSPEC,
TCA_ACT_KIND,
TCA_ACT_OPTIONS,
TCA_ACT_INDEX,
TCA_ACT_STATS,
TCA_ACT_PAD,
TCA_ACT_COOKIE,
__TCA_ACT_MAX
};
#define TCA_ACT_MAX __TCA_ACT_MAX
#define TCA_OLD_COMPAT (TCA_ACT_MAX+1)
#define TCA_ACT_MAX_PRIO 32
#define TCA_ACT_BIND 1
#define TCA_ACT_NOBIND 0
#define TCA_ACT_UNBIND 1
#define TCA_ACT_NOUNBIND 0
#define TCA_ACT_REPLACE 1
#define TCA_ACT_NOREPLACE 0
#define TC_ACT_UNSPEC (-1)
#define TC_ACT_OK 0
#define TC_ACT_RECLASSIFY 1
#define TC_ACT_SHOT 2
#define TC_ACT_PIPE 3
#define TC_ACT_STOLEN 4
#define TC_ACT_QUEUED 5
#define TC_ACT_REPEAT 6
#define TC_ACT_REDIRECT 7
#define TC_ACT_TRAP 8 /* For hw path, this means "trap to cpu"
* and don't further process the frame
* in hardware. For sw path, this is
* equivalent of TC_ACT_STOLEN - drop
* the skb and act like everything
* is alright.
*/
#define TC_ACT_VALUE_MAX TC_ACT_TRAP
/* There is a special kind of actions called "extended actions",
* which need a value parameter. These have a local opcode located in
* the highest nibble, starting from 1. The rest of the bits
* are used to carry the value. These two parts together make
* a combined opcode.
*/
#define __TC_ACT_EXT_SHIFT 28
#define __TC_ACT_EXT(local) ((local) << __TC_ACT_EXT_SHIFT)
#define TC_ACT_EXT_VAL_MASK ((1 << __TC_ACT_EXT_SHIFT) - 1)
#define TC_ACT_EXT_OPCODE(combined) ((combined) & (~TC_ACT_EXT_VAL_MASK))
#define TC_ACT_EXT_CMP(combined, opcode) (TC_ACT_EXT_OPCODE(combined) == opcode)
#define TC_ACT_JUMP __TC_ACT_EXT(1)
#define TC_ACT_GOTO_CHAIN __TC_ACT_EXT(2)
#define TC_ACT_EXT_OPCODE_MAX TC_ACT_GOTO_CHAIN
/* Action type identifiers*/
enum {
TCA_ID_UNSPEC=0,
TCA_ID_POLICE=1,
/* other actions go here */
__TCA_ID_MAX=255
};
#define TCA_ID_MAX __TCA_ID_MAX
struct tc_police {
__u32 index;
int action;
#define TC_POLICE_UNSPEC TC_ACT_UNSPEC
#define TC_POLICE_OK TC_ACT_OK
#define TC_POLICE_RECLASSIFY TC_ACT_RECLASSIFY
#define TC_POLICE_SHOT TC_ACT_SHOT
#define TC_POLICE_PIPE TC_ACT_PIPE
__u32 limit;
__u32 burst;
__u32 mtu;
struct tc_ratespec rate;
struct tc_ratespec peakrate;
int refcnt;
int bindcnt;
__u32 capab;
};
struct tcf_t {
__u64 install;
__u64 lastuse;
__u64 expires;
__u64 firstuse;
};
struct tc_cnt {
int refcnt;
int bindcnt;
};
#define tc_gen \
__u32 index; \
__u32 capab; \
int action; \
int refcnt; \
int bindcnt
enum {
TCA_POLICE_UNSPEC,
TCA_POLICE_TBF,
TCA_POLICE_RATE,
TCA_POLICE_PEAKRATE,
TCA_POLICE_AVRATE,
TCA_POLICE_RESULT,
TCA_POLICE_TM,
TCA_POLICE_PAD,
__TCA_POLICE_MAX
#define TCA_POLICE_RESULT TCA_POLICE_RESULT
};
#define TCA_POLICE_MAX (__TCA_POLICE_MAX - 1)
/* tca flags definitions */
#define TCA_CLS_FLAGS_SKIP_HW (1 << 0) /* don't offload filter to HW */
#define TCA_CLS_FLAGS_SKIP_SW (1 << 1) /* don't use filter in SW */
#define TCA_CLS_FLAGS_IN_HW (1 << 2) /* filter is offloaded to HW */
#define TCA_CLS_FLAGS_NOT_IN_HW (1 << 3) /* filter isn't offloaded to HW */
#define TCA_CLS_FLAGS_VERBOSE (1 << 4) /* verbose logging */
/* U32 filters */
#define TC_U32_HTID(h) ((h)&0xFFF00000)
#define TC_U32_USERHTID(h) (TC_U32_HTID(h)>>20)
#define TC_U32_HASH(h) (((h)>>12)&0xFF)
#define TC_U32_NODE(h) ((h)&0xFFF)
#define TC_U32_KEY(h) ((h)&0xFFFFF)
#define TC_U32_UNSPEC 0
#define TC_U32_ROOT (0xFFF00000)
enum {
TCA_U32_UNSPEC,
TCA_U32_CLASSID,
TCA_U32_HASH,
TCA_U32_LINK,
TCA_U32_DIVISOR,
TCA_U32_SEL,
TCA_U32_POLICE,
TCA_U32_ACT,
TCA_U32_INDEV,
TCA_U32_PCNT,
TCA_U32_MARK,
TCA_U32_FLAGS,
TCA_U32_PAD,
__TCA_U32_MAX
};
#define TCA_U32_MAX (__TCA_U32_MAX - 1)
struct tc_u32_key {
__be32 mask;
__be32 val;
int off;
int offmask;
};
struct tc_u32_sel {
unsigned char flags;
unsigned char offshift;
unsigned char nkeys;
__be16 offmask;
__u16 off;
short offoff;
short hoff;
__be32 hmask;
struct tc_u32_key keys[0];
};
struct tc_u32_mark {
__u32 val;
__u32 mask;
__u32 success;
};
struct tc_u32_pcnt {
__u64 rcnt;
__u64 rhit;
__u64 kcnts[0];
};
/* Flags */
#define TC_U32_TERMINAL 1
#define TC_U32_OFFSET 2
#define TC_U32_VAROFFSET 4
#define TC_U32_EAT 8
#define TC_U32_MAXDEPTH 8
/* RSVP filter */
enum {
TCA_RSVP_UNSPEC,
TCA_RSVP_CLASSID,
TCA_RSVP_DST,
TCA_RSVP_SRC,
TCA_RSVP_PINFO,
TCA_RSVP_POLICE,
TCA_RSVP_ACT,
__TCA_RSVP_MAX
};
#define TCA_RSVP_MAX (__TCA_RSVP_MAX - 1 )
struct tc_rsvp_gpi {
__u32 key;
__u32 mask;
int offset;
};
struct tc_rsvp_pinfo {
struct tc_rsvp_gpi dpi;
struct tc_rsvp_gpi spi;
__u8 protocol;
__u8 tunnelid;
__u8 tunnelhdr;
__u8 pad;
};
/* ROUTE filter */
enum {
TCA_ROUTE4_UNSPEC,
TCA_ROUTE4_CLASSID,
TCA_ROUTE4_TO,
TCA_ROUTE4_FROM,
TCA_ROUTE4_IIF,
TCA_ROUTE4_POLICE,
TCA_ROUTE4_ACT,
__TCA_ROUTE4_MAX
};
#define TCA_ROUTE4_MAX (__TCA_ROUTE4_MAX - 1)
/* FW filter */
enum {
TCA_FW_UNSPEC,
TCA_FW_CLASSID,
TCA_FW_POLICE,
TCA_FW_INDEV,
TCA_FW_ACT, /* used by CONFIG_NET_CLS_ACT */
TCA_FW_MASK,
__TCA_FW_MAX
};
#define TCA_FW_MAX (__TCA_FW_MAX - 1)
/* TC index filter */
enum {
TCA_TCINDEX_UNSPEC,
TCA_TCINDEX_HASH,
TCA_TCINDEX_MASK,
TCA_TCINDEX_SHIFT,
TCA_TCINDEX_FALL_THROUGH,
TCA_TCINDEX_CLASSID,
TCA_TCINDEX_POLICE,
TCA_TCINDEX_ACT,
__TCA_TCINDEX_MAX
};
#define TCA_TCINDEX_MAX (__TCA_TCINDEX_MAX - 1)
/* Flow filter */
enum {
FLOW_KEY_SRC,
FLOW_KEY_DST,
FLOW_KEY_PROTO,
FLOW_KEY_PROTO_SRC,
FLOW_KEY_PROTO_DST,
FLOW_KEY_IIF,
FLOW_KEY_PRIORITY,
FLOW_KEY_MARK,
FLOW_KEY_NFCT,
FLOW_KEY_NFCT_SRC,
FLOW_KEY_NFCT_DST,
FLOW_KEY_NFCT_PROTO_SRC,
FLOW_KEY_NFCT_PROTO_DST,
FLOW_KEY_RTCLASSID,
FLOW_KEY_SKUID,
FLOW_KEY_SKGID,
FLOW_KEY_VLAN_TAG,
FLOW_KEY_RXHASH,
__FLOW_KEY_MAX,
};
#define FLOW_KEY_MAX (__FLOW_KEY_MAX - 1)
enum {
FLOW_MODE_MAP,
FLOW_MODE_HASH,
};
enum {
TCA_FLOW_UNSPEC,
TCA_FLOW_KEYS,
TCA_FLOW_MODE,
TCA_FLOW_BASECLASS,
TCA_FLOW_RSHIFT,
TCA_FLOW_ADDEND,
TCA_FLOW_MASK,
TCA_FLOW_XOR,
TCA_FLOW_DIVISOR,
TCA_FLOW_ACT,
TCA_FLOW_POLICE,
TCA_FLOW_EMATCHES,
TCA_FLOW_PERTURB,
__TCA_FLOW_MAX
};
#define TCA_FLOW_MAX (__TCA_FLOW_MAX - 1)
/* Basic filter */
enum {
TCA_BASIC_UNSPEC,
TCA_BASIC_CLASSID,
TCA_BASIC_EMATCHES,
TCA_BASIC_ACT,
TCA_BASIC_POLICE,
__TCA_BASIC_MAX
};
#define TCA_BASIC_MAX (__TCA_BASIC_MAX - 1)
/* Cgroup classifier */
enum {
TCA_CGROUP_UNSPEC,
TCA_CGROUP_ACT,
TCA_CGROUP_POLICE,
TCA_CGROUP_EMATCHES,
__TCA_CGROUP_MAX,
};
#define TCA_CGROUP_MAX (__TCA_CGROUP_MAX - 1)
/* BPF classifier */
#define TCA_BPF_FLAG_ACT_DIRECT (1 << 0)
enum {
TCA_BPF_UNSPEC,
TCA_BPF_ACT,
TCA_BPF_POLICE,
TCA_BPF_CLASSID,
TCA_BPF_OPS_LEN,
TCA_BPF_OPS,
TCA_BPF_FD,
TCA_BPF_NAME,
TCA_BPF_FLAGS,
TCA_BPF_FLAGS_GEN,
TCA_BPF_TAG,
TCA_BPF_ID,
__TCA_BPF_MAX,
};
#define TCA_BPF_MAX (__TCA_BPF_MAX - 1)
/* Flower classifier */
enum {
TCA_FLOWER_UNSPEC,
TCA_FLOWER_CLASSID,
TCA_FLOWER_INDEV,
TCA_FLOWER_ACT,
TCA_FLOWER_KEY_ETH_DST, /* ETH_ALEN */
TCA_FLOWER_KEY_ETH_DST_MASK, /* ETH_ALEN */
TCA_FLOWER_KEY_ETH_SRC, /* ETH_ALEN */
TCA_FLOWER_KEY_ETH_SRC_MASK, /* ETH_ALEN */
TCA_FLOWER_KEY_ETH_TYPE, /* be16 */
TCA_FLOWER_KEY_IP_PROTO, /* u8 */
TCA_FLOWER_KEY_IPV4_SRC, /* be32 */
TCA_FLOWER_KEY_IPV4_SRC_MASK, /* be32 */
TCA_FLOWER_KEY_IPV4_DST, /* be32 */
TCA_FLOWER_KEY_IPV4_DST_MASK, /* be32 */
TCA_FLOWER_KEY_IPV6_SRC, /* struct in6_addr */
TCA_FLOWER_KEY_IPV6_SRC_MASK, /* struct in6_addr */
TCA_FLOWER_KEY_IPV6_DST, /* struct in6_addr */
TCA_FLOWER_KEY_IPV6_DST_MASK, /* struct in6_addr */
TCA_FLOWER_KEY_TCP_SRC, /* be16 */
TCA_FLOWER_KEY_TCP_DST, /* be16 */
TCA_FLOWER_KEY_UDP_SRC, /* be16 */
TCA_FLOWER_KEY_UDP_DST, /* be16 */
TCA_FLOWER_FLAGS,
TCA_FLOWER_KEY_VLAN_ID, /* be16 */
TCA_FLOWER_KEY_VLAN_PRIO, /* u8 */
TCA_FLOWER_KEY_VLAN_ETH_TYPE, /* be16 */
TCA_FLOWER_KEY_ENC_KEY_ID, /* be32 */
TCA_FLOWER_KEY_ENC_IPV4_SRC, /* be32 */
TCA_FLOWER_KEY_ENC_IPV4_SRC_MASK,/* be32 */
TCA_FLOWER_KEY_ENC_IPV4_DST, /* be32 */
TCA_FLOWER_KEY_ENC_IPV4_DST_MASK,/* be32 */
TCA_FLOWER_KEY_ENC_IPV6_SRC, /* struct in6_addr */
TCA_FLOWER_KEY_ENC_IPV6_SRC_MASK,/* struct in6_addr */
TCA_FLOWER_KEY_ENC_IPV6_DST, /* struct in6_addr */
TCA_FLOWER_KEY_ENC_IPV6_DST_MASK,/* struct in6_addr */
TCA_FLOWER_KEY_TCP_SRC_MASK, /* be16 */
TCA_FLOWER_KEY_TCP_DST_MASK, /* be16 */
TCA_FLOWER_KEY_UDP_SRC_MASK, /* be16 */
TCA_FLOWER_KEY_UDP_DST_MASK, /* be16 */
TCA_FLOWER_KEY_SCTP_SRC_MASK, /* be16 */
TCA_FLOWER_KEY_SCTP_DST_MASK, /* be16 */
TCA_FLOWER_KEY_SCTP_SRC, /* be16 */
TCA_FLOWER_KEY_SCTP_DST, /* be16 */
TCA_FLOWER_KEY_ENC_UDP_SRC_PORT, /* be16 */
TCA_FLOWER_KEY_ENC_UDP_SRC_PORT_MASK, /* be16 */
TCA_FLOWER_KEY_ENC_UDP_DST_PORT, /* be16 */
TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK, /* be16 */
TCA_FLOWER_KEY_FLAGS, /* be32 */
TCA_FLOWER_KEY_FLAGS_MASK, /* be32 */
TCA_FLOWER_KEY_ICMPV4_CODE, /* u8 */
TCA_FLOWER_KEY_ICMPV4_CODE_MASK,/* u8 */
TCA_FLOWER_KEY_ICMPV4_TYPE, /* u8 */
TCA_FLOWER_KEY_ICMPV4_TYPE_MASK,/* u8 */
TCA_FLOWER_KEY_ICMPV6_CODE, /* u8 */
TCA_FLOWER_KEY_ICMPV6_CODE_MASK,/* u8 */
TCA_FLOWER_KEY_ICMPV6_TYPE, /* u8 */
TCA_FLOWER_KEY_ICMPV6_TYPE_MASK,/* u8 */
TCA_FLOWER_KEY_ARP_SIP, /* be32 */
TCA_FLOWER_KEY_ARP_SIP_MASK, /* be32 */
TCA_FLOWER_KEY_ARP_TIP, /* be32 */
TCA_FLOWER_KEY_ARP_TIP_MASK, /* be32 */
TCA_FLOWER_KEY_ARP_OP, /* u8 */
TCA_FLOWER_KEY_ARP_OP_MASK, /* u8 */
TCA_FLOWER_KEY_ARP_SHA, /* ETH_ALEN */
TCA_FLOWER_KEY_ARP_SHA_MASK, /* ETH_ALEN */
TCA_FLOWER_KEY_ARP_THA, /* ETH_ALEN */
TCA_FLOWER_KEY_ARP_THA_MASK, /* ETH_ALEN */
TCA_FLOWER_KEY_MPLS_TTL, /* u8 - 8 bits */
TCA_FLOWER_KEY_MPLS_BOS, /* u8 - 1 bit */
TCA_FLOWER_KEY_MPLS_TC, /* u8 - 3 bits */
TCA_FLOWER_KEY_MPLS_LABEL, /* be32 - 20 bits */
TCA_FLOWER_KEY_TCP_FLAGS, /* be16 */
TCA_FLOWER_KEY_TCP_FLAGS_MASK, /* be16 */
TCA_FLOWER_KEY_IP_TOS, /* u8 */
TCA_FLOWER_KEY_IP_TOS_MASK, /* u8 */
TCA_FLOWER_KEY_IP_TTL, /* u8 */
TCA_FLOWER_KEY_IP_TTL_MASK, /* u8 */
TCA_FLOWER_KEY_CVLAN_ID, /* be16 */
TCA_FLOWER_KEY_CVLAN_PRIO, /* u8 */
TCA_FLOWER_KEY_CVLAN_ETH_TYPE, /* be16 */
TCA_FLOWER_KEY_ENC_IP_TOS, /* u8 */
TCA_FLOWER_KEY_ENC_IP_TOS_MASK, /* u8 */
TCA_FLOWER_KEY_ENC_IP_TTL, /* u8 */
TCA_FLOWER_KEY_ENC_IP_TTL_MASK, /* u8 */
TCA_FLOWER_KEY_ENC_OPTS,
TCA_FLOWER_KEY_ENC_OPTS_MASK,
TCA_FLOWER_IN_HW_COUNT,
__TCA_FLOWER_MAX,
};
#define TCA_FLOWER_MAX (__TCA_FLOWER_MAX - 1)
enum {
TCA_FLOWER_KEY_ENC_OPTS_UNSPEC,
TCA_FLOWER_KEY_ENC_OPTS_GENEVE, /* Nested
* TCA_FLOWER_KEY_ENC_OPT_GENEVE_
* attributes
*/
__TCA_FLOWER_KEY_ENC_OPTS_MAX,
};
#define TCA_FLOWER_KEY_ENC_OPTS_MAX (__TCA_FLOWER_KEY_ENC_OPTS_MAX - 1)
enum {
TCA_FLOWER_KEY_ENC_OPT_GENEVE_UNSPEC,
TCA_FLOWER_KEY_ENC_OPT_GENEVE_CLASS, /* u16 */
TCA_FLOWER_KEY_ENC_OPT_GENEVE_TYPE, /* u8 */
TCA_FLOWER_KEY_ENC_OPT_GENEVE_DATA, /* 4 to 128 bytes */
__TCA_FLOWER_KEY_ENC_OPT_GENEVE_MAX,
};
#define TCA_FLOWER_KEY_ENC_OPT_GENEVE_MAX \
(__TCA_FLOWER_KEY_ENC_OPT_GENEVE_MAX - 1)
enum {
TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT = (1 << 0),
TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST = (1 << 1),
};
/* Match-all classifier */
enum {
TCA_MATCHALL_UNSPEC,
TCA_MATCHALL_CLASSID,
TCA_MATCHALL_ACT,
TCA_MATCHALL_FLAGS,
__TCA_MATCHALL_MAX,
};
#define TCA_MATCHALL_MAX (__TCA_MATCHALL_MAX - 1)
/* Extended Matches */
struct tcf_ematch_tree_hdr {
__u16 nmatches;
__u16 progid;
};
enum {
TCA_EMATCH_TREE_UNSPEC,
TCA_EMATCH_TREE_HDR,
TCA_EMATCH_TREE_LIST,
__TCA_EMATCH_TREE_MAX
};
#define TCA_EMATCH_TREE_MAX (__TCA_EMATCH_TREE_MAX - 1)
struct tcf_ematch_hdr {
__u16 matchid;
__u16 kind;
__u16 flags;
__u16 pad; /* currently unused */
};
/* 0 1
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
* +-----------------------+-+-+---+
* | Unused |S|I| R |
* +-----------------------+-+-+---+
*
* R(2) ::= relation to next ematch
* where: 0 0 END (last ematch)
* 0 1 AND
* 1 0 OR
* 1 1 Unused (invalid)
* I(1) ::= invert result
* S(1) ::= simple payload
*/
#define TCF_EM_REL_END 0
#define TCF_EM_REL_AND (1<<0)
#define TCF_EM_REL_OR (1<<1)
#define TCF_EM_INVERT (1<<2)
#define TCF_EM_SIMPLE (1<<3)
#define TCF_EM_REL_MASK 3
#define TCF_EM_REL_VALID(v) (((v) & TCF_EM_REL_MASK) != TCF_EM_REL_MASK)
enum {
TCF_LAYER_LINK,
TCF_LAYER_NETWORK,
TCF_LAYER_TRANSPORT,
__TCF_LAYER_MAX
};
#define TCF_LAYER_MAX (__TCF_LAYER_MAX - 1)
/* Ematch type assignments
* 1..32767 Reserved for ematches inside kernel tree
* 32768..65535 Free to use, not reliable
*/
#define TCF_EM_CONTAINER 0
#define TCF_EM_CMP 1
#define TCF_EM_NBYTE 2
#define TCF_EM_U32 3
#define TCF_EM_META 4
#define TCF_EM_TEXT 5
#define TCF_EM_VLAN 6
#define TCF_EM_CANID 7
#define TCF_EM_IPSET 8
#define TCF_EM_IPT 9
#define TCF_EM_MAX 9
enum {
TCF_EM_PROG_TC
};
enum {
TCF_EM_OPND_EQ,
TCF_EM_OPND_GT,
TCF_EM_OPND_LT
};
#endif

File diff suppressed because it is too large Load Diff

View File

@@ -1,18 +0,0 @@
#!/bin/sh
tfile=$(mktemp /tmp/test_reallocarray_XXXXXXXX.c)
ofile=${tfile%.c}.o
cat > $tfile <<EOL
#define _GNU_SOURCE
#include <stdlib.h>
int main(void)
{
return !!reallocarray(NULL, 1, 1);
}
EOL
gcc $tfile -o $ofile >/dev/null 2>&1
if [ $? -ne 0 ]; then echo "FAIL"; fi
/bin/rm -f $tfile $ofile

105
scripts/coverity.sh Executable file
View File

@@ -0,0 +1,105 @@
#!/bin/bash
# Taken from: https://scan.coverity.com/scripts/travisci_build_coverity_scan.sh
# Local changes are annotated with "#[local]"
set -e
# Environment check
echo -e "\033[33;1mNote: COVERITY_SCAN_PROJECT_NAME and COVERITY_SCAN_TOKEN are available on Project Settings page on scan.coverity.com\033[0m"
[ -z "$COVERITY_SCAN_PROJECT_NAME" ] && echo "ERROR: COVERITY_SCAN_PROJECT_NAME must be set" && exit 1
[ -z "$COVERITY_SCAN_NOTIFICATION_EMAIL" ] && echo "ERROR: COVERITY_SCAN_NOTIFICATION_EMAIL must be set" && exit 1
[ -z "$COVERITY_SCAN_BRANCH_PATTERN" ] && echo "ERROR: COVERITY_SCAN_BRANCH_PATTERN must be set" && exit 1
[ -z "$COVERITY_SCAN_BUILD_COMMAND" ] && echo "ERROR: COVERITY_SCAN_BUILD_COMMAND must be set" && exit 1
[ -z "$COVERITY_SCAN_TOKEN" ] && echo "ERROR: COVERITY_SCAN_TOKEN must be set" && exit 1
PLATFORM=`uname`
#[local] Use /var/tmp for TOOL_ARCHIVE and TOOL_BASE, as on certain systems
# /tmp is tmpfs and is sometimes too small to handle all necessary tooling
TOOL_ARCHIVE=/var//tmp/cov-analysis-${PLATFORM}.tgz
TOOL_URL=https://scan.coverity.com/download/${PLATFORM}
TOOL_BASE=/var/tmp/coverity-scan-analysis
UPLOAD_URL="https://scan.coverity.com/builds"
SCAN_URL="https://scan.coverity.com"
# Do not run on pull requests
if [ "${TRAVIS_PULL_REQUEST}" = "true" ]; then
echo -e "\033[33;1mINFO: Skipping Coverity Analysis: branch is a pull request.\033[0m"
exit 0
fi
# Verify this branch should run
IS_COVERITY_SCAN_BRANCH=`ruby -e "puts '${TRAVIS_BRANCH}' =~ /\\A$COVERITY_SCAN_BRANCH_PATTERN\\z/ ? 1 : 0"`
if [ "$IS_COVERITY_SCAN_BRANCH" = "1" ]; then
echo -e "\033[33;1mCoverity Scan configured to run on branch ${TRAVIS_BRANCH}\033[0m"
else
echo -e "\033[33;1mCoverity Scan NOT configured to run on branch ${TRAVIS_BRANCH}\033[0m"
exit 1
fi
# Verify upload is permitted
AUTH_RES=`curl -s --form project="$COVERITY_SCAN_PROJECT_NAME" --form token="$COVERITY_SCAN_TOKEN" $SCAN_URL/api/upload_permitted`
if [ "$AUTH_RES" = "Access denied" ]; then
echo -e "\033[33;1mCoverity Scan API access denied. Check COVERITY_SCAN_PROJECT_NAME and COVERITY_SCAN_TOKEN.\033[0m"
exit 1
else
AUTH=`echo $AUTH_RES | ruby -e "require 'rubygems'; require 'json'; puts JSON[STDIN.read]['upload_permitted']"`
if [ "$AUTH" = "true" ]; then
echo -e "\033[33;1mCoverity Scan analysis authorized per quota.\033[0m"
else
WHEN=`echo $AUTH_RES | ruby -e "require 'rubygems'; require 'json'; puts JSON[STDIN.read]['next_upload_permitted_at']"`
echo -e "\033[33;1mCoverity Scan analysis NOT authorized until $WHEN.\033[0m"
exit 0
fi
fi
if [ ! -d $TOOL_BASE ]; then
# Download Coverity Scan Analysis Tool
if [ ! -e $TOOL_ARCHIVE ]; then
echo -e "\033[33;1mDownloading Coverity Scan Analysis Tool...\033[0m"
wget -nv -O $TOOL_ARCHIVE $TOOL_URL --post-data "project=$COVERITY_SCAN_PROJECT_NAME&token=$COVERITY_SCAN_TOKEN"
fi
# Extract Coverity Scan Analysis Tool
echo -e "\033[33;1mExtracting Coverity Scan Analysis Tool...\033[0m"
mkdir -p $TOOL_BASE
pushd $TOOL_BASE
tar xzf $TOOL_ARCHIVE
popd
fi
TOOL_DIR=`find $TOOL_BASE -type d -name 'cov-analysis*'`
export PATH=$TOOL_DIR/bin:$PATH
# Build
echo -e "\033[33;1mRunning Coverity Scan Analysis Tool...\033[0m"
COV_BUILD_OPTIONS=""
#COV_BUILD_OPTIONS="--return-emit-failures 8 --parse-error-threshold 85"
RESULTS_DIR="cov-int"
eval "${COVERITY_SCAN_BUILD_COMMAND_PREPEND}"
COVERITY_UNSUPPORTED=1 cov-build --dir $RESULTS_DIR $COV_BUILD_OPTIONS $COVERITY_SCAN_BUILD_COMMAND
cov-import-scm --dir $RESULTS_DIR --scm git --log $RESULTS_DIR/scm_log.txt 2>&1
# Upload results
echo -e "\033[33;1mTarring Coverity Scan Analysis results...\033[0m"
RESULTS_ARCHIVE=analysis-results.tgz
tar czf $RESULTS_ARCHIVE $RESULTS_DIR
SHA=`git rev-parse --short HEAD`
echo -e "\033[33;1mUploading Coverity Scan Analysis results...\033[0m"
response=$(curl \
--silent --write-out "\n%{http_code}\n" \
--form project=$COVERITY_SCAN_PROJECT_NAME \
--form token=$COVERITY_SCAN_TOKEN \
--form email=$COVERITY_SCAN_NOTIFICATION_EMAIL \
--form file=@$RESULTS_ARCHIVE \
--form version=$SHA \
--form description="Travis CI build" \
$UPLOAD_URL)
status_code=$(echo "$response" | sed -n '$p')
#[local] Coverity used to return 201 on success, but it's 200 now
# See https://github.com/systemd/systemd/blob/master/tools/coverity.sh#L145
if [ "$status_code" != "200" ]; then
TEXT=$(echo "$response" | sed '$d')
echo -e "\033[33;1mCoverity Scan upload failed: $TEXT.\033[0m"
exit 1
fi

View File

@@ -6,7 +6,6 @@ usage () {
echo "Set BPF_NEXT_BASELINE to override bpf-next tree commit, otherwise read from <libbpf-repo>/CHECKPOINT-COMMIT."
echo "Set BPF_BASELINE to override bpf tree commit, otherwise read from <libbpf-repo>/BPF-CHECKPOINT-COMMIT."
echo "Set MANUAL_MODE to 1 to manually control every cherry-picked commits."
echo "Set IGNORE_CONSISTENCY to 1 to ignore failed contents consistency check."
exit 1
}
@@ -46,18 +45,21 @@ PATH_MAP=( \
[tools/include/uapi/linux/if_link.h]=include/uapi/linux/if_link.h \
[tools/include/uapi/linux/if_xdp.h]=include/uapi/linux/if_xdp.h \
[tools/include/uapi/linux/netlink.h]=include/uapi/linux/netlink.h \
[tools/include/tools/libc_compat.h]=include/tools/libc_compat.h \
[tools/include/uapi/linux/pkt_cls.h]=include/uapi/linux/pkt_cls.h \
[tools/include/uapi/linux/pkt_sched.h]=include/uapi/linux/pkt_sched.h \
[Documentation/bpf/libbpf]=docs \
)
LIBBPF_PATHS="${!PATH_MAP[@]}"
LIBBPF_PATHS="${!PATH_MAP[@]} :^tools/lib/bpf/Makefile :^tools/lib/bpf/Build :^tools/lib/bpf/.gitignore :^tools/include/tools/libc_compat.h"
LIBBPF_VIEW_PATHS="${PATH_MAP[@]}"
LIBBPF_VIEW_EXCLUDE_REGEX='^src/(Makefile|Build|test_libbpf.cpp|\.gitignore)$'
LIBBPF_VIEW_EXCLUDE_REGEX='^src/(Makefile|Build|test_libbpf\.c|bpf_helper_defs\.h|\.gitignore)$|^docs/(\.gitignore|api\.rst|conf\.py)$|^docs/sphinx/.*'
LINUX_VIEW_EXCLUDE_REGEX='^include/tools/libc_compat.h$'
LIBBPF_TREE_FILTER="mkdir -p __libbpf/include/uapi/linux __libbpf/include/tools && "$'\\\n'
for p in "${!PATH_MAP[@]}"; do
LIBBPF_TREE_FILTER+="git mv -kf ${p} __libbpf/${PATH_MAP[${p}]} && "$'\\\n'
done
LIBBPF_TREE_FILTER+="git rm --ignore-unmatch -f __libbpf/src/{Makefile,Build,test_libbpf.cpp,.gitignore} >/dev/null"
LIBBPF_TREE_FILTER+="git rm --ignore-unmatch -f __libbpf/src/{Makefile,Build,test_libbpf.c,.gitignore} >/dev/null"
cd_to()
{
@@ -79,39 +81,10 @@ commit_desc()
# The idea is that this single-line signature is good enough to make final
# decision about whether two commits are the same, across different repos.
# $1 - commit ref
# $2 - paths filter
commit_signature()
{
git log -n1 --pretty='("%s")|%aI|%b' --shortstat $1 | tr '\n' '|'
}
# Validate there are no non-empty merges (we can't handle them)
# $1 - baseline tag
# $2 - tip tag
validate_merges()
{
local baseline_tag=$1
local tip_tag=$2
local new_merges
local merge_change_cnt
local ignore_merge_resolutions
local desc
new_merges=$(git rev-list --merges --topo-order --reverse ${baseline_tag}..${tip_tag} ${LIBBPF_PATHS[@]})
for new_merge in ${new_merges}; do
desc=$(commit_desc ${new_merge})
echo "MERGE: ${desc}"
merge_change_cnt=$(git show --format='' ${new_merge} | wc -l)
if ((${merge_change_cnt} > 0)); then
read -p "Merge '${desc}' is non-empty, which will cause conflicts! Do you want to proceed? [y/N]: " ignore_merge_resolutions
case "${ignore_merge_resolutions}" in
"y" | "Y")
echo "Skipping '${desc}'..."
continue
;;
esac
exit 3
fi
done
git show --pretty='("%s")|%aI|%b' --shortstat $1 -- ${2-.} | tr '\n' '|'
}
# Cherry-pick commits touching libbpf-related files
@@ -133,7 +106,7 @@ cherry_pick_commits()
new_commits=$(git rev-list --no-merges --topo-order --reverse ${baseline_tag}..${tip_tag} ${LIBBPF_PATHS[@]})
for new_commit in ${new_commits}; do
desc="$(commit_desc ${new_commit})"
signature="$(commit_signature ${new_commit})"
signature="$(commit_signature ${new_commit} "${LIBBPF_PATHS[@]}")"
synced_cnt=$(grep -F "${signature}" ${TMP_DIR}/libbpf_commits.txt | wc -l)
manual_check=0
if ((${synced_cnt} > 0)); then
@@ -166,6 +139,7 @@ cherry_pick_commits()
echo "Warning! Cherry-picking '${desc} failed, checking if it's non-libbpf files causing problems..."
libbpf_conflict_cnt=$(git diff --name-only --diff-filter=U -- ${LIBBPF_PATHS[@]} | wc -l)
conflict_cnt=$(git diff --name-only | wc -l)
prompt_resolution=1
if ((${libbpf_conflict_cnt} == 0)); then
echo "Looks like only non-libbpf files have conflicts, ignoring..."
@@ -181,16 +155,39 @@ cherry_pick_commits()
echo "Error! That still failed! Please resolve manually."
else
echo "Success! All cherry-pick conflicts were resolved for '${desc}'!"
continue
prompt_resolution=0
fi
fi
read -p "Error! Cherry-picking '${desc}' failed, please fix manually and press <return> to proceed..."
if ((${prompt_resolution} == 1)); then
read -p "Error! Cherry-picking '${desc}' failed, please fix manually and press <return> to proceed..."
fi
fi
# Append signature of just cherry-picked commit to avoid
# potentially cherry-picking the same commit twice later when
# processing bpf tree commits. At this point we don't know yet
# the final commit sha in libbpf repo, so we record Linux SHA
# instead as LINUX_<sha>.
echo LINUX_$(git log --pretty='%h' -n1) "${signature}" >> ${TMP_DIR}/libbpf_commits.txt
done
}
cleanup()
{
echo "Cleaning up..."
rm -r ${TMP_DIR}
cd_to ${LINUX_REPO}
git checkout ${TIP_SYM_REF}
git branch -D ${BASELINE_TAG} ${TIP_TAG} ${BPF_BASELINE_TAG} ${BPF_TIP_TAG} \
${SQUASH_BASE_TAG} ${SQUASH_TIP_TAG} ${VIEW_TAG} || true
cd_to .
echo "DONE."
}
cd_to ${LIBBPF_REPO}
GITHUB_ABS_DIR=$(pwd)
echo "Dumping existing libbpf commit signatures..."
for h in $(git log --pretty='%h' -n500); do
echo $h "$(commit_signature $h)" >> ${TMP_DIR}/libbpf_commits.txt
@@ -198,6 +195,7 @@ done
# Use current kernel repo HEAD as a source of patches
cd_to ${LINUX_REPO}
LINUX_ABS_DIR=$(pwd)
TIP_SYM_REF=$(git symbolic-ref -q --short HEAD || git rev-parse HEAD)
TIP_COMMIT=$(git rev-parse HEAD)
BPF_TIP_COMMIT=$(git rev-parse ${BPF_BRANCH})
@@ -240,23 +238,20 @@ git branch ${BPF_TIP_TAG} ${BPF_TIP_COMMIT}
git branch ${SQUASH_BASE_TAG} ${SQUASH_COMMIT}
git checkout -b ${SQUASH_TIP_TAG} ${SQUASH_COMMIT}
# Validate there are no non-empty merges in bpf-next and bpf trees
validate_merges ${BASELINE_TAG} ${TIP_TAG}
validate_merges ${BPF_BASELINE_TAG} ${BPF_TIP_TAG}
# Cherry-pick new commits onto squashed baseline commit
cherry_pick_commits ${BASELINE_TAG} ${TIP_TAG}
cherry_pick_commits ${BPF_BASELINE_TAG} ${BPF_TIP_TAG}
# Move all libbpf files into __libbpf directory.
git filter-branch --prune-empty -f --tree-filter "${LIBBPF_TREE_FILTER}" ${SQUASH_TIP_TAG} ${SQUASH_BASE_TAG}
FILTER_BRANCH_SQUELCH_WARNING=1 git filter-branch --prune-empty -f --tree-filter "${LIBBPF_TREE_FILTER}" ${SQUASH_TIP_TAG} ${SQUASH_BASE_TAG}
# Make __libbpf a new root directory
git filter-branch --prune-empty -f --subdirectory-filter __libbpf ${SQUASH_TIP_TAG} ${SQUASH_BASE_TAG}
FILTER_BRANCH_SQUELCH_WARNING=1 git filter-branch --prune-empty -f --subdirectory-filter __libbpf ${SQUASH_TIP_TAG} ${SQUASH_BASE_TAG}
# If there are no new commits with libbpf-related changes, bail out
COMMIT_CNT=$(git rev-list --count ${SQUASH_BASE_TAG}..${SQUASH_TIP_TAG})
if ((${COMMIT_CNT} <= 0)); then
echo "No new changes to apply, we are done!"
cleanup
exit 2
fi
@@ -268,11 +263,29 @@ cd_to ${LIBBPF_REPO}
git checkout -b ${LIBBPF_SYNC_TAG}
for patch in $(ls -1 ${TMP_DIR}/patches | tail -n +2); do
if ! git am --committer-date-is-author-date "${TMP_DIR}/patches/${patch}"; then
if ! git am --3way --committer-date-is-author-date "${TMP_DIR}/patches/${patch}"; then
read -p "Applying ${TMP_DIR}/patches/${patch} failed, please resolve manually and press <return> to proceed..."
fi
done
# Generate bpf_helper_defs.h and commit, if anything changed
# restore Linux tip to use bpf_doc.py
cd_to ${LINUX_REPO}
git checkout ${TIP_TAG}
# re-generate bpf_helper_defs.h
cd_to ${LIBBPF_REPO}
"${LINUX_ABS_DIR}/scripts/bpf_doc.py" --header \
--file include/uapi/linux/bpf.h > src/bpf_helper_defs.h
# if anything changed, commit it
helpers_changes=$(git status --porcelain src/bpf_helper_defs.h | wc -l)
if ((${helpers_changes} == 1)); then
git add src/bpf_helper_defs.h
git commit -m "sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
" -- src/bpf_helper_defs.h
fi
# Use generated cover-letter as a template for "sync commit" with
# baseline and checkpoint commits from kernel repo (and leave summary
# from cover letter intact, of course)
@@ -296,14 +309,12 @@ echo "SUCCESS! ${COMMIT_CNT} commits synced."
echo "Verifying Linux's and Github's libbpf state"
cd_to ${LINUX_REPO}
LINUX_ABS_DIR=$(pwd)
git checkout -b ${VIEW_TAG} ${TIP_COMMIT}
git filter-branch -f --tree-filter "${LIBBPF_TREE_FILTER}" ${VIEW_TAG}^..${VIEW_TAG}
git filter-branch -f --subdirectory-filter __libbpf ${VIEW_TAG}^..${VIEW_TAG}
git ls-files -- ${LIBBPF_VIEW_PATHS[@]} > ${TMP_DIR}/linux-view.ls
FILTER_BRANCH_SQUELCH_WARNING=1 git filter-branch -f --tree-filter "${LIBBPF_TREE_FILTER}" ${VIEW_TAG}^..${VIEW_TAG}
FILTER_BRANCH_SQUELCH_WARNING=1 git filter-branch -f --subdirectory-filter __libbpf ${VIEW_TAG}^..${VIEW_TAG}
git ls-files -- ${LIBBPF_VIEW_PATHS[@]} | grep -v -E "${LINUX_VIEW_EXCLUDE_REGEX}" > ${TMP_DIR}/linux-view.ls
cd_to ${LIBBPF_REPO}
GITHUB_ABS_DIR=$(pwd)
git ls-files -- ${LIBBPF_VIEW_PATHS[@]} | grep -v -E "${LIBBPF_VIEW_EXCLUDE_REGEX}" > ${TMP_DIR}/github-view.ls
echo "Comparing list of files..."
@@ -319,19 +330,17 @@ done
if ((${CONSISTENT} == 1)); then
echo "Great! Content is identical!"
else
echo "Unfortunately, there are consistency problems!"
if ((${IGNORE_CONSISTENCY-0} != 1)); then
exit 4
fi
ignore_inconsistency=n
echo "Unfortunately, there are some inconsistencies, please double check."
read -p "Does everything look good? [y/N]: " ignore_inconsistency
case "${ignore_inconsistency}" in
"y" | "Y")
echo "Ok, proceeding..."
;;
*)
echo "Oops, exiting with error..."
exit 4
esac
fi
echo "Cleaning up..."
rm -r ${TMP_DIR}
cd_to ${LINUX_REPO}
git checkout ${TIP_SYM_REF}
git branch -D ${BASELINE_TAG} ${TIP_TAG} ${BPF_BASELINE_TAG} ${BPF_TIP_TAG} \
${SQUASH_BASE_TAG} ${SQUASH_TIP_TAG} ${VIEW_TAG}
cd_to .
echo "DONE."
cleanup

View File

@@ -1,5 +1,13 @@
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
ifeq ($(V),1)
Q =
msg =
else
Q = @
msg = @printf ' %-8s %s%s\n' "$(1)" "$(notdir $(2))" "$(if $(3), $(3))";
endif
LIBBPF_VERSION := $(shell \
grep -oE '^LIBBPF_([0-9.]+)' libbpf.map | \
sort -rV | head -n1 | cut -d'_' -f2)
@@ -10,22 +18,17 @@ TOPDIR = ..
INCLUDES := -I. -I$(TOPDIR)/include -I$(TOPDIR)/include/uapi
ALL_CFLAGS := $(INCLUDES)
FEATURE_REALLOCARRAY := $(shell $(TOPDIR)/scripts/check-reallocarray.sh)
ifneq ($(FEATURE_REALLOCARRAY),)
ALL_CFLAGS += -DCOMPAT_NEED_REALLOCARRAY
endif
SHARED_CFLAGS += -fPIC -fvisibility=hidden -DSHARED
CFLAGS ?= -g -O2 -Werror -Wall
ALL_CFLAGS += $(CFLAGS)
ALL_CFLAGS += $(CFLAGS) -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64
ALL_LDFLAGS += $(LDFLAGS)
ifdef NO_PKG_CONFIG
ALL_LDFLAGS += -lelf
ALL_LDFLAGS += -lelf -lz
else
PKG_CONFIG ?= pkg-config
ALL_CFLAGS += $(shell $(PKG_CONFIG) --cflags libelf)
ALL_LDFLAGS += $(shell $(PKG_CONFIG) --libs libelf)
ALL_CFLAGS += $(shell $(PKG_CONFIG) --cflags libelf zlib)
ALL_LDFLAGS += $(shell $(PKG_CONFIG) --libs libelf zlib)
endif
OBJDIR ?= .
@@ -33,7 +36,8 @@ SHARED_OBJDIR := $(OBJDIR)/sharedobjs
STATIC_OBJDIR := $(OBJDIR)/staticobjs
OBJS := bpf.o btf.o libbpf.o libbpf_errno.o netlink.o \
nlattr.o str_error.o libbpf_probes.o bpf_prog_linfo.o xsk.o \
btf_dump.o hashmap.o
btf_dump.o hashmap.o ringbuf.o strset.o linker.o gen_loader.o \
relo_core.o
SHARED_OBJS := $(addprefix $(SHARED_OBJDIR)/,$(OBJS))
STATIC_OBJS := $(addprefix $(STATIC_OBJDIR)/,$(OBJS))
@@ -45,7 +49,9 @@ ifndef BUILD_STATIC_ONLY
VERSION_SCRIPT := libbpf.map
endif
HEADERS := bpf.h libbpf.h btf.h xsk.h libbpf_util.h
HEADERS := bpf.h libbpf.h btf.h libbpf_common.h libbpf_legacy.h xsk.h \
bpf_helpers.h bpf_helper_defs.h bpf_tracing.h \
bpf_endian.h bpf_core_read.h skel_internal.h libbpf_version.h
UAPI_HEADERS := $(addprefix $(TOPDIR)/include/uapi/linux/,\
bpf.h bpf_common.h btf.h)
@@ -55,64 +61,77 @@ INSTALL = install
DESTDIR ?=
ifeq ($(shell uname -m),x86_64)
ifeq ($(filter-out %64 %64be %64eb %64le %64el s390x, $(shell uname -m)),)
LIBSUBDIR := lib64
else
LIBSUBDIR := lib
endif
# By default let the pc file itself use ${prefix} in includedir/libdir so that
# the prefix can be overridden at runtime (eg: --define-prefix)
ifndef LIBDIR
LIBDIR_PC := $$\{prefix\}/$(LIBSUBDIR)
else
LIBDIR_PC := $(LIBDIR)
endif
PREFIX ?= /usr
LIBDIR ?= $(PREFIX)/$(LIBSUBDIR)
INCLUDEDIR ?= $(PREFIX)/include
UAPIDIR ?= $(PREFIX)/include
TAGS_PROG := $(if $(shell which etags 2>/dev/null),etags,ctags)
all: $(STATIC_LIBS) $(SHARED_LIBS) $(PC_FILE)
$(OBJDIR)/libbpf.a: $(STATIC_OBJS)
$(AR) rcs $@ $^
$(call msg,AR,$@)
$(Q)$(AR) rcs $@ $^
$(OBJDIR)/libbpf.so: $(OBJDIR)/libbpf.so.$(LIBBPF_MAJOR_VERSION)
ln -sf $(^F) $@
$(Q)ln -sf $(^F) $@
$(OBJDIR)/libbpf.so.$(LIBBPF_MAJOR_VERSION): $(OBJDIR)/libbpf.so.$(LIBBPF_VERSION)
ln -sf $(^F) $@
$(Q)ln -sf $(^F) $@
$(OBJDIR)/libbpf.so.$(LIBBPF_VERSION): $(SHARED_OBJS)
$(CC) -shared -Wl,--version-script=$(VERSION_SCRIPT) \
-Wl,-soname,libbpf.so.$(LIBBPF_MAJOR_VERSION) \
$^ $(ALL_LDFLAGS) -o $@
$(call msg,CC,$@)
$(Q)$(CC) -shared -Wl,--version-script=$(VERSION_SCRIPT) \
-Wl,-soname,libbpf.so.$(LIBBPF_MAJOR_VERSION) \
$^ $(ALL_LDFLAGS) -o $@
$(OBJDIR)/libbpf.pc:
sed -e "s|@PREFIX@|$(PREFIX)|" \
-e "s|@LIBDIR@|$(LIBDIR)|" \
$(Q)sed -e "s|@PREFIX@|$(PREFIX)|" \
-e "s|@LIBDIR@|$(LIBDIR_PC)|" \
-e "s|@VERSION@|$(LIBBPF_VERSION)|" \
< libbpf.pc.template > $@
$(STATIC_OBJDIR):
mkdir -p $(STATIC_OBJDIR)
$(SHARED_OBJDIR):
mkdir -p $(SHARED_OBJDIR)
$(STATIC_OBJDIR) $(SHARED_OBJDIR):
$(call msg,MKDIR,$@)
$(Q)mkdir -p $@
$(STATIC_OBJDIR)/%.o: %.c | $(STATIC_OBJDIR)
$(CC) $(ALL_CFLAGS) $(CPPFLAGS) -c $< -o $@
$(call msg,CC,$@)
$(Q)$(CC) $(ALL_CFLAGS) $(CPPFLAGS) -c $< -o $@
$(SHARED_OBJDIR)/%.o: %.c | $(SHARED_OBJDIR)
$(CC) $(ALL_CFLAGS) $(SHARED_CFLAGS) $(CPPFLAGS) -c $< -o $@
$(call msg,CC,$@)
$(Q)$(CC) $(ALL_CFLAGS) $(SHARED_CFLAGS) $(CPPFLAGS) -c $< -o $@
define do_install
if [ ! -d '$(DESTDIR)$2' ]; then \
$(call msg,INSTALL,$1)
$(Q)if [ ! -d '$(DESTDIR)$2' ]; then \
$(INSTALL) -d -m 755 '$(DESTDIR)$2'; \
fi; \
$(INSTALL) $1 $(if $3,-m $3,) '$(DESTDIR)$2'
fi;
$(Q)$(INSTALL) $(if $3,-m $3,) $1 '$(DESTDIR)$2'
endef
# Preserve symlinks at installation.
define do_s_install
if [ ! -d '$(DESTDIR)$2' ]; then \
$(call msg,INSTALL,$1)
$(Q)if [ ! -d '$(DESTDIR)$2' ]; then \
$(INSTALL) -d -m 755 '$(DESTDIR)$2'; \
fi; \
cp -fpR $1 '$(DESTDIR)$2'
fi;
$(Q)cp -fR $1 '$(DESTDIR)$2'
endef
install: all install_headers install_pkgconfig
@@ -130,4 +149,16 @@ install_pkgconfig: $(PC_FILE)
$(call do_install,$(PC_FILE),$(LIBDIR)/pkgconfig,644)
clean:
rm -rf *.o *.a *.so *.so.* *.pc $(SHARED_OBJDIR) $(STATIC_OBJDIR)
$(call msg,CLEAN)
$(Q)rm -rf *.o *.a *.so *.so.* *.pc $(SHARED_OBJDIR) $(STATIC_OBJDIR)
.PHONY: cscope tags
cscope:
$(call msg,CSCOPE)
$(Q)ls *.c *.h > cscope.files
$(Q)cscope -b -q -f cscope.out
tags:
$(call msg,CTAGS)
$(Q)rm -f TAGS tags
$(Q)ls *.c *.h | xargs $(TAGS_PROG) -a

514
src/bpf.c
View File

@@ -67,11 +67,12 @@ static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
static inline int sys_bpf_prog_load(union bpf_attr *attr, unsigned int size)
{
int retries = 5;
int fd;
do {
fd = sys_bpf(BPF_PROG_LOAD, attr, size);
} while (fd < 0 && errno == EAGAIN);
} while (fd < 0 && errno == EAGAIN && retries-- > 0);
return fd;
}
@@ -79,6 +80,7 @@ static inline int sys_bpf_prog_load(union bpf_attr *attr, unsigned int size)
int bpf_create_map_xattr(const struct bpf_create_map_attr *create_attr)
{
union bpf_attr attr;
int fd;
memset(&attr, '\0', sizeof(attr));
@@ -95,9 +97,14 @@ int bpf_create_map_xattr(const struct bpf_create_map_attr *create_attr)
attr.btf_key_type_id = create_attr->btf_key_type_id;
attr.btf_value_type_id = create_attr->btf_value_type_id;
attr.map_ifindex = create_attr->map_ifindex;
attr.inner_map_fd = create_attr->inner_map_fd;
if (attr.map_type == BPF_MAP_TYPE_STRUCT_OPS)
attr.btf_vmlinux_value_type_id =
create_attr->btf_vmlinux_value_type_id;
else
attr.inner_map_fd = create_attr->inner_map_fd;
return sys_bpf(BPF_MAP_CREATE, &attr, sizeof(attr));
fd = sys_bpf(BPF_MAP_CREATE, &attr, sizeof(attr));
return libbpf_err_errno(fd);
}
int bpf_create_map_node(enum bpf_map_type map_type, const char *name,
@@ -155,6 +162,7 @@ int bpf_create_map_in_map_node(enum bpf_map_type map_type, const char *name,
__u32 map_flags, int node)
{
union bpf_attr attr;
int fd;
memset(&attr, '\0', sizeof(attr));
@@ -173,7 +181,8 @@ int bpf_create_map_in_map_node(enum bpf_map_type map_type, const char *name,
attr.numa_node = node;
}
return sys_bpf(BPF_MAP_CREATE, &attr, sizeof(attr));
fd = sys_bpf(BPF_MAP_CREATE, &attr, sizeof(attr));
return libbpf_err_errno(fd);
}
int bpf_create_map_in_map(enum bpf_map_type map_type, const char *name,
@@ -189,7 +198,7 @@ static void *
alloc_zero_tailing_info(const void *orecord, __u32 cnt,
__u32 actual_rec_size, __u32 expected_rec_size)
{
__u64 info_len = actual_rec_size * cnt;
__u64 info_len = (__u64)actual_rec_size * cnt;
void *info, *nrecord;
int i;
@@ -210,50 +219,56 @@ alloc_zero_tailing_info(const void *orecord, __u32 cnt,
return info;
}
int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
char *log_buf, size_t log_buf_sz)
int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr)
{
void *finfo = NULL, *linfo = NULL;
union bpf_attr attr;
__u32 log_level;
int fd;
if (!load_attr || !log_buf != !log_buf_sz)
return -EINVAL;
if (!load_attr->log_buf != !load_attr->log_buf_sz)
return libbpf_err(-EINVAL);
log_level = load_attr->log_level;
if (log_level > (4 | 2 | 1) || (log_level && !log_buf))
return -EINVAL;
if (load_attr->log_level > (4 | 2 | 1) || (load_attr->log_level && !load_attr->log_buf))
return libbpf_err(-EINVAL);
memset(&attr, 0, sizeof(attr));
attr.prog_type = load_attr->prog_type;
attr.expected_attach_type = load_attr->expected_attach_type;
attr.insn_cnt = (__u32)load_attr->insns_cnt;
if (load_attr->attach_prog_fd)
attr.attach_prog_fd = load_attr->attach_prog_fd;
else
attr.attach_btf_obj_fd = load_attr->attach_btf_obj_fd;
attr.attach_btf_id = load_attr->attach_btf_id;
attr.prog_ifindex = load_attr->prog_ifindex;
attr.kern_version = load_attr->kern_version;
attr.insn_cnt = (__u32)load_attr->insn_cnt;
attr.insns = ptr_to_u64(load_attr->insns);
attr.license = ptr_to_u64(load_attr->license);
attr.log_level = log_level;
if (log_level) {
attr.log_buf = ptr_to_u64(log_buf);
attr.log_size = log_buf_sz;
} else {
attr.log_buf = ptr_to_u64(NULL);
attr.log_size = 0;
attr.log_level = load_attr->log_level;
if (attr.log_level) {
attr.log_buf = ptr_to_u64(load_attr->log_buf);
attr.log_size = load_attr->log_buf_sz;
}
attr.kern_version = load_attr->kern_version;
attr.prog_ifindex = load_attr->prog_ifindex;
attr.prog_btf_fd = load_attr->prog_btf_fd;
attr.prog_flags = load_attr->prog_flags;
attr.func_info_rec_size = load_attr->func_info_rec_size;
attr.func_info_cnt = load_attr->func_info_cnt;
attr.func_info = ptr_to_u64(load_attr->func_info);
attr.line_info_rec_size = load_attr->line_info_rec_size;
attr.line_info_cnt = load_attr->line_info_cnt;
attr.line_info = ptr_to_u64(load_attr->line_info);
attr.fd_array = ptr_to_u64(load_attr->fd_array);
if (load_attr->name)
memcpy(attr.prog_name, load_attr->name,
min(strlen(load_attr->name), BPF_OBJ_NAME_LEN - 1));
attr.prog_flags = load_attr->prog_flags;
min(strlen(load_attr->name), (size_t)BPF_OBJ_NAME_LEN - 1));
fd = sys_bpf_prog_load(&attr, sizeof(attr));
if (fd >= 0)
@@ -271,8 +286,10 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
load_attr->func_info_cnt,
load_attr->func_info_rec_size,
attr.func_info_rec_size);
if (!finfo)
if (!finfo) {
errno = E2BIG;
goto done;
}
attr.func_info = ptr_to_u64(finfo);
attr.func_info_rec_size = load_attr->func_info_rec_size;
@@ -283,8 +300,10 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
load_attr->line_info_cnt,
load_attr->line_info_rec_size,
attr.line_info_rec_size);
if (!linfo)
if (!linfo) {
errno = E2BIG;
goto done;
}
attr.line_info = ptr_to_u64(linfo);
attr.line_info_rec_size = load_attr->line_info_rec_size;
@@ -293,24 +312,68 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
}
fd = sys_bpf_prog_load(&attr, sizeof(attr));
if (fd >= 0)
goto done;
}
if (log_level || !log_buf)
if (load_attr->log_level || !load_attr->log_buf)
goto done;
/* Try again with log */
attr.log_buf = ptr_to_u64(log_buf);
attr.log_size = log_buf_sz;
attr.log_buf = ptr_to_u64(load_attr->log_buf);
attr.log_size = load_attr->log_buf_sz;
attr.log_level = 1;
log_buf[0] = 0;
load_attr->log_buf[0] = 0;
fd = sys_bpf_prog_load(&attr, sizeof(attr));
done:
/* free() doesn't affect errno, so we don't need to restore it */
free(finfo);
free(linfo);
return fd;
return libbpf_err_errno(fd);
}
int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
char *log_buf, size_t log_buf_sz)
{
struct bpf_prog_load_params p = {};
if (!load_attr || !log_buf != !log_buf_sz)
return libbpf_err(-EINVAL);
p.prog_type = load_attr->prog_type;
p.expected_attach_type = load_attr->expected_attach_type;
switch (p.prog_type) {
case BPF_PROG_TYPE_STRUCT_OPS:
case BPF_PROG_TYPE_LSM:
p.attach_btf_id = load_attr->attach_btf_id;
break;
case BPF_PROG_TYPE_TRACING:
case BPF_PROG_TYPE_EXT:
p.attach_btf_id = load_attr->attach_btf_id;
p.attach_prog_fd = load_attr->attach_prog_fd;
break;
default:
p.prog_ifindex = load_attr->prog_ifindex;
p.kern_version = load_attr->kern_version;
}
p.insn_cnt = load_attr->insns_cnt;
p.insns = load_attr->insns;
p.license = load_attr->license;
p.log_level = load_attr->log_level;
p.log_buf = log_buf;
p.log_buf_sz = log_buf_sz;
p.prog_btf_fd = load_attr->prog_btf_fd;
p.func_info_rec_size = load_attr->func_info_rec_size;
p.func_info_cnt = load_attr->func_info_cnt;
p.func_info = load_attr->func_info;
p.line_info_rec_size = load_attr->line_info_rec_size;
p.line_info_cnt = load_attr->line_info_cnt;
p.line_info = load_attr->line_info;
p.name = load_attr->name;
p.prog_flags = load_attr->prog_flags;
return libbpf__bpf_prog_load(&p);
}
int bpf_load_program(enum bpf_prog_type type, const struct bpf_insn *insns,
@@ -338,6 +401,7 @@ int bpf_verify_program(enum bpf_prog_type type, const struct bpf_insn *insns,
int log_level)
{
union bpf_attr attr;
int fd;
memset(&attr, 0, sizeof(attr));
attr.prog_type = type;
@@ -351,13 +415,15 @@ int bpf_verify_program(enum bpf_prog_type type, const struct bpf_insn *insns,
attr.kern_version = kern_version;
attr.prog_flags = prog_flags;
return sys_bpf_prog_load(&attr, sizeof(attr));
fd = sys_bpf_prog_load(&attr, sizeof(attr));
return libbpf_err_errno(fd);
}
int bpf_map_update_elem(int fd, const void *key, const void *value,
__u64 flags)
{
union bpf_attr attr;
int ret;
memset(&attr, 0, sizeof(attr));
attr.map_fd = fd;
@@ -365,22 +431,54 @@ int bpf_map_update_elem(int fd, const void *key, const void *value,
attr.value = ptr_to_u64(value);
attr.flags = flags;
return sys_bpf(BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr));
ret = sys_bpf(BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr));
return libbpf_err_errno(ret);
}
int bpf_map_lookup_elem(int fd, const void *key, void *value)
{
union bpf_attr attr;
int ret;
memset(&attr, 0, sizeof(attr));
attr.map_fd = fd;
attr.key = ptr_to_u64(key);
attr.value = ptr_to_u64(value);
return sys_bpf(BPF_MAP_LOOKUP_ELEM, &attr, sizeof(attr));
ret = sys_bpf(BPF_MAP_LOOKUP_ELEM, &attr, sizeof(attr));
return libbpf_err_errno(ret);
}
int bpf_map_lookup_elem_flags(int fd, const void *key, void *value, __u64 flags)
{
union bpf_attr attr;
int ret;
memset(&attr, 0, sizeof(attr));
attr.map_fd = fd;
attr.key = ptr_to_u64(key);
attr.value = ptr_to_u64(value);
attr.flags = flags;
ret = sys_bpf(BPF_MAP_LOOKUP_ELEM, &attr, sizeof(attr));
return libbpf_err_errno(ret);
}
int bpf_map_lookup_and_delete_elem(int fd, const void *key, void *value)
{
union bpf_attr attr;
int ret;
memset(&attr, 0, sizeof(attr));
attr.map_fd = fd;
attr.key = ptr_to_u64(key);
attr.value = ptr_to_u64(value);
ret = sys_bpf(BPF_MAP_LOOKUP_AND_DELETE_ELEM, &attr, sizeof(attr));
return libbpf_err_errno(ret);
}
int bpf_map_lookup_and_delete_elem_flags(int fd, const void *key, void *value, __u64 flags)
{
union bpf_attr attr;
@@ -390,110 +488,283 @@ int bpf_map_lookup_elem_flags(int fd, const void *key, void *value, __u64 flags)
attr.value = ptr_to_u64(value);
attr.flags = flags;
return sys_bpf(BPF_MAP_LOOKUP_ELEM, &attr, sizeof(attr));
}
int bpf_map_lookup_and_delete_elem(int fd, const void *key, void *value)
{
union bpf_attr attr;
memset(&attr, 0, sizeof(attr));
attr.map_fd = fd;
attr.key = ptr_to_u64(key);
attr.value = ptr_to_u64(value);
return sys_bpf(BPF_MAP_LOOKUP_AND_DELETE_ELEM, &attr, sizeof(attr));
}
int bpf_map_delete_elem(int fd, const void *key)
{
union bpf_attr attr;
int ret;
memset(&attr, 0, sizeof(attr));
attr.map_fd = fd;
attr.key = ptr_to_u64(key);
return sys_bpf(BPF_MAP_DELETE_ELEM, &attr, sizeof(attr));
ret = sys_bpf(BPF_MAP_DELETE_ELEM, &attr, sizeof(attr));
return libbpf_err_errno(ret);
}
int bpf_map_get_next_key(int fd, const void *key, void *next_key)
{
union bpf_attr attr;
int ret;
memset(&attr, 0, sizeof(attr));
attr.map_fd = fd;
attr.key = ptr_to_u64(key);
attr.next_key = ptr_to_u64(next_key);
return sys_bpf(BPF_MAP_GET_NEXT_KEY, &attr, sizeof(attr));
ret = sys_bpf(BPF_MAP_GET_NEXT_KEY, &attr, sizeof(attr));
return libbpf_err_errno(ret);
}
int bpf_map_freeze(int fd)
{
union bpf_attr attr;
int ret;
memset(&attr, 0, sizeof(attr));
attr.map_fd = fd;
return sys_bpf(BPF_MAP_FREEZE, &attr, sizeof(attr));
ret = sys_bpf(BPF_MAP_FREEZE, &attr, sizeof(attr));
return libbpf_err_errno(ret);
}
static int bpf_map_batch_common(int cmd, int fd, void *in_batch,
void *out_batch, void *keys, void *values,
__u32 *count,
const struct bpf_map_batch_opts *opts)
{
union bpf_attr attr;
int ret;
if (!OPTS_VALID(opts, bpf_map_batch_opts))
return libbpf_err(-EINVAL);
memset(&attr, 0, sizeof(attr));
attr.batch.map_fd = fd;
attr.batch.in_batch = ptr_to_u64(in_batch);
attr.batch.out_batch = ptr_to_u64(out_batch);
attr.batch.keys = ptr_to_u64(keys);
attr.batch.values = ptr_to_u64(values);
attr.batch.count = *count;
attr.batch.elem_flags = OPTS_GET(opts, elem_flags, 0);
attr.batch.flags = OPTS_GET(opts, flags, 0);
ret = sys_bpf(cmd, &attr, sizeof(attr));
*count = attr.batch.count;
return libbpf_err_errno(ret);
}
int bpf_map_delete_batch(int fd, void *keys, __u32 *count,
const struct bpf_map_batch_opts *opts)
{
return bpf_map_batch_common(BPF_MAP_DELETE_BATCH, fd, NULL,
NULL, keys, NULL, count, opts);
}
int bpf_map_lookup_batch(int fd, void *in_batch, void *out_batch, void *keys,
void *values, __u32 *count,
const struct bpf_map_batch_opts *opts)
{
return bpf_map_batch_common(BPF_MAP_LOOKUP_BATCH, fd, in_batch,
out_batch, keys, values, count, opts);
}
int bpf_map_lookup_and_delete_batch(int fd, void *in_batch, void *out_batch,
void *keys, void *values, __u32 *count,
const struct bpf_map_batch_opts *opts)
{
return bpf_map_batch_common(BPF_MAP_LOOKUP_AND_DELETE_BATCH,
fd, in_batch, out_batch, keys, values,
count, opts);
}
int bpf_map_update_batch(int fd, void *keys, void *values, __u32 *count,
const struct bpf_map_batch_opts *opts)
{
return bpf_map_batch_common(BPF_MAP_UPDATE_BATCH, fd, NULL, NULL,
keys, values, count, opts);
}
int bpf_obj_pin(int fd, const char *pathname)
{
union bpf_attr attr;
int ret;
memset(&attr, 0, sizeof(attr));
attr.pathname = ptr_to_u64((void *)pathname);
attr.bpf_fd = fd;
return sys_bpf(BPF_OBJ_PIN, &attr, sizeof(attr));
ret = sys_bpf(BPF_OBJ_PIN, &attr, sizeof(attr));
return libbpf_err_errno(ret);
}
int bpf_obj_get(const char *pathname)
{
union bpf_attr attr;
int fd;
memset(&attr, 0, sizeof(attr));
attr.pathname = ptr_to_u64((void *)pathname);
return sys_bpf(BPF_OBJ_GET, &attr, sizeof(attr));
fd = sys_bpf(BPF_OBJ_GET, &attr, sizeof(attr));
return libbpf_err_errno(fd);
}
int bpf_prog_attach(int prog_fd, int target_fd, enum bpf_attach_type type,
unsigned int flags)
{
DECLARE_LIBBPF_OPTS(bpf_prog_attach_opts, opts,
.flags = flags,
);
return bpf_prog_attach_xattr(prog_fd, target_fd, type, &opts);
}
int bpf_prog_attach_xattr(int prog_fd, int target_fd,
enum bpf_attach_type type,
const struct bpf_prog_attach_opts *opts)
{
union bpf_attr attr;
int ret;
if (!OPTS_VALID(opts, bpf_prog_attach_opts))
return libbpf_err(-EINVAL);
memset(&attr, 0, sizeof(attr));
attr.target_fd = target_fd;
attr.attach_bpf_fd = prog_fd;
attr.attach_type = type;
attr.attach_flags = flags;
attr.attach_flags = OPTS_GET(opts, flags, 0);
attr.replace_bpf_fd = OPTS_GET(opts, replace_prog_fd, 0);
return sys_bpf(BPF_PROG_ATTACH, &attr, sizeof(attr));
ret = sys_bpf(BPF_PROG_ATTACH, &attr, sizeof(attr));
return libbpf_err_errno(ret);
}
int bpf_prog_detach(int target_fd, enum bpf_attach_type type)
{
union bpf_attr attr;
int ret;
memset(&attr, 0, sizeof(attr));
attr.target_fd = target_fd;
attr.attach_type = type;
return sys_bpf(BPF_PROG_DETACH, &attr, sizeof(attr));
ret = sys_bpf(BPF_PROG_DETACH, &attr, sizeof(attr));
return libbpf_err_errno(ret);
}
int bpf_prog_detach2(int prog_fd, int target_fd, enum bpf_attach_type type)
{
union bpf_attr attr;
int ret;
memset(&attr, 0, sizeof(attr));
attr.target_fd = target_fd;
attr.attach_bpf_fd = prog_fd;
attr.attach_type = type;
return sys_bpf(BPF_PROG_DETACH, &attr, sizeof(attr));
ret = sys_bpf(BPF_PROG_DETACH, &attr, sizeof(attr));
return libbpf_err_errno(ret);
}
int bpf_link_create(int prog_fd, int target_fd,
enum bpf_attach_type attach_type,
const struct bpf_link_create_opts *opts)
{
__u32 target_btf_id, iter_info_len;
union bpf_attr attr;
int fd;
if (!OPTS_VALID(opts, bpf_link_create_opts))
return libbpf_err(-EINVAL);
iter_info_len = OPTS_GET(opts, iter_info_len, 0);
target_btf_id = OPTS_GET(opts, target_btf_id, 0);
/* validate we don't have unexpected combinations of non-zero fields */
if (iter_info_len || target_btf_id) {
if (iter_info_len && target_btf_id)
return libbpf_err(-EINVAL);
if (!OPTS_ZEROED(opts, target_btf_id))
return libbpf_err(-EINVAL);
}
memset(&attr, 0, sizeof(attr));
attr.link_create.prog_fd = prog_fd;
attr.link_create.target_fd = target_fd;
attr.link_create.attach_type = attach_type;
attr.link_create.flags = OPTS_GET(opts, flags, 0);
if (target_btf_id) {
attr.link_create.target_btf_id = target_btf_id;
goto proceed;
}
switch (attach_type) {
case BPF_TRACE_ITER:
attr.link_create.iter_info = ptr_to_u64(OPTS_GET(opts, iter_info, (void *)0));
attr.link_create.iter_info_len = iter_info_len;
break;
case BPF_PERF_EVENT:
attr.link_create.perf_event.bpf_cookie = OPTS_GET(opts, perf_event.bpf_cookie, 0);
if (!OPTS_ZEROED(opts, perf_event))
return libbpf_err(-EINVAL);
break;
default:
if (!OPTS_ZEROED(opts, flags))
return libbpf_err(-EINVAL);
break;
}
proceed:
fd = sys_bpf(BPF_LINK_CREATE, &attr, sizeof(attr));
return libbpf_err_errno(fd);
}
int bpf_link_detach(int link_fd)
{
union bpf_attr attr;
int ret;
memset(&attr, 0, sizeof(attr));
attr.link_detach.link_fd = link_fd;
ret = sys_bpf(BPF_LINK_DETACH, &attr, sizeof(attr));
return libbpf_err_errno(ret);
}
int bpf_link_update(int link_fd, int new_prog_fd,
const struct bpf_link_update_opts *opts)
{
union bpf_attr attr;
int ret;
if (!OPTS_VALID(opts, bpf_link_update_opts))
return libbpf_err(-EINVAL);
memset(&attr, 0, sizeof(attr));
attr.link_update.link_fd = link_fd;
attr.link_update.new_prog_fd = new_prog_fd;
attr.link_update.flags = OPTS_GET(opts, flags, 0);
attr.link_update.old_prog_fd = OPTS_GET(opts, old_prog_fd, 0);
ret = sys_bpf(BPF_LINK_UPDATE, &attr, sizeof(attr));
return libbpf_err_errno(ret);
}
int bpf_iter_create(int link_fd)
{
union bpf_attr attr;
int fd;
memset(&attr, 0, sizeof(attr));
attr.iter_create.link_fd = link_fd;
fd = sys_bpf(BPF_ITER_CREATE, &attr, sizeof(attr));
return libbpf_err_errno(fd);
}
int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
@@ -510,10 +781,12 @@ int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
attr.query.prog_ids = ptr_to_u64(prog_ids);
ret = sys_bpf(BPF_PROG_QUERY, &attr, sizeof(attr));
if (attach_flags)
*attach_flags = attr.query.attach_flags;
*prog_cnt = attr.query.prog_cnt;
return ret;
return libbpf_err_errno(ret);
}
int bpf_prog_test_run(int prog_fd, int repeat, void *data, __u32 size,
@@ -531,13 +804,15 @@ int bpf_prog_test_run(int prog_fd, int repeat, void *data, __u32 size,
attr.test.repeat = repeat;
ret = sys_bpf(BPF_PROG_TEST_RUN, &attr, sizeof(attr));
if (size_out)
*size_out = attr.test.data_size_out;
if (retval)
*retval = attr.test.retval;
if (duration)
*duration = attr.test.duration;
return ret;
return libbpf_err_errno(ret);
}
int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr *test_attr)
@@ -546,7 +821,7 @@ int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr *test_attr)
int ret;
if (!test_attr->data_out && test_attr->data_size_out > 0)
return -EINVAL;
return libbpf_err(-EINVAL);
memset(&attr, 0, sizeof(attr));
attr.test.prog_fd = test_attr->prog_fd;
@@ -561,11 +836,46 @@ int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr *test_attr)
attr.test.repeat = test_attr->repeat;
ret = sys_bpf(BPF_PROG_TEST_RUN, &attr, sizeof(attr));
test_attr->data_size_out = attr.test.data_size_out;
test_attr->ctx_size_out = attr.test.ctx_size_out;
test_attr->retval = attr.test.retval;
test_attr->duration = attr.test.duration;
return ret;
return libbpf_err_errno(ret);
}
int bpf_prog_test_run_opts(int prog_fd, struct bpf_test_run_opts *opts)
{
union bpf_attr attr;
int ret;
if (!OPTS_VALID(opts, bpf_test_run_opts))
return libbpf_err(-EINVAL);
memset(&attr, 0, sizeof(attr));
attr.test.prog_fd = prog_fd;
attr.test.cpu = OPTS_GET(opts, cpu, 0);
attr.test.flags = OPTS_GET(opts, flags, 0);
attr.test.repeat = OPTS_GET(opts, repeat, 0);
attr.test.duration = OPTS_GET(opts, duration, 0);
attr.test.ctx_size_in = OPTS_GET(opts, ctx_size_in, 0);
attr.test.ctx_size_out = OPTS_GET(opts, ctx_size_out, 0);
attr.test.data_size_in = OPTS_GET(opts, data_size_in, 0);
attr.test.data_size_out = OPTS_GET(opts, data_size_out, 0);
attr.test.ctx_in = ptr_to_u64(OPTS_GET(opts, ctx_in, NULL));
attr.test.ctx_out = ptr_to_u64(OPTS_GET(opts, ctx_out, NULL));
attr.test.data_in = ptr_to_u64(OPTS_GET(opts, data_in, NULL));
attr.test.data_out = ptr_to_u64(OPTS_GET(opts, data_out, NULL));
ret = sys_bpf(BPF_PROG_TEST_RUN, &attr, sizeof(attr));
OPTS_SET(opts, data_size_out, attr.test.data_size_out);
OPTS_SET(opts, ctx_size_out, attr.test.ctx_size_out);
OPTS_SET(opts, duration, attr.test.duration);
OPTS_SET(opts, retval, attr.test.retval);
return libbpf_err_errno(ret);
}
static int bpf_obj_get_next_id(__u32 start_id, __u32 *next_id, int cmd)
@@ -580,7 +890,7 @@ static int bpf_obj_get_next_id(__u32 start_id, __u32 *next_id, int cmd)
if (!err)
*next_id = attr.next_id;
return err;
return libbpf_err_errno(err);
}
int bpf_prog_get_next_id(__u32 start_id, __u32 *next_id)
@@ -598,65 +908,91 @@ int bpf_btf_get_next_id(__u32 start_id, __u32 *next_id)
return bpf_obj_get_next_id(start_id, next_id, BPF_BTF_GET_NEXT_ID);
}
int bpf_link_get_next_id(__u32 start_id, __u32 *next_id)
{
return bpf_obj_get_next_id(start_id, next_id, BPF_LINK_GET_NEXT_ID);
}
int bpf_prog_get_fd_by_id(__u32 id)
{
union bpf_attr attr;
int fd;
memset(&attr, 0, sizeof(attr));
attr.prog_id = id;
return sys_bpf(BPF_PROG_GET_FD_BY_ID, &attr, sizeof(attr));
fd = sys_bpf(BPF_PROG_GET_FD_BY_ID, &attr, sizeof(attr));
return libbpf_err_errno(fd);
}
int bpf_map_get_fd_by_id(__u32 id)
{
union bpf_attr attr;
int fd;
memset(&attr, 0, sizeof(attr));
attr.map_id = id;
return sys_bpf(BPF_MAP_GET_FD_BY_ID, &attr, sizeof(attr));
fd = sys_bpf(BPF_MAP_GET_FD_BY_ID, &attr, sizeof(attr));
return libbpf_err_errno(fd);
}
int bpf_btf_get_fd_by_id(__u32 id)
{
union bpf_attr attr;
int fd;
memset(&attr, 0, sizeof(attr));
attr.btf_id = id;
return sys_bpf(BPF_BTF_GET_FD_BY_ID, &attr, sizeof(attr));
fd = sys_bpf(BPF_BTF_GET_FD_BY_ID, &attr, sizeof(attr));
return libbpf_err_errno(fd);
}
int bpf_obj_get_info_by_fd(int prog_fd, void *info, __u32 *info_len)
int bpf_link_get_fd_by_id(__u32 id)
{
union bpf_attr attr;
int fd;
memset(&attr, 0, sizeof(attr));
attr.link_id = id;
fd = sys_bpf(BPF_LINK_GET_FD_BY_ID, &attr, sizeof(attr));
return libbpf_err_errno(fd);
}
int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len)
{
union bpf_attr attr;
int err;
memset(&attr, 0, sizeof(attr));
attr.info.bpf_fd = prog_fd;
attr.info.bpf_fd = bpf_fd;
attr.info.info_len = *info_len;
attr.info.info = ptr_to_u64(info);
err = sys_bpf(BPF_OBJ_GET_INFO_BY_FD, &attr, sizeof(attr));
if (!err)
*info_len = attr.info.info_len;
return err;
return libbpf_err_errno(err);
}
int bpf_raw_tracepoint_open(const char *name, int prog_fd)
{
union bpf_attr attr;
int fd;
memset(&attr, 0, sizeof(attr));
attr.raw_tracepoint.name = ptr_to_u64(name);
attr.raw_tracepoint.prog_fd = prog_fd;
return sys_bpf(BPF_RAW_TRACEPOINT_OPEN, &attr, sizeof(attr));
fd = sys_bpf(BPF_RAW_TRACEPOINT_OPEN, &attr, sizeof(attr));
return libbpf_err_errno(fd);
}
int bpf_load_btf(void *btf, __u32 btf_size, char *log_buf, __u32 log_buf_size,
int bpf_load_btf(const void *btf, __u32 btf_size, char *log_buf, __u32 log_buf_size,
bool do_log)
{
union bpf_attr attr = {};
@@ -673,12 +1009,13 @@ retry:
}
fd = sys_bpf(BPF_BTF_LOAD, &attr, sizeof(attr));
if (fd == -1 && !do_log && log_buf && log_buf_size) {
if (fd < 0 && !do_log && log_buf && log_buf_size) {
do_log = true;
goto retry;
}
return fd;
return libbpf_err_errno(fd);
}
int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf, __u32 *buf_len,
@@ -695,11 +1032,42 @@ int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf, __u32 *buf_len,
attr.task_fd_query.buf_len = *buf_len;
err = sys_bpf(BPF_TASK_FD_QUERY, &attr, sizeof(attr));
*buf_len = attr.task_fd_query.buf_len;
*prog_id = attr.task_fd_query.prog_id;
*fd_type = attr.task_fd_query.fd_type;
*probe_offset = attr.task_fd_query.probe_offset;
*probe_addr = attr.task_fd_query.probe_addr;
return err;
return libbpf_err_errno(err);
}
int bpf_enable_stats(enum bpf_stats_type type)
{
union bpf_attr attr;
int fd;
memset(&attr, 0, sizeof(attr));
attr.enable_stats.type = type;
fd = sys_bpf(BPF_ENABLE_STATS, &attr, sizeof(attr));
return libbpf_err_errno(fd);
}
int bpf_prog_bind_map(int prog_fd, int map_fd,
const struct bpf_prog_bind_opts *opts)
{
union bpf_attr attr;
int ret;
if (!OPTS_VALID(opts, bpf_prog_bind_opts))
return libbpf_err(-EINVAL);
memset(&attr, 0, sizeof(attr));
attr.prog_bind_map.prog_fd = prog_fd;
attr.prog_bind_map.map_fd = map_fd;
attr.prog_bind_map.flags = OPTS_GET(opts, flags, 0);
ret = sys_bpf(BPF_PROG_BIND_MAP, &attr, sizeof(attr));
return libbpf_err_errno(ret);
}

133
src/bpf.h
View File

@@ -28,14 +28,12 @@
#include <stddef.h>
#include <stdint.h>
#include "libbpf_common.h"
#ifdef __cplusplus
extern "C" {
#endif
#ifndef LIBBPF_API
#define LIBBPF_API __attribute__((visibility("default")))
#endif
struct bpf_create_map_attr {
const char *name;
enum bpf_map_type map_type;
@@ -48,7 +46,10 @@ struct bpf_create_map_attr {
__u32 btf_key_type_id;
__u32 btf_value_type_id;
__u32 map_ifindex;
__u32 inner_map_fd;
union {
__u32 inner_map_fd;
__u32 btf_vmlinux_value_type_id;
};
};
LIBBPF_API int
@@ -77,8 +78,14 @@ struct bpf_load_program_attr {
const struct bpf_insn *insns;
size_t insns_cnt;
const char *license;
__u32 kern_version;
__u32 prog_ifindex;
union {
__u32 kern_version;
__u32 attach_prog_fd;
};
union {
__u32 prog_ifindex;
__u32 attach_btf_id;
};
__u32 prog_btf_fd;
__u32 func_info_rec_size;
const void *func_info;
@@ -117,17 +124,86 @@ LIBBPF_API int bpf_map_lookup_elem_flags(int fd, const void *key, void *value,
__u64 flags);
LIBBPF_API int bpf_map_lookup_and_delete_elem(int fd, const void *key,
void *value);
LIBBPF_API int bpf_map_lookup_and_delete_elem_flags(int fd, const void *key,
void *value, __u64 flags);
LIBBPF_API int bpf_map_delete_elem(int fd, const void *key);
LIBBPF_API int bpf_map_get_next_key(int fd, const void *key, void *next_key);
LIBBPF_API int bpf_map_freeze(int fd);
struct bpf_map_batch_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
__u64 elem_flags;
__u64 flags;
};
#define bpf_map_batch_opts__last_field flags
LIBBPF_API int bpf_map_delete_batch(int fd, void *keys,
__u32 *count,
const struct bpf_map_batch_opts *opts);
LIBBPF_API int bpf_map_lookup_batch(int fd, void *in_batch, void *out_batch,
void *keys, void *values, __u32 *count,
const struct bpf_map_batch_opts *opts);
LIBBPF_API int bpf_map_lookup_and_delete_batch(int fd, void *in_batch,
void *out_batch, void *keys,
void *values, __u32 *count,
const struct bpf_map_batch_opts *opts);
LIBBPF_API int bpf_map_update_batch(int fd, void *keys, void *values,
__u32 *count,
const struct bpf_map_batch_opts *opts);
LIBBPF_API int bpf_obj_pin(int fd, const char *pathname);
LIBBPF_API int bpf_obj_get(const char *pathname);
struct bpf_prog_attach_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
unsigned int flags;
int replace_prog_fd;
};
#define bpf_prog_attach_opts__last_field replace_prog_fd
LIBBPF_API int bpf_prog_attach(int prog_fd, int attachable_fd,
enum bpf_attach_type type, unsigned int flags);
LIBBPF_API int bpf_prog_attach_xattr(int prog_fd, int attachable_fd,
enum bpf_attach_type type,
const struct bpf_prog_attach_opts *opts);
LIBBPF_API int bpf_prog_detach(int attachable_fd, enum bpf_attach_type type);
LIBBPF_API int bpf_prog_detach2(int prog_fd, int attachable_fd,
enum bpf_attach_type type);
union bpf_iter_link_info; /* defined in up-to-date linux/bpf.h */
struct bpf_link_create_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
__u32 flags;
union bpf_iter_link_info *iter_info;
__u32 iter_info_len;
__u32 target_btf_id;
union {
struct {
__u64 bpf_cookie;
} perf_event;
};
size_t :0;
};
#define bpf_link_create_opts__last_field perf_event
LIBBPF_API int bpf_link_create(int prog_fd, int target_fd,
enum bpf_attach_type attach_type,
const struct bpf_link_create_opts *opts);
LIBBPF_API int bpf_link_detach(int link_fd);
struct bpf_link_update_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
__u32 flags; /* extra flags */
__u32 old_prog_fd; /* expected old program FD */
};
#define bpf_link_update_opts__last_field old_prog_fd
LIBBPF_API int bpf_link_update(int link_fd, int new_prog_fd,
const struct bpf_link_update_opts *opts);
LIBBPF_API int bpf_iter_create(int link_fd);
struct bpf_prog_test_run_attr {
int prog_fd;
int repeat;
@@ -157,20 +233,59 @@ LIBBPF_API int bpf_prog_test_run(int prog_fd, int repeat, void *data,
LIBBPF_API int bpf_prog_get_next_id(__u32 start_id, __u32 *next_id);
LIBBPF_API int bpf_map_get_next_id(__u32 start_id, __u32 *next_id);
LIBBPF_API int bpf_btf_get_next_id(__u32 start_id, __u32 *next_id);
LIBBPF_API int bpf_link_get_next_id(__u32 start_id, __u32 *next_id);
LIBBPF_API int bpf_prog_get_fd_by_id(__u32 id);
LIBBPF_API int bpf_map_get_fd_by_id(__u32 id);
LIBBPF_API int bpf_btf_get_fd_by_id(__u32 id);
LIBBPF_API int bpf_obj_get_info_by_fd(int prog_fd, void *info, __u32 *info_len);
LIBBPF_API int bpf_link_get_fd_by_id(__u32 id);
LIBBPF_API int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len);
LIBBPF_API int bpf_prog_query(int target_fd, enum bpf_attach_type type,
__u32 query_flags, __u32 *attach_flags,
__u32 *prog_ids, __u32 *prog_cnt);
LIBBPF_API int bpf_raw_tracepoint_open(const char *name, int prog_fd);
LIBBPF_API int bpf_load_btf(void *btf, __u32 btf_size, char *log_buf,
LIBBPF_API int bpf_load_btf(const void *btf, __u32 btf_size, char *log_buf,
__u32 log_buf_size, bool do_log);
LIBBPF_API int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf,
__u32 *buf_len, __u32 *prog_id, __u32 *fd_type,
__u64 *probe_offset, __u64 *probe_addr);
enum bpf_stats_type; /* defined in up-to-date linux/bpf.h */
LIBBPF_API int bpf_enable_stats(enum bpf_stats_type type);
struct bpf_prog_bind_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
__u32 flags;
};
#define bpf_prog_bind_opts__last_field flags
LIBBPF_API int bpf_prog_bind_map(int prog_fd, int map_fd,
const struct bpf_prog_bind_opts *opts);
struct bpf_test_run_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
const void *data_in; /* optional */
void *data_out; /* optional */
__u32 data_size_in;
__u32 data_size_out; /* in: max length of data_out
* out: length of data_out
*/
const void *ctx_in; /* optional */
void *ctx_out; /* optional */
__u32 ctx_size_in;
__u32 ctx_size_out; /* in: max length of ctx_out
* out: length of cxt_out
*/
__u32 retval; /* out: return code of the BPF program */
int repeat;
__u32 duration; /* out: average per repetition in ns */
__u32 flags;
__u32 cpu;
};
#define bpf_test_run_opts__last_field cpu
LIBBPF_API int bpf_prog_test_run_opts(int prog_fd,
struct bpf_test_run_opts *opts);
#ifdef __cplusplus
} /* extern "C" */
#endif

444
src/bpf_core_read.h Normal file
View File

@@ -0,0 +1,444 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
#ifndef __BPF_CORE_READ_H__
#define __BPF_CORE_READ_H__
/*
* enum bpf_field_info_kind is passed as a second argument into
* __builtin_preserve_field_info() built-in to get a specific aspect of
* a field, captured as a first argument. __builtin_preserve_field_info(field,
* info_kind) returns __u32 integer and produces BTF field relocation, which
* is understood and processed by libbpf during BPF object loading. See
* selftests/bpf for examples.
*/
enum bpf_field_info_kind {
BPF_FIELD_BYTE_OFFSET = 0, /* field byte offset */
BPF_FIELD_BYTE_SIZE = 1,
BPF_FIELD_EXISTS = 2, /* field existence in target kernel */
BPF_FIELD_SIGNED = 3,
BPF_FIELD_LSHIFT_U64 = 4,
BPF_FIELD_RSHIFT_U64 = 5,
};
/* second argument to __builtin_btf_type_id() built-in */
enum bpf_type_id_kind {
BPF_TYPE_ID_LOCAL = 0, /* BTF type ID in local program */
BPF_TYPE_ID_TARGET = 1, /* BTF type ID in target kernel */
};
/* second argument to __builtin_preserve_type_info() built-in */
enum bpf_type_info_kind {
BPF_TYPE_EXISTS = 0, /* type existence in target kernel */
BPF_TYPE_SIZE = 1, /* type size in target kernel */
};
/* second argument to __builtin_preserve_enum_value() built-in */
enum bpf_enum_value_kind {
BPF_ENUMVAL_EXISTS = 0, /* enum value existence in kernel */
BPF_ENUMVAL_VALUE = 1, /* enum value value relocation */
};
#define __CORE_RELO(src, field, info) \
__builtin_preserve_field_info((src)->field, BPF_FIELD_##info)
#if __BYTE_ORDER == __LITTLE_ENDIAN
#define __CORE_BITFIELD_PROBE_READ(dst, src, fld) \
bpf_probe_read_kernel( \
(void *)dst, \
__CORE_RELO(src, fld, BYTE_SIZE), \
(const void *)src + __CORE_RELO(src, fld, BYTE_OFFSET))
#else
/* semantics of LSHIFT_64 assumes loading values into low-ordered bytes, so
* for big-endian we need to adjust destination pointer accordingly, based on
* field byte size
*/
#define __CORE_BITFIELD_PROBE_READ(dst, src, fld) \
bpf_probe_read_kernel( \
(void *)dst + (8 - __CORE_RELO(src, fld, BYTE_SIZE)), \
__CORE_RELO(src, fld, BYTE_SIZE), \
(const void *)src + __CORE_RELO(src, fld, BYTE_OFFSET))
#endif
/*
* Extract bitfield, identified by s->field, and return its value as u64.
* All this is done in relocatable manner, so bitfield changes such as
* signedness, bit size, offset changes, this will be handled automatically.
* This version of macro is using bpf_probe_read_kernel() to read underlying
* integer storage. Macro functions as an expression and its return type is
* bpf_probe_read_kernel()'s return value: 0, on success, <0 on error.
*/
#define BPF_CORE_READ_BITFIELD_PROBED(s, field) ({ \
unsigned long long val = 0; \
\
__CORE_BITFIELD_PROBE_READ(&val, s, field); \
val <<= __CORE_RELO(s, field, LSHIFT_U64); \
if (__CORE_RELO(s, field, SIGNED)) \
val = ((long long)val) >> __CORE_RELO(s, field, RSHIFT_U64); \
else \
val = val >> __CORE_RELO(s, field, RSHIFT_U64); \
val; \
})
/*
* Extract bitfield, identified by s->field, and return its value as u64.
* This version of macro is using direct memory reads and should be used from
* BPF program types that support such functionality (e.g., typed raw
* tracepoints).
*/
#define BPF_CORE_READ_BITFIELD(s, field) ({ \
const void *p = (const void *)s + __CORE_RELO(s, field, BYTE_OFFSET); \
unsigned long long val; \
\
/* This is a so-called barrier_var() operation that makes specified \
* variable "a black box" for optimizing compiler. \
* It forces compiler to perform BYTE_OFFSET relocation on p and use \
* its calculated value in the switch below, instead of applying \
* the same relocation 4 times for each individual memory load. \
*/ \
asm volatile("" : "=r"(p) : "0"(p)); \
\
switch (__CORE_RELO(s, field, BYTE_SIZE)) { \
case 1: val = *(const unsigned char *)p; break; \
case 2: val = *(const unsigned short *)p; break; \
case 4: val = *(const unsigned int *)p; break; \
case 8: val = *(const unsigned long long *)p; break; \
} \
val <<= __CORE_RELO(s, field, LSHIFT_U64); \
if (__CORE_RELO(s, field, SIGNED)) \
val = ((long long)val) >> __CORE_RELO(s, field, RSHIFT_U64); \
else \
val = val >> __CORE_RELO(s, field, RSHIFT_U64); \
val; \
})
/*
* Convenience macro to check that field actually exists in target kernel's.
* Returns:
* 1, if matching field is present in target kernel;
* 0, if no matching field found.
*/
#define bpf_core_field_exists(field) \
__builtin_preserve_field_info(field, BPF_FIELD_EXISTS)
/*
* Convenience macro to get the byte size of a field. Works for integers,
* struct/unions, pointers, arrays, and enums.
*/
#define bpf_core_field_size(field) \
__builtin_preserve_field_info(field, BPF_FIELD_BYTE_SIZE)
/*
* Convenience macro to get BTF type ID of a specified type, using a local BTF
* information. Return 32-bit unsigned integer with type ID from program's own
* BTF. Always succeeds.
*/
#define bpf_core_type_id_local(type) \
__builtin_btf_type_id(*(typeof(type) *)0, BPF_TYPE_ID_LOCAL)
/*
* Convenience macro to get BTF type ID of a target kernel's type that matches
* specified local type.
* Returns:
* - valid 32-bit unsigned type ID in kernel BTF;
* - 0, if no matching type was found in a target kernel BTF.
*/
#define bpf_core_type_id_kernel(type) \
__builtin_btf_type_id(*(typeof(type) *)0, BPF_TYPE_ID_TARGET)
/*
* Convenience macro to check that provided named type
* (struct/union/enum/typedef) exists in a target kernel.
* Returns:
* 1, if such type is present in target kernel's BTF;
* 0, if no matching type is found.
*/
#define bpf_core_type_exists(type) \
__builtin_preserve_type_info(*(typeof(type) *)0, BPF_TYPE_EXISTS)
/*
* Convenience macro to get the byte size of a provided named type
* (struct/union/enum/typedef) in a target kernel.
* Returns:
* >= 0 size (in bytes), if type is present in target kernel's BTF;
* 0, if no matching type is found.
*/
#define bpf_core_type_size(type) \
__builtin_preserve_type_info(*(typeof(type) *)0, BPF_TYPE_SIZE)
/*
* Convenience macro to check that provided enumerator value is defined in
* a target kernel.
* Returns:
* 1, if specified enum type and its enumerator value are present in target
* kernel's BTF;
* 0, if no matching enum and/or enum value within that enum is found.
*/
#define bpf_core_enum_value_exists(enum_type, enum_value) \
__builtin_preserve_enum_value(*(typeof(enum_type) *)enum_value, BPF_ENUMVAL_EXISTS)
/*
* Convenience macro to get the integer value of an enumerator value in
* a target kernel.
* Returns:
* 64-bit value, if specified enum type and its enumerator value are
* present in target kernel's BTF;
* 0, if no matching enum and/or enum value within that enum is found.
*/
#define bpf_core_enum_value(enum_type, enum_value) \
__builtin_preserve_enum_value(*(typeof(enum_type) *)enum_value, BPF_ENUMVAL_VALUE)
/*
* bpf_core_read() abstracts away bpf_probe_read_kernel() call and captures
* offset relocation for source address using __builtin_preserve_access_index()
* built-in, provided by Clang.
*
* __builtin_preserve_access_index() takes as an argument an expression of
* taking an address of a field within struct/union. It makes compiler emit
* a relocation, which records BTF type ID describing root struct/union and an
* accessor string which describes exact embedded field that was used to take
* an address. See detailed description of this relocation format and
* semantics in comments to struct bpf_field_reloc in libbpf_internal.h.
*
* This relocation allows libbpf to adjust BPF instruction to use correct
* actual field offset, based on target kernel BTF type that matches original
* (local) BTF, used to record relocation.
*/
#define bpf_core_read(dst, sz, src) \
bpf_probe_read_kernel(dst, sz, (const void *)__builtin_preserve_access_index(src))
/* NOTE: see comments for BPF_CORE_READ_USER() about the proper types use. */
#define bpf_core_read_user(dst, sz, src) \
bpf_probe_read_user(dst, sz, (const void *)__builtin_preserve_access_index(src))
/*
* bpf_core_read_str() is a thin wrapper around bpf_probe_read_str()
* additionally emitting BPF CO-RE field relocation for specified source
* argument.
*/
#define bpf_core_read_str(dst, sz, src) \
bpf_probe_read_kernel_str(dst, sz, (const void *)__builtin_preserve_access_index(src))
/* NOTE: see comments for BPF_CORE_READ_USER() about the proper types use. */
#define bpf_core_read_user_str(dst, sz, src) \
bpf_probe_read_user_str(dst, sz, (const void *)__builtin_preserve_access_index(src))
#define ___concat(a, b) a ## b
#define ___apply(fn, n) ___concat(fn, n)
#define ___nth(_1, _2, _3, _4, _5, _6, _7, _8, _9, _10, __11, N, ...) N
/*
* return number of provided arguments; used for switch-based variadic macro
* definitions (see ___last, ___arrow, etc below)
*/
#define ___narg(...) ___nth(_, ##__VA_ARGS__, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)
/*
* return 0 if no arguments are passed, N - otherwise; used for
* recursively-defined macros to specify termination (0) case, and generic
* (N) case (e.g., ___read_ptrs, ___core_read)
*/
#define ___empty(...) ___nth(_, ##__VA_ARGS__, N, N, N, N, N, N, N, N, N, N, 0)
#define ___last1(x) x
#define ___last2(a, x) x
#define ___last3(a, b, x) x
#define ___last4(a, b, c, x) x
#define ___last5(a, b, c, d, x) x
#define ___last6(a, b, c, d, e, x) x
#define ___last7(a, b, c, d, e, f, x) x
#define ___last8(a, b, c, d, e, f, g, x) x
#define ___last9(a, b, c, d, e, f, g, h, x) x
#define ___last10(a, b, c, d, e, f, g, h, i, x) x
#define ___last(...) ___apply(___last, ___narg(__VA_ARGS__))(__VA_ARGS__)
#define ___nolast2(a, _) a
#define ___nolast3(a, b, _) a, b
#define ___nolast4(a, b, c, _) a, b, c
#define ___nolast5(a, b, c, d, _) a, b, c, d
#define ___nolast6(a, b, c, d, e, _) a, b, c, d, e
#define ___nolast7(a, b, c, d, e, f, _) a, b, c, d, e, f
#define ___nolast8(a, b, c, d, e, f, g, _) a, b, c, d, e, f, g
#define ___nolast9(a, b, c, d, e, f, g, h, _) a, b, c, d, e, f, g, h
#define ___nolast10(a, b, c, d, e, f, g, h, i, _) a, b, c, d, e, f, g, h, i
#define ___nolast(...) ___apply(___nolast, ___narg(__VA_ARGS__))(__VA_ARGS__)
#define ___arrow1(a) a
#define ___arrow2(a, b) a->b
#define ___arrow3(a, b, c) a->b->c
#define ___arrow4(a, b, c, d) a->b->c->d
#define ___arrow5(a, b, c, d, e) a->b->c->d->e
#define ___arrow6(a, b, c, d, e, f) a->b->c->d->e->f
#define ___arrow7(a, b, c, d, e, f, g) a->b->c->d->e->f->g
#define ___arrow8(a, b, c, d, e, f, g, h) a->b->c->d->e->f->g->h
#define ___arrow9(a, b, c, d, e, f, g, h, i) a->b->c->d->e->f->g->h->i
#define ___arrow10(a, b, c, d, e, f, g, h, i, j) a->b->c->d->e->f->g->h->i->j
#define ___arrow(...) ___apply(___arrow, ___narg(__VA_ARGS__))(__VA_ARGS__)
#define ___type(...) typeof(___arrow(__VA_ARGS__))
#define ___read(read_fn, dst, src_type, src, accessor) \
read_fn((void *)(dst), sizeof(*(dst)), &((src_type)(src))->accessor)
/* "recursively" read a sequence of inner pointers using local __t var */
#define ___rd_first(fn, src, a) ___read(fn, &__t, ___type(src), src, a);
#define ___rd_last(fn, ...) \
___read(fn, &__t, ___type(___nolast(__VA_ARGS__)), __t, ___last(__VA_ARGS__));
#define ___rd_p1(fn, ...) const void *__t; ___rd_first(fn, __VA_ARGS__)
#define ___rd_p2(fn, ...) ___rd_p1(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)
#define ___rd_p3(fn, ...) ___rd_p2(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)
#define ___rd_p4(fn, ...) ___rd_p3(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)
#define ___rd_p5(fn, ...) ___rd_p4(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)
#define ___rd_p6(fn, ...) ___rd_p5(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)
#define ___rd_p7(fn, ...) ___rd_p6(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)
#define ___rd_p8(fn, ...) ___rd_p7(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)
#define ___rd_p9(fn, ...) ___rd_p8(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)
#define ___read_ptrs(fn, src, ...) \
___apply(___rd_p, ___narg(__VA_ARGS__))(fn, src, __VA_ARGS__)
#define ___core_read0(fn, fn_ptr, dst, src, a) \
___read(fn, dst, ___type(src), src, a);
#define ___core_readN(fn, fn_ptr, dst, src, ...) \
___read_ptrs(fn_ptr, src, ___nolast(__VA_ARGS__)) \
___read(fn, dst, ___type(src, ___nolast(__VA_ARGS__)), __t, \
___last(__VA_ARGS__));
#define ___core_read(fn, fn_ptr, dst, src, a, ...) \
___apply(___core_read, ___empty(__VA_ARGS__))(fn, fn_ptr, dst, \
src, a, ##__VA_ARGS__)
/*
* BPF_CORE_READ_INTO() is a more performance-conscious variant of
* BPF_CORE_READ(), in which final field is read into user-provided storage.
* See BPF_CORE_READ() below for more details on general usage.
*/
#define BPF_CORE_READ_INTO(dst, src, a, ...) ({ \
___core_read(bpf_core_read, bpf_core_read, \
dst, (src), a, ##__VA_ARGS__) \
})
/*
* Variant of BPF_CORE_READ_INTO() for reading from user-space memory.
*
* NOTE: see comments for BPF_CORE_READ_USER() about the proper types use.
*/
#define BPF_CORE_READ_USER_INTO(dst, src, a, ...) ({ \
___core_read(bpf_core_read_user, bpf_core_read_user, \
dst, (src), a, ##__VA_ARGS__) \
})
/* Non-CO-RE variant of BPF_CORE_READ_INTO() */
#define BPF_PROBE_READ_INTO(dst, src, a, ...) ({ \
___core_read(bpf_probe_read, bpf_probe_read, \
dst, (src), a, ##__VA_ARGS__) \
})
/* Non-CO-RE variant of BPF_CORE_READ_USER_INTO().
*
* As no CO-RE relocations are emitted, source types can be arbitrary and are
* not restricted to kernel types only.
*/
#define BPF_PROBE_READ_USER_INTO(dst, src, a, ...) ({ \
___core_read(bpf_probe_read_user, bpf_probe_read_user, \
dst, (src), a, ##__VA_ARGS__) \
})
/*
* BPF_CORE_READ_STR_INTO() does same "pointer chasing" as
* BPF_CORE_READ() for intermediate pointers, but then executes (and returns
* corresponding error code) bpf_core_read_str() for final string read.
*/
#define BPF_CORE_READ_STR_INTO(dst, src, a, ...) ({ \
___core_read(bpf_core_read_str, bpf_core_read, \
dst, (src), a, ##__VA_ARGS__) \
})
/*
* Variant of BPF_CORE_READ_STR_INTO() for reading from user-space memory.
*
* NOTE: see comments for BPF_CORE_READ_USER() about the proper types use.
*/
#define BPF_CORE_READ_USER_STR_INTO(dst, src, a, ...) ({ \
___core_read(bpf_core_read_user_str, bpf_core_read_user, \
dst, (src), a, ##__VA_ARGS__) \
})
/* Non-CO-RE variant of BPF_CORE_READ_STR_INTO() */
#define BPF_PROBE_READ_STR_INTO(dst, src, a, ...) ({ \
___core_read(bpf_probe_read_str, bpf_probe_read, \
dst, (src), a, ##__VA_ARGS__) \
})
/*
* Non-CO-RE variant of BPF_CORE_READ_USER_STR_INTO().
*
* As no CO-RE relocations are emitted, source types can be arbitrary and are
* not restricted to kernel types only.
*/
#define BPF_PROBE_READ_USER_STR_INTO(dst, src, a, ...) ({ \
___core_read(bpf_probe_read_user_str, bpf_probe_read_user, \
dst, (src), a, ##__VA_ARGS__) \
})
/*
* BPF_CORE_READ() is used to simplify BPF CO-RE relocatable read, especially
* when there are few pointer chasing steps.
* E.g., what in non-BPF world (or in BPF w/ BCC) would be something like:
* int x = s->a.b.c->d.e->f->g;
* can be succinctly achieved using BPF_CORE_READ as:
* int x = BPF_CORE_READ(s, a.b.c, d.e, f, g);
*
* BPF_CORE_READ will decompose above statement into 4 bpf_core_read (BPF
* CO-RE relocatable bpf_probe_read_kernel() wrapper) calls, logically
* equivalent to:
* 1. const void *__t = s->a.b.c;
* 2. __t = __t->d.e;
* 3. __t = __t->f;
* 4. return __t->g;
*
* Equivalence is logical, because there is a heavy type casting/preservation
* involved, as well as all the reads are happening through
* bpf_probe_read_kernel() calls using __builtin_preserve_access_index() to
* emit CO-RE relocations.
*
* N.B. Only up to 9 "field accessors" are supported, which should be more
* than enough for any practical purpose.
*/
#define BPF_CORE_READ(src, a, ...) ({ \
___type((src), a, ##__VA_ARGS__) __r; \
BPF_CORE_READ_INTO(&__r, (src), a, ##__VA_ARGS__); \
__r; \
})
/*
* Variant of BPF_CORE_READ() for reading from user-space memory.
*
* NOTE: all the source types involved are still *kernel types* and need to
* exist in kernel (or kernel module) BTF, otherwise CO-RE relocation will
* fail. Custom user types are not relocatable with CO-RE.
* The typical situation in which BPF_CORE_READ_USER() might be used is to
* read kernel UAPI types from the user-space memory passed in as a syscall
* input argument.
*/
#define BPF_CORE_READ_USER(src, a, ...) ({ \
___type((src), a, ##__VA_ARGS__) __r; \
BPF_CORE_READ_USER_INTO(&__r, (src), a, ##__VA_ARGS__); \
__r; \
})
/* Non-CO-RE variant of BPF_CORE_READ() */
#define BPF_PROBE_READ(src, a, ...) ({ \
___type((src), a, ##__VA_ARGS__) __r; \
BPF_PROBE_READ_INTO(&__r, (src), a, ##__VA_ARGS__); \
__r; \
})
/*
* Non-CO-RE variant of BPF_CORE_READ_USER().
*
* As no CO-RE relocations are emitted, source types can be arbitrary and are
* not restricted to kernel types only.
*/
#define BPF_PROBE_READ_USER(src, a, ...) ({ \
___type((src), a, ##__VA_ARGS__) __r; \
BPF_PROBE_READ_USER_INTO(&__r, (src), a, ##__VA_ARGS__); \
__r; \
})
#endif

99
src/bpf_endian.h Normal file
View File

@@ -0,0 +1,99 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
#ifndef __BPF_ENDIAN__
#define __BPF_ENDIAN__
/*
* Isolate byte #n and put it into byte #m, for __u##b type.
* E.g., moving byte #6 (nnnnnnnn) into byte #1 (mmmmmmmm) for __u64:
* 1) xxxxxxxx nnnnnnnn xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx mmmmmmmm xxxxxxxx
* 2) nnnnnnnn xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx mmmmmmmm xxxxxxxx 00000000
* 3) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 nnnnnnnn
* 4) 00000000 00000000 00000000 00000000 00000000 00000000 nnnnnnnn 00000000
*/
#define ___bpf_mvb(x, b, n, m) ((__u##b)(x) << (b-(n+1)*8) >> (b-8) << (m*8))
#define ___bpf_swab16(x) ((__u16)( \
___bpf_mvb(x, 16, 0, 1) | \
___bpf_mvb(x, 16, 1, 0)))
#define ___bpf_swab32(x) ((__u32)( \
___bpf_mvb(x, 32, 0, 3) | \
___bpf_mvb(x, 32, 1, 2) | \
___bpf_mvb(x, 32, 2, 1) | \
___bpf_mvb(x, 32, 3, 0)))
#define ___bpf_swab64(x) ((__u64)( \
___bpf_mvb(x, 64, 0, 7) | \
___bpf_mvb(x, 64, 1, 6) | \
___bpf_mvb(x, 64, 2, 5) | \
___bpf_mvb(x, 64, 3, 4) | \
___bpf_mvb(x, 64, 4, 3) | \
___bpf_mvb(x, 64, 5, 2) | \
___bpf_mvb(x, 64, 6, 1) | \
___bpf_mvb(x, 64, 7, 0)))
/* LLVM's BPF target selects the endianness of the CPU
* it compiles on, or the user specifies (bpfel/bpfeb),
* respectively. The used __BYTE_ORDER__ is defined by
* the compiler, we cannot rely on __BYTE_ORDER from
* libc headers, since it doesn't reflect the actual
* requested byte order.
*
* Note, LLVM's BPF target has different __builtin_bswapX()
* semantics. It does map to BPF_ALU | BPF_END | BPF_TO_BE
* in bpfel and bpfeb case, which means below, that we map
* to cpu_to_be16(). We could use it unconditionally in BPF
* case, but better not rely on it, so that this header here
* can be used from application and BPF program side, which
* use different targets.
*/
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
# define __bpf_ntohs(x) __builtin_bswap16(x)
# define __bpf_htons(x) __builtin_bswap16(x)
# define __bpf_constant_ntohs(x) ___bpf_swab16(x)
# define __bpf_constant_htons(x) ___bpf_swab16(x)
# define __bpf_ntohl(x) __builtin_bswap32(x)
# define __bpf_htonl(x) __builtin_bswap32(x)
# define __bpf_constant_ntohl(x) ___bpf_swab32(x)
# define __bpf_constant_htonl(x) ___bpf_swab32(x)
# define __bpf_be64_to_cpu(x) __builtin_bswap64(x)
# define __bpf_cpu_to_be64(x) __builtin_bswap64(x)
# define __bpf_constant_be64_to_cpu(x) ___bpf_swab64(x)
# define __bpf_constant_cpu_to_be64(x) ___bpf_swab64(x)
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
# define __bpf_ntohs(x) (x)
# define __bpf_htons(x) (x)
# define __bpf_constant_ntohs(x) (x)
# define __bpf_constant_htons(x) (x)
# define __bpf_ntohl(x) (x)
# define __bpf_htonl(x) (x)
# define __bpf_constant_ntohl(x) (x)
# define __bpf_constant_htonl(x) (x)
# define __bpf_be64_to_cpu(x) (x)
# define __bpf_cpu_to_be64(x) (x)
# define __bpf_constant_be64_to_cpu(x) (x)
# define __bpf_constant_cpu_to_be64(x) (x)
#else
# error "Fix your compiler's __BYTE_ORDER__?!"
#endif
#define bpf_htons(x) \
(__builtin_constant_p(x) ? \
__bpf_constant_htons(x) : __bpf_htons(x))
#define bpf_ntohs(x) \
(__builtin_constant_p(x) ? \
__bpf_constant_ntohs(x) : __bpf_ntohs(x))
#define bpf_htonl(x) \
(__builtin_constant_p(x) ? \
__bpf_constant_htonl(x) : __bpf_htonl(x))
#define bpf_ntohl(x) \
(__builtin_constant_p(x) ? \
__bpf_constant_ntohl(x) : __bpf_ntohl(x))
#define bpf_cpu_to_be64(x) \
(__builtin_constant_p(x) ? \
__bpf_constant_cpu_to_be64(x) : __bpf_cpu_to_be64(x))
#define bpf_be64_to_cpu(x) \
(__builtin_constant_p(x) ? \
__bpf_constant_be64_to_cpu(x) : __bpf_be64_to_cpu(x))
#endif /* __BPF_ENDIAN__ */

55
src/bpf_gen_internal.h Normal file
View File

@@ -0,0 +1,55 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* Copyright (c) 2021 Facebook */
#ifndef __BPF_GEN_INTERNAL_H
#define __BPF_GEN_INTERNAL_H
struct ksym_relo_desc {
const char *name;
int kind;
int insn_idx;
bool is_weak;
};
struct ksym_desc {
const char *name;
int ref;
int kind;
int off;
int insn;
};
struct bpf_gen {
struct gen_loader_opts *opts;
void *data_start;
void *data_cur;
void *insn_start;
void *insn_cur;
ssize_t cleanup_label;
__u32 nr_progs;
__u32 nr_maps;
int log_level;
int error;
struct ksym_relo_desc *relos;
int relo_cnt;
char attach_target[128];
int attach_kind;
struct ksym_desc *ksyms;
__u32 nr_ksyms;
int fd_array;
int nr_fd_array;
};
void bpf_gen__init(struct bpf_gen *gen, int log_level);
int bpf_gen__finish(struct bpf_gen *gen);
void bpf_gen__free(struct bpf_gen *gen);
void bpf_gen__load_btf(struct bpf_gen *gen, const void *raw_data, __u32 raw_size);
void bpf_gen__map_create(struct bpf_gen *gen, struct bpf_create_map_attr *map_attr, int map_idx);
struct bpf_prog_load_params;
void bpf_gen__prog_load(struct bpf_gen *gen, struct bpf_prog_load_params *load_attr, int prog_idx);
void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *value, __u32 value_size);
void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx);
void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *name, enum bpf_attach_type type);
void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, bool is_weak, int kind,
int insn_idx);
#endif

4086
src/bpf_helper_defs.h Normal file

File diff suppressed because it is too large Load Diff

262
src/bpf_helpers.h Normal file
View File

@@ -0,0 +1,262 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
#ifndef __BPF_HELPERS__
#define __BPF_HELPERS__
/*
* Note that bpf programs need to include either
* vmlinux.h (auto-generated from BTF) or linux/types.h
* in advance since bpf_helper_defs.h uses such types
* as __u64.
*/
#include "bpf_helper_defs.h"
#define __uint(name, val) int (*name)[val]
#define __type(name, val) typeof(val) *name
#define __array(name, val) typeof(val) *name[]
/*
* Helper macro to place programs, maps, license in
* different sections in elf_bpf file. Section names
* are interpreted by libbpf depending on the context (BPF programs, BPF maps,
* extern variables, etc).
* To allow use of SEC() with externs (e.g., for extern .maps declarations),
* make sure __attribute__((unused)) doesn't trigger compilation warning.
*/
#define SEC(name) \
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wignored-attributes\"") \
__attribute__((section(name), used)) \
_Pragma("GCC diagnostic pop") \
/* Avoid 'linux/stddef.h' definition of '__always_inline'. */
#undef __always_inline
#define __always_inline inline __attribute__((always_inline))
#ifndef __noinline
#define __noinline __attribute__((noinline))
#endif
#ifndef __weak
#define __weak __attribute__((weak))
#endif
/*
* Use __hidden attribute to mark a non-static BPF subprogram effectively
* static for BPF verifier's verification algorithm purposes, allowing more
* extensive and permissive BPF verification process, taking into account
* subprogram's caller context.
*/
#define __hidden __attribute__((visibility("hidden")))
/* When utilizing vmlinux.h with BPF CO-RE, user BPF programs can't include
* any system-level headers (such as stddef.h, linux/version.h, etc), and
* commonly-used macros like NULL and KERNEL_VERSION aren't available through
* vmlinux.h. This just adds unnecessary hurdles and forces users to re-define
* them on their own. So as a convenience, provide such definitions here.
*/
#ifndef NULL
#define NULL ((void *)0)
#endif
#ifndef KERNEL_VERSION
#define KERNEL_VERSION(a, b, c) (((a) << 16) + ((b) << 8) + ((c) > 255 ? 255 : (c)))
#endif
/*
* Helper macros to manipulate data structures
*/
#ifndef offsetof
#define offsetof(TYPE, MEMBER) ((unsigned long)&((TYPE *)0)->MEMBER)
#endif
#ifndef container_of
#define container_of(ptr, type, member) \
({ \
void *__mptr = (void *)(ptr); \
((type *)(__mptr - offsetof(type, member))); \
})
#endif
/*
* Helper macro to throw a compilation error if __bpf_unreachable() gets
* built into the resulting code. This works given BPF back end does not
* implement __builtin_trap(). This is useful to assert that certain paths
* of the program code are never used and hence eliminated by the compiler.
*
* For example, consider a switch statement that covers known cases used by
* the program. __bpf_unreachable() can then reside in the default case. If
* the program gets extended such that a case is not covered in the switch
* statement, then it will throw a build error due to the default case not
* being compiled out.
*/
#ifndef __bpf_unreachable
# define __bpf_unreachable() __builtin_trap()
#endif
/*
* Helper function to perform a tail call with a constant/immediate map slot.
*/
#if __clang_major__ >= 8 && defined(__bpf__)
static __always_inline void
bpf_tail_call_static(void *ctx, const void *map, const __u32 slot)
{
if (!__builtin_constant_p(slot))
__bpf_unreachable();
/*
* Provide a hard guarantee that LLVM won't optimize setting r2 (map
* pointer) and r3 (constant map index) from _different paths_ ending
* up at the _same_ call insn as otherwise we won't be able to use the
* jmpq/nopl retpoline-free patching by the x86-64 JIT in the kernel
* given they mismatch. See also d2e4c1e6c294 ("bpf: Constant map key
* tracking for prog array pokes") for details on verifier tracking.
*
* Note on clobber list: we need to stay in-line with BPF calling
* convention, so even if we don't end up using r0, r4, r5, we need
* to mark them as clobber so that LLVM doesn't end up using them
* before / after the call.
*/
asm volatile("r1 = %[ctx]\n\t"
"r2 = %[map]\n\t"
"r3 = %[slot]\n\t"
"call 12"
:: [ctx]"r"(ctx), [map]"r"(map), [slot]"i"(slot)
: "r0", "r1", "r2", "r3", "r4", "r5");
}
#endif
/*
* Helper structure used by eBPF C program
* to describe BPF map attributes to libbpf loader
*/
struct bpf_map_def {
unsigned int type;
unsigned int key_size;
unsigned int value_size;
unsigned int max_entries;
unsigned int map_flags;
};
enum libbpf_pin_type {
LIBBPF_PIN_NONE,
/* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */
LIBBPF_PIN_BY_NAME,
};
enum libbpf_tristate {
TRI_NO = 0,
TRI_YES = 1,
TRI_MODULE = 2,
};
#define __kconfig __attribute__((section(".kconfig")))
#define __ksym __attribute__((section(".ksyms")))
#ifndef ___bpf_concat
#define ___bpf_concat(a, b) a ## b
#endif
#ifndef ___bpf_apply
#define ___bpf_apply(fn, n) ___bpf_concat(fn, n)
#endif
#ifndef ___bpf_nth
#define ___bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _a, _b, _c, N, ...) N
#endif
#ifndef ___bpf_narg
#define ___bpf_narg(...) \
___bpf_nth(_, ##__VA_ARGS__, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)
#endif
#define ___bpf_fill0(arr, p, x) do {} while (0)
#define ___bpf_fill1(arr, p, x) arr[p] = x
#define ___bpf_fill2(arr, p, x, args...) arr[p] = x; ___bpf_fill1(arr, p + 1, args)
#define ___bpf_fill3(arr, p, x, args...) arr[p] = x; ___bpf_fill2(arr, p + 1, args)
#define ___bpf_fill4(arr, p, x, args...) arr[p] = x; ___bpf_fill3(arr, p + 1, args)
#define ___bpf_fill5(arr, p, x, args...) arr[p] = x; ___bpf_fill4(arr, p + 1, args)
#define ___bpf_fill6(arr, p, x, args...) arr[p] = x; ___bpf_fill5(arr, p + 1, args)
#define ___bpf_fill7(arr, p, x, args...) arr[p] = x; ___bpf_fill6(arr, p + 1, args)
#define ___bpf_fill8(arr, p, x, args...) arr[p] = x; ___bpf_fill7(arr, p + 1, args)
#define ___bpf_fill9(arr, p, x, args...) arr[p] = x; ___bpf_fill8(arr, p + 1, args)
#define ___bpf_fill10(arr, p, x, args...) arr[p] = x; ___bpf_fill9(arr, p + 1, args)
#define ___bpf_fill11(arr, p, x, args...) arr[p] = x; ___bpf_fill10(arr, p + 1, args)
#define ___bpf_fill12(arr, p, x, args...) arr[p] = x; ___bpf_fill11(arr, p + 1, args)
#define ___bpf_fill(arr, args...) \
___bpf_apply(___bpf_fill, ___bpf_narg(args))(arr, 0, args)
/*
* BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values
* in a structure.
*/
#define BPF_SEQ_PRINTF(seq, fmt, args...) \
({ \
static const char ___fmt[] = fmt; \
unsigned long long ___param[___bpf_narg(args)]; \
\
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
___bpf_fill(___param, args); \
_Pragma("GCC diagnostic pop") \
\
bpf_seq_printf(seq, ___fmt, sizeof(___fmt), \
___param, sizeof(___param)); \
})
/*
* BPF_SNPRINTF wraps the bpf_snprintf helper with variadic arguments instead of
* an array of u64.
*/
#define BPF_SNPRINTF(out, out_size, fmt, args...) \
({ \
static const char ___fmt[] = fmt; \
unsigned long long ___param[___bpf_narg(args)]; \
\
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
___bpf_fill(___param, args); \
_Pragma("GCC diagnostic pop") \
\
bpf_snprintf(out, out_size, ___fmt, \
___param, sizeof(___param)); \
})
#ifdef BPF_NO_GLOBAL_DATA
#define BPF_PRINTK_FMT_MOD
#else
#define BPF_PRINTK_FMT_MOD static const
#endif
#define __bpf_printk(fmt, ...) \
({ \
BPF_PRINTK_FMT_MOD char ____fmt[] = fmt; \
bpf_trace_printk(____fmt, sizeof(____fmt), \
##__VA_ARGS__); \
})
/*
* __bpf_vprintk wraps the bpf_trace_vprintk helper with variadic arguments
* instead of an array of u64.
*/
#define __bpf_vprintk(fmt, args...) \
({ \
static const char ___fmt[] = fmt; \
unsigned long long ___param[___bpf_narg(args)]; \
\
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
___bpf_fill(___param, args); \
_Pragma("GCC diagnostic pop") \
\
bpf_trace_vprintk(___fmt, sizeof(___fmt), \
___param, sizeof(___param)); \
})
/* Use __bpf_printk when bpf_printk call has 3 or fewer fmt args
* Otherwise use __bpf_vprintk
*/
#define ___bpf_pick_printk(...) \
___bpf_nth(_, ##__VA_ARGS__, __bpf_vprintk, __bpf_vprintk, __bpf_vprintk, \
__bpf_vprintk, __bpf_vprintk, __bpf_vprintk, __bpf_vprintk, \
__bpf_vprintk, __bpf_vprintk, __bpf_printk /*3*/, __bpf_printk /*2*/,\
__bpf_printk /*1*/, __bpf_printk /*0*/)
/* Helper macro to print out debug messages */
#define bpf_printk(fmt, args...) ___bpf_pick_printk(args)(fmt, ##args)
#endif

View File

@@ -101,11 +101,12 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info)
{
struct bpf_prog_linfo *prog_linfo;
__u32 nr_linfo, nr_jited_func;
__u64 data_sz;
nr_linfo = info->nr_line_info;
if (!nr_linfo)
return NULL;
return errno = EINVAL, NULL;
/*
* The min size that bpf_prog_linfo has to access for
@@ -113,20 +114,20 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info)
*/
if (info->line_info_rec_size <
offsetof(struct bpf_line_info, file_name_off))
return NULL;
return errno = EINVAL, NULL;
prog_linfo = calloc(1, sizeof(*prog_linfo));
if (!prog_linfo)
return NULL;
return errno = ENOMEM, NULL;
/* Copy xlated line_info */
prog_linfo->nr_linfo = nr_linfo;
prog_linfo->rec_size = info->line_info_rec_size;
prog_linfo->raw_linfo = malloc(nr_linfo * prog_linfo->rec_size);
data_sz = (__u64)nr_linfo * prog_linfo->rec_size;
prog_linfo->raw_linfo = malloc(data_sz);
if (!prog_linfo->raw_linfo)
goto err_free;
memcpy(prog_linfo->raw_linfo, (void *)(long)info->line_info,
nr_linfo * prog_linfo->rec_size);
memcpy(prog_linfo->raw_linfo, (void *)(long)info->line_info, data_sz);
nr_jited_func = info->nr_jited_ksyms;
if (!nr_jited_func ||
@@ -142,13 +143,12 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info)
/* Copy jited_line_info */
prog_linfo->nr_jited_func = nr_jited_func;
prog_linfo->jited_rec_size = info->jited_line_info_rec_size;
prog_linfo->raw_jited_linfo = malloc(nr_linfo *
prog_linfo->jited_rec_size);
data_sz = (__u64)nr_linfo * prog_linfo->jited_rec_size;
prog_linfo->raw_jited_linfo = malloc(data_sz);
if (!prog_linfo->raw_jited_linfo)
goto err_free;
memcpy(prog_linfo->raw_jited_linfo,
(void *)(long)info->jited_line_info,
nr_linfo * prog_linfo->jited_rec_size);
(void *)(long)info->jited_line_info, data_sz);
/* Number of jited_line_info per jited func */
prog_linfo->nr_jited_linfo_per_func = malloc(nr_jited_func *
@@ -174,7 +174,7 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info)
err_free:
bpf_prog_linfo__free(prog_linfo);
return NULL;
return errno = EINVAL, NULL;
}
const struct bpf_line_info *
@@ -186,11 +186,11 @@ bpf_prog_linfo__lfind_addr_func(const struct bpf_prog_linfo *prog_linfo,
const __u64 *jited_linfo;
if (func_idx >= prog_linfo->nr_jited_func)
return NULL;
return errno = ENOENT, NULL;
nr_linfo = prog_linfo->nr_jited_linfo_per_func[func_idx];
if (nr_skip >= nr_linfo)
return NULL;
return errno = ENOENT, NULL;
start = prog_linfo->jited_linfo_func_idx[func_idx] + nr_skip;
jited_rec_size = prog_linfo->jited_rec_size;
@@ -198,7 +198,7 @@ bpf_prog_linfo__lfind_addr_func(const struct bpf_prog_linfo *prog_linfo,
(start * jited_rec_size);
jited_linfo = raw_jited_linfo;
if (addr < *jited_linfo)
return NULL;
return errno = ENOENT, NULL;
nr_linfo -= nr_skip;
rec_size = prog_linfo->rec_size;
@@ -225,13 +225,13 @@ bpf_prog_linfo__lfind(const struct bpf_prog_linfo *prog_linfo,
nr_linfo = prog_linfo->nr_linfo;
if (nr_skip >= nr_linfo)
return NULL;
return errno = ENOENT, NULL;
rec_size = prog_linfo->rec_size;
raw_linfo = prog_linfo->raw_linfo + (nr_skip * rec_size);
linfo = raw_linfo;
if (insn_off < linfo->insn_off)
return NULL;
return errno = ENOENT, NULL;
nr_linfo -= nr_skip;
for (i = 0; i < nr_linfo; i++) {

460
src/bpf_tracing.h Normal file
View File

@@ -0,0 +1,460 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
#ifndef __BPF_TRACING_H__
#define __BPF_TRACING_H__
/* Scan the ARCH passed in from ARCH env variable (see Makefile) */
#if defined(__TARGET_ARCH_x86)
#define bpf_target_x86
#define bpf_target_defined
#elif defined(__TARGET_ARCH_s390)
#define bpf_target_s390
#define bpf_target_defined
#elif defined(__TARGET_ARCH_arm)
#define bpf_target_arm
#define bpf_target_defined
#elif defined(__TARGET_ARCH_arm64)
#define bpf_target_arm64
#define bpf_target_defined
#elif defined(__TARGET_ARCH_mips)
#define bpf_target_mips
#define bpf_target_defined
#elif defined(__TARGET_ARCH_powerpc)
#define bpf_target_powerpc
#define bpf_target_defined
#elif defined(__TARGET_ARCH_sparc)
#define bpf_target_sparc
#define bpf_target_defined
#else
/* Fall back to what the compiler says */
#if defined(__x86_64__)
#define bpf_target_x86
#define bpf_target_defined
#elif defined(__s390__)
#define bpf_target_s390
#define bpf_target_defined
#elif defined(__arm__)
#define bpf_target_arm
#define bpf_target_defined
#elif defined(__aarch64__)
#define bpf_target_arm64
#define bpf_target_defined
#elif defined(__mips__)
#define bpf_target_mips
#define bpf_target_defined
#elif defined(__powerpc__)
#define bpf_target_powerpc
#define bpf_target_defined
#elif defined(__sparc__)
#define bpf_target_sparc
#define bpf_target_defined
#endif /* no compiler target */
#endif
#ifndef __BPF_TARGET_MISSING
#define __BPF_TARGET_MISSING "GCC error \"Must specify a BPF target arch via __TARGET_ARCH_xxx\""
#endif
#if defined(bpf_target_x86)
#if defined(__KERNEL__) || defined(__VMLINUX_H__)
#define PT_REGS_PARM1(x) ((x)->di)
#define PT_REGS_PARM2(x) ((x)->si)
#define PT_REGS_PARM3(x) ((x)->dx)
#define PT_REGS_PARM4(x) ((x)->cx)
#define PT_REGS_PARM5(x) ((x)->r8)
#define PT_REGS_RET(x) ((x)->sp)
#define PT_REGS_FP(x) ((x)->bp)
#define PT_REGS_RC(x) ((x)->ax)
#define PT_REGS_SP(x) ((x)->sp)
#define PT_REGS_IP(x) ((x)->ip)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), di)
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), si)
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), dx)
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), cx)
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), r8)
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), sp)
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), bp)
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), ax)
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), sp)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), ip)
#else
#ifdef __i386__
/* i386 kernel is built with -mregparm=3 */
#define PT_REGS_PARM1(x) ((x)->eax)
#define PT_REGS_PARM2(x) ((x)->edx)
#define PT_REGS_PARM3(x) ((x)->ecx)
#define PT_REGS_PARM4(x) 0
#define PT_REGS_PARM5(x) 0
#define PT_REGS_RET(x) ((x)->esp)
#define PT_REGS_FP(x) ((x)->ebp)
#define PT_REGS_RC(x) ((x)->eax)
#define PT_REGS_SP(x) ((x)->esp)
#define PT_REGS_IP(x) ((x)->eip)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), eax)
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), edx)
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), ecx)
#define PT_REGS_PARM4_CORE(x) 0
#define PT_REGS_PARM5_CORE(x) 0
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), esp)
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), ebp)
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), eax)
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), esp)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), eip)
#else
#define PT_REGS_PARM1(x) ((x)->rdi)
#define PT_REGS_PARM2(x) ((x)->rsi)
#define PT_REGS_PARM3(x) ((x)->rdx)
#define PT_REGS_PARM4(x) ((x)->rcx)
#define PT_REGS_PARM5(x) ((x)->r8)
#define PT_REGS_RET(x) ((x)->rsp)
#define PT_REGS_FP(x) ((x)->rbp)
#define PT_REGS_RC(x) ((x)->rax)
#define PT_REGS_SP(x) ((x)->rsp)
#define PT_REGS_IP(x) ((x)->rip)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), rdi)
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), rsi)
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), rdx)
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), rcx)
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), r8)
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), rsp)
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), rbp)
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), rax)
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), rsp)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), rip)
#endif
#endif
#elif defined(bpf_target_s390)
/* s390 provides user_pt_regs instead of struct pt_regs to userspace */
struct pt_regs;
#define PT_REGS_S390 const volatile user_pt_regs
#define PT_REGS_PARM1(x) (((PT_REGS_S390 *)(x))->gprs[2])
#define PT_REGS_PARM2(x) (((PT_REGS_S390 *)(x))->gprs[3])
#define PT_REGS_PARM3(x) (((PT_REGS_S390 *)(x))->gprs[4])
#define PT_REGS_PARM4(x) (((PT_REGS_S390 *)(x))->gprs[5])
#define PT_REGS_PARM5(x) (((PT_REGS_S390 *)(x))->gprs[6])
#define PT_REGS_RET(x) (((PT_REGS_S390 *)(x))->gprs[14])
/* Works only with CONFIG_FRAME_POINTER */
#define PT_REGS_FP(x) (((PT_REGS_S390 *)(x))->gprs[11])
#define PT_REGS_RC(x) (((PT_REGS_S390 *)(x))->gprs[2])
#define PT_REGS_SP(x) (((PT_REGS_S390 *)(x))->gprs[15])
#define PT_REGS_IP(x) (((PT_REGS_S390 *)(x))->psw.addr)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[2])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[3])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[4])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[5])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[6])
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[14])
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[11])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[2])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[15])
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), psw.addr)
#elif defined(bpf_target_arm)
#define PT_REGS_PARM1(x) ((x)->uregs[0])
#define PT_REGS_PARM2(x) ((x)->uregs[1])
#define PT_REGS_PARM3(x) ((x)->uregs[2])
#define PT_REGS_PARM4(x) ((x)->uregs[3])
#define PT_REGS_PARM5(x) ((x)->uregs[4])
#define PT_REGS_RET(x) ((x)->uregs[14])
#define PT_REGS_FP(x) ((x)->uregs[11]) /* Works only with CONFIG_FRAME_POINTER */
#define PT_REGS_RC(x) ((x)->uregs[0])
#define PT_REGS_SP(x) ((x)->uregs[13])
#define PT_REGS_IP(x) ((x)->uregs[12])
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), uregs[0])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), uregs[1])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), uregs[2])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), uregs[3])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), uregs[4])
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), uregs[14])
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), uregs[11])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), uregs[0])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), uregs[13])
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), uregs[12])
#elif defined(bpf_target_arm64)
/* arm64 provides struct user_pt_regs instead of struct pt_regs to userspace */
struct pt_regs;
#define PT_REGS_ARM64 const volatile struct user_pt_regs
#define PT_REGS_PARM1(x) (((PT_REGS_ARM64 *)(x))->regs[0])
#define PT_REGS_PARM2(x) (((PT_REGS_ARM64 *)(x))->regs[1])
#define PT_REGS_PARM3(x) (((PT_REGS_ARM64 *)(x))->regs[2])
#define PT_REGS_PARM4(x) (((PT_REGS_ARM64 *)(x))->regs[3])
#define PT_REGS_PARM5(x) (((PT_REGS_ARM64 *)(x))->regs[4])
#define PT_REGS_RET(x) (((PT_REGS_ARM64 *)(x))->regs[30])
/* Works only with CONFIG_FRAME_POINTER */
#define PT_REGS_FP(x) (((PT_REGS_ARM64 *)(x))->regs[29])
#define PT_REGS_RC(x) (((PT_REGS_ARM64 *)(x))->regs[0])
#define PT_REGS_SP(x) (((PT_REGS_ARM64 *)(x))->sp)
#define PT_REGS_IP(x) (((PT_REGS_ARM64 *)(x))->pc)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[0])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[1])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[2])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[3])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[4])
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[30])
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[29])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[0])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), sp)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), pc)
#elif defined(bpf_target_mips)
#define PT_REGS_PARM1(x) ((x)->regs[4])
#define PT_REGS_PARM2(x) ((x)->regs[5])
#define PT_REGS_PARM3(x) ((x)->regs[6])
#define PT_REGS_PARM4(x) ((x)->regs[7])
#define PT_REGS_PARM5(x) ((x)->regs[8])
#define PT_REGS_RET(x) ((x)->regs[31])
#define PT_REGS_FP(x) ((x)->regs[30]) /* Works only with CONFIG_FRAME_POINTER */
#define PT_REGS_RC(x) ((x)->regs[2])
#define PT_REGS_SP(x) ((x)->regs[29])
#define PT_REGS_IP(x) ((x)->cp0_epc)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), regs[4])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), regs[5])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), regs[6])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), regs[7])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), regs[8])
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), regs[31])
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), regs[30])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), regs[2])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), regs[29])
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), cp0_epc)
#elif defined(bpf_target_powerpc)
#define PT_REGS_PARM1(x) ((x)->gpr[3])
#define PT_REGS_PARM2(x) ((x)->gpr[4])
#define PT_REGS_PARM3(x) ((x)->gpr[5])
#define PT_REGS_PARM4(x) ((x)->gpr[6])
#define PT_REGS_PARM5(x) ((x)->gpr[7])
#define PT_REGS_RC(x) ((x)->gpr[3])
#define PT_REGS_SP(x) ((x)->sp)
#define PT_REGS_IP(x) ((x)->nip)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), gpr[3])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), gpr[4])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), gpr[5])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), gpr[6])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), gpr[7])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), gpr[3])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), sp)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), nip)
#elif defined(bpf_target_sparc)
#define PT_REGS_PARM1(x) ((x)->u_regs[UREG_I0])
#define PT_REGS_PARM2(x) ((x)->u_regs[UREG_I1])
#define PT_REGS_PARM3(x) ((x)->u_regs[UREG_I2])
#define PT_REGS_PARM4(x) ((x)->u_regs[UREG_I3])
#define PT_REGS_PARM5(x) ((x)->u_regs[UREG_I4])
#define PT_REGS_RET(x) ((x)->u_regs[UREG_I7])
#define PT_REGS_RC(x) ((x)->u_regs[UREG_I0])
#define PT_REGS_SP(x) ((x)->u_regs[UREG_FP])
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I0])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I1])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I2])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I3])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I4])
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I7])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I0])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), u_regs[UREG_FP])
/* Should this also be a bpf_target check for the sparc case? */
#if defined(__arch64__)
#define PT_REGS_IP(x) ((x)->tpc)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), tpc)
#else
#define PT_REGS_IP(x) ((x)->pc)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), pc)
#endif
#endif
#if defined(bpf_target_powerpc)
#define BPF_KPROBE_READ_RET_IP(ip, ctx) ({ (ip) = (ctx)->link; })
#define BPF_KRETPROBE_READ_RET_IP BPF_KPROBE_READ_RET_IP
#elif defined(bpf_target_sparc)
#define BPF_KPROBE_READ_RET_IP(ip, ctx) ({ (ip) = PT_REGS_RET(ctx); })
#define BPF_KRETPROBE_READ_RET_IP BPF_KPROBE_READ_RET_IP
#elif defined(bpf_target_defined)
#define BPF_KPROBE_READ_RET_IP(ip, ctx) \
({ bpf_probe_read_kernel(&(ip), sizeof(ip), (void *)PT_REGS_RET(ctx)); })
#define BPF_KRETPROBE_READ_RET_IP(ip, ctx) \
({ bpf_probe_read_kernel(&(ip), sizeof(ip), \
(void *)(PT_REGS_FP(ctx) + sizeof(ip))); })
#endif
#if !defined(bpf_target_defined)
#define PT_REGS_PARM1(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM2(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM3(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM4(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM5(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_RET(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_FP(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_RC(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_SP(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_IP(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM1_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM2_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM3_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM4_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM5_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_RET_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_FP_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_RC_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_SP_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_IP_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define BPF_KPROBE_READ_RET_IP(ip, ctx) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define BPF_KRETPROBE_READ_RET_IP(ip, ctx) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#endif /* !defined(bpf_target_defined) */
#ifndef ___bpf_concat
#define ___bpf_concat(a, b) a ## b
#endif
#ifndef ___bpf_apply
#define ___bpf_apply(fn, n) ___bpf_concat(fn, n)
#endif
#ifndef ___bpf_nth
#define ___bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _a, _b, _c, N, ...) N
#endif
#ifndef ___bpf_narg
#define ___bpf_narg(...) \
___bpf_nth(_, ##__VA_ARGS__, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)
#endif
#define ___bpf_ctx_cast0() ctx
#define ___bpf_ctx_cast1(x) ___bpf_ctx_cast0(), (void *)ctx[0]
#define ___bpf_ctx_cast2(x, args...) ___bpf_ctx_cast1(args), (void *)ctx[1]
#define ___bpf_ctx_cast3(x, args...) ___bpf_ctx_cast2(args), (void *)ctx[2]
#define ___bpf_ctx_cast4(x, args...) ___bpf_ctx_cast3(args), (void *)ctx[3]
#define ___bpf_ctx_cast5(x, args...) ___bpf_ctx_cast4(args), (void *)ctx[4]
#define ___bpf_ctx_cast6(x, args...) ___bpf_ctx_cast5(args), (void *)ctx[5]
#define ___bpf_ctx_cast7(x, args...) ___bpf_ctx_cast6(args), (void *)ctx[6]
#define ___bpf_ctx_cast8(x, args...) ___bpf_ctx_cast7(args), (void *)ctx[7]
#define ___bpf_ctx_cast9(x, args...) ___bpf_ctx_cast8(args), (void *)ctx[8]
#define ___bpf_ctx_cast10(x, args...) ___bpf_ctx_cast9(args), (void *)ctx[9]
#define ___bpf_ctx_cast11(x, args...) ___bpf_ctx_cast10(args), (void *)ctx[10]
#define ___bpf_ctx_cast12(x, args...) ___bpf_ctx_cast11(args), (void *)ctx[11]
#define ___bpf_ctx_cast(args...) \
___bpf_apply(___bpf_ctx_cast, ___bpf_narg(args))(args)
/*
* BPF_PROG is a convenience wrapper for generic tp_btf/fentry/fexit and
* similar kinds of BPF programs, that accept input arguments as a single
* pointer to untyped u64 array, where each u64 can actually be a typed
* pointer or integer of different size. Instead of requring user to write
* manual casts and work with array elements by index, BPF_PROG macro
* allows user to declare a list of named and typed input arguments in the
* same syntax as for normal C function. All the casting is hidden and
* performed transparently, while user code can just assume working with
* function arguments of specified type and name.
*
* Original raw context argument is preserved as well as 'ctx' argument.
* This is useful when using BPF helpers that expect original context
* as one of the parameters (e.g., for bpf_perf_event_output()).
*/
#define BPF_PROG(name, args...) \
name(unsigned long long *ctx); \
static __attribute__((always_inline)) typeof(name(0)) \
____##name(unsigned long long *ctx, ##args); \
typeof(name(0)) name(unsigned long long *ctx) \
{ \
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
return ____##name(___bpf_ctx_cast(args)); \
_Pragma("GCC diagnostic pop") \
} \
static __attribute__((always_inline)) typeof(name(0)) \
____##name(unsigned long long *ctx, ##args)
struct pt_regs;
#define ___bpf_kprobe_args0() ctx
#define ___bpf_kprobe_args1(x) \
___bpf_kprobe_args0(), (void *)PT_REGS_PARM1(ctx)
#define ___bpf_kprobe_args2(x, args...) \
___bpf_kprobe_args1(args), (void *)PT_REGS_PARM2(ctx)
#define ___bpf_kprobe_args3(x, args...) \
___bpf_kprobe_args2(args), (void *)PT_REGS_PARM3(ctx)
#define ___bpf_kprobe_args4(x, args...) \
___bpf_kprobe_args3(args), (void *)PT_REGS_PARM4(ctx)
#define ___bpf_kprobe_args5(x, args...) \
___bpf_kprobe_args4(args), (void *)PT_REGS_PARM5(ctx)
#define ___bpf_kprobe_args(args...) \
___bpf_apply(___bpf_kprobe_args, ___bpf_narg(args))(args)
/*
* BPF_KPROBE serves the same purpose for kprobes as BPF_PROG for
* tp_btf/fentry/fexit BPF programs. It hides the underlying platform-specific
* low-level way of getting kprobe input arguments from struct pt_regs, and
* provides a familiar typed and named function arguments syntax and
* semantics of accessing kprobe input paremeters.
*
* Original struct pt_regs* context is preserved as 'ctx' argument. This might
* be necessary when using BPF helpers like bpf_perf_event_output().
*/
#define BPF_KPROBE(name, args...) \
name(struct pt_regs *ctx); \
static __attribute__((always_inline)) typeof(name(0)) \
____##name(struct pt_regs *ctx, ##args); \
typeof(name(0)) name(struct pt_regs *ctx) \
{ \
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
return ____##name(___bpf_kprobe_args(args)); \
_Pragma("GCC diagnostic pop") \
} \
static __attribute__((always_inline)) typeof(name(0)) \
____##name(struct pt_regs *ctx, ##args)
#define ___bpf_kretprobe_args0() ctx
#define ___bpf_kretprobe_args1(x) \
___bpf_kretprobe_args0(), (void *)PT_REGS_RC(ctx)
#define ___bpf_kretprobe_args(args...) \
___bpf_apply(___bpf_kretprobe_args, ___bpf_narg(args))(args)
/*
* BPF_KRETPROBE is similar to BPF_KPROBE, except, it only provides optional
* return value (in addition to `struct pt_regs *ctx`), but no input
* arguments, because they will be clobbered by the time probed function
* returns.
*/
#define BPF_KRETPROBE(name, args...) \
name(struct pt_regs *ctx); \
static __attribute__((always_inline)) typeof(name(0)) \
____##name(struct pt_regs *ctx, ##args); \
typeof(name(0)) name(struct pt_regs *ctx) \
{ \
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
return ____##name(___bpf_kretprobe_args(args)); \
_Pragma("GCC diagnostic pop") \
} \
static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)
#endif

3233
src/btf.c

File diff suppressed because it is too large Load Diff

307
src/btf.h
View File

@@ -1,21 +1,21 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* Copyright (c) 2018 Facebook */
/*! \file */
#ifndef __LIBBPF_BTF_H
#define __LIBBPF_BTF_H
#include <stdarg.h>
#include <stdbool.h>
#include <linux/btf.h>
#include <linux/types.h>
#include "libbpf_common.h"
#ifdef __cplusplus
extern "C" {
#endif
#ifndef LIBBPF_API
#define LIBBPF_API __attribute__((visibility("default")))
#endif
#define BTF_ELF_SEC ".BTF"
#define BTF_EXT_ELF_SEC ".BTF.ext"
#define MAPS_ELF_SEC ".maps"
@@ -26,61 +26,127 @@ struct btf_type;
struct bpf_object;
/*
* The .BTF.ext ELF section layout defined as
* struct btf_ext_header
* func_info subsection
*
* The func_info subsection layout:
* record size for struct bpf_func_info in the func_info subsection
* struct btf_sec_func_info for section #1
* a list of bpf_func_info records for section #1
* where struct bpf_func_info mimics one in include/uapi/linux/bpf.h
* but may not be identical
* struct btf_sec_func_info for section #2
* a list of bpf_func_info records for section #2
* ......
*
* Note that the bpf_func_info record size in .BTF.ext may not
* be the same as the one defined in include/uapi/linux/bpf.h.
* The loader should ensure that record_size meets minimum
* requirement and pass the record as is to the kernel. The
* kernel will handle the func_info properly based on its contents.
*/
struct btf_ext_header {
__u16 magic;
__u8 version;
__u8 flags;
__u32 hdr_len;
/* All offsets are in bytes relative to the end of this header */
__u32 func_info_off;
__u32 func_info_len;
__u32 line_info_off;
__u32 line_info_len;
/* optional part of .BTF.ext header */
__u32 offset_reloc_off;
__u32 offset_reloc_len;
enum btf_endianness {
BTF_LITTLE_ENDIAN = 0,
BTF_BIG_ENDIAN = 1,
};
/**
* @brief **btf__free()** frees all data of a BTF object
* @param btf BTF object to free
*/
LIBBPF_API void btf__free(struct btf *btf);
LIBBPF_API struct btf *btf__new(__u8 *data, __u32 size);
LIBBPF_API struct btf *btf__parse_elf(const char *path,
struct btf_ext **btf_ext);
/**
* @brief **btf__new()** creates a new instance of a BTF object from the raw
* bytes of an ELF's BTF section
* @param data raw bytes
* @param size number of bytes passed in `data`
* @return new BTF object instance which has to be eventually freed with
* **btf__free()**
*
* On error, error-code-encoded-as-pointer is returned, not a NULL. To extract
* error code from such a pointer `libbpf_get_error()` should be used. If
* `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is
* returned on error instead. In both cases thread-local `errno` variable is
* always set to error code as well.
*/
LIBBPF_API struct btf *btf__new(const void *data, __u32 size);
/**
* @brief **btf__new_split()** create a new instance of a BTF object from the
* provided raw data bytes. It takes another BTF instance, **base_btf**, which
* serves as a base BTF, which is extended by types in a newly created BTF
* instance
* @param data raw bytes
* @param size length of raw bytes
* @param base_btf the base BTF object
* @return new BTF object instance which has to be eventually freed with
* **btf__free()**
*
* If *base_btf* is NULL, `btf__new_split()` is equivalent to `btf__new()` and
* creates non-split BTF.
*
* On error, error-code-encoded-as-pointer is returned, not a NULL. To extract
* error code from such a pointer `libbpf_get_error()` should be used. If
* `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is
* returned on error instead. In both cases thread-local `errno` variable is
* always set to error code as well.
*/
LIBBPF_API struct btf *btf__new_split(const void *data, __u32 size, struct btf *base_btf);
/**
* @brief **btf__new_empty()** creates an empty BTF object. Use
* `btf__add_*()` to populate such BTF object.
* @return new BTF object instance which has to be eventually freed with
* **btf__free()**
*
* On error, error-code-encoded-as-pointer is returned, not a NULL. To extract
* error code from such a pointer `libbpf_get_error()` should be used. If
* `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is
* returned on error instead. In both cases thread-local `errno` variable is
* always set to error code as well.
*/
LIBBPF_API struct btf *btf__new_empty(void);
/**
* @brief **btf__new_empty_split()** creates an unpopulated BTF object from an
* ELF BTF section except with a base BTF on top of which split BTF should be
* based
* @return new BTF object instance which has to be eventually freed with
* **btf__free()**
*
* If *base_btf* is NULL, `btf__new_empty_split()` is equivalent to
* `btf__new_empty()` and creates non-split BTF.
*
* On error, error-code-encoded-as-pointer is returned, not a NULL. To extract
* error code from such a pointer `libbpf_get_error()` should be used. If
* `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is
* returned on error instead. In both cases thread-local `errno` variable is
* always set to error code as well.
*/
LIBBPF_API struct btf *btf__new_empty_split(struct btf *base_btf);
LIBBPF_API struct btf *btf__parse(const char *path, struct btf_ext **btf_ext);
LIBBPF_API struct btf *btf__parse_split(const char *path, struct btf *base_btf);
LIBBPF_API struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext);
LIBBPF_API struct btf *btf__parse_elf_split(const char *path, struct btf *base_btf);
LIBBPF_API struct btf *btf__parse_raw(const char *path);
LIBBPF_API struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf);
LIBBPF_API struct btf *btf__load_vmlinux_btf(void);
LIBBPF_API struct btf *btf__load_module_btf(const char *module_name, struct btf *vmlinux_btf);
LIBBPF_API struct btf *libbpf_find_kernel_btf(void);
LIBBPF_API struct btf *btf__load_from_kernel_by_id(__u32 id);
LIBBPF_API struct btf *btf__load_from_kernel_by_id_split(__u32 id, struct btf *base_btf);
LIBBPF_DEPRECATED_SINCE(0, 6, "use btf__load_from_kernel_by_id instead")
LIBBPF_API int btf__get_from_id(__u32 id, struct btf **btf);
LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf);
LIBBPF_DEPRECATED_SINCE(0, 6, "use btf__load_into_kernel instead")
LIBBPF_API int btf__load(struct btf *btf);
LIBBPF_API int btf__load_into_kernel(struct btf *btf);
LIBBPF_API __s32 btf__find_by_name(const struct btf *btf,
const char *type_name);
LIBBPF_API __s32 btf__find_by_name_kind(const struct btf *btf,
const char *type_name, __u32 kind);
LIBBPF_API __u32 btf__get_nr_types(const struct btf *btf);
LIBBPF_API const struct btf *btf__base_btf(const struct btf *btf);
LIBBPF_API const struct btf_type *btf__type_by_id(const struct btf *btf,
__u32 id);
LIBBPF_API size_t btf__pointer_size(const struct btf *btf);
LIBBPF_API int btf__set_pointer_size(struct btf *btf, size_t ptr_sz);
LIBBPF_API enum btf_endianness btf__endianness(const struct btf *btf);
LIBBPF_API int btf__set_endianness(struct btf *btf, enum btf_endianness endian);
LIBBPF_API __s64 btf__resolve_size(const struct btf *btf, __u32 type_id);
LIBBPF_API int btf__resolve_type(const struct btf *btf, __u32 type_id);
LIBBPF_API int btf__align_of(const struct btf *btf, __u32 id);
LIBBPF_API int btf__fd(const struct btf *btf);
LIBBPF_API void btf__set_fd(struct btf *btf, int fd);
LIBBPF_API const void *btf__get_raw_data(const struct btf *btf, __u32 *size);
LIBBPF_API const char *btf__name_by_offset(const struct btf *btf, __u32 offset);
LIBBPF_API int btf__get_from_id(__u32 id, struct btf **btf);
LIBBPF_API const char *btf__str_by_offset(const struct btf *btf, __u32 offset);
LIBBPF_API int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
__u32 expected_key_size,
__u32 expected_value_size,
@@ -90,17 +156,89 @@ LIBBPF_API struct btf_ext *btf_ext__new(__u8 *data, __u32 size);
LIBBPF_API void btf_ext__free(struct btf_ext *btf_ext);
LIBBPF_API const void *btf_ext__get_raw_data(const struct btf_ext *btf_ext,
__u32 *size);
LIBBPF_API int btf_ext__reloc_func_info(const struct btf *btf,
const struct btf_ext *btf_ext,
const char *sec_name, __u32 insns_cnt,
void **func_info, __u32 *cnt);
LIBBPF_API int btf_ext__reloc_line_info(const struct btf *btf,
const struct btf_ext *btf_ext,
const char *sec_name, __u32 insns_cnt,
void **line_info, __u32 *cnt);
LIBBPF_API LIBBPF_DEPRECATED("btf_ext__reloc_func_info was never meant as a public API and has wrong assumptions embedded in it; it will be removed in the future libbpf versions")
int btf_ext__reloc_func_info(const struct btf *btf,
const struct btf_ext *btf_ext,
const char *sec_name, __u32 insns_cnt,
void **func_info, __u32 *cnt);
LIBBPF_API LIBBPF_DEPRECATED("btf_ext__reloc_line_info was never meant as a public API and has wrong assumptions embedded in it; it will be removed in the future libbpf versions")
int btf_ext__reloc_line_info(const struct btf *btf,
const struct btf_ext *btf_ext,
const char *sec_name, __u32 insns_cnt,
void **line_info, __u32 *cnt);
LIBBPF_API __u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext);
LIBBPF_API __u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext);
LIBBPF_API int btf__find_str(struct btf *btf, const char *s);
LIBBPF_API int btf__add_str(struct btf *btf, const char *s);
LIBBPF_API int btf__add_type(struct btf *btf, const struct btf *src_btf,
const struct btf_type *src_type);
/**
* @brief **btf__add_btf()** appends all the BTF types from *src_btf* into *btf*
* @param btf BTF object which all the BTF types and strings are added to
* @param src_btf BTF object which all BTF types and referenced strings are copied from
* @return BTF type ID of the first appended BTF type, or negative error code
*
* **btf__add_btf()** can be used to simply and efficiently append the entire
* contents of one BTF object to another one. All the BTF type data is copied
* over, all referenced type IDs are adjusted by adding a necessary ID offset.
* Only strings referenced from BTF types are copied over and deduplicated, so
* if there were some unused strings in *src_btf*, those won't be copied over,
* which is consistent with the general string deduplication semantics of BTF
* writing APIs.
*
* If any error is encountered during this process, the contents of *btf* is
* left intact, which means that **btf__add_btf()** follows the transactional
* semantics and the operation as a whole is all-or-nothing.
*
* *src_btf* has to be non-split BTF, as of now copying types from split BTF
* is not supported and will result in -ENOTSUP error code returned.
*/
LIBBPF_API int btf__add_btf(struct btf *btf, const struct btf *src_btf);
LIBBPF_API int btf__add_int(struct btf *btf, const char *name, size_t byte_sz, int encoding);
LIBBPF_API int btf__add_float(struct btf *btf, const char *name, size_t byte_sz);
LIBBPF_API int btf__add_ptr(struct btf *btf, int ref_type_id);
LIBBPF_API int btf__add_array(struct btf *btf,
int index_type_id, int elem_type_id, __u32 nr_elems);
/* struct/union construction APIs */
LIBBPF_API int btf__add_struct(struct btf *btf, const char *name, __u32 sz);
LIBBPF_API int btf__add_union(struct btf *btf, const char *name, __u32 sz);
LIBBPF_API int btf__add_field(struct btf *btf, const char *name, int field_type_id,
__u32 bit_offset, __u32 bit_size);
/* enum construction APIs */
LIBBPF_API int btf__add_enum(struct btf *btf, const char *name, __u32 bytes_sz);
LIBBPF_API int btf__add_enum_value(struct btf *btf, const char *name, __s64 value);
enum btf_fwd_kind {
BTF_FWD_STRUCT = 0,
BTF_FWD_UNION = 1,
BTF_FWD_ENUM = 2,
};
LIBBPF_API int btf__add_fwd(struct btf *btf, const char *name, enum btf_fwd_kind fwd_kind);
LIBBPF_API int btf__add_typedef(struct btf *btf, const char *name, int ref_type_id);
LIBBPF_API int btf__add_volatile(struct btf *btf, int ref_type_id);
LIBBPF_API int btf__add_const(struct btf *btf, int ref_type_id);
LIBBPF_API int btf__add_restrict(struct btf *btf, int ref_type_id);
/* func and func_proto construction APIs */
LIBBPF_API int btf__add_func(struct btf *btf, const char *name,
enum btf_func_linkage linkage, int proto_type_id);
LIBBPF_API int btf__add_func_proto(struct btf *btf, int ret_type_id);
LIBBPF_API int btf__add_func_param(struct btf *btf, const char *name, int type_id);
/* var & datasec construction APIs */
LIBBPF_API int btf__add_var(struct btf *btf, const char *name, int linkage, int type_id);
LIBBPF_API int btf__add_datasec(struct btf *btf, const char *name, __u32 byte_sz);
LIBBPF_API int btf__add_datasec_var_info(struct btf *btf, int var_type_id,
__u32 offset, __u32 byte_sz);
/* tag construction API */
LIBBPF_API int btf__add_tag(struct btf *btf, const char *value, int ref_type_id,
int component_idx);
struct btf_dedup_opts {
unsigned int dedup_table_size;
bool dont_resolve_fwds;
@@ -125,6 +263,50 @@ LIBBPF_API void btf_dump__free(struct btf_dump *d);
LIBBPF_API int btf_dump__dump_type(struct btf_dump *d, __u32 id);
struct btf_dump_emit_type_decl_opts {
/* size of this struct, for forward/backward compatiblity */
size_t sz;
/* optional field name for type declaration, e.g.:
* - struct my_struct <FNAME>
* - void (*<FNAME>)(int)
* - char (*<FNAME>)[123]
*/
const char *field_name;
/* extra indentation level (in number of tabs) to emit for multi-line
* type declarations (e.g., anonymous struct); applies for lines
* starting from the second one (first line is assumed to have
* necessary indentation already
*/
int indent_level;
/* strip all the const/volatile/restrict mods */
bool strip_mods;
size_t :0;
};
#define btf_dump_emit_type_decl_opts__last_field strip_mods
LIBBPF_API int
btf_dump__emit_type_decl(struct btf_dump *d, __u32 id,
const struct btf_dump_emit_type_decl_opts *opts);
struct btf_dump_type_data_opts {
/* size of this struct, for forward/backward compatibility */
size_t sz;
const char *indent_str;
int indent_level;
/* below match "show" flags for bpf_show_snprintf() */
bool compact; /* no newlines/indentation */
bool skip_names; /* skip member/type names */
bool emit_zeroes; /* show 0-valued fields */
size_t :0;
};
#define btf_dump_type_data_opts__last_field emit_zeroes
LIBBPF_API int
btf_dump__dump_type_data(struct btf_dump *d, __u32 id,
const void *data, size_t data_sz,
const struct btf_dump_type_data_opts *opts);
/*
* A set of helpers for easier BTF types handling
*/
@@ -143,6 +325,11 @@ static inline bool btf_kflag(const struct btf_type *t)
return BTF_INFO_KFLAG(t->info);
}
static inline bool btf_is_void(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_UNKN;
}
static inline bool btf_is_int(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_INT;
@@ -234,6 +421,16 @@ static inline bool btf_is_datasec(const struct btf_type *t)
return btf_kind(t) == BTF_KIND_DATASEC;
}
static inline bool btf_is_float(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_FLOAT;
}
static inline bool btf_is_tag(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_TAG;
}
static inline __u8 btf_int_encoding(const struct btf_type *t)
{
return BTF_INT_ENCODING(*(__u32 *)(t + 1));
@@ -302,6 +499,12 @@ btf_var_secinfos(const struct btf_type *t)
return (struct btf_var_secinfo *)(t + 1);
}
struct btf_tag;
static inline struct btf_tag *btf_tag(const struct btf_type *t)
{
return (struct btf_tag *)(t + 1);
}
#ifdef __cplusplus
} /* extern "C" */
#endif

File diff suppressed because it is too large Load Diff

942
src/gen_loader.c Normal file
View File

@@ -0,0 +1,942 @@
// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
/* Copyright (c) 2021 Facebook */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <linux/filter.h>
#include <sys/param.h>
#include "btf.h"
#include "bpf.h"
#include "libbpf.h"
#include "libbpf_internal.h"
#include "hashmap.h"
#include "bpf_gen_internal.h"
#include "skel_internal.h"
#define MAX_USED_MAPS 64
#define MAX_USED_PROGS 32
#define MAX_KFUNC_DESCS 256
#define MAX_FD_ARRAY_SZ (MAX_USED_PROGS + MAX_KFUNC_DESCS)
/* The following structure describes the stack layout of the loader program.
* In addition R6 contains the pointer to context.
* R7 contains the result of the last sys_bpf command (typically error or FD).
* R9 contains the result of the last sys_close command.
*
* Naming convention:
* ctx - bpf program context
* stack - bpf program stack
* blob - bpf_attr-s, strings, insns, map data.
* All the bytes that loader prog will use for read/write.
*/
struct loader_stack {
__u32 btf_fd;
__u32 prog_fd[MAX_USED_PROGS];
__u32 inner_map_fd;
};
#define stack_off(field) \
(__s16)(-sizeof(struct loader_stack) + offsetof(struct loader_stack, field))
#define attr_field(attr, field) (attr + offsetof(union bpf_attr, field))
static int realloc_insn_buf(struct bpf_gen *gen, __u32 size)
{
size_t off = gen->insn_cur - gen->insn_start;
void *insn_start;
if (gen->error)
return gen->error;
if (size > INT32_MAX || off + size > INT32_MAX) {
gen->error = -ERANGE;
return -ERANGE;
}
insn_start = realloc(gen->insn_start, off + size);
if (!insn_start) {
gen->error = -ENOMEM;
free(gen->insn_start);
gen->insn_start = NULL;
return -ENOMEM;
}
gen->insn_start = insn_start;
gen->insn_cur = insn_start + off;
return 0;
}
static int realloc_data_buf(struct bpf_gen *gen, __u32 size)
{
size_t off = gen->data_cur - gen->data_start;
void *data_start;
if (gen->error)
return gen->error;
if (size > INT32_MAX || off + size > INT32_MAX) {
gen->error = -ERANGE;
return -ERANGE;
}
data_start = realloc(gen->data_start, off + size);
if (!data_start) {
gen->error = -ENOMEM;
free(gen->data_start);
gen->data_start = NULL;
return -ENOMEM;
}
gen->data_start = data_start;
gen->data_cur = data_start + off;
return 0;
}
static void emit(struct bpf_gen *gen, struct bpf_insn insn)
{
if (realloc_insn_buf(gen, sizeof(insn)))
return;
memcpy(gen->insn_cur, &insn, sizeof(insn));
gen->insn_cur += sizeof(insn);
}
static void emit2(struct bpf_gen *gen, struct bpf_insn insn1, struct bpf_insn insn2)
{
emit(gen, insn1);
emit(gen, insn2);
}
void bpf_gen__init(struct bpf_gen *gen, int log_level)
{
size_t stack_sz = sizeof(struct loader_stack);
int i;
gen->log_level = log_level;
/* save ctx pointer into R6 */
emit(gen, BPF_MOV64_REG(BPF_REG_6, BPF_REG_1));
/* bzero stack */
emit(gen, BPF_MOV64_REG(BPF_REG_1, BPF_REG_10));
emit(gen, BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -stack_sz));
emit(gen, BPF_MOV64_IMM(BPF_REG_2, stack_sz));
emit(gen, BPF_MOV64_IMM(BPF_REG_3, 0));
emit(gen, BPF_EMIT_CALL(BPF_FUNC_probe_read_kernel));
/* jump over cleanup code */
emit(gen, BPF_JMP_IMM(BPF_JA, 0, 0,
/* size of cleanup code below */
(stack_sz / 4) * 3 + 2));
/* remember the label where all error branches will jump to */
gen->cleanup_label = gen->insn_cur - gen->insn_start;
/* emit cleanup code: close all temp FDs */
for (i = 0; i < stack_sz; i += 4) {
emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, -stack_sz + i));
emit(gen, BPF_JMP_IMM(BPF_JSLE, BPF_REG_1, 0, 1));
emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_close));
}
/* R7 contains the error code from sys_bpf. Copy it into R0 and exit. */
emit(gen, BPF_MOV64_REG(BPF_REG_0, BPF_REG_7));
emit(gen, BPF_EXIT_INSN());
}
static int add_data(struct bpf_gen *gen, const void *data, __u32 size)
{
__u32 size8 = roundup(size, 8);
__u64 zero = 0;
void *prev;
if (realloc_data_buf(gen, size8))
return 0;
prev = gen->data_cur;
if (data) {
memcpy(gen->data_cur, data, size);
memcpy(gen->data_cur + size, &zero, size8 - size);
} else {
memset(gen->data_cur, 0, size8);
}
gen->data_cur += size8;
return prev - gen->data_start;
}
/* Get index for map_fd/btf_fd slot in reserved fd_array, or in data relative
* to start of fd_array. Caller can decide if it is usable or not.
*/
static int add_map_fd(struct bpf_gen *gen)
{
if (!gen->fd_array)
gen->fd_array = add_data(gen, NULL, MAX_FD_ARRAY_SZ * sizeof(int));
if (gen->nr_maps == MAX_USED_MAPS) {
pr_warn("Total maps exceeds %d\n", MAX_USED_MAPS);
gen->error = -E2BIG;
return 0;
}
return gen->nr_maps++;
}
static int add_kfunc_btf_fd(struct bpf_gen *gen)
{
int cur;
if (!gen->fd_array)
gen->fd_array = add_data(gen, NULL, MAX_FD_ARRAY_SZ * sizeof(int));
if (gen->nr_fd_array == MAX_KFUNC_DESCS) {
cur = add_data(gen, NULL, sizeof(int));
return (cur - gen->fd_array) / sizeof(int);
}
return MAX_USED_MAPS + gen->nr_fd_array++;
}
static int blob_fd_array_off(struct bpf_gen *gen, int index)
{
return gen->fd_array + index * sizeof(int);
}
static int insn_bytes_to_bpf_size(__u32 sz)
{
switch (sz) {
case 8: return BPF_DW;
case 4: return BPF_W;
case 2: return BPF_H;
case 1: return BPF_B;
default: return -1;
}
}
/* *(u64 *)(blob + off) = (u64)(void *)(blob + data) */
static void emit_rel_store(struct bpf_gen *gen, int off, int data)
{
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, data));
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, off));
emit(gen, BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, 0));
}
static void move_blob2blob(struct bpf_gen *gen, int off, int size, int blob_off)
{
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_2, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, blob_off));
emit(gen, BPF_LDX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_0, BPF_REG_2, 0));
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, off));
emit(gen, BPF_STX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_1, BPF_REG_0, 0));
}
static void move_blob2ctx(struct bpf_gen *gen, int ctx_off, int size, int blob_off)
{
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, blob_off));
emit(gen, BPF_LDX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_0, BPF_REG_1, 0));
emit(gen, BPF_STX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_6, BPF_REG_0, ctx_off));
}
static void move_ctx2blob(struct bpf_gen *gen, int off, int size, int ctx_off,
bool check_non_zero)
{
emit(gen, BPF_LDX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_0, BPF_REG_6, ctx_off));
if (check_non_zero)
/* If value in ctx is zero don't update the blob.
* For example: when ctx->map.max_entries == 0, keep default max_entries from bpf.c
*/
emit(gen, BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 3));
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, off));
emit(gen, BPF_STX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_1, BPF_REG_0, 0));
}
static void move_stack2blob(struct bpf_gen *gen, int off, int size, int stack_off)
{
emit(gen, BPF_LDX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_0, BPF_REG_10, stack_off));
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, off));
emit(gen, BPF_STX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_1, BPF_REG_0, 0));
}
static void move_stack2ctx(struct bpf_gen *gen, int ctx_off, int size, int stack_off)
{
emit(gen, BPF_LDX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_0, BPF_REG_10, stack_off));
emit(gen, BPF_STX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_6, BPF_REG_0, ctx_off));
}
static void emit_sys_bpf(struct bpf_gen *gen, int cmd, int attr, int attr_size)
{
emit(gen, BPF_MOV64_IMM(BPF_REG_1, cmd));
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_2, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, attr));
emit(gen, BPF_MOV64_IMM(BPF_REG_3, attr_size));
emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_bpf));
/* remember the result in R7 */
emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0));
}
static bool is_simm16(__s64 value)
{
return value == (__s64)(__s16)value;
}
static void emit_check_err(struct bpf_gen *gen)
{
__s64 off = -(gen->insn_cur - gen->insn_start - gen->cleanup_label) / 8 - 1;
/* R7 contains result of last sys_bpf command.
* if (R7 < 0) goto cleanup;
*/
if (is_simm16(off)) {
emit(gen, BPF_JMP_IMM(BPF_JSLT, BPF_REG_7, 0, off));
} else {
gen->error = -ERANGE;
emit(gen, BPF_JMP_IMM(BPF_JA, 0, 0, -1));
}
}
/* reg1 and reg2 should not be R1 - R5. They can be R0, R6 - R10 */
static void emit_debug(struct bpf_gen *gen, int reg1, int reg2,
const char *fmt, va_list args)
{
char buf[1024];
int addr, len, ret;
if (!gen->log_level)
return;
ret = vsnprintf(buf, sizeof(buf), fmt, args);
if (ret < 1024 - 7 && reg1 >= 0 && reg2 < 0)
/* The special case to accommodate common debug_ret():
* to avoid specifying BPF_REG_7 and adding " r=%%d" to
* prints explicitly.
*/
strcat(buf, " r=%d");
len = strlen(buf) + 1;
addr = add_data(gen, buf, len);
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, addr));
emit(gen, BPF_MOV64_IMM(BPF_REG_2, len));
if (reg1 >= 0)
emit(gen, BPF_MOV64_REG(BPF_REG_3, reg1));
if (reg2 >= 0)
emit(gen, BPF_MOV64_REG(BPF_REG_4, reg2));
emit(gen, BPF_EMIT_CALL(BPF_FUNC_trace_printk));
}
static void debug_regs(struct bpf_gen *gen, int reg1, int reg2, const char *fmt, ...)
{
va_list args;
va_start(args, fmt);
emit_debug(gen, reg1, reg2, fmt, args);
va_end(args);
}
static void debug_ret(struct bpf_gen *gen, const char *fmt, ...)
{
va_list args;
va_start(args, fmt);
emit_debug(gen, BPF_REG_7, -1, fmt, args);
va_end(args);
}
static void __emit_sys_close(struct bpf_gen *gen)
{
emit(gen, BPF_JMP_IMM(BPF_JSLE, BPF_REG_1, 0,
/* 2 is the number of the following insns
* * 6 is additional insns in debug_regs
*/
2 + (gen->log_level ? 6 : 0)));
emit(gen, BPF_MOV64_REG(BPF_REG_9, BPF_REG_1));
emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_close));
debug_regs(gen, BPF_REG_9, BPF_REG_0, "close(%%d) = %%d");
}
static void emit_sys_close_stack(struct bpf_gen *gen, int stack_off)
{
emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, stack_off));
__emit_sys_close(gen);
}
static void emit_sys_close_blob(struct bpf_gen *gen, int blob_off)
{
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, blob_off));
emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, 0));
__emit_sys_close(gen);
}
int bpf_gen__finish(struct bpf_gen *gen)
{
int i;
emit_sys_close_stack(gen, stack_off(btf_fd));
for (i = 0; i < gen->nr_progs; i++)
move_stack2ctx(gen,
sizeof(struct bpf_loader_ctx) +
sizeof(struct bpf_map_desc) * gen->nr_maps +
sizeof(struct bpf_prog_desc) * i +
offsetof(struct bpf_prog_desc, prog_fd), 4,
stack_off(prog_fd[i]));
for (i = 0; i < gen->nr_maps; i++)
move_blob2ctx(gen,
sizeof(struct bpf_loader_ctx) +
sizeof(struct bpf_map_desc) * i +
offsetof(struct bpf_map_desc, map_fd), 4,
blob_fd_array_off(gen, i));
emit(gen, BPF_MOV64_IMM(BPF_REG_0, 0));
emit(gen, BPF_EXIT_INSN());
pr_debug("gen: finish %d\n", gen->error);
if (!gen->error) {
struct gen_loader_opts *opts = gen->opts;
opts->insns = gen->insn_start;
opts->insns_sz = gen->insn_cur - gen->insn_start;
opts->data = gen->data_start;
opts->data_sz = gen->data_cur - gen->data_start;
}
return gen->error;
}
void bpf_gen__free(struct bpf_gen *gen)
{
if (!gen)
return;
free(gen->data_start);
free(gen->insn_start);
free(gen);
}
void bpf_gen__load_btf(struct bpf_gen *gen, const void *btf_raw_data,
__u32 btf_raw_size)
{
int attr_size = offsetofend(union bpf_attr, btf_log_level);
int btf_data, btf_load_attr;
union bpf_attr attr;
memset(&attr, 0, attr_size);
pr_debug("gen: load_btf: size %d\n", btf_raw_size);
btf_data = add_data(gen, btf_raw_data, btf_raw_size);
attr.btf_size = btf_raw_size;
btf_load_attr = add_data(gen, &attr, attr_size);
/* populate union bpf_attr with user provided log details */
move_ctx2blob(gen, attr_field(btf_load_attr, btf_log_level), 4,
offsetof(struct bpf_loader_ctx, log_level), false);
move_ctx2blob(gen, attr_field(btf_load_attr, btf_log_size), 4,
offsetof(struct bpf_loader_ctx, log_size), false);
move_ctx2blob(gen, attr_field(btf_load_attr, btf_log_buf), 8,
offsetof(struct bpf_loader_ctx, log_buf), false);
/* populate union bpf_attr with a pointer to the BTF data */
emit_rel_store(gen, attr_field(btf_load_attr, btf), btf_data);
/* emit BTF_LOAD command */
emit_sys_bpf(gen, BPF_BTF_LOAD, btf_load_attr, attr_size);
debug_ret(gen, "btf_load size %d", btf_raw_size);
emit_check_err(gen);
/* remember btf_fd in the stack, if successful */
emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(btf_fd)));
}
void bpf_gen__map_create(struct bpf_gen *gen,
struct bpf_create_map_attr *map_attr, int map_idx)
{
int attr_size = offsetofend(union bpf_attr, btf_vmlinux_value_type_id);
bool close_inner_map_fd = false;
int map_create_attr, idx;
union bpf_attr attr;
memset(&attr, 0, attr_size);
attr.map_type = map_attr->map_type;
attr.key_size = map_attr->key_size;
attr.value_size = map_attr->value_size;
attr.map_flags = map_attr->map_flags;
memcpy(attr.map_name, map_attr->name,
min((unsigned)strlen(map_attr->name), BPF_OBJ_NAME_LEN - 1));
attr.numa_node = map_attr->numa_node;
attr.map_ifindex = map_attr->map_ifindex;
attr.max_entries = map_attr->max_entries;
switch (attr.map_type) {
case BPF_MAP_TYPE_PERF_EVENT_ARRAY:
case BPF_MAP_TYPE_CGROUP_ARRAY:
case BPF_MAP_TYPE_STACK_TRACE:
case BPF_MAP_TYPE_ARRAY_OF_MAPS:
case BPF_MAP_TYPE_HASH_OF_MAPS:
case BPF_MAP_TYPE_DEVMAP:
case BPF_MAP_TYPE_DEVMAP_HASH:
case BPF_MAP_TYPE_CPUMAP:
case BPF_MAP_TYPE_XSKMAP:
case BPF_MAP_TYPE_SOCKMAP:
case BPF_MAP_TYPE_SOCKHASH:
case BPF_MAP_TYPE_QUEUE:
case BPF_MAP_TYPE_STACK:
case BPF_MAP_TYPE_RINGBUF:
break;
default:
attr.btf_key_type_id = map_attr->btf_key_type_id;
attr.btf_value_type_id = map_attr->btf_value_type_id;
}
pr_debug("gen: map_create: %s idx %d type %d value_type_id %d\n",
attr.map_name, map_idx, map_attr->map_type, attr.btf_value_type_id);
map_create_attr = add_data(gen, &attr, attr_size);
if (attr.btf_value_type_id)
/* populate union bpf_attr with btf_fd saved in the stack earlier */
move_stack2blob(gen, attr_field(map_create_attr, btf_fd), 4,
stack_off(btf_fd));
switch (attr.map_type) {
case BPF_MAP_TYPE_ARRAY_OF_MAPS:
case BPF_MAP_TYPE_HASH_OF_MAPS:
move_stack2blob(gen, attr_field(map_create_attr, inner_map_fd), 4,
stack_off(inner_map_fd));
close_inner_map_fd = true;
break;
default:
break;
}
/* conditionally update max_entries */
if (map_idx >= 0)
move_ctx2blob(gen, attr_field(map_create_attr, max_entries), 4,
sizeof(struct bpf_loader_ctx) +
sizeof(struct bpf_map_desc) * map_idx +
offsetof(struct bpf_map_desc, max_entries),
true /* check that max_entries != 0 */);
/* emit MAP_CREATE command */
emit_sys_bpf(gen, BPF_MAP_CREATE, map_create_attr, attr_size);
debug_ret(gen, "map_create %s idx %d type %d value_size %d value_btf_id %d",
attr.map_name, map_idx, map_attr->map_type, attr.value_size,
attr.btf_value_type_id);
emit_check_err(gen);
/* remember map_fd in the stack, if successful */
if (map_idx < 0) {
/* This bpf_gen__map_create() function is called with map_idx >= 0
* for all maps that libbpf loading logic tracks.
* It's called with -1 to create an inner map.
*/
emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7,
stack_off(inner_map_fd)));
} else if (map_idx != gen->nr_maps) {
gen->error = -EDOM; /* internal bug */
return;
} else {
/* add_map_fd does gen->nr_maps++ */
idx = add_map_fd(gen);
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, blob_fd_array_off(gen, idx)));
emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_7, 0));
}
if (close_inner_map_fd)
emit_sys_close_stack(gen, stack_off(inner_map_fd));
}
void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *attach_name,
enum bpf_attach_type type)
{
const char *prefix;
int kind, ret;
btf_get_kernel_prefix_kind(type, &prefix, &kind);
gen->attach_kind = kind;
ret = snprintf(gen->attach_target, sizeof(gen->attach_target), "%s%s",
prefix, attach_name);
if (ret == sizeof(gen->attach_target))
gen->error = -ENOSPC;
}
static void emit_find_attach_target(struct bpf_gen *gen)
{
int name, len = strlen(gen->attach_target) + 1;
pr_debug("gen: find_attach_tgt %s %d\n", gen->attach_target, gen->attach_kind);
name = add_data(gen, gen->attach_target, len);
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, name));
emit(gen, BPF_MOV64_IMM(BPF_REG_2, len));
emit(gen, BPF_MOV64_IMM(BPF_REG_3, gen->attach_kind));
emit(gen, BPF_MOV64_IMM(BPF_REG_4, 0));
emit(gen, BPF_EMIT_CALL(BPF_FUNC_btf_find_by_name_kind));
emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0));
debug_ret(gen, "find_by_name_kind(%s,%d)",
gen->attach_target, gen->attach_kind);
emit_check_err(gen);
/* if successful, btf_id is in lower 32-bit of R7 and
* btf_obj_fd is in upper 32-bit
*/
}
void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, bool is_weak,
int kind, int insn_idx)
{
struct ksym_relo_desc *relo;
relo = libbpf_reallocarray(gen->relos, gen->relo_cnt + 1, sizeof(*relo));
if (!relo) {
gen->error = -ENOMEM;
return;
}
gen->relos = relo;
relo += gen->relo_cnt;
relo->name = name;
relo->is_weak = is_weak;
relo->kind = kind;
relo->insn_idx = insn_idx;
gen->relo_cnt++;
}
/* returns existing ksym_desc with ref incremented, or inserts a new one */
static struct ksym_desc *get_ksym_desc(struct bpf_gen *gen, struct ksym_relo_desc *relo)
{
struct ksym_desc *kdesc;
for (int i = 0; i < gen->nr_ksyms; i++) {
if (!strcmp(gen->ksyms[i].name, relo->name)) {
gen->ksyms[i].ref++;
return &gen->ksyms[i];
}
}
kdesc = libbpf_reallocarray(gen->ksyms, gen->nr_ksyms + 1, sizeof(*kdesc));
if (!kdesc) {
gen->error = -ENOMEM;
return NULL;
}
gen->ksyms = kdesc;
kdesc = &gen->ksyms[gen->nr_ksyms++];
kdesc->name = relo->name;
kdesc->kind = relo->kind;
kdesc->ref = 1;
kdesc->off = 0;
kdesc->insn = 0;
return kdesc;
}
/* Overwrites BPF_REG_{0, 1, 2, 3, 4, 7}
* Returns result in BPF_REG_7
*/
static void emit_bpf_find_by_name_kind(struct bpf_gen *gen, struct ksym_relo_desc *relo)
{
int name_off, len = strlen(relo->name) + 1;
name_off = add_data(gen, relo->name, len);
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, name_off));
emit(gen, BPF_MOV64_IMM(BPF_REG_2, len));
emit(gen, BPF_MOV64_IMM(BPF_REG_3, relo->kind));
emit(gen, BPF_MOV64_IMM(BPF_REG_4, 0));
emit(gen, BPF_EMIT_CALL(BPF_FUNC_btf_find_by_name_kind));
emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0));
debug_ret(gen, "find_by_name_kind(%s,%d)", relo->name, relo->kind);
}
/* Expects:
* BPF_REG_8 - pointer to instruction
*
* We need to reuse BTF fd for same symbol otherwise each relocation takes a new
* index, while kernel limits total kfunc BTFs to 256. For duplicate symbols,
* this would mean a new BTF fd index for each entry. By pairing symbol name
* with index, we get the insn->imm, insn->off pairing that kernel uses for
* kfunc_tab, which becomes the effective limit even though all of them may
* share same index in fd_array (such that kfunc_btf_tab has 1 element).
*/
static void emit_relo_kfunc_btf(struct bpf_gen *gen, struct ksym_relo_desc *relo, int insn)
{
struct ksym_desc *kdesc;
int btf_fd_idx;
kdesc = get_ksym_desc(gen, relo);
if (!kdesc)
return;
/* try to copy from existing bpf_insn */
if (kdesc->ref > 1) {
move_blob2blob(gen, insn + offsetof(struct bpf_insn, imm), 4,
kdesc->insn + offsetof(struct bpf_insn, imm));
move_blob2blob(gen, insn + offsetof(struct bpf_insn, off), 2,
kdesc->insn + offsetof(struct bpf_insn, off));
goto log;
}
/* remember insn offset, so we can copy BTF ID and FD later */
kdesc->insn = insn;
emit_bpf_find_by_name_kind(gen, relo);
if (!relo->is_weak)
emit_check_err(gen);
/* get index in fd_array to store BTF FD at */
btf_fd_idx = add_kfunc_btf_fd(gen);
if (btf_fd_idx > INT16_MAX) {
pr_warn("BTF fd off %d for kfunc %s exceeds INT16_MAX, cannot process relocation\n",
btf_fd_idx, relo->name);
gen->error = -E2BIG;
return;
}
kdesc->off = btf_fd_idx;
/* set a default value for imm */
emit(gen, BPF_ST_MEM(BPF_W, BPF_REG_8, offsetof(struct bpf_insn, imm), 0));
/* skip success case store if ret < 0 */
emit(gen, BPF_JMP_IMM(BPF_JSLT, BPF_REG_7, 0, 1));
/* store btf_id into insn[insn_idx].imm */
emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_8, BPF_REG_7, offsetof(struct bpf_insn, imm)));
/* load fd_array slot pointer */
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, blob_fd_array_off(gen, btf_fd_idx)));
/* skip store of BTF fd if ret < 0 */
emit(gen, BPF_JMP_IMM(BPF_JSLT, BPF_REG_7, 0, 3));
/* store BTF fd in slot */
emit(gen, BPF_MOV64_REG(BPF_REG_9, BPF_REG_7));
emit(gen, BPF_ALU64_IMM(BPF_RSH, BPF_REG_9, 32));
emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_9, 0));
/* set a default value for off */
emit(gen, BPF_ST_MEM(BPF_H, BPF_REG_8, offsetof(struct bpf_insn, off), 0));
/* skip insn->off store if ret < 0 */
emit(gen, BPF_JMP_IMM(BPF_JSLT, BPF_REG_7, 0, 2));
/* skip if vmlinux BTF */
emit(gen, BPF_JMP_IMM(BPF_JEQ, BPF_REG_9, 0, 1));
/* store index into insn[insn_idx].off */
emit(gen, BPF_ST_MEM(BPF_H, BPF_REG_8, offsetof(struct bpf_insn, off), btf_fd_idx));
log:
if (!gen->log_level)
return;
emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_8,
offsetof(struct bpf_insn, imm)));
emit(gen, BPF_LDX_MEM(BPF_H, BPF_REG_9, BPF_REG_8,
offsetof(struct bpf_insn, off)));
debug_regs(gen, BPF_REG_7, BPF_REG_9, " func (%s:count=%d): imm: %%d, off: %%d",
relo->name, kdesc->ref);
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, blob_fd_array_off(gen, kdesc->off)));
emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_9, BPF_REG_0, 0));
debug_regs(gen, BPF_REG_9, -1, " func (%s:count=%d): btf_fd",
relo->name, kdesc->ref);
}
/* Expects:
* BPF_REG_8 - pointer to instruction
*/
static void emit_relo_ksym_btf(struct bpf_gen *gen, struct ksym_relo_desc *relo, int insn)
{
struct ksym_desc *kdesc;
kdesc = get_ksym_desc(gen, relo);
if (!kdesc)
return;
/* try to copy from existing ldimm64 insn */
if (kdesc->ref > 1) {
move_blob2blob(gen, insn + offsetof(struct bpf_insn, imm), 4,
kdesc->insn + offsetof(struct bpf_insn, imm));
move_blob2blob(gen, insn + sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm), 4,
kdesc->insn + sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm));
goto log;
}
/* remember insn offset, so we can copy BTF ID and FD later */
kdesc->insn = insn;
emit_bpf_find_by_name_kind(gen, relo);
emit_check_err(gen);
/* store btf_id into insn[insn_idx].imm */
emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_8, BPF_REG_7, offsetof(struct bpf_insn, imm)));
/* store btf_obj_fd into insn[insn_idx + 1].imm */
emit(gen, BPF_ALU64_IMM(BPF_RSH, BPF_REG_7, 32));
emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_8, BPF_REG_7,
sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm)));
log:
if (!gen->log_level)
return;
emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_8,
offsetof(struct bpf_insn, imm)));
emit(gen, BPF_LDX_MEM(BPF_H, BPF_REG_9, BPF_REG_8, sizeof(struct bpf_insn) +
offsetof(struct bpf_insn, imm)));
debug_regs(gen, BPF_REG_7, BPF_REG_9, " var (%s:count=%d): imm: %%d, fd: %%d",
relo->name, kdesc->ref);
}
static void emit_relo(struct bpf_gen *gen, struct ksym_relo_desc *relo, int insns)
{
int insn;
pr_debug("gen: emit_relo (%d): %s at %d\n", relo->kind, relo->name, relo->insn_idx);
insn = insns + sizeof(struct bpf_insn) * relo->insn_idx;
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_8, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, insn));
switch (relo->kind) {
case BTF_KIND_VAR:
emit_relo_ksym_btf(gen, relo, insn);
break;
case BTF_KIND_FUNC:
emit_relo_kfunc_btf(gen, relo, insn);
break;
default:
pr_warn("Unknown relocation kind '%d'\n", relo->kind);
gen->error = -EDOM;
return;
}
}
static void emit_relos(struct bpf_gen *gen, int insns)
{
int i;
for (i = 0; i < gen->relo_cnt; i++)
emit_relo(gen, gen->relos + i, insns);
}
static void cleanup_relos(struct bpf_gen *gen, int insns)
{
int i, insn;
for (i = 0; i < gen->nr_ksyms; i++) {
if (gen->ksyms[i].kind == BTF_KIND_VAR) {
/* close fd recorded in insn[insn_idx + 1].imm */
insn = gen->ksyms[i].insn;
insn += sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm);
emit_sys_close_blob(gen, insn);
} else { /* BTF_KIND_FUNC */
emit_sys_close_blob(gen, blob_fd_array_off(gen, gen->ksyms[i].off));
if (gen->ksyms[i].off < MAX_FD_ARRAY_SZ)
gen->nr_fd_array--;
}
}
if (gen->nr_ksyms) {
free(gen->ksyms);
gen->nr_ksyms = 0;
gen->ksyms = NULL;
}
if (gen->relo_cnt) {
free(gen->relos);
gen->relo_cnt = 0;
gen->relos = NULL;
}
}
void bpf_gen__prog_load(struct bpf_gen *gen,
struct bpf_prog_load_params *load_attr, int prog_idx)
{
int attr_size = offsetofend(union bpf_attr, fd_array);
int prog_load_attr, license, insns, func_info, line_info;
union bpf_attr attr;
memset(&attr, 0, attr_size);
pr_debug("gen: prog_load: type %d insns_cnt %zd\n",
load_attr->prog_type, load_attr->insn_cnt);
/* add license string to blob of bytes */
license = add_data(gen, load_attr->license, strlen(load_attr->license) + 1);
/* add insns to blob of bytes */
insns = add_data(gen, load_attr->insns,
load_attr->insn_cnt * sizeof(struct bpf_insn));
attr.prog_type = load_attr->prog_type;
attr.expected_attach_type = load_attr->expected_attach_type;
attr.attach_btf_id = load_attr->attach_btf_id;
attr.prog_ifindex = load_attr->prog_ifindex;
attr.kern_version = 0;
attr.insn_cnt = (__u32)load_attr->insn_cnt;
attr.prog_flags = load_attr->prog_flags;
attr.func_info_rec_size = load_attr->func_info_rec_size;
attr.func_info_cnt = load_attr->func_info_cnt;
func_info = add_data(gen, load_attr->func_info,
attr.func_info_cnt * attr.func_info_rec_size);
attr.line_info_rec_size = load_attr->line_info_rec_size;
attr.line_info_cnt = load_attr->line_info_cnt;
line_info = add_data(gen, load_attr->line_info,
attr.line_info_cnt * attr.line_info_rec_size);
memcpy(attr.prog_name, load_attr->name,
min((unsigned)strlen(load_attr->name), BPF_OBJ_NAME_LEN - 1));
prog_load_attr = add_data(gen, &attr, attr_size);
/* populate union bpf_attr with a pointer to license */
emit_rel_store(gen, attr_field(prog_load_attr, license), license);
/* populate union bpf_attr with a pointer to instructions */
emit_rel_store(gen, attr_field(prog_load_attr, insns), insns);
/* populate union bpf_attr with a pointer to func_info */
emit_rel_store(gen, attr_field(prog_load_attr, func_info), func_info);
/* populate union bpf_attr with a pointer to line_info */
emit_rel_store(gen, attr_field(prog_load_attr, line_info), line_info);
/* populate union bpf_attr fd_array with a pointer to data where map_fds are saved */
emit_rel_store(gen, attr_field(prog_load_attr, fd_array), gen->fd_array);
/* populate union bpf_attr with user provided log details */
move_ctx2blob(gen, attr_field(prog_load_attr, log_level), 4,
offsetof(struct bpf_loader_ctx, log_level), false);
move_ctx2blob(gen, attr_field(prog_load_attr, log_size), 4,
offsetof(struct bpf_loader_ctx, log_size), false);
move_ctx2blob(gen, attr_field(prog_load_attr, log_buf), 8,
offsetof(struct bpf_loader_ctx, log_buf), false);
/* populate union bpf_attr with btf_fd saved in the stack earlier */
move_stack2blob(gen, attr_field(prog_load_attr, prog_btf_fd), 4,
stack_off(btf_fd));
if (gen->attach_kind) {
emit_find_attach_target(gen);
/* populate union bpf_attr with btf_id and btf_obj_fd found by helper */
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, prog_load_attr));
emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_7,
offsetof(union bpf_attr, attach_btf_id)));
emit(gen, BPF_ALU64_IMM(BPF_RSH, BPF_REG_7, 32));
emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_7,
offsetof(union bpf_attr, attach_btf_obj_fd)));
}
emit_relos(gen, insns);
/* emit PROG_LOAD command */
emit_sys_bpf(gen, BPF_PROG_LOAD, prog_load_attr, attr_size);
debug_ret(gen, "prog_load %s insn_cnt %d", attr.prog_name, attr.insn_cnt);
/* successful or not, close btf module FDs used in extern ksyms and attach_btf_obj_fd */
cleanup_relos(gen, insns);
if (gen->attach_kind)
emit_sys_close_blob(gen,
attr_field(prog_load_attr, attach_btf_obj_fd));
emit_check_err(gen);
/* remember prog_fd in the stack, if successful */
emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7,
stack_off(prog_fd[gen->nr_progs])));
gen->nr_progs++;
}
void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue,
__u32 value_size)
{
int attr_size = offsetofend(union bpf_attr, flags);
int map_update_attr, value, key;
union bpf_attr attr;
int zero = 0;
memset(&attr, 0, attr_size);
pr_debug("gen: map_update_elem: idx %d\n", map_idx);
value = add_data(gen, pvalue, value_size);
key = add_data(gen, &zero, sizeof(zero));
/* if (map_desc[map_idx].initial_value)
* copy_from_user(value, initial_value, value_size);
*/
emit(gen, BPF_LDX_MEM(BPF_DW, BPF_REG_3, BPF_REG_6,
sizeof(struct bpf_loader_ctx) +
sizeof(struct bpf_map_desc) * map_idx +
offsetof(struct bpf_map_desc, initial_value)));
emit(gen, BPF_JMP_IMM(BPF_JEQ, BPF_REG_3, 0, 4));
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, value));
emit(gen, BPF_MOV64_IMM(BPF_REG_2, value_size));
emit(gen, BPF_EMIT_CALL(BPF_FUNC_copy_from_user));
map_update_attr = add_data(gen, &attr, attr_size);
move_blob2blob(gen, attr_field(map_update_attr, map_fd), 4,
blob_fd_array_off(gen, map_idx));
emit_rel_store(gen, attr_field(map_update_attr, key), key);
emit_rel_store(gen, attr_field(map_update_attr, value), value);
/* emit MAP_UPDATE_ELEM command */
emit_sys_bpf(gen, BPF_MAP_UPDATE_ELEM, map_update_attr, attr_size);
debug_ret(gen, "update_elem idx %d value_size %d", map_idx, value_size);
emit_check_err(gen);
}
void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx)
{
int attr_size = offsetofend(union bpf_attr, map_fd);
int map_freeze_attr;
union bpf_attr attr;
memset(&attr, 0, attr_size);
pr_debug("gen: map_freeze: idx %d\n", map_idx);
map_freeze_attr = add_data(gen, &attr, attr_size);
move_blob2blob(gen, attr_field(map_freeze_attr, map_fd), 4,
blob_fd_array_off(gen, map_idx));
/* emit MAP_FREEZE command */
emit_sys_bpf(gen, BPF_MAP_FREEZE, map_freeze_attr, attr_size);
debug_ret(gen, "map_freeze");
emit_check_err(gen);
}

View File

@@ -12,6 +12,12 @@
#include <linux/err.h>
#include "hashmap.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
/* prevent accidental re-addition of reallocarray() */
#pragma GCC poison reallocarray
/* start with 4 buckets */
#define HASHMAP_MIN_CAP_BITS 2
@@ -56,7 +62,14 @@ struct hashmap *hashmap__new(hashmap_hash_fn hash_fn,
void hashmap__clear(struct hashmap *map)
{
struct hashmap_entry *cur, *tmp;
size_t bkt;
hashmap__for_each_entry_safe(map, cur, tmp, bkt) {
free(cur);
}
free(map->buckets);
map->buckets = NULL;
map->cap = map->cap_bits = map->sz = 0;
}
@@ -90,8 +103,7 @@ static int hashmap_grow(struct hashmap *map)
struct hashmap_entry **new_buckets;
struct hashmap_entry *cur, *tmp;
size_t new_cap_bits, new_cap;
size_t h;
int bkt;
size_t h, bkt;
new_cap_bits = map->cap_bits + 1;
if (new_cap_bits < HASHMAP_MIN_CAP_BITS)

View File

@@ -10,17 +10,34 @@
#include <stdbool.h>
#include <stddef.h>
#ifdef __GLIBC__
#include <bits/wordsize.h>
#else
#include <bits/reg.h>
#endif
#include "libbpf_internal.h"
#include <limits.h>
static inline size_t hash_bits(size_t h, int bits)
{
/* shuffle bits and return requested number of upper bits */
return (h * 11400714819323198485llu) >> (__WORDSIZE - bits);
if (bits == 0)
return 0;
#if (__SIZEOF_SIZE_T__ == __SIZEOF_LONG_LONG__)
/* LP64 case */
return (h * 11400714819323198485llu) >> (__SIZEOF_LONG_LONG__ * 8 - bits);
#elif (__SIZEOF_SIZE_T__ <= __SIZEOF_LONG__)
return (h * 2654435769lu) >> (__SIZEOF_LONG__ * 8 - bits);
#else
# error "Unsupported size_t size"
#endif
}
/* generic C-string hashing function */
static inline size_t str_hash(const char *s)
{
size_t h = 0;
while (*s) {
h = h * 31 + *s;
s++;
}
return h;
}
typedef size_t (*hashmap_hash_fn)(const void *key, void *ctx);
@@ -160,17 +177,17 @@ bool hashmap__find(const struct hashmap *map, const void *key, void **value);
* @key: key to iterate entries for
*/
#define hashmap__for_each_key_entry(map, cur, _key) \
for (cur = ({ size_t bkt = hash_bits(map->hash_fn((_key), map->ctx),\
map->cap_bits); \
map->buckets ? map->buckets[bkt] : NULL; }); \
for (cur = map->buckets \
? map->buckets[hash_bits(map->hash_fn((_key), map->ctx), map->cap_bits)] \
: NULL; \
cur; \
cur = cur->next) \
if (map->equal_fn(cur->key, (_key), map->ctx))
#define hashmap__for_each_key_entry_safe(map, cur, tmp, _key) \
for (cur = ({ size_t bkt = hash_bits(map->hash_fn((_key), map->ctx),\
map->cap_bits); \
cur = map->buckets ? map->buckets[bkt] : NULL; }); \
for (cur = map->buckets \
? map->buckets[hash_bits(map->hash_fn((_key), map->ctx), map->cap_bits)] \
: NULL; \
cur && ({ tmp = cur->next; true; }); \
cur = tmp) \
if (map->equal_fn(cur->key, (_key), map->ctx))

10479
src/libbpf.c

File diff suppressed because it is too large Load Diff

View File

@@ -17,14 +17,13 @@
#include <sys/types.h> // for size_t
#include <linux/bpf.h>
#include "libbpf_common.h"
#include "libbpf_legacy.h"
#ifdef __cplusplus
extern "C" {
#endif
#ifndef LIBBPF_API
#define LIBBPF_API __attribute__((visibility("default")))
#endif
enum libbpf_errno {
__LIBBPF_ERRNO__START = 4000,
@@ -67,18 +66,71 @@ struct bpf_object_open_attr {
enum bpf_prog_type prog_type;
};
struct bpf_object_open_opts {
/* size of this struct, for forward/backward compatiblity */
size_t sz;
/* object name override, if provided:
* - for object open from file, this will override setting object
* name from file path's base name;
* - for object open from memory buffer, this will specify an object
* name and will override default "<addr>-<buf-size>" name;
*/
const char *object_name;
/* parse map definitions non-strictly, allowing extra attributes/data */
bool relaxed_maps;
/* DEPRECATED: handle CO-RE relocations non-strictly, allowing failures.
* Value is ignored. Relocations always are processed non-strictly.
* Non-relocatable instructions are replaced with invalid ones to
* prevent accidental errors.
* */
LIBBPF_DEPRECATED_SINCE(0, 6, "field has no effect")
bool relaxed_core_relocs;
/* maps that set the 'pinning' attribute in their definition will have
* their pin_path attribute set to a file in this directory, and be
* auto-pinned to that path on load; defaults to "/sys/fs/bpf".
*/
const char *pin_root_path;
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_program__set_attach_target() on each individual bpf_program")
__u32 attach_prog_fd;
/* Additional kernel config content that augments and overrides
* system Kconfig for CONFIG_xxx externs.
*/
const char *kconfig;
/* Path to the custom BTF to be used for BPF CO-RE relocations.
* This custom BTF completely replaces the use of vmlinux BTF
* for the purpose of CO-RE relocations.
* NOTE: any other BPF feature (e.g., fentry/fexit programs,
* struct_ops, etc) will need actual kernel BTF at /sys/kernel/btf/vmlinux.
*/
const char *btf_custom_path;
};
#define bpf_object_open_opts__last_field btf_custom_path
LIBBPF_API struct bpf_object *bpf_object__open(const char *path);
LIBBPF_API struct bpf_object *
bpf_object__open_file(const char *path, const struct bpf_object_open_opts *opts);
LIBBPF_API struct bpf_object *
bpf_object__open_mem(const void *obj_buf, size_t obj_buf_sz,
const struct bpf_object_open_opts *opts);
/* deprecated bpf_object__open variants */
LIBBPF_API struct bpf_object *
bpf_object__open_buffer(const void *obj_buf, size_t obj_buf_sz,
const char *name);
LIBBPF_API struct bpf_object *
bpf_object__open_xattr(struct bpf_object_open_attr *attr);
struct bpf_object *__bpf_object__open_xattr(struct bpf_object_open_attr *attr,
int flags);
LIBBPF_API struct bpf_object *bpf_object__open_buffer(void *obj_buf,
size_t obj_buf_sz,
const char *name);
int bpf_object__section_size(const struct bpf_object *obj, const char *name,
__u32 *size);
int bpf_object__variable_offset(const struct bpf_object *obj, const char *name,
__u32 *off);
enum libbpf_pin_type {
LIBBPF_PIN_NONE,
/* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */
LIBBPF_PIN_BY_NAME,
};
/* pin_maps and unpin_maps can both be called with a NULL path, in which case
* they will use the pin_path attribute of each map (and ignore all maps that
* don't have a pin_path set).
*/
LIBBPF_API int bpf_object__pin_maps(struct bpf_object *obj, const char *path);
LIBBPF_API int bpf_object__unpin_maps(struct bpf_object *obj,
const char *path);
@@ -98,9 +150,12 @@ struct bpf_object_load_attr {
/* Load/unload object into/from kernel */
LIBBPF_API int bpf_object__load(struct bpf_object *obj);
LIBBPF_API int bpf_object__load_xattr(struct bpf_object_load_attr *attr);
LIBBPF_DEPRECATED_SINCE(0, 6, "bpf_object__unload() is deprecated, use bpf_object__close() instead")
LIBBPF_API int bpf_object__unload(struct bpf_object *obj);
LIBBPF_API const char *bpf_object__name(const struct bpf_object *obj);
LIBBPF_API unsigned int bpf_object__kversion(const struct bpf_object *obj);
LIBBPF_API int bpf_object__set_kversion(struct bpf_object *obj, __u32 kern_version);
struct btf;
LIBBPF_API struct btf *bpf_object__btf(const struct bpf_object *obj);
@@ -109,6 +164,9 @@ LIBBPF_API int bpf_object__btf_fd(const struct bpf_object *obj);
LIBBPF_API struct bpf_program *
bpf_object__find_program_by_title(const struct bpf_object *obj,
const char *title);
LIBBPF_API struct bpf_program *
bpf_object__find_program_by_name(const struct bpf_object *obj,
const char *name);
LIBBPF_API struct bpf_object *bpf_object__next(struct bpf_object *prev);
#define bpf_object__for_each_safe(pos, tmp) \
@@ -127,19 +185,27 @@ libbpf_prog_type_by_name(const char *name, enum bpf_prog_type *prog_type,
enum bpf_attach_type *expected_attach_type);
LIBBPF_API int libbpf_attach_type_by_name(const char *name,
enum bpf_attach_type *attach_type);
LIBBPF_API int libbpf_find_vmlinux_btf_id(const char *name,
enum bpf_attach_type attach_type);
/* Accessors of bpf_program */
struct bpf_program;
LIBBPF_API struct bpf_program *bpf_program__next(struct bpf_program *prog,
const struct bpf_object *obj);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__next_program() instead")
struct bpf_program *bpf_program__next(struct bpf_program *prog,
const struct bpf_object *obj);
LIBBPF_API struct bpf_program *
bpf_object__next_program(const struct bpf_object *obj, struct bpf_program *prog);
#define bpf_object__for_each_program(pos, obj) \
for ((pos) = bpf_program__next(NULL, (obj)); \
(pos) != NULL; \
(pos) = bpf_program__next((pos), (obj)))
#define bpf_object__for_each_program(pos, obj) \
for ((pos) = bpf_object__next_program((obj), NULL); \
(pos) != NULL; \
(pos) = bpf_object__next_program((obj), (pos)))
LIBBPF_API struct bpf_program *bpf_program__prev(struct bpf_program *prog,
const struct bpf_object *obj);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__prev_program() instead")
struct bpf_program *bpf_program__prev(struct bpf_program *prog,
const struct bpf_object *obj);
LIBBPF_API struct bpf_program *
bpf_object__prev_program(const struct bpf_object *obj, struct bpf_program *prog);
typedef void (*bpf_program_clear_priv_t)(struct bpf_program *, void *);
@@ -150,8 +216,15 @@ LIBBPF_API void *bpf_program__priv(const struct bpf_program *prog);
LIBBPF_API void bpf_program__set_ifindex(struct bpf_program *prog,
__u32 ifindex);
LIBBPF_API const char *bpf_program__title(const struct bpf_program *prog,
bool needs_copy);
LIBBPF_API const char *bpf_program__name(const struct bpf_program *prog);
LIBBPF_API const char *bpf_program__section_name(const struct bpf_program *prog);
LIBBPF_API LIBBPF_DEPRECATED("BPF program title is confusing term; please use bpf_program__section_name() instead")
const char *bpf_program__title(const struct bpf_program *prog, bool needs_copy);
LIBBPF_API bool bpf_program__autoload(const struct bpf_program *prog);
LIBBPF_API int bpf_program__set_autoload(struct bpf_program *prog, bool autoload);
/* returns program size in bytes */
LIBBPF_API size_t bpf_program__size(const struct bpf_program *prog);
LIBBPF_API int bpf_program__load(struct bpf_program *prog, char *license,
__u32 kern_version);
@@ -168,24 +241,129 @@ LIBBPF_API void bpf_program__unload(struct bpf_program *prog);
struct bpf_link;
LIBBPF_API struct bpf_link *bpf_link__open(const char *path);
LIBBPF_API int bpf_link__fd(const struct bpf_link *link);
LIBBPF_API const char *bpf_link__pin_path(const struct bpf_link *link);
LIBBPF_API int bpf_link__pin(struct bpf_link *link, const char *path);
LIBBPF_API int bpf_link__unpin(struct bpf_link *link);
LIBBPF_API int bpf_link__update_program(struct bpf_link *link,
struct bpf_program *prog);
LIBBPF_API void bpf_link__disconnect(struct bpf_link *link);
LIBBPF_API int bpf_link__detach(struct bpf_link *link);
LIBBPF_API int bpf_link__destroy(struct bpf_link *link);
LIBBPF_API struct bpf_link *
bpf_program__attach_perf_event(struct bpf_program *prog, int pfd);
bpf_program__attach(const struct bpf_program *prog);
struct bpf_perf_event_opts {
/* size of this struct, for forward/backward compatiblity */
size_t sz;
/* custom user-provided value fetchable through bpf_get_attach_cookie() */
__u64 bpf_cookie;
};
#define bpf_perf_event_opts__last_field bpf_cookie
LIBBPF_API struct bpf_link *
bpf_program__attach_kprobe(struct bpf_program *prog, bool retprobe,
bpf_program__attach_perf_event(const struct bpf_program *prog, int pfd);
LIBBPF_API struct bpf_link *
bpf_program__attach_perf_event_opts(const struct bpf_program *prog, int pfd,
const struct bpf_perf_event_opts *opts);
struct bpf_kprobe_opts {
/* size of this struct, for forward/backward compatiblity */
size_t sz;
/* custom user-provided value fetchable through bpf_get_attach_cookie() */
__u64 bpf_cookie;
/* function's offset to install kprobe to */
size_t offset;
/* kprobe is return probe */
bool retprobe;
size_t :0;
};
#define bpf_kprobe_opts__last_field retprobe
LIBBPF_API struct bpf_link *
bpf_program__attach_kprobe(const struct bpf_program *prog, bool retprobe,
const char *func_name);
LIBBPF_API struct bpf_link *
bpf_program__attach_uprobe(struct bpf_program *prog, bool retprobe,
bpf_program__attach_kprobe_opts(const struct bpf_program *prog,
const char *func_name,
const struct bpf_kprobe_opts *opts);
struct bpf_uprobe_opts {
/* size of this struct, for forward/backward compatiblity */
size_t sz;
/* offset of kernel reference counted USDT semaphore, added in
* a6ca88b241d5 ("trace_uprobe: support reference counter in fd-based uprobe")
*/
size_t ref_ctr_offset;
/* custom user-provided value fetchable through bpf_get_attach_cookie() */
__u64 bpf_cookie;
/* uprobe is return probe, invoked at function return time */
bool retprobe;
size_t :0;
};
#define bpf_uprobe_opts__last_field retprobe
LIBBPF_API struct bpf_link *
bpf_program__attach_uprobe(const struct bpf_program *prog, bool retprobe,
pid_t pid, const char *binary_path,
size_t func_offset);
LIBBPF_API struct bpf_link *
bpf_program__attach_tracepoint(struct bpf_program *prog,
bpf_program__attach_uprobe_opts(const struct bpf_program *prog, pid_t pid,
const char *binary_path, size_t func_offset,
const struct bpf_uprobe_opts *opts);
struct bpf_tracepoint_opts {
/* size of this struct, for forward/backward compatiblity */
size_t sz;
/* custom user-provided value fetchable through bpf_get_attach_cookie() */
__u64 bpf_cookie;
};
#define bpf_tracepoint_opts__last_field bpf_cookie
LIBBPF_API struct bpf_link *
bpf_program__attach_tracepoint(const struct bpf_program *prog,
const char *tp_category,
const char *tp_name);
LIBBPF_API struct bpf_link *
bpf_program__attach_raw_tracepoint(struct bpf_program *prog,
bpf_program__attach_tracepoint_opts(const struct bpf_program *prog,
const char *tp_category,
const char *tp_name,
const struct bpf_tracepoint_opts *opts);
LIBBPF_API struct bpf_link *
bpf_program__attach_raw_tracepoint(const struct bpf_program *prog,
const char *tp_name);
LIBBPF_API struct bpf_link *
bpf_program__attach_trace(const struct bpf_program *prog);
LIBBPF_API struct bpf_link *
bpf_program__attach_lsm(const struct bpf_program *prog);
LIBBPF_API struct bpf_link *
bpf_program__attach_cgroup(const struct bpf_program *prog, int cgroup_fd);
LIBBPF_API struct bpf_link *
bpf_program__attach_netns(const struct bpf_program *prog, int netns_fd);
LIBBPF_API struct bpf_link *
bpf_program__attach_xdp(const struct bpf_program *prog, int ifindex);
LIBBPF_API struct bpf_link *
bpf_program__attach_freplace(const struct bpf_program *prog,
int target_fd, const char *attach_func_name);
struct bpf_map;
LIBBPF_API struct bpf_link *bpf_map__attach_struct_ops(const struct bpf_map *map);
struct bpf_iter_attach_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
union bpf_iter_link_info *link_info;
__u32 link_info_len;
};
#define bpf_iter_attach_opts__last_field link_info_len
LIBBPF_API struct bpf_link *
bpf_program__attach_iter(const struct bpf_program *prog,
const struct bpf_iter_attach_opts *opts);
struct bpf_insn;
@@ -258,24 +436,43 @@ LIBBPF_API int bpf_program__set_socket_filter(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_tracepoint(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_raw_tracepoint(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_kprobe(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_lsm(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_sched_cls(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_sched_act(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_xdp(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_perf_event(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_tracing(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_struct_ops(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_extension(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_sk_lookup(struct bpf_program *prog);
LIBBPF_API enum bpf_prog_type bpf_program__get_type(const struct bpf_program *prog);
LIBBPF_API void bpf_program__set_type(struct bpf_program *prog,
enum bpf_prog_type type);
LIBBPF_API enum bpf_attach_type
bpf_program__get_expected_attach_type(const struct bpf_program *prog);
LIBBPF_API void
bpf_program__set_expected_attach_type(struct bpf_program *prog,
enum bpf_attach_type type);
LIBBPF_API int
bpf_program__set_attach_target(struct bpf_program *prog, int attach_prog_fd,
const char *attach_func_name);
LIBBPF_API bool bpf_program__is_socket_filter(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_tracepoint(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_raw_tracepoint(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_kprobe(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_lsm(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_sched_cls(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_sched_act(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_xdp(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_perf_event(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_tracing(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_struct_ops(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_extension(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_sk_lookup(const struct bpf_program *prog);
/*
* No need for __attribute__((packed)), all members of 'bpf_map_def'
@@ -291,11 +488,14 @@ struct bpf_map_def {
unsigned int map_flags;
};
/*
* The 'struct bpf_map' in include/linux/bpf.h is internal to the kernel,
* so no need to worry about a name clash.
/**
* @brief **bpf_object__find_map_by_name()** returns BPF map of
* the given name, if it exists within the passed BPF object
* @param obj BPF object
* @param name name of the BPF map
* @return BPF map instance, if such map exists within the BPF object;
* or NULL otherwise.
*/
struct bpf_map;
LIBBPF_API struct bpf_map *
bpf_object__find_map_by_name(const struct bpf_object *obj, const char *name);
@@ -309,37 +509,119 @@ bpf_object__find_map_fd_by_name(const struct bpf_object *obj, const char *name);
LIBBPF_API struct bpf_map *
bpf_object__find_map_by_offset(struct bpf_object *obj, size_t offset);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__next_map() instead")
struct bpf_map *bpf_map__next(const struct bpf_map *map, const struct bpf_object *obj);
LIBBPF_API struct bpf_map *
bpf_map__next(const struct bpf_map *map, const struct bpf_object *obj);
bpf_object__next_map(const struct bpf_object *obj, const struct bpf_map *map);
#define bpf_object__for_each_map(pos, obj) \
for ((pos) = bpf_map__next(NULL, (obj)); \
for ((pos) = bpf_object__next_map((obj), NULL); \
(pos) != NULL; \
(pos) = bpf_map__next((pos), (obj)))
(pos) = bpf_object__next_map((obj), (pos)))
#define bpf_map__for_each bpf_object__for_each_map
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__prev_map() instead")
struct bpf_map *bpf_map__prev(const struct bpf_map *map, const struct bpf_object *obj);
LIBBPF_API struct bpf_map *
bpf_map__prev(const struct bpf_map *map, const struct bpf_object *obj);
bpf_object__prev_map(const struct bpf_object *obj, const struct bpf_map *map);
/**
* @brief **bpf_map__fd()** gets the file descriptor of the passed
* BPF map
* @param map the BPF map instance
* @return the file descriptor; or -EINVAL in case of an error
*/
LIBBPF_API int bpf_map__fd(const struct bpf_map *map);
LIBBPF_API int bpf_map__reuse_fd(struct bpf_map *map, int fd);
/* get map definition */
LIBBPF_API const struct bpf_map_def *bpf_map__def(const struct bpf_map *map);
/* get map name */
LIBBPF_API const char *bpf_map__name(const struct bpf_map *map);
/* get/set map type */
LIBBPF_API enum bpf_map_type bpf_map__type(const struct bpf_map *map);
LIBBPF_API int bpf_map__set_type(struct bpf_map *map, enum bpf_map_type type);
/* get/set map size (max_entries) */
LIBBPF_API __u32 bpf_map__max_entries(const struct bpf_map *map);
LIBBPF_API int bpf_map__set_max_entries(struct bpf_map *map, __u32 max_entries);
LIBBPF_API int bpf_map__resize(struct bpf_map *map, __u32 max_entries);
/* get/set map flags */
LIBBPF_API __u32 bpf_map__map_flags(const struct bpf_map *map);
LIBBPF_API int bpf_map__set_map_flags(struct bpf_map *map, __u32 flags);
/* get/set map NUMA node */
LIBBPF_API __u32 bpf_map__numa_node(const struct bpf_map *map);
LIBBPF_API int bpf_map__set_numa_node(struct bpf_map *map, __u32 numa_node);
/* get/set map key size */
LIBBPF_API __u32 bpf_map__key_size(const struct bpf_map *map);
LIBBPF_API int bpf_map__set_key_size(struct bpf_map *map, __u32 size);
/* get/set map value size */
LIBBPF_API __u32 bpf_map__value_size(const struct bpf_map *map);
LIBBPF_API int bpf_map__set_value_size(struct bpf_map *map, __u32 size);
/* get map key/value BTF type IDs */
LIBBPF_API __u32 bpf_map__btf_key_type_id(const struct bpf_map *map);
LIBBPF_API __u32 bpf_map__btf_value_type_id(const struct bpf_map *map);
/* get/set map if_index */
LIBBPF_API __u32 bpf_map__ifindex(const struct bpf_map *map);
LIBBPF_API int bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex);
typedef void (*bpf_map_clear_priv_t)(struct bpf_map *, void *);
LIBBPF_API int bpf_map__set_priv(struct bpf_map *map, void *priv,
bpf_map_clear_priv_t clear_priv);
LIBBPF_API void *bpf_map__priv(const struct bpf_map *map);
LIBBPF_API int bpf_map__reuse_fd(struct bpf_map *map, int fd);
LIBBPF_API int bpf_map__resize(struct bpf_map *map, __u32 max_entries);
LIBBPF_API int bpf_map__set_initial_value(struct bpf_map *map,
const void *data, size_t size);
LIBBPF_API const void *bpf_map__initial_value(struct bpf_map *map, size_t *psize);
LIBBPF_API bool bpf_map__is_offload_neutral(const struct bpf_map *map);
/**
* @brief **bpf_map__is_internal()** tells the caller whether or not the
* passed map is a special map created by libbpf automatically for things like
* global variables, __ksym externs, Kconfig values, etc
* @param map the bpf_map
* @return true, if the map is an internal map; false, otherwise
*/
LIBBPF_API bool bpf_map__is_internal(const struct bpf_map *map);
LIBBPF_API void bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex);
LIBBPF_API int bpf_map__set_pin_path(struct bpf_map *map, const char *path);
LIBBPF_API const char *bpf_map__get_pin_path(const struct bpf_map *map);
LIBBPF_API const char *bpf_map__pin_path(const struct bpf_map *map);
LIBBPF_API bool bpf_map__is_pinned(const struct bpf_map *map);
LIBBPF_API int bpf_map__pin(struct bpf_map *map, const char *path);
LIBBPF_API int bpf_map__unpin(struct bpf_map *map, const char *path);
LIBBPF_API int bpf_map__set_inner_map_fd(struct bpf_map *map, int fd);
LIBBPF_API struct bpf_map *bpf_map__inner_map(struct bpf_map *map);
/**
* @brief **libbpf_get_error()** extracts the error code from the passed
* pointer
* @param ptr pointer returned from libbpf API function
* @return error code; or 0 if no error occured
*
* Many libbpf API functions which return pointers have logic to encode error
* codes as pointers, and do not return NULL. Meaning **libbpf_get_error()**
* should be used on the return value from these functions immediately after
* calling the API function, with no intervening calls that could clobber the
* `errno` variable. Consult the individual functions documentation to verify
* if this logic applies should be used.
*
* For these API functions, if `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)`
* is enabled, NULL is returned on error instead.
*
* If ptr is NULL, then errno should be already set by the failing
* API, because libbpf never returns NULL on success and it now always
* sets errno on error.
*
* Example usage:
*
* struct perf_buffer *pb;
*
* pb = perf_buffer__new(bpf_map__fd(obj->maps.events), PERF_BUFFER_PAGES, &opts);
* err = libbpf_get_error(pb);
* if (err) {
* pb = NULL;
* fprintf(stderr, "failed to open perf buffer: %d\n", err);
* goto cleanup;
* }
*/
LIBBPF_API long libbpf_get_error(const void *ptr);
struct bpf_prog_load_attr {
@@ -356,9 +638,94 @@ LIBBPF_API int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,
LIBBPF_API int bpf_prog_load(const char *file, enum bpf_prog_type type,
struct bpf_object **pobj, int *prog_fd);
LIBBPF_API int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags);
LIBBPF_API int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags);
/* XDP related API */
struct xdp_link_info {
__u32 prog_id;
__u32 drv_prog_id;
__u32 hw_prog_id;
__u32 skb_prog_id;
__u8 attach_mode;
};
struct bpf_xdp_set_link_opts {
size_t sz;
int old_fd;
size_t :0;
};
#define bpf_xdp_set_link_opts__last_field old_fd
LIBBPF_API int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags);
LIBBPF_API int bpf_set_link_xdp_fd_opts(int ifindex, int fd, __u32 flags,
const struct bpf_xdp_set_link_opts *opts);
LIBBPF_API int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags);
LIBBPF_API int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info,
size_t info_size, __u32 flags);
/* TC related API */
enum bpf_tc_attach_point {
BPF_TC_INGRESS = 1 << 0,
BPF_TC_EGRESS = 1 << 1,
BPF_TC_CUSTOM = 1 << 2,
};
#define BPF_TC_PARENT(a, b) \
((((a) << 16) & 0xFFFF0000U) | ((b) & 0x0000FFFFU))
enum bpf_tc_flags {
BPF_TC_F_REPLACE = 1 << 0,
};
struct bpf_tc_hook {
size_t sz;
int ifindex;
enum bpf_tc_attach_point attach_point;
__u32 parent;
size_t :0;
};
#define bpf_tc_hook__last_field parent
struct bpf_tc_opts {
size_t sz;
int prog_fd;
__u32 flags;
__u32 prog_id;
__u32 handle;
__u32 priority;
size_t :0;
};
#define bpf_tc_opts__last_field priority
LIBBPF_API int bpf_tc_hook_create(struct bpf_tc_hook *hook);
LIBBPF_API int bpf_tc_hook_destroy(struct bpf_tc_hook *hook);
LIBBPF_API int bpf_tc_attach(const struct bpf_tc_hook *hook,
struct bpf_tc_opts *opts);
LIBBPF_API int bpf_tc_detach(const struct bpf_tc_hook *hook,
const struct bpf_tc_opts *opts);
LIBBPF_API int bpf_tc_query(const struct bpf_tc_hook *hook,
struct bpf_tc_opts *opts);
/* Ring buffer APIs */
struct ring_buffer;
typedef int (*ring_buffer_sample_fn)(void *ctx, void *data, size_t size);
struct ring_buffer_opts {
size_t sz; /* size of this struct, for forward/backward compatiblity */
};
#define ring_buffer_opts__last_field sz
LIBBPF_API struct ring_buffer *
ring_buffer__new(int map_fd, ring_buffer_sample_fn sample_cb, void *ctx,
const struct ring_buffer_opts *opts);
LIBBPF_API void ring_buffer__free(struct ring_buffer *rb);
LIBBPF_API int ring_buffer__add(struct ring_buffer *rb, int map_fd,
ring_buffer_sample_fn sample_cb, void *ctx);
LIBBPF_API int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms);
LIBBPF_API int ring_buffer__consume(struct ring_buffer *rb);
LIBBPF_API int ring_buffer__epoll_fd(const struct ring_buffer *rb);
/* Perf buffer APIs */
struct perf_buffer;
typedef void (*perf_buffer_sample_fn)(void *ctx, int cpu,
@@ -413,7 +780,12 @@ perf_buffer__new_raw(int map_fd, size_t page_cnt,
const struct perf_buffer_raw_opts *opts);
LIBBPF_API void perf_buffer__free(struct perf_buffer *pb);
LIBBPF_API int perf_buffer__epoll_fd(const struct perf_buffer *pb);
LIBBPF_API int perf_buffer__poll(struct perf_buffer *pb, int timeout_ms);
LIBBPF_API int perf_buffer__consume(struct perf_buffer *pb);
LIBBPF_API int perf_buffer__consume_buffer(struct perf_buffer *pb, size_t buf_idx);
LIBBPF_API size_t perf_buffer__buffer_cnt(const struct perf_buffer *pb);
LIBBPF_API int perf_buffer__buffer_fd(const struct perf_buffer *pb, size_t buf_idx);
typedef enum bpf_perf_event_ret
(*bpf_perf_event_print_t)(struct perf_event_header *hdr,
@@ -423,18 +795,6 @@ bpf_perf_event_read_simple(void *mmap_mem, size_t mmap_size, size_t page_size,
void **copy_mem, size_t *copy_size,
bpf_perf_event_print_t fn, void *private_data);
struct nlattr;
typedef int (*libbpf_dump_nlmsg_t)(void *cookie, void *msg, struct nlattr **tb);
int libbpf_netlink_open(unsigned int *nl_pid);
int libbpf_nl_get_link(int sock, unsigned int nl_pid,
libbpf_dump_nlmsg_t dump_link_nlmsg, void *cookie);
int libbpf_nl_get_class(int sock, unsigned int nl_pid, int ifindex,
libbpf_dump_nlmsg_t dump_class_nlmsg, void *cookie);
int libbpf_nl_get_qdisc(int sock, unsigned int nl_pid, int ifindex,
libbpf_dump_nlmsg_t dump_qdisc_nlmsg, void *cookie);
int libbpf_nl_get_filter(int sock, unsigned int nl_pid, int ifindex, int handle,
libbpf_dump_nlmsg_t dump_filter_nlmsg, void *cookie);
struct bpf_prog_linfo;
struct bpf_prog_info;
@@ -461,6 +821,7 @@ LIBBPF_API bool bpf_probe_prog_type(enum bpf_prog_type prog_type,
LIBBPF_API bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex);
LIBBPF_API bool bpf_probe_helper(enum bpf_func_id id,
enum bpf_prog_type prog_type, __u32 ifindex);
LIBBPF_API bool bpf_probe_large_insn_limit(__u32 ifindex);
/*
* Get bpf_prog_info in continuous memory
@@ -525,9 +886,10 @@ bpf_program__bpil_addr_to_offs(struct bpf_prog_info_linear *info_linear);
LIBBPF_API void
bpf_program__bpil_offs_to_addr(struct bpf_prog_info_linear *info_linear);
/*
* A helper function to get the number of possible CPUs before looking up
* per-CPU maps. Negative errno is returned on failure.
/**
* @brief **libbpf_num_possible_cpus()** is a helper function to get the
* number of possible CPUs that the host kernel supports and expects.
* @return number of possible CPUs; or error code on failure
*
* Example usage:
*
@@ -537,10 +899,86 @@ bpf_program__bpil_offs_to_addr(struct bpf_prog_info_linear *info_linear);
* }
* long values[ncpus];
* bpf_map_lookup_elem(per_cpu_map_fd, key, values);
*
*/
LIBBPF_API int libbpf_num_possible_cpus(void);
struct bpf_map_skeleton {
const char *name;
struct bpf_map **map;
void **mmaped;
};
struct bpf_prog_skeleton {
const char *name;
struct bpf_program **prog;
struct bpf_link **link;
};
struct bpf_object_skeleton {
size_t sz; /* size of this struct, for forward/backward compatibility */
const char *name;
const void *data;
size_t data_sz;
struct bpf_object **obj;
int map_cnt;
int map_skel_sz; /* sizeof(struct bpf_skeleton_map) */
struct bpf_map_skeleton *maps;
int prog_cnt;
int prog_skel_sz; /* sizeof(struct bpf_skeleton_prog) */
struct bpf_prog_skeleton *progs;
};
LIBBPF_API int
bpf_object__open_skeleton(struct bpf_object_skeleton *s,
const struct bpf_object_open_opts *opts);
LIBBPF_API int bpf_object__load_skeleton(struct bpf_object_skeleton *s);
LIBBPF_API int bpf_object__attach_skeleton(struct bpf_object_skeleton *s);
LIBBPF_API void bpf_object__detach_skeleton(struct bpf_object_skeleton *s);
LIBBPF_API void bpf_object__destroy_skeleton(struct bpf_object_skeleton *s);
struct gen_loader_opts {
size_t sz; /* size of this struct, for forward/backward compatiblity */
const char *data;
const char *insns;
__u32 data_sz;
__u32 insns_sz;
};
#define gen_loader_opts__last_field insns_sz
LIBBPF_API int bpf_object__gen_loader(struct bpf_object *obj,
struct gen_loader_opts *opts);
enum libbpf_tristate {
TRI_NO = 0,
TRI_YES = 1,
TRI_MODULE = 2,
};
struct bpf_linker_opts {
/* size of this struct, for forward/backward compatiblity */
size_t sz;
};
#define bpf_linker_opts__last_field sz
struct bpf_linker_file_opts {
/* size of this struct, for forward/backward compatiblity */
size_t sz;
};
#define bpf_linker_file_opts__last_field sz
struct bpf_linker;
LIBBPF_API struct bpf_linker *bpf_linker__new(const char *filename, struct bpf_linker_opts *opts);
LIBBPF_API int bpf_linker__add_file(struct bpf_linker *linker,
const char *filename,
const struct bpf_linker_file_opts *opts);
LIBBPF_API int bpf_linker__finalize(struct bpf_linker *linker);
LIBBPF_API void bpf_linker__free(struct bpf_linker *linker);
#ifdef __cplusplus
} /* extern "C" */
#endif

View File

@@ -190,3 +190,209 @@ LIBBPF_0.0.5 {
global:
bpf_btf_get_next_id;
} LIBBPF_0.0.4;
LIBBPF_0.0.6 {
global:
bpf_get_link_xdp_info;
bpf_map__get_pin_path;
bpf_map__is_pinned;
bpf_map__set_pin_path;
bpf_object__open_file;
bpf_object__open_mem;
bpf_program__attach_trace;
bpf_program__get_expected_attach_type;
bpf_program__get_type;
bpf_program__is_tracing;
bpf_program__set_tracing;
bpf_program__size;
btf__find_by_name_kind;
libbpf_find_vmlinux_btf_id;
} LIBBPF_0.0.5;
LIBBPF_0.0.7 {
global:
btf_dump__emit_type_decl;
bpf_link__disconnect;
bpf_map__attach_struct_ops;
bpf_map_delete_batch;
bpf_map_lookup_and_delete_batch;
bpf_map_lookup_batch;
bpf_map_update_batch;
bpf_object__find_program_by_name;
bpf_object__attach_skeleton;
bpf_object__destroy_skeleton;
bpf_object__detach_skeleton;
bpf_object__load_skeleton;
bpf_object__open_skeleton;
bpf_probe_large_insn_limit;
bpf_prog_attach_xattr;
bpf_program__attach;
bpf_program__name;
bpf_program__is_extension;
bpf_program__is_struct_ops;
bpf_program__set_extension;
bpf_program__set_struct_ops;
btf__align_of;
libbpf_find_kernel_btf;
} LIBBPF_0.0.6;
LIBBPF_0.0.8 {
global:
bpf_link__fd;
bpf_link__open;
bpf_link__pin;
bpf_link__pin_path;
bpf_link__unpin;
bpf_link__update_program;
bpf_link_create;
bpf_link_update;
bpf_map__set_initial_value;
bpf_program__attach_cgroup;
bpf_program__attach_lsm;
bpf_program__is_lsm;
bpf_program__set_attach_target;
bpf_program__set_lsm;
bpf_set_link_xdp_fd_opts;
} LIBBPF_0.0.7;
LIBBPF_0.0.9 {
global:
bpf_enable_stats;
bpf_iter_create;
bpf_link_get_fd_by_id;
bpf_link_get_next_id;
bpf_program__attach_iter;
bpf_program__attach_netns;
perf_buffer__consume;
ring_buffer__add;
ring_buffer__consume;
ring_buffer__free;
ring_buffer__new;
ring_buffer__poll;
} LIBBPF_0.0.8;
LIBBPF_0.1.0 {
global:
bpf_link__detach;
bpf_link_detach;
bpf_map__ifindex;
bpf_map__key_size;
bpf_map__map_flags;
bpf_map__max_entries;
bpf_map__numa_node;
bpf_map__set_key_size;
bpf_map__set_map_flags;
bpf_map__set_max_entries;
bpf_map__set_numa_node;
bpf_map__set_type;
bpf_map__set_value_size;
bpf_map__type;
bpf_map__value_size;
bpf_program__attach_xdp;
bpf_program__autoload;
bpf_program__is_sk_lookup;
bpf_program__set_autoload;
bpf_program__set_sk_lookup;
btf__parse;
btf__parse_raw;
btf__pointer_size;
btf__set_fd;
btf__set_pointer_size;
} LIBBPF_0.0.9;
LIBBPF_0.2.0 {
global:
bpf_prog_bind_map;
bpf_prog_test_run_opts;
bpf_program__attach_freplace;
bpf_program__section_name;
btf__add_array;
btf__add_const;
btf__add_enum;
btf__add_enum_value;
btf__add_datasec;
btf__add_datasec_var_info;
btf__add_field;
btf__add_func;
btf__add_func_param;
btf__add_func_proto;
btf__add_fwd;
btf__add_int;
btf__add_ptr;
btf__add_restrict;
btf__add_str;
btf__add_struct;
btf__add_typedef;
btf__add_union;
btf__add_var;
btf__add_volatile;
btf__endianness;
btf__find_str;
btf__new_empty;
btf__set_endianness;
btf__str_by_offset;
perf_buffer__buffer_cnt;
perf_buffer__buffer_fd;
perf_buffer__epoll_fd;
perf_buffer__consume_buffer;
xsk_socket__create_shared;
} LIBBPF_0.1.0;
LIBBPF_0.3.0 {
global:
btf__base_btf;
btf__parse_elf_split;
btf__parse_raw_split;
btf__parse_split;
btf__new_empty_split;
btf__new_split;
ring_buffer__epoll_fd;
xsk_setup_xdp_prog;
xsk_socket__update_xskmap;
} LIBBPF_0.2.0;
LIBBPF_0.4.0 {
global:
btf__add_float;
btf__add_type;
bpf_linker__add_file;
bpf_linker__finalize;
bpf_linker__free;
bpf_linker__new;
bpf_map__inner_map;
bpf_object__set_kversion;
bpf_tc_attach;
bpf_tc_detach;
bpf_tc_hook_create;
bpf_tc_hook_destroy;
bpf_tc_query;
} LIBBPF_0.3.0;
LIBBPF_0.5.0 {
global:
bpf_map__initial_value;
bpf_map__pin_path;
bpf_map_lookup_and_delete_elem_flags;
bpf_program__attach_kprobe_opts;
bpf_program__attach_perf_event_opts;
bpf_program__attach_tracepoint_opts;
bpf_program__attach_uprobe_opts;
bpf_object__gen_loader;
btf__load_from_kernel_by_id;
btf__load_from_kernel_by_id_split;
btf__load_into_kernel;
btf__load_module_btf;
btf__load_vmlinux_btf;
btf_dump__dump_type_data;
libbpf_set_strict_mode;
} LIBBPF_0.4.0;
LIBBPF_0.6.0 {
global:
bpf_object__next_map;
bpf_object__next_program;
bpf_object__prev_map;
bpf_object__prev_program;
btf__add_btf;
btf__add_tag;
} LIBBPF_0.5.0;

View File

@@ -8,5 +8,5 @@ Name: libbpf
Description: BPF library
Version: @VERSION@
Libs: -L${libdir} -lbpf
Requires.private: libelf
Requires.private: libelf zlib
Cflags: -I${includedir}

66
src/libbpf_common.h Normal file
View File

@@ -0,0 +1,66 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/*
* Common user-facing libbpf helpers.
*
* Copyright (c) 2019 Facebook
*/
#ifndef __LIBBPF_LIBBPF_COMMON_H
#define __LIBBPF_LIBBPF_COMMON_H
#include <string.h>
#include "libbpf_version.h"
#ifndef LIBBPF_API
#define LIBBPF_API __attribute__((visibility("default")))
#endif
#define LIBBPF_DEPRECATED(msg) __attribute__((deprecated(msg)))
/* Mark a symbol as deprecated when libbpf version is >= {major}.{minor} */
#define LIBBPF_DEPRECATED_SINCE(major, minor, msg) \
__LIBBPF_MARK_DEPRECATED_ ## major ## _ ## minor \
(LIBBPF_DEPRECATED("libbpf v" # major "." # minor "+: " msg))
#define __LIBBPF_CURRENT_VERSION_GEQ(major, minor) \
(LIBBPF_MAJOR_VERSION > (major) || \
(LIBBPF_MAJOR_VERSION == (major) && LIBBPF_MINOR_VERSION >= (minor)))
/* Add checks for other versions below when planning deprecation of API symbols
* with the LIBBPF_DEPRECATED_SINCE macro.
*/
#if __LIBBPF_CURRENT_VERSION_GEQ(0, 6)
#define __LIBBPF_MARK_DEPRECATED_0_6(X) X
#else
#define __LIBBPF_MARK_DEPRECATED_0_6(X)
#endif
#if __LIBBPF_CURRENT_VERSION_GEQ(0, 7)
#define __LIBBPF_MARK_DEPRECATED_0_7(X) X
#else
#define __LIBBPF_MARK_DEPRECATED_0_7(X)
#endif
/* Helper macro to declare and initialize libbpf options struct
*
* This dance with uninitialized declaration, followed by memset to zero,
* followed by assignment using compound literal syntax is done to preserve
* ability to use a nice struct field initialization syntax and **hopefully**
* have all the padding bytes initialized to zero. It's not guaranteed though,
* when copying literal, that compiler won't copy garbage in literal's padding
* bytes, but that's the best way I've found and it seems to work in practice.
*
* Macro declares opts struct of given type and name, zero-initializes,
* including any extra padding, it with memset() and then assigns initial
* values provided by users in struct initializer-syntax as varargs.
*/
#define DECLARE_LIBBPF_OPTS(TYPE, NAME, ...) \
struct TYPE NAME = ({ \
memset(&NAME, 0, sizeof(struct TYPE)); \
(struct TYPE) { \
.sz = sizeof(struct TYPE), \
__VA_ARGS__ \
}; \
})
#endif /* __LIBBPF_LIBBPF_COMMON_H */

View File

@@ -12,6 +12,10 @@
#include <string.h>
#include "libbpf.h"
#include "libbpf_internal.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
#define ERRNO_OFFSET(e) ((e) - __LIBBPF_ERRNO__START)
#define ERRCODE_OFFSET(c) ERRNO_OFFSET(LIBBPF_ERRNO__##c)
@@ -36,7 +40,7 @@ static const char *libbpf_strerror_table[NR_ERRNO] = {
int libbpf_strerror(int err, char *buf, size_t size)
{
if (!buf || !size)
return -1;
return libbpf_err(-EINVAL);
err = err > 0 ? err : -err;
@@ -45,7 +49,7 @@ int libbpf_strerror(int err, char *buf, size_t size)
ret = strerror_r(err, buf, size);
buf[size - 1] = '\0';
return ret;
return libbpf_err_errno(ret);
}
if (err < __LIBBPF_ERRNO__END) {
@@ -59,5 +63,5 @@ int libbpf_strerror(int err, char *buf, size_t size)
snprintf(buf, size, "Unknown libbpf error %d", err);
buf[size - 1] = '\0';
return -1;
return libbpf_err(-ENOENT);
}

View File

@@ -9,7 +9,52 @@
#ifndef __LIBBPF_LIBBPF_INTERNAL_H
#define __LIBBPF_LIBBPF_INTERNAL_H
#include <stdlib.h>
#include <limits.h>
#include <errno.h>
#include <linux/err.h>
#include "libbpf_legacy.h"
#include "relo_core.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
/* prevent accidental re-addition of reallocarray() */
#pragma GCC poison reallocarray
#include "libbpf.h"
#include "btf.h"
#ifndef EM_BPF
#define EM_BPF 247
#endif
#ifndef R_BPF_64_64
#define R_BPF_64_64 1
#endif
#ifndef R_BPF_64_ABS64
#define R_BPF_64_ABS64 2
#endif
#ifndef R_BPF_64_ABS32
#define R_BPF_64_ABS32 3
#endif
#ifndef R_BPF_64_32
#define R_BPF_64_32 10
#endif
#ifndef SHT_LLVM_ADDRSIG
#define SHT_LLVM_ADDRSIG 0x6FFF4C03
#endif
/* if libelf is old and doesn't support mmap(), fall back to read() */
#ifndef ELF_C_READ_MMAP
#define ELF_C_READ_MMAP ELF_C_READ
#endif
/* Older libelf all end up in this expression, for both 32 and 64 bit */
#ifndef GELF_ST_VISIBILITY
#define GELF_ST_VISIBILITY(o) ((o) & 0x03)
#endif
#define BTF_INFO_ENC(kind, kind_flag, vlen) \
((!!(kind_flag) << 31) | ((kind) << 24) | ((vlen) & BTF_MAX_VLEN))
@@ -22,7 +67,17 @@
#define BTF_MEMBER_ENC(name, type, bits_offset) (name), (type), (bits_offset)
#define BTF_PARAM_ENC(name, type) (name), (type)
#define BTF_VAR_SECINFO_ENC(type, offset, size) (type), (offset), (size)
#define BTF_TYPE_FLOAT_ENC(name, sz) \
BTF_TYPE_ENC(name, BTF_INFO_ENC(BTF_KIND_FLOAT, 0, 0), sz)
#define BTF_TYPE_TAG_ENC(value, type, component_idx) \
BTF_TYPE_ENC(value, BTF_INFO_ENC(BTF_KIND_TAG, 0, 0), type), (component_idx)
#ifndef likely
#define likely(x) __builtin_expect(!!(x), 1)
#endif
#ifndef unlikely
#define unlikely(x) __builtin_expect(!!(x), 0)
#endif
#ifndef min
# define min(x, y) ((x) < (y) ? (x) : (y))
#endif
@@ -34,20 +89,40 @@
(offsetof(TYPE, FIELD) + sizeof(((TYPE *)0)->FIELD))
#endif
/* Check whether a string `str` has prefix `pfx`, regardless if `pfx` is
* a string literal known at compilation time or char * pointer known only at
* runtime.
*/
#define str_has_pfx(str, pfx) \
(strncmp(str, pfx, __builtin_constant_p(pfx) ? sizeof(pfx) - 1 : strlen(pfx)) == 0)
/* Symbol versioning is different between static and shared library.
* Properly versioned symbols are needed for shared library, but
* only the symbol of the new version is needed for static library.
* Starting with GNU C 10, use symver attribute instead of .symver assembler
* directive, which works better with GCC LTO builds.
*/
#ifdef SHARED
# define COMPAT_VERSION(internal_name, api_name, version) \
#if defined(SHARED) && defined(__GNUC__) && __GNUC__ >= 10
#define DEFAULT_VERSION(internal_name, api_name, version) \
__attribute__((symver(#api_name "@@" #version)))
#define COMPAT_VERSION(internal_name, api_name, version) \
__attribute__((symver(#api_name "@" #version)))
#elif defined(SHARED)
#define COMPAT_VERSION(internal_name, api_name, version) \
asm(".symver " #internal_name "," #api_name "@" #version);
# define DEFAULT_VERSION(internal_name, api_name, version) \
#define DEFAULT_VERSION(internal_name, api_name, version) \
asm(".symver " #internal_name "," #api_name "@@" #version);
#else
# define COMPAT_VERSION(internal_name, api_name, version)
# define DEFAULT_VERSION(internal_name, api_name, version) \
#else /* !SHARED */
#define COMPAT_VERSION(internal_name, api_name, version)
#define DEFAULT_VERSION(internal_name, api_name, version) \
extern typeof(internal_name) api_name \
__attribute__((alias(#internal_name)));
#endif
extern void libbpf_print(enum libbpf_print_level level,
@@ -59,13 +134,183 @@ do { \
libbpf_print(level, "libbpf: " fmt, ##__VA_ARGS__); \
} while (0)
#define pr_warning(fmt, ...) __pr(LIBBPF_WARN, fmt, ##__VA_ARGS__)
#define pr_warn(fmt, ...) __pr(LIBBPF_WARN, fmt, ##__VA_ARGS__)
#define pr_info(fmt, ...) __pr(LIBBPF_INFO, fmt, ##__VA_ARGS__)
#define pr_debug(fmt, ...) __pr(LIBBPF_DEBUG, fmt, ##__VA_ARGS__)
#ifndef __has_builtin
#define __has_builtin(x) 0
#endif
/*
* Re-implement glibc's reallocarray() for libbpf internal-only use.
* reallocarray(), unfortunately, is not available in all versions of glibc,
* so requires extra feature detection and using reallocarray() stub from
* <tools/libc_compat.h> and COMPAT_NEED_REALLOCARRAY. All this complicates
* build of libbpf unnecessarily and is just a maintenance burden. Instead,
* it's trivial to implement libbpf-specific internal version and use it
* throughout libbpf.
*/
static inline void *libbpf_reallocarray(void *ptr, size_t nmemb, size_t size)
{
size_t total;
#if __has_builtin(__builtin_mul_overflow)
if (unlikely(__builtin_mul_overflow(nmemb, size, &total)))
return NULL;
#else
if (size == 0 || nmemb > ULONG_MAX / size)
return NULL;
total = nmemb * size;
#endif
return realloc(ptr, total);
}
struct btf;
struct btf_type;
struct btf_type *btf_type_by_id(struct btf *btf, __u32 type_id);
const char *btf_kind_str(const struct btf_type *t);
const struct btf_type *skip_mods_and_typedefs(const struct btf *btf, __u32 id, __u32 *res_id);
static inline enum btf_func_linkage btf_func_linkage(const struct btf_type *t)
{
return (enum btf_func_linkage)(int)btf_vlen(t);
}
static inline __u32 btf_type_info(int kind, int vlen, int kflag)
{
return (kflag << 31) | (kind << 24) | vlen;
}
enum map_def_parts {
MAP_DEF_MAP_TYPE = 0x001,
MAP_DEF_KEY_TYPE = 0x002,
MAP_DEF_KEY_SIZE = 0x004,
MAP_DEF_VALUE_TYPE = 0x008,
MAP_DEF_VALUE_SIZE = 0x010,
MAP_DEF_MAX_ENTRIES = 0x020,
MAP_DEF_MAP_FLAGS = 0x040,
MAP_DEF_NUMA_NODE = 0x080,
MAP_DEF_PINNING = 0x100,
MAP_DEF_INNER_MAP = 0x200,
MAP_DEF_ALL = 0x3ff, /* combination of all above */
};
struct btf_map_def {
enum map_def_parts parts;
__u32 map_type;
__u32 key_type_id;
__u32 key_size;
__u32 value_type_id;
__u32 value_size;
__u32 max_entries;
__u32 map_flags;
__u32 numa_node;
__u32 pinning;
};
int parse_btf_map_def(const char *map_name, struct btf *btf,
const struct btf_type *def_t, bool strict,
struct btf_map_def *map_def, struct btf_map_def *inner_def);
void *libbpf_add_mem(void **data, size_t *cap_cnt, size_t elem_sz,
size_t cur_cnt, size_t max_cnt, size_t add_cnt);
int libbpf_ensure_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t need_cnt);
static inline bool libbpf_is_mem_zeroed(const char *p, ssize_t len)
{
while (len > 0) {
if (*p)
return false;
p++;
len--;
}
return true;
}
static inline bool libbpf_validate_opts(const char *opts,
size_t opts_sz, size_t user_sz,
const char *type_name)
{
if (user_sz < sizeof(size_t)) {
pr_warn("%s size (%zu) is too small\n", type_name, user_sz);
return false;
}
if (!libbpf_is_mem_zeroed(opts + opts_sz, (ssize_t)user_sz - opts_sz)) {
pr_warn("%s has non-zero extra bytes\n", type_name);
return false;
}
return true;
}
#define OPTS_VALID(opts, type) \
(!(opts) || libbpf_validate_opts((const char *)opts, \
offsetofend(struct type, \
type##__last_field), \
(opts)->sz, #type))
#define OPTS_HAS(opts, field) \
((opts) && opts->sz >= offsetofend(typeof(*(opts)), field))
#define OPTS_GET(opts, field, fallback_value) \
(OPTS_HAS(opts, field) ? (opts)->field : fallback_value)
#define OPTS_SET(opts, field, value) \
do { \
if (OPTS_HAS(opts, field)) \
(opts)->field = value; \
} while (0)
#define OPTS_ZEROED(opts, last_nonzero_field) \
({ \
ssize_t __off = offsetofend(typeof(*(opts)), last_nonzero_field); \
!(opts) || libbpf_is_mem_zeroed((const void *)opts + __off, \
(opts)->sz - __off); \
})
int parse_cpu_mask_str(const char *s, bool **mask, int *mask_sz);
int parse_cpu_mask_file(const char *fcpu, bool **mask, int *mask_sz);
int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
const char *str_sec, size_t str_len);
struct bpf_prog_load_params {
enum bpf_prog_type prog_type;
enum bpf_attach_type expected_attach_type;
const char *name;
const struct bpf_insn *insns;
size_t insn_cnt;
const char *license;
__u32 kern_version;
__u32 attach_prog_fd;
__u32 attach_btf_obj_fd;
__u32 attach_btf_id;
__u32 prog_ifindex;
__u32 prog_btf_fd;
__u32 prog_flags;
__u32 func_info_rec_size;
const void *func_info;
__u32 func_info_cnt;
__u32 line_info_rec_size;
const void *line_info;
__u32 line_info_cnt;
__u32 log_level;
char *log_buf;
size_t log_buf_sz;
int *fd_array;
};
int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr);
int bpf_object__section_size(const struct bpf_object *obj, const char *name,
__u32 *size);
int bpf_object__variable_offset(const struct bpf_object *obj, const char *name,
__u32 *off);
struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf);
void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type,
const char **prefix, int *kind);
struct btf_ext_info {
/*
* info points to the individual info section (e.g. func_info and
@@ -87,6 +332,44 @@ struct btf_ext_info {
i < (sec)->num_info; \
i++, rec = (void *)rec + (seg)->rec_size)
/*
* The .BTF.ext ELF section layout defined as
* struct btf_ext_header
* func_info subsection
*
* The func_info subsection layout:
* record size for struct bpf_func_info in the func_info subsection
* struct btf_sec_func_info for section #1
* a list of bpf_func_info records for section #1
* where struct bpf_func_info mimics one in include/uapi/linux/bpf.h
* but may not be identical
* struct btf_sec_func_info for section #2
* a list of bpf_func_info records for section #2
* ......
*
* Note that the bpf_func_info record size in .BTF.ext may not
* be the same as the one defined in include/uapi/linux/bpf.h.
* The loader should ensure that record_size meets minimum
* requirement and pass the record as is to the kernel. The
* kernel will handle the func_info properly based on its contents.
*/
struct btf_ext_header {
__u16 magic;
__u8 version;
__u8 flags;
__u32 hdr_len;
/* All offsets are in bytes relative to the end of this header */
__u32 func_info_off;
__u32 func_info_len;
__u32 line_info_off;
__u32 line_info_len;
/* optional part of .BTF.ext header */
__u32 core_relo_off;
__u32 core_relo_len;
};
struct btf_ext {
union {
struct btf_ext_header *hdr;
@@ -94,7 +377,7 @@ struct btf_ext {
};
struct btf_ext_info func_info;
struct btf_ext_info line_info;
struct btf_ext_info offset_reloc_info;
struct btf_ext_info core_relo_info;
__u32 data_size;
};
@@ -102,7 +385,7 @@ struct btf_ext_info_sec {
__u32 sec_name_off;
__u32 num_info;
/* Followed by num_info * record_size number of bytes */
__u8 data[0];
__u8 data[];
};
/* The minimum bpf_func_info checked by the loader */
@@ -119,52 +402,74 @@ struct bpf_line_info_min {
__u32 line_col;
};
/* The minimum bpf_offset_reloc checked by the loader
*
* Offset relocation captures the following data:
* - insn_off - instruction offset (in bytes) within a BPF program that needs
* its insn->imm field to be relocated with actual offset;
* - type_id - BTF type ID of the "root" (containing) entity of a relocatable
* offset;
* - access_str_off - offset into corresponding .BTF string section. String
* itself encodes an accessed field using a sequence of field and array
* indicies, separated by colon (:). It's conceptually very close to LLVM's
* getelementptr ([0]) instruction's arguments for identifying offset to
* a field.
*
* Example to provide a better feel.
*
* struct sample {
* int a;
* struct {
* int b[10];
* };
* };
*
* struct sample *s = ...;
* int x = &s->a; // encoded as "0:0" (a is field #0)
* int y = &s->b[5]; // encoded as "0:1:0:5" (anon struct is field #1,
* // b is field #0 inside anon struct, accessing elem #5)
* int z = &s[10]->b; // encoded as "10:1" (ptr is used as an array)
*
* type_id for all relocs in this example will capture BTF type id of
* `struct sample`.
*
* Such relocation is emitted when using __builtin_preserve_access_index()
* Clang built-in, passing expression that captures field address, e.g.:
*
* bpf_probe_read(&dst, sizeof(dst),
* __builtin_preserve_access_index(&src->a.b.c));
*
* In this case Clang will emit offset relocation recording necessary data to
* be able to find offset of embedded `a.b.c` field within `src` struct.
*
* [0] https://llvm.org/docs/LangRef.html#getelementptr-instruction
typedef int (*type_id_visit_fn)(__u32 *type_id, void *ctx);
typedef int (*str_off_visit_fn)(__u32 *str_off, void *ctx);
int btf_type_visit_type_ids(struct btf_type *t, type_id_visit_fn visit, void *ctx);
int btf_type_visit_str_offs(struct btf_type *t, str_off_visit_fn visit, void *ctx);
int btf_ext_visit_type_ids(struct btf_ext *btf_ext, type_id_visit_fn visit, void *ctx);
int btf_ext_visit_str_offs(struct btf_ext *btf_ext, str_off_visit_fn visit, void *ctx);
__s32 btf__find_by_name_kind_own(const struct btf *btf, const char *type_name,
__u32 kind);
extern enum libbpf_strict_mode libbpf_mode;
/* handle direct returned errors */
static inline int libbpf_err(int ret)
{
if (ret < 0)
errno = -ret;
return ret;
}
/* handle errno-based (e.g., syscall or libc) errors according to libbpf's
* strict mode settings
*/
struct bpf_offset_reloc {
__u32 insn_off;
__u32 type_id;
__u32 access_str_off;
};
static inline int libbpf_err_errno(int ret)
{
if (libbpf_mode & LIBBPF_STRICT_DIRECT_ERRS)
/* errno is already assumed to be set on error */
return ret < 0 ? -errno : ret;
/* legacy: on error return -1 directly and don't touch errno */
return ret;
}
/* handle error for pointer-returning APIs, err is assumed to be < 0 always */
static inline void *libbpf_err_ptr(int err)
{
/* set errno on error, this doesn't break anything */
errno = -err;
if (libbpf_mode & LIBBPF_STRICT_CLEAN_PTRS)
return NULL;
/* legacy: encode err as ptr */
return ERR_PTR(err);
}
/* handle pointer-returning APIs' error handling */
static inline void *libbpf_ptr(void *ret)
{
/* set errno on error, this doesn't break anything */
if (IS_ERR(ret))
errno = -PTR_ERR(ret);
if (libbpf_mode & LIBBPF_STRICT_CLEAN_PTRS)
return IS_ERR(ret) ? NULL : ret;
/* legacy: pass-through original pointer */
return ret;
}
static inline bool str_is_empty(const char *s)
{
return !s || !s[0];
}
static inline bool is_ldimm64_insn(struct bpf_insn *insn)
{
return insn->code == (BPF_LD | BPF_IMM | BPF_DW);
}
#endif /* __LIBBPF_LIBBPF_INTERNAL_H */

68
src/libbpf_legacy.h Normal file
View File

@@ -0,0 +1,68 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/*
* Libbpf legacy APIs (either discouraged or deprecated, as mentioned in [0])
*
* [0] https://docs.google.com/document/d/1UyjTZuPFWiPFyKk1tV5an11_iaRuec6U-ZESZ54nNTY
*
* Copyright (C) 2021 Facebook
*/
#ifndef __LIBBPF_LEGACY_BPF_H
#define __LIBBPF_LEGACY_BPF_H
#include <linux/bpf.h>
#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include "libbpf_common.h"
#ifdef __cplusplus
extern "C" {
#endif
enum libbpf_strict_mode {
/* Turn on all supported strict features of libbpf to simulate libbpf
* v1.0 behavior.
* This will be the default behavior in libbpf v1.0.
*/
LIBBPF_STRICT_ALL = 0xffffffff,
/*
* Disable any libbpf 1.0 behaviors. This is the default before libbpf
* v1.0. It won't be supported anymore in v1.0, please update your
* code so that it handles LIBBPF_STRICT_ALL mode before libbpf v1.0.
*/
LIBBPF_STRICT_NONE = 0x00,
/*
* Return NULL pointers on error, not ERR_PTR(err).
* Additionally, libbpf also always sets errno to corresponding Exx
* (positive) error code.
*/
LIBBPF_STRICT_CLEAN_PTRS = 0x01,
/*
* Return actual error codes from low-level APIs directly, not just -1.
* Additionally, libbpf also always sets errno to corresponding Exx
* (positive) error code.
*/
LIBBPF_STRICT_DIRECT_ERRS = 0x02,
/*
* Enforce strict BPF program section (SEC()) names.
* E.g., while prefiously SEC("xdp_whatever") or SEC("perf_event_blah") were
* allowed, with LIBBPF_STRICT_SEC_PREFIX this will become
* unrecognized by libbpf and would have to be just SEC("xdp") and
* SEC("xdp") and SEC("perf_event").
*/
LIBBPF_STRICT_SEC_NAME = 0x04,
__LIBBPF_STRICT_LAST,
};
LIBBPF_API int libbpf_set_strict_mode(enum libbpf_strict_mode mode);
#ifdef __cplusplus
} /* extern "C" */
#endif
#endif /* __LIBBPF_LEGACY_BPF_H */

View File

@@ -75,6 +75,12 @@ probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns,
case BPF_PROG_TYPE_CGROUP_SOCK_ADDR:
xattr.expected_attach_type = BPF_CGROUP_INET4_CONNECT;
break;
case BPF_PROG_TYPE_CGROUP_SOCKOPT:
xattr.expected_attach_type = BPF_CGROUP_GETSOCKOPT;
break;
case BPF_PROG_TYPE_SK_LOOKUP:
xattr.expected_attach_type = BPF_SK_LOOKUP;
break;
case BPF_PROG_TYPE_KPROBE:
xattr.kern_version = get_kernel_version();
break;
@@ -101,7 +107,10 @@ probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns,
case BPF_PROG_TYPE_SK_REUSEPORT:
case BPF_PROG_TYPE_FLOW_DISSECTOR:
case BPF_PROG_TYPE_CGROUP_SYSCTL:
case BPF_PROG_TYPE_CGROUP_SOCKOPT:
case BPF_PROG_TYPE_TRACING:
case BPF_PROG_TYPE_STRUCT_OPS:
case BPF_PROG_TYPE_EXT:
case BPF_PROG_TYPE_LSM:
default:
break;
}
@@ -163,7 +172,7 @@ int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
return btf_fd;
}
static int load_sk_storage_btf(void)
static int load_local_storage_btf(void)
{
const char strs[] = "\0bpf_spin_lock\0val\0cnt\0l";
/* struct bpf_spin_lock {
@@ -222,15 +231,22 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
key_size = 0;
break;
case BPF_MAP_TYPE_SK_STORAGE:
case BPF_MAP_TYPE_INODE_STORAGE:
case BPF_MAP_TYPE_TASK_STORAGE:
btf_key_type_id = 1;
btf_value_type_id = 3;
value_size = 8;
max_entries = 0;
map_flags = BPF_F_NO_PREALLOC;
btf_fd = load_sk_storage_btf();
btf_fd = load_local_storage_btf();
if (btf_fd < 0)
return false;
break;
case BPF_MAP_TYPE_RINGBUF:
key_size = 0;
value_size = 0;
max_entries = 4096;
break;
case BPF_MAP_TYPE_UNSPEC:
case BPF_MAP_TYPE_HASH:
case BPF_MAP_TYPE_ARRAY:
@@ -250,6 +266,7 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
case BPF_MAP_TYPE_XSKMAP:
case BPF_MAP_TYPE_SOCKHASH:
case BPF_MAP_TYPE_REUSEPORT_SOCKARRAY:
case BPF_MAP_TYPE_STRUCT_OPS:
default:
break;
}
@@ -320,3 +337,24 @@ bool bpf_probe_helper(enum bpf_func_id id, enum bpf_prog_type prog_type,
return res;
}
/*
* Probe for availability of kernel commit (5.3):
*
* c04c0d2b968a ("bpf: increase complexity limit and maximum program size")
*/
bool bpf_probe_large_insn_limit(__u32 ifindex)
{
struct bpf_insn insns[BPF_MAXINSNS + 1];
int i;
for (i = 0; i < BPF_MAXINSNS; i++)
insns[i] = BPF_MOV64_IMM(BPF_REG_0, 1);
insns[BPF_MAXINSNS] = BPF_EXIT_INSN();
errno = 0;
probe_load(BPF_PROG_TYPE_SCHED_CLS, insns, ARRAY_SIZE(insns), NULL, 0,
ifindex);
return errno != E2BIG && errno != EINVAL;
}

View File

@@ -1,47 +0,0 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* Copyright (c) 2019 Facebook */
#ifndef __LIBBPF_LIBBPF_UTIL_H
#define __LIBBPF_LIBBPF_UTIL_H
#include <stdbool.h>
#ifdef __cplusplus
extern "C" {
#endif
/* Use these barrier functions instead of smp_[rw]mb() when they are
* used in a libbpf header file. That way they can be built into the
* application that uses libbpf.
*/
#if defined(__i386__) || defined(__x86_64__)
# define libbpf_smp_rmb() asm volatile("" : : : "memory")
# define libbpf_smp_wmb() asm volatile("" : : : "memory")
# define libbpf_smp_mb() \
asm volatile("lock; addl $0,-4(%%rsp)" : : : "memory", "cc")
/* Hinders stores to be observed before older loads. */
# define libbpf_smp_rwmb() asm volatile("" : : : "memory")
#elif defined(__aarch64__)
# define libbpf_smp_rmb() asm volatile("dmb ishld" : : : "memory")
# define libbpf_smp_wmb() asm volatile("dmb ishst" : : : "memory")
# define libbpf_smp_mb() asm volatile("dmb ish" : : : "memory")
# define libbpf_smp_rwmb() libbpf_smp_mb()
#elif defined(__arm__)
/* These are only valid for armv7 and above */
# define libbpf_smp_rmb() asm volatile("dmb ish" : : : "memory")
# define libbpf_smp_wmb() asm volatile("dmb ishst" : : : "memory")
# define libbpf_smp_mb() asm volatile("dmb ish" : : : "memory")
# define libbpf_smp_rwmb() libbpf_smp_mb()
#else
/* Architecture missing native barrier functions. */
# define libbpf_smp_rmb() __sync_synchronize()
# define libbpf_smp_wmb() __sync_synchronize()
# define libbpf_smp_mb() __sync_synchronize()
# define libbpf_smp_rwmb() __sync_synchronize()
#endif
#ifdef __cplusplus
} /* extern "C" */
#endif
#endif

9
src/libbpf_version.h Normal file
View File

@@ -0,0 +1,9 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* Copyright (C) 2021 Facebook */
#ifndef __LIBBPF_VERSION_H
#define __LIBBPF_VERSION_H
#define LIBBPF_MAJOR_VERSION 0
#define LIBBPF_MINOR_VERSION 6
#endif /* __LIBBPF_VERSION_H */

2898
src/linker.c Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -4,7 +4,10 @@
#include <stdlib.h>
#include <memory.h>
#include <unistd.h>
#include <arpa/inet.h>
#include <linux/bpf.h>
#include <linux/if_ether.h>
#include <linux/pkt_cls.h>
#include <linux/rtnetlink.h>
#include <sys/socket.h>
#include <errno.h>
@@ -12,22 +15,25 @@
#include "bpf.h"
#include "libbpf.h"
#include "libbpf_internal.h"
#include "nlattr.h"
#ifndef SOL_NETLINK
#define SOL_NETLINK 270
#endif
typedef int (*libbpf_dump_nlmsg_t)(void *cookie, void *msg, struct nlattr **tb);
typedef int (*__dump_nlmsg_t)(struct nlmsghdr *nlmsg, libbpf_dump_nlmsg_t,
void *cookie);
struct xdp_id_md {
int ifindex;
__u32 flags;
__u32 id;
struct xdp_link_info info;
};
int libbpf_netlink_open(__u32 *nl_pid)
static int libbpf_netlink_open(__u32 *nl_pid)
{
struct sockaddr_nl sa;
socklen_t addrlen;
@@ -37,13 +43,13 @@ int libbpf_netlink_open(__u32 *nl_pid)
memset(&sa, 0, sizeof(sa));
sa.nl_family = AF_NETLINK;
sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
sock = socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC, NETLINK_ROUTE);
if (sock < 0)
return -errno;
if (setsockopt(sock, SOL_NETLINK, NETLINK_EXT_ACK,
&one, sizeof(one)) < 0) {
fprintf(stderr, "Netlink error reporting not supported\n");
pr_warn("Netlink error reporting not supported\n");
}
if (bind(sock, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
@@ -70,9 +76,20 @@ cleanup:
return ret;
}
static int bpf_netlink_recv(int sock, __u32 nl_pid, int seq,
__dump_nlmsg_t _fn, libbpf_dump_nlmsg_t fn,
void *cookie)
static void libbpf_netlink_close(int sock)
{
close(sock);
}
enum {
NL_CONT,
NL_NEXT,
NL_DONE,
};
static int libbpf_netlink_recv(int sock, __u32 nl_pid, int seq,
__dump_nlmsg_t _fn, libbpf_dump_nlmsg_t fn,
void *cookie)
{
bool multipart = true;
struct nlmsgerr *err;
@@ -81,6 +98,7 @@ static int bpf_netlink_recv(int sock, __u32 nl_pid, int seq,
int len, ret;
while (multipart) {
start:
multipart = false;
len = recv(sock, buf, sizeof(buf), 0);
if (len < 0) {
@@ -118,8 +136,16 @@ static int bpf_netlink_recv(int sock, __u32 nl_pid, int seq,
}
if (_fn) {
ret = _fn(nh, fn, cookie);
if (ret)
switch (ret) {
case NL_CONT:
break;
case NL_NEXT:
goto start;
case NL_DONE:
return 0;
default:
return ret;
}
}
}
}
@@ -128,65 +154,94 @@ done:
return ret;
}
int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
static int libbpf_netlink_send_recv(struct libbpf_nla_req *req,
__dump_nlmsg_t parse_msg,
libbpf_dump_nlmsg_t parse_attr,
void *cookie)
{
int sock, seq = 0, ret;
struct nlattr *nla, *nla_xdp;
struct {
struct nlmsghdr nh;
struct ifinfomsg ifinfo;
char attrbuf[64];
} req;
__u32 nl_pid;
__u32 nl_pid = 0;
int sock, ret;
sock = libbpf_netlink_open(&nl_pid);
if (sock < 0)
return sock;
memset(&req, 0, sizeof(req));
req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
req.nh.nlmsg_type = RTM_SETLINK;
req.nh.nlmsg_pid = 0;
req.nh.nlmsg_seq = ++seq;
req.ifinfo.ifi_family = AF_UNSPEC;
req.ifinfo.ifi_index = ifindex;
req->nh.nlmsg_pid = 0;
req->nh.nlmsg_seq = time(NULL);
/* started nested attribute for XDP */
nla = (struct nlattr *)(((char *)&req)
+ NLMSG_ALIGN(req.nh.nlmsg_len));
nla->nla_type = NLA_F_NESTED | IFLA_XDP;
nla->nla_len = NLA_HDRLEN;
/* add XDP fd */
nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
nla_xdp->nla_type = IFLA_XDP_FD;
nla_xdp->nla_len = NLA_HDRLEN + sizeof(int);
memcpy((char *)nla_xdp + NLA_HDRLEN, &fd, sizeof(fd));
nla->nla_len += nla_xdp->nla_len;
/* if user passed in any flags, add those too */
if (flags) {
nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
nla_xdp->nla_type = IFLA_XDP_FLAGS;
nla_xdp->nla_len = NLA_HDRLEN + sizeof(flags);
memcpy((char *)nla_xdp + NLA_HDRLEN, &flags, sizeof(flags));
nla->nla_len += nla_xdp->nla_len;
}
req.nh.nlmsg_len += NLA_ALIGN(nla->nla_len);
if (send(sock, &req, req.nh.nlmsg_len, 0) < 0) {
if (send(sock, req, req->nh.nlmsg_len, 0) < 0) {
ret = -errno;
goto cleanup;
goto out;
}
ret = bpf_netlink_recv(sock, nl_pid, seq, NULL, NULL, NULL);
cleanup:
close(sock);
ret = libbpf_netlink_recv(sock, nl_pid, req->nh.nlmsg_seq,
parse_msg, parse_attr, cookie);
out:
libbpf_netlink_close(sock);
return ret;
}
static int __bpf_set_link_xdp_fd_replace(int ifindex, int fd, int old_fd,
__u32 flags)
{
struct nlattr *nla;
int ret;
struct libbpf_nla_req req;
memset(&req, 0, sizeof(req));
req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
req.nh.nlmsg_type = RTM_SETLINK;
req.ifinfo.ifi_family = AF_UNSPEC;
req.ifinfo.ifi_index = ifindex;
nla = nlattr_begin_nested(&req, IFLA_XDP);
if (!nla)
return -EMSGSIZE;
ret = nlattr_add(&req, IFLA_XDP_FD, &fd, sizeof(fd));
if (ret < 0)
return ret;
if (flags) {
ret = nlattr_add(&req, IFLA_XDP_FLAGS, &flags, sizeof(flags));
if (ret < 0)
return ret;
}
if (flags & XDP_FLAGS_REPLACE) {
ret = nlattr_add(&req, IFLA_XDP_EXPECTED_FD, &old_fd,
sizeof(old_fd));
if (ret < 0)
return ret;
}
nlattr_end_nested(&req, nla);
return libbpf_netlink_send_recv(&req, NULL, NULL, NULL);
}
int bpf_set_link_xdp_fd_opts(int ifindex, int fd, __u32 flags,
const struct bpf_xdp_set_link_opts *opts)
{
int old_fd = -1, ret;
if (!OPTS_VALID(opts, bpf_xdp_set_link_opts))
return libbpf_err(-EINVAL);
if (OPTS_HAS(opts, old_fd)) {
old_fd = OPTS_GET(opts, old_fd, -1);
flags |= XDP_FLAGS_REPLACE;
}
ret = __bpf_set_link_xdp_fd_replace(ifindex, fd, old_fd, flags);
return libbpf_err(ret);
}
int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
{
int ret;
ret = __bpf_set_link_xdp_fd_replace(ifindex, fd, 0, flags);
return libbpf_err(ret);
}
static int __dump_link_nlmsg(struct nlmsghdr *nlh,
libbpf_dump_nlmsg_t dump_link_nlmsg, void *cookie)
{
@@ -196,32 +251,18 @@ static int __dump_link_nlmsg(struct nlmsghdr *nlh,
len = nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*ifi));
attr = (struct nlattr *) ((void *) ifi + NLMSG_ALIGN(sizeof(*ifi)));
if (libbpf_nla_parse(tb, IFLA_MAX, attr, len, NULL) != 0)
return -LIBBPF_ERRNO__NLPARSE;
return dump_link_nlmsg(cookie, ifi, tb);
}
static unsigned char get_xdp_id_attr(unsigned char mode, __u32 flags)
{
if (mode != XDP_ATTACHED_MULTI)
return IFLA_XDP_PROG_ID;
if (flags & XDP_FLAGS_DRV_MODE)
return IFLA_XDP_DRV_PROG_ID;
if (flags & XDP_FLAGS_HW_MODE)
return IFLA_XDP_HW_PROG_ID;
if (flags & XDP_FLAGS_SKB_MODE)
return IFLA_XDP_SKB_PROG_ID;
return IFLA_XDP_UNSPEC;
}
static int get_xdp_id(void *cookie, void *msg, struct nlattr **tb)
static int get_xdp_info(void *cookie, void *msg, struct nlattr **tb)
{
struct nlattr *xdp_tb[IFLA_XDP_MAX + 1];
struct xdp_id_md *xdp_id = cookie;
struct ifinfomsg *ifinfo = msg;
unsigned char mode, xdp_attr;
int ret;
if (xdp_id->ifindex && xdp_id->ifindex != ifinfo->ifi_index)
@@ -237,186 +278,481 @@ static int get_xdp_id(void *cookie, void *msg, struct nlattr **tb)
if (!xdp_tb[IFLA_XDP_ATTACHED])
return 0;
mode = libbpf_nla_getattr_u8(xdp_tb[IFLA_XDP_ATTACHED]);
if (mode == XDP_ATTACHED_NONE)
xdp_id->info.attach_mode = libbpf_nla_getattr_u8(
xdp_tb[IFLA_XDP_ATTACHED]);
if (xdp_id->info.attach_mode == XDP_ATTACHED_NONE)
return 0;
xdp_attr = get_xdp_id_attr(mode, xdp_id->flags);
if (!xdp_attr || !xdp_tb[xdp_attr])
return 0;
if (xdp_tb[IFLA_XDP_PROG_ID])
xdp_id->info.prog_id = libbpf_nla_getattr_u32(
xdp_tb[IFLA_XDP_PROG_ID]);
xdp_id->id = libbpf_nla_getattr_u32(xdp_tb[xdp_attr]);
if (xdp_tb[IFLA_XDP_SKB_PROG_ID])
xdp_id->info.skb_prog_id = libbpf_nla_getattr_u32(
xdp_tb[IFLA_XDP_SKB_PROG_ID]);
if (xdp_tb[IFLA_XDP_DRV_PROG_ID])
xdp_id->info.drv_prog_id = libbpf_nla_getattr_u32(
xdp_tb[IFLA_XDP_DRV_PROG_ID]);
if (xdp_tb[IFLA_XDP_HW_PROG_ID])
xdp_id->info.hw_prog_id = libbpf_nla_getattr_u32(
xdp_tb[IFLA_XDP_HW_PROG_ID]);
return 0;
}
int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info,
size_t info_size, __u32 flags)
{
struct xdp_id_md xdp_id = {};
__u32 mask;
int ret;
struct libbpf_nla_req req = {
.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
.nh.nlmsg_type = RTM_GETLINK,
.nh.nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST,
.ifinfo.ifi_family = AF_PACKET,
};
if (flags & ~XDP_FLAGS_MASK || !info_size)
return libbpf_err(-EINVAL);
/* Check whether the single {HW,DRV,SKB} mode is set */
flags &= (XDP_FLAGS_SKB_MODE | XDP_FLAGS_DRV_MODE | XDP_FLAGS_HW_MODE);
mask = flags - 1;
if (flags && flags & mask)
return libbpf_err(-EINVAL);
xdp_id.ifindex = ifindex;
xdp_id.flags = flags;
ret = libbpf_netlink_send_recv(&req, __dump_link_nlmsg,
get_xdp_info, &xdp_id);
if (!ret) {
size_t sz = min(info_size, sizeof(xdp_id.info));
memcpy(info, &xdp_id.info, sz);
memset((void *) info + sz, 0, info_size - sz);
}
return libbpf_err(ret);
}
static __u32 get_xdp_id(struct xdp_link_info *info, __u32 flags)
{
flags &= XDP_FLAGS_MODES;
if (info->attach_mode != XDP_ATTACHED_MULTI && !flags)
return info->prog_id;
if (flags & XDP_FLAGS_DRV_MODE)
return info->drv_prog_id;
if (flags & XDP_FLAGS_HW_MODE)
return info->hw_prog_id;
if (flags & XDP_FLAGS_SKB_MODE)
return info->skb_prog_id;
return 0;
}
int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags)
{
struct xdp_id_md xdp_id = {};
int sock, ret;
__u32 nl_pid;
__u32 mask;
struct xdp_link_info info;
int ret;
if (flags & ~XDP_FLAGS_MASK)
return -EINVAL;
/* Check whether the single {HW,DRV,SKB} mode is set */
flags &= (XDP_FLAGS_SKB_MODE | XDP_FLAGS_DRV_MODE | XDP_FLAGS_HW_MODE);
mask = flags - 1;
if (flags && flags & mask)
return -EINVAL;
sock = libbpf_netlink_open(&nl_pid);
if (sock < 0)
return sock;
xdp_id.ifindex = ifindex;
xdp_id.flags = flags;
ret = libbpf_nl_get_link(sock, nl_pid, get_xdp_id, &xdp_id);
ret = bpf_get_link_xdp_info(ifindex, &info, sizeof(info), flags);
if (!ret)
*prog_id = xdp_id.id;
*prog_id = get_xdp_id(&info, flags);
close(sock);
return libbpf_err(ret);
}
typedef int (*qdisc_config_t)(struct libbpf_nla_req *req);
static int clsact_config(struct libbpf_nla_req *req)
{
req->tc.tcm_parent = TC_H_CLSACT;
req->tc.tcm_handle = TC_H_MAKE(TC_H_CLSACT, 0);
return nlattr_add(req, TCA_KIND, "clsact", sizeof("clsact"));
}
static int attach_point_to_config(struct bpf_tc_hook *hook,
qdisc_config_t *config)
{
switch (OPTS_GET(hook, attach_point, 0)) {
case BPF_TC_INGRESS:
case BPF_TC_EGRESS:
case BPF_TC_INGRESS | BPF_TC_EGRESS:
if (OPTS_GET(hook, parent, 0))
return -EINVAL;
*config = &clsact_config;
return 0;
case BPF_TC_CUSTOM:
return -EOPNOTSUPP;
default:
return -EINVAL;
}
}
static int tc_get_tcm_parent(enum bpf_tc_attach_point attach_point,
__u32 *parent)
{
switch (attach_point) {
case BPF_TC_INGRESS:
case BPF_TC_EGRESS:
if (*parent)
return -EINVAL;
*parent = TC_H_MAKE(TC_H_CLSACT,
attach_point == BPF_TC_INGRESS ?
TC_H_MIN_INGRESS : TC_H_MIN_EGRESS);
break;
case BPF_TC_CUSTOM:
if (!*parent)
return -EINVAL;
break;
default:
return -EINVAL;
}
return 0;
}
static int tc_qdisc_modify(struct bpf_tc_hook *hook, int cmd, int flags)
{
qdisc_config_t config;
int ret;
struct libbpf_nla_req req;
ret = attach_point_to_config(hook, &config);
if (ret < 0)
return ret;
memset(&req, 0, sizeof(req));
req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg));
req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK | flags;
req.nh.nlmsg_type = cmd;
req.tc.tcm_family = AF_UNSPEC;
req.tc.tcm_ifindex = OPTS_GET(hook, ifindex, 0);
ret = config(&req);
if (ret < 0)
return ret;
return libbpf_netlink_send_recv(&req, NULL, NULL, NULL);
}
static int tc_qdisc_create_excl(struct bpf_tc_hook *hook)
{
return tc_qdisc_modify(hook, RTM_NEWQDISC, NLM_F_CREATE | NLM_F_EXCL);
}
static int tc_qdisc_delete(struct bpf_tc_hook *hook)
{
return tc_qdisc_modify(hook, RTM_DELQDISC, 0);
}
int bpf_tc_hook_create(struct bpf_tc_hook *hook)
{
int ret;
if (!hook || !OPTS_VALID(hook, bpf_tc_hook) ||
OPTS_GET(hook, ifindex, 0) <= 0)
return libbpf_err(-EINVAL);
ret = tc_qdisc_create_excl(hook);
return libbpf_err(ret);
}
static int __bpf_tc_detach(const struct bpf_tc_hook *hook,
const struct bpf_tc_opts *opts,
const bool flush);
int bpf_tc_hook_destroy(struct bpf_tc_hook *hook)
{
if (!hook || !OPTS_VALID(hook, bpf_tc_hook) ||
OPTS_GET(hook, ifindex, 0) <= 0)
return libbpf_err(-EINVAL);
switch (OPTS_GET(hook, attach_point, 0)) {
case BPF_TC_INGRESS:
case BPF_TC_EGRESS:
return libbpf_err(__bpf_tc_detach(hook, NULL, true));
case BPF_TC_INGRESS | BPF_TC_EGRESS:
return libbpf_err(tc_qdisc_delete(hook));
case BPF_TC_CUSTOM:
return libbpf_err(-EOPNOTSUPP);
default:
return libbpf_err(-EINVAL);
}
}
struct bpf_cb_ctx {
struct bpf_tc_opts *opts;
bool processed;
};
static int __get_tc_info(void *cookie, struct tcmsg *tc, struct nlattr **tb,
bool unicast)
{
struct nlattr *tbb[TCA_BPF_MAX + 1];
struct bpf_cb_ctx *info = cookie;
if (!info || !info->opts)
return -EINVAL;
if (unicast && info->processed)
return -EINVAL;
if (!tb[TCA_OPTIONS])
return NL_CONT;
libbpf_nla_parse_nested(tbb, TCA_BPF_MAX, tb[TCA_OPTIONS], NULL);
if (!tbb[TCA_BPF_ID])
return -EINVAL;
OPTS_SET(info->opts, prog_id, libbpf_nla_getattr_u32(tbb[TCA_BPF_ID]));
OPTS_SET(info->opts, handle, tc->tcm_handle);
OPTS_SET(info->opts, priority, TC_H_MAJ(tc->tcm_info) >> 16);
info->processed = true;
return unicast ? NL_NEXT : NL_DONE;
}
static int get_tc_info(struct nlmsghdr *nh, libbpf_dump_nlmsg_t fn,
void *cookie)
{
struct tcmsg *tc = NLMSG_DATA(nh);
struct nlattr *tb[TCA_MAX + 1];
libbpf_nla_parse(tb, TCA_MAX,
(struct nlattr *)((void *)tc + NLMSG_ALIGN(sizeof(*tc))),
NLMSG_PAYLOAD(nh, sizeof(*tc)), NULL);
if (!tb[TCA_KIND])
return NL_CONT;
return __get_tc_info(cookie, tc, tb, nh->nlmsg_flags & NLM_F_ECHO);
}
static int tc_add_fd_and_name(struct libbpf_nla_req *req, int fd)
{
struct bpf_prog_info info = {};
__u32 info_len = sizeof(info);
char name[256];
int len, ret;
ret = bpf_obj_get_info_by_fd(fd, &info, &info_len);
if (ret < 0)
return ret;
ret = nlattr_add(req, TCA_BPF_FD, &fd, sizeof(fd));
if (ret < 0)
return ret;
len = snprintf(name, sizeof(name), "%s:[%u]", info.name, info.id);
if (len < 0)
return -errno;
if (len >= sizeof(name))
return -ENAMETOOLONG;
return nlattr_add(req, TCA_BPF_NAME, name, len + 1);
}
int bpf_tc_attach(const struct bpf_tc_hook *hook, struct bpf_tc_opts *opts)
{
__u32 protocol, bpf_flags, handle, priority, parent, prog_id, flags;
int ret, ifindex, attach_point, prog_fd;
struct bpf_cb_ctx info = {};
struct libbpf_nla_req req;
struct nlattr *nla;
if (!hook || !opts ||
!OPTS_VALID(hook, bpf_tc_hook) ||
!OPTS_VALID(opts, bpf_tc_opts))
return libbpf_err(-EINVAL);
ifindex = OPTS_GET(hook, ifindex, 0);
parent = OPTS_GET(hook, parent, 0);
attach_point = OPTS_GET(hook, attach_point, 0);
handle = OPTS_GET(opts, handle, 0);
priority = OPTS_GET(opts, priority, 0);
prog_fd = OPTS_GET(opts, prog_fd, 0);
prog_id = OPTS_GET(opts, prog_id, 0);
flags = OPTS_GET(opts, flags, 0);
if (ifindex <= 0 || !prog_fd || prog_id)
return libbpf_err(-EINVAL);
if (priority > UINT16_MAX)
return libbpf_err(-EINVAL);
if (flags & ~BPF_TC_F_REPLACE)
return libbpf_err(-EINVAL);
flags = (flags & BPF_TC_F_REPLACE) ? NLM_F_REPLACE : NLM_F_EXCL;
protocol = ETH_P_ALL;
memset(&req, 0, sizeof(req));
req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg));
req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK | NLM_F_CREATE |
NLM_F_ECHO | flags;
req.nh.nlmsg_type = RTM_NEWTFILTER;
req.tc.tcm_family = AF_UNSPEC;
req.tc.tcm_ifindex = ifindex;
req.tc.tcm_handle = handle;
req.tc.tcm_info = TC_H_MAKE(priority << 16, htons(protocol));
ret = tc_get_tcm_parent(attach_point, &parent);
if (ret < 0)
return libbpf_err(ret);
req.tc.tcm_parent = parent;
ret = nlattr_add(&req, TCA_KIND, "bpf", sizeof("bpf"));
if (ret < 0)
return libbpf_err(ret);
nla = nlattr_begin_nested(&req, TCA_OPTIONS);
if (!nla)
return libbpf_err(-EMSGSIZE);
ret = tc_add_fd_and_name(&req, prog_fd);
if (ret < 0)
return libbpf_err(ret);
bpf_flags = TCA_BPF_FLAG_ACT_DIRECT;
ret = nlattr_add(&req, TCA_BPF_FLAGS, &bpf_flags, sizeof(bpf_flags));
if (ret < 0)
return libbpf_err(ret);
nlattr_end_nested(&req, nla);
info.opts = opts;
ret = libbpf_netlink_send_recv(&req, get_tc_info, NULL, &info);
if (ret < 0)
return libbpf_err(ret);
if (!info.processed)
return libbpf_err(-ENOENT);
return ret;
}
int libbpf_nl_get_link(int sock, unsigned int nl_pid,
libbpf_dump_nlmsg_t dump_link_nlmsg, void *cookie)
static int __bpf_tc_detach(const struct bpf_tc_hook *hook,
const struct bpf_tc_opts *opts,
const bool flush)
{
struct {
struct nlmsghdr nlh;
struct ifinfomsg ifm;
} req = {
.nlh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
.nlh.nlmsg_type = RTM_GETLINK,
.nlh.nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST,
.ifm.ifi_family = AF_PACKET,
};
int seq = time(NULL);
__u32 protocol = 0, handle, priority, parent, prog_id, flags;
int ret, ifindex, attach_point, prog_fd;
struct libbpf_nla_req req;
req.nlh.nlmsg_seq = seq;
if (send(sock, &req, req.nlh.nlmsg_len, 0) < 0)
return -errno;
if (!hook ||
!OPTS_VALID(hook, bpf_tc_hook) ||
!OPTS_VALID(opts, bpf_tc_opts))
return -EINVAL;
return bpf_netlink_recv(sock, nl_pid, seq, __dump_link_nlmsg,
dump_link_nlmsg, cookie);
ifindex = OPTS_GET(hook, ifindex, 0);
parent = OPTS_GET(hook, parent, 0);
attach_point = OPTS_GET(hook, attach_point, 0);
handle = OPTS_GET(opts, handle, 0);
priority = OPTS_GET(opts, priority, 0);
prog_fd = OPTS_GET(opts, prog_fd, 0);
prog_id = OPTS_GET(opts, prog_id, 0);
flags = OPTS_GET(opts, flags, 0);
if (ifindex <= 0 || flags || prog_fd || prog_id)
return -EINVAL;
if (priority > UINT16_MAX)
return -EINVAL;
if (!flush) {
if (!handle || !priority)
return -EINVAL;
protocol = ETH_P_ALL;
} else {
if (handle || priority)
return -EINVAL;
}
memset(&req, 0, sizeof(req));
req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg));
req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
req.nh.nlmsg_type = RTM_DELTFILTER;
req.tc.tcm_family = AF_UNSPEC;
req.tc.tcm_ifindex = ifindex;
if (!flush) {
req.tc.tcm_handle = handle;
req.tc.tcm_info = TC_H_MAKE(priority << 16, htons(protocol));
}
ret = tc_get_tcm_parent(attach_point, &parent);
if (ret < 0)
return ret;
req.tc.tcm_parent = parent;
if (!flush) {
ret = nlattr_add(&req, TCA_KIND, "bpf", sizeof("bpf"));
if (ret < 0)
return ret;
}
return libbpf_netlink_send_recv(&req, NULL, NULL, NULL);
}
static int __dump_class_nlmsg(struct nlmsghdr *nlh,
libbpf_dump_nlmsg_t dump_class_nlmsg,
void *cookie)
int bpf_tc_detach(const struct bpf_tc_hook *hook,
const struct bpf_tc_opts *opts)
{
struct nlattr *tb[TCA_MAX + 1], *attr;
struct tcmsg *t = NLMSG_DATA(nlh);
int len;
int ret;
len = nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*t));
attr = (struct nlattr *) ((void *) t + NLMSG_ALIGN(sizeof(*t)));
if (libbpf_nla_parse(tb, TCA_MAX, attr, len, NULL) != 0)
return -LIBBPF_ERRNO__NLPARSE;
if (!opts)
return libbpf_err(-EINVAL);
return dump_class_nlmsg(cookie, t, tb);
ret = __bpf_tc_detach(hook, opts, false);
return libbpf_err(ret);
}
int libbpf_nl_get_class(int sock, unsigned int nl_pid, int ifindex,
libbpf_dump_nlmsg_t dump_class_nlmsg, void *cookie)
int bpf_tc_query(const struct bpf_tc_hook *hook, struct bpf_tc_opts *opts)
{
struct {
struct nlmsghdr nlh;
struct tcmsg t;
} req = {
.nlh.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg)),
.nlh.nlmsg_type = RTM_GETTCLASS,
.nlh.nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST,
.t.tcm_family = AF_UNSPEC,
.t.tcm_ifindex = ifindex,
};
int seq = time(NULL);
__u32 protocol, handle, priority, parent, prog_id, flags;
int ret, ifindex, attach_point, prog_fd;
struct bpf_cb_ctx info = {};
struct libbpf_nla_req req;
req.nlh.nlmsg_seq = seq;
if (send(sock, &req, req.nlh.nlmsg_len, 0) < 0)
return -errno;
if (!hook || !opts ||
!OPTS_VALID(hook, bpf_tc_hook) ||
!OPTS_VALID(opts, bpf_tc_opts))
return libbpf_err(-EINVAL);
return bpf_netlink_recv(sock, nl_pid, seq, __dump_class_nlmsg,
dump_class_nlmsg, cookie);
}
static int __dump_qdisc_nlmsg(struct nlmsghdr *nlh,
libbpf_dump_nlmsg_t dump_qdisc_nlmsg,
void *cookie)
{
struct nlattr *tb[TCA_MAX + 1], *attr;
struct tcmsg *t = NLMSG_DATA(nlh);
int len;
len = nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*t));
attr = (struct nlattr *) ((void *) t + NLMSG_ALIGN(sizeof(*t)));
if (libbpf_nla_parse(tb, TCA_MAX, attr, len, NULL) != 0)
return -LIBBPF_ERRNO__NLPARSE;
return dump_qdisc_nlmsg(cookie, t, tb);
}
int libbpf_nl_get_qdisc(int sock, unsigned int nl_pid, int ifindex,
libbpf_dump_nlmsg_t dump_qdisc_nlmsg, void *cookie)
{
struct {
struct nlmsghdr nlh;
struct tcmsg t;
} req = {
.nlh.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg)),
.nlh.nlmsg_type = RTM_GETQDISC,
.nlh.nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST,
.t.tcm_family = AF_UNSPEC,
.t.tcm_ifindex = ifindex,
};
int seq = time(NULL);
req.nlh.nlmsg_seq = seq;
if (send(sock, &req, req.nlh.nlmsg_len, 0) < 0)
return -errno;
return bpf_netlink_recv(sock, nl_pid, seq, __dump_qdisc_nlmsg,
dump_qdisc_nlmsg, cookie);
}
static int __dump_filter_nlmsg(struct nlmsghdr *nlh,
libbpf_dump_nlmsg_t dump_filter_nlmsg,
void *cookie)
{
struct nlattr *tb[TCA_MAX + 1], *attr;
struct tcmsg *t = NLMSG_DATA(nlh);
int len;
len = nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*t));
attr = (struct nlattr *) ((void *) t + NLMSG_ALIGN(sizeof(*t)));
if (libbpf_nla_parse(tb, TCA_MAX, attr, len, NULL) != 0)
return -LIBBPF_ERRNO__NLPARSE;
return dump_filter_nlmsg(cookie, t, tb);
}
int libbpf_nl_get_filter(int sock, unsigned int nl_pid, int ifindex, int handle,
libbpf_dump_nlmsg_t dump_filter_nlmsg, void *cookie)
{
struct {
struct nlmsghdr nlh;
struct tcmsg t;
} req = {
.nlh.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg)),
.nlh.nlmsg_type = RTM_GETTFILTER,
.nlh.nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST,
.t.tcm_family = AF_UNSPEC,
.t.tcm_ifindex = ifindex,
.t.tcm_parent = handle,
};
int seq = time(NULL);
req.nlh.nlmsg_seq = seq;
if (send(sock, &req, req.nlh.nlmsg_len, 0) < 0)
return -errno;
return bpf_netlink_recv(sock, nl_pid, seq, __dump_filter_nlmsg,
dump_filter_nlmsg, cookie);
ifindex = OPTS_GET(hook, ifindex, 0);
parent = OPTS_GET(hook, parent, 0);
attach_point = OPTS_GET(hook, attach_point, 0);
handle = OPTS_GET(opts, handle, 0);
priority = OPTS_GET(opts, priority, 0);
prog_fd = OPTS_GET(opts, prog_fd, 0);
prog_id = OPTS_GET(opts, prog_id, 0);
flags = OPTS_GET(opts, flags, 0);
if (ifindex <= 0 || flags || prog_fd || prog_id ||
!handle || !priority)
return libbpf_err(-EINVAL);
if (priority > UINT16_MAX)
return libbpf_err(-EINVAL);
protocol = ETH_P_ALL;
memset(&req, 0, sizeof(req));
req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg));
req.nh.nlmsg_flags = NLM_F_REQUEST;
req.nh.nlmsg_type = RTM_GETTFILTER;
req.tc.tcm_family = AF_UNSPEC;
req.tc.tcm_ifindex = ifindex;
req.tc.tcm_handle = handle;
req.tc.tcm_info = TC_H_MAKE(priority << 16, htons(protocol));
ret = tc_get_tcm_parent(attach_point, &parent);
if (ret < 0)
return libbpf_err(ret);
req.tc.tcm_parent = parent;
ret = nlattr_add(&req, TCA_KIND, "bpf", sizeof("bpf"));
if (ret < 0)
return libbpf_err(ret);
info.opts = opts;
ret = libbpf_netlink_send_recv(&req, get_tc_info, NULL, &info);
if (ret < 0)
return libbpf_err(ret);
if (!info.processed)
return libbpf_err(-ENOENT);
return ret;
}

View File

@@ -7,10 +7,11 @@
*/
#include <errno.h>
#include "nlattr.h"
#include <linux/rtnetlink.h>
#include <string.h>
#include <stdio.h>
#include <linux/rtnetlink.h>
#include "nlattr.h"
#include "libbpf_internal.h"
static uint16_t nla_attr_minlen[LIBBPF_NLA_TYPE_MAX+1] = {
[LIBBPF_NLA_U8] = sizeof(uint8_t),
@@ -26,7 +27,7 @@ static struct nlattr *nla_next(const struct nlattr *nla, int *remaining)
int totlen = NLA_ALIGN(nla->nla_len);
*remaining -= totlen;
return (struct nlattr *) ((char *) nla + totlen);
return (struct nlattr *)((void *)nla + totlen);
}
static int nla_ok(const struct nlattr *nla, int remaining)
@@ -121,8 +122,8 @@ int libbpf_nla_parse(struct nlattr *tb[], int maxtype, struct nlattr *head,
}
if (tb[type])
fprintf(stderr, "Attribute of type %#x found multiple times in message, "
"previous attribute is being ignored.\n", type);
pr_warn("Attribute of type %#x found multiple times in message, "
"previous attribute is being ignored.\n", type);
tb[type] = nla;
}
@@ -181,15 +182,14 @@ int libbpf_nla_dump_errormsg(struct nlmsghdr *nlh)
if (libbpf_nla_parse(tb, NLMSGERR_ATTR_MAX, attr, alen,
extack_policy) != 0) {
fprintf(stderr,
"Failed to parse extended error attributes\n");
pr_warn("Failed to parse extended error attributes\n");
return 0;
}
if (tb[NLMSGERR_ATTR_MSG])
errmsg = (char *) libbpf_nla_data(tb[NLMSGERR_ATTR_MSG]);
fprintf(stderr, "Kernel error message: %s\n", errmsg);
pr_warn("Kernel error message: %s\n", errmsg);
return 0;
}

View File

@@ -10,7 +10,11 @@
#define __LIBBPF_NLATTR_H
#include <stdint.h>
#include <string.h>
#include <errno.h>
#include <linux/netlink.h>
#include <linux/rtnetlink.h>
/* avoid multiple definition of netlink features */
#define __LINUX_NETLINK_H
@@ -49,6 +53,15 @@ struct libbpf_nla_policy {
uint16_t maxlen;
};
struct libbpf_nla_req {
struct nlmsghdr nh;
union {
struct ifinfomsg ifinfo;
struct tcmsg tc;
};
char buf[128];
};
/**
* @ingroup attr
* Iterate over a stream of attributes
@@ -68,7 +81,7 @@ struct libbpf_nla_policy {
*/
static inline void *libbpf_nla_data(const struct nlattr *nla)
{
return (char *) nla + NLA_HDRLEN;
return (void *)nla + NLA_HDRLEN;
}
static inline uint8_t libbpf_nla_getattr_u8(const struct nlattr *nla)
@@ -103,4 +116,49 @@ int libbpf_nla_parse_nested(struct nlattr *tb[], int maxtype,
int libbpf_nla_dump_errormsg(struct nlmsghdr *nlh);
static inline struct nlattr *nla_data(struct nlattr *nla)
{
return (struct nlattr *)((void *)nla + NLA_HDRLEN);
}
static inline struct nlattr *req_tail(struct libbpf_nla_req *req)
{
return (struct nlattr *)((void *)req + NLMSG_ALIGN(req->nh.nlmsg_len));
}
static inline int nlattr_add(struct libbpf_nla_req *req, int type,
const void *data, int len)
{
struct nlattr *nla;
if (NLMSG_ALIGN(req->nh.nlmsg_len) + NLA_ALIGN(NLA_HDRLEN + len) > sizeof(*req))
return -EMSGSIZE;
if (!!data != !!len)
return -EINVAL;
nla = req_tail(req);
nla->nla_type = type;
nla->nla_len = NLA_HDRLEN + len;
if (data)
memcpy(nla_data(nla), data, len);
req->nh.nlmsg_len = NLMSG_ALIGN(req->nh.nlmsg_len) + NLA_ALIGN(nla->nla_len);
return 0;
}
static inline struct nlattr *nlattr_begin_nested(struct libbpf_nla_req *req, int type)
{
struct nlattr *tail;
tail = req_tail(req);
if (nlattr_add(req, type | NLA_F_NESTED, NULL, 0))
return NULL;
return tail;
}
static inline void nlattr_end_nested(struct libbpf_nla_req *req,
struct nlattr *tail)
{
tail->nla_len = (void *)req_tail(req) - (void *)tail;
}
#endif /* __LIBBPF_NLATTR_H */

1295
src/relo_core.c Normal file

File diff suppressed because it is too large Load Diff

100
src/relo_core.h Normal file
View File

@@ -0,0 +1,100 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* Copyright (c) 2019 Facebook */
#ifndef __RELO_CORE_H
#define __RELO_CORE_H
/* bpf_core_relo_kind encodes which aspect of captured field/type/enum value
* has to be adjusted by relocations.
*/
enum bpf_core_relo_kind {
BPF_FIELD_BYTE_OFFSET = 0, /* field byte offset */
BPF_FIELD_BYTE_SIZE = 1, /* field size in bytes */
BPF_FIELD_EXISTS = 2, /* field existence in target kernel */
BPF_FIELD_SIGNED = 3, /* field signedness (0 - unsigned, 1 - signed) */
BPF_FIELD_LSHIFT_U64 = 4, /* bitfield-specific left bitshift */
BPF_FIELD_RSHIFT_U64 = 5, /* bitfield-specific right bitshift */
BPF_TYPE_ID_LOCAL = 6, /* type ID in local BPF object */
BPF_TYPE_ID_TARGET = 7, /* type ID in target kernel */
BPF_TYPE_EXISTS = 8, /* type existence in target kernel */
BPF_TYPE_SIZE = 9, /* type size in bytes */
BPF_ENUMVAL_EXISTS = 10, /* enum value existence in target kernel */
BPF_ENUMVAL_VALUE = 11, /* enum value integer value */
};
/* The minimum bpf_core_relo checked by the loader
*
* CO-RE relocation captures the following data:
* - insn_off - instruction offset (in bytes) within a BPF program that needs
* its insn->imm field to be relocated with actual field info;
* - type_id - BTF type ID of the "root" (containing) entity of a relocatable
* type or field;
* - access_str_off - offset into corresponding .BTF string section. String
* interpretation depends on specific relocation kind:
* - for field-based relocations, string encodes an accessed field using
* a sequence of field and array indices, separated by colon (:). It's
* conceptually very close to LLVM's getelementptr ([0]) instruction's
* arguments for identifying offset to a field.
* - for type-based relocations, strings is expected to be just "0";
* - for enum value-based relocations, string contains an index of enum
* value within its enum type;
*
* Example to provide a better feel.
*
* struct sample {
* int a;
* struct {
* int b[10];
* };
* };
*
* struct sample *s = ...;
* int x = &s->a; // encoded as "0:0" (a is field #0)
* int y = &s->b[5]; // encoded as "0:1:0:5" (anon struct is field #1,
* // b is field #0 inside anon struct, accessing elem #5)
* int z = &s[10]->b; // encoded as "10:1" (ptr is used as an array)
*
* type_id for all relocs in this example will capture BTF type id of
* `struct sample`.
*
* Such relocation is emitted when using __builtin_preserve_access_index()
* Clang built-in, passing expression that captures field address, e.g.:
*
* bpf_probe_read(&dst, sizeof(dst),
* __builtin_preserve_access_index(&src->a.b.c));
*
* In this case Clang will emit field relocation recording necessary data to
* be able to find offset of embedded `a.b.c` field within `src` struct.
*
* [0] https://llvm.org/docs/LangRef.html#getelementptr-instruction
*/
struct bpf_core_relo {
__u32 insn_off;
__u32 type_id;
__u32 access_str_off;
enum bpf_core_relo_kind kind;
};
struct bpf_core_cand {
const struct btf *btf;
const struct btf_type *t;
const char *name;
__u32 id;
};
/* dynamically sized list of type IDs and its associated struct btf */
struct bpf_core_cand_list {
struct bpf_core_cand *cands;
int len;
};
int bpf_core_apply_relo_insn(const char *prog_name,
struct bpf_insn *insn, int insn_idx,
const struct bpf_core_relo *relo, int relo_idx,
const struct btf *local_btf,
struct bpf_core_cand_list *cands);
int bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id,
const struct btf *targ_btf, __u32 targ_id);
size_t bpf_core_essential_name_len(const char *name);
#endif

302
src/ringbuf.c Normal file
View File

@@ -0,0 +1,302 @@
// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
/*
* Ring buffer operations.
*
* Copyright (C) 2020 Facebook, Inc.
*/
#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include <linux/err.h>
#include <linux/bpf.h>
#include <asm/barrier.h>
#include <sys/mman.h>
#include <sys/epoll.h>
#include "libbpf.h"
#include "libbpf_internal.h"
#include "bpf.h"
struct ring {
ring_buffer_sample_fn sample_cb;
void *ctx;
void *data;
unsigned long *consumer_pos;
unsigned long *producer_pos;
unsigned long mask;
int map_fd;
};
struct ring_buffer {
struct epoll_event *events;
struct ring *rings;
size_t page_size;
int epoll_fd;
int ring_cnt;
};
static void ringbuf_unmap_ring(struct ring_buffer *rb, struct ring *r)
{
if (r->consumer_pos) {
munmap(r->consumer_pos, rb->page_size);
r->consumer_pos = NULL;
}
if (r->producer_pos) {
munmap(r->producer_pos, rb->page_size + 2 * (r->mask + 1));
r->producer_pos = NULL;
}
}
/* Add extra RINGBUF maps to this ring buffer manager */
int ring_buffer__add(struct ring_buffer *rb, int map_fd,
ring_buffer_sample_fn sample_cb, void *ctx)
{
struct bpf_map_info info;
__u32 len = sizeof(info);
struct epoll_event *e;
struct ring *r;
void *tmp;
int err;
memset(&info, 0, sizeof(info));
err = bpf_obj_get_info_by_fd(map_fd, &info, &len);
if (err) {
err = -errno;
pr_warn("ringbuf: failed to get map info for fd=%d: %d\n",
map_fd, err);
return libbpf_err(err);
}
if (info.type != BPF_MAP_TYPE_RINGBUF) {
pr_warn("ringbuf: map fd=%d is not BPF_MAP_TYPE_RINGBUF\n",
map_fd);
return libbpf_err(-EINVAL);
}
tmp = libbpf_reallocarray(rb->rings, rb->ring_cnt + 1, sizeof(*rb->rings));
if (!tmp)
return libbpf_err(-ENOMEM);
rb->rings = tmp;
tmp = libbpf_reallocarray(rb->events, rb->ring_cnt + 1, sizeof(*rb->events));
if (!tmp)
return libbpf_err(-ENOMEM);
rb->events = tmp;
r = &rb->rings[rb->ring_cnt];
memset(r, 0, sizeof(*r));
r->map_fd = map_fd;
r->sample_cb = sample_cb;
r->ctx = ctx;
r->mask = info.max_entries - 1;
/* Map writable consumer page */
tmp = mmap(NULL, rb->page_size, PROT_READ | PROT_WRITE, MAP_SHARED,
map_fd, 0);
if (tmp == MAP_FAILED) {
err = -errno;
pr_warn("ringbuf: failed to mmap consumer page for map fd=%d: %d\n",
map_fd, err);
return libbpf_err(err);
}
r->consumer_pos = tmp;
/* Map read-only producer page and data pages. We map twice as big
* data size to allow simple reading of samples that wrap around the
* end of a ring buffer. See kernel implementation for details.
* */
tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ,
MAP_SHARED, map_fd, rb->page_size);
if (tmp == MAP_FAILED) {
err = -errno;
ringbuf_unmap_ring(rb, r);
pr_warn("ringbuf: failed to mmap data pages for map fd=%d: %d\n",
map_fd, err);
return libbpf_err(err);
}
r->producer_pos = tmp;
r->data = tmp + rb->page_size;
e = &rb->events[rb->ring_cnt];
memset(e, 0, sizeof(*e));
e->events = EPOLLIN;
e->data.fd = rb->ring_cnt;
if (epoll_ctl(rb->epoll_fd, EPOLL_CTL_ADD, map_fd, e) < 0) {
err = -errno;
ringbuf_unmap_ring(rb, r);
pr_warn("ringbuf: failed to epoll add map fd=%d: %d\n",
map_fd, err);
return libbpf_err(err);
}
rb->ring_cnt++;
return 0;
}
void ring_buffer__free(struct ring_buffer *rb)
{
int i;
if (!rb)
return;
for (i = 0; i < rb->ring_cnt; ++i)
ringbuf_unmap_ring(rb, &rb->rings[i]);
if (rb->epoll_fd >= 0)
close(rb->epoll_fd);
free(rb->events);
free(rb->rings);
free(rb);
}
struct ring_buffer *
ring_buffer__new(int map_fd, ring_buffer_sample_fn sample_cb, void *ctx,
const struct ring_buffer_opts *opts)
{
struct ring_buffer *rb;
int err;
if (!OPTS_VALID(opts, ring_buffer_opts))
return errno = EINVAL, NULL;
rb = calloc(1, sizeof(*rb));
if (!rb)
return errno = ENOMEM, NULL;
rb->page_size = getpagesize();
rb->epoll_fd = epoll_create1(EPOLL_CLOEXEC);
if (rb->epoll_fd < 0) {
err = -errno;
pr_warn("ringbuf: failed to create epoll instance: %d\n", err);
goto err_out;
}
err = ring_buffer__add(rb, map_fd, sample_cb, ctx);
if (err)
goto err_out;
return rb;
err_out:
ring_buffer__free(rb);
return errno = -err, NULL;
}
static inline int roundup_len(__u32 len)
{
/* clear out top 2 bits (discard and busy, if set) */
len <<= 2;
len >>= 2;
/* add length prefix */
len += BPF_RINGBUF_HDR_SZ;
/* round up to 8 byte alignment */
return (len + 7) / 8 * 8;
}
static int64_t ringbuf_process_ring(struct ring* r)
{
int *len_ptr, len, err;
/* 64-bit to avoid overflow in case of extreme application behavior */
int64_t cnt = 0;
unsigned long cons_pos, prod_pos;
bool got_new_data;
void *sample;
cons_pos = smp_load_acquire(r->consumer_pos);
do {
got_new_data = false;
prod_pos = smp_load_acquire(r->producer_pos);
while (cons_pos < prod_pos) {
len_ptr = r->data + (cons_pos & r->mask);
len = smp_load_acquire(len_ptr);
/* sample not committed yet, bail out for now */
if (len & BPF_RINGBUF_BUSY_BIT)
goto done;
got_new_data = true;
cons_pos += roundup_len(len);
if ((len & BPF_RINGBUF_DISCARD_BIT) == 0) {
sample = (void *)len_ptr + BPF_RINGBUF_HDR_SZ;
err = r->sample_cb(r->ctx, sample, len);
if (err < 0) {
/* update consumer pos and bail out */
smp_store_release(r->consumer_pos,
cons_pos);
return err;
}
cnt++;
}
smp_store_release(r->consumer_pos, cons_pos);
}
} while (got_new_data);
done:
return cnt;
}
/* Consume available ring buffer(s) data without event polling.
* Returns number of records consumed across all registered ring buffers (or
* INT_MAX, whichever is less), or negative number if any of the callbacks
* return error.
*/
int ring_buffer__consume(struct ring_buffer *rb)
{
int64_t err, res = 0;
int i;
for (i = 0; i < rb->ring_cnt; i++) {
struct ring *ring = &rb->rings[i];
err = ringbuf_process_ring(ring);
if (err < 0)
return libbpf_err(err);
res += err;
}
if (res > INT_MAX)
return INT_MAX;
return res;
}
/* Poll for available data and consume records, if any are available.
* Returns number of records consumed (or INT_MAX, whichever is less), or
* negative number, if any of the registered callbacks returned error.
*/
int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms)
{
int i, cnt;
int64_t err, res = 0;
cnt = epoll_wait(rb->epoll_fd, rb->events, rb->ring_cnt, timeout_ms);
if (cnt < 0)
return libbpf_err(-errno);
for (i = 0; i < cnt; i++) {
__u32 ring_id = rb->events[i].data.fd;
struct ring *ring = &rb->rings[ring_id];
err = ringbuf_process_ring(ring);
if (err < 0)
return libbpf_err(err);
res += err;
}
if (res > INT_MAX)
return INT_MAX;
return res;
}
/* Get an fd that can be used to sleep until data is available in the ring(s) */
int ring_buffer__epoll_fd(const struct ring_buffer *rb)
{
return rb->epoll_fd;
}

125
src/skel_internal.h Normal file
View File

@@ -0,0 +1,125 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* Copyright (c) 2021 Facebook */
#ifndef __SKEL_INTERNAL_H
#define __SKEL_INTERNAL_H
#include <unistd.h>
#include <sys/syscall.h>
#include <sys/mman.h>
/* This file is a base header for auto-generated *.lskel.h files.
* Its contents will change and may become part of auto-generation in the future.
*
* The layout of bpf_[map|prog]_desc and bpf_loader_ctx is feature dependent
* and will change from one version of libbpf to another and features
* requested during loader program generation.
*/
struct bpf_map_desc {
union {
/* input for the loader prog */
struct {
__aligned_u64 initial_value;
__u32 max_entries;
};
/* output of the loader prog */
struct {
int map_fd;
};
};
};
struct bpf_prog_desc {
int prog_fd;
};
struct bpf_loader_ctx {
size_t sz;
__u32 log_level;
__u32 log_size;
__u64 log_buf;
};
struct bpf_load_and_run_opts {
struct bpf_loader_ctx *ctx;
const void *data;
const void *insns;
__u32 data_sz;
__u32 insns_sz;
const char *errstr;
};
static inline int skel_sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
unsigned int size)
{
return syscall(__NR_bpf, cmd, attr, size);
}
static inline int skel_closenz(int fd)
{
if (fd > 0)
return close(fd);
return -EINVAL;
}
static inline int bpf_load_and_run(struct bpf_load_and_run_opts *opts)
{
int map_fd = -1, prog_fd = -1, key = 0, err;
union bpf_attr attr;
map_fd = bpf_create_map_name(BPF_MAP_TYPE_ARRAY, "__loader.map", 4,
opts->data_sz, 1, 0);
if (map_fd < 0) {
opts->errstr = "failed to create loader map";
err = -errno;
goto out;
}
err = bpf_map_update_elem(map_fd, &key, opts->data, 0);
if (err < 0) {
opts->errstr = "failed to update loader map";
err = -errno;
goto out;
}
memset(&attr, 0, sizeof(attr));
attr.prog_type = BPF_PROG_TYPE_SYSCALL;
attr.insns = (long) opts->insns;
attr.insn_cnt = opts->insns_sz / sizeof(struct bpf_insn);
attr.license = (long) "Dual BSD/GPL";
memcpy(attr.prog_name, "__loader.prog", sizeof("__loader.prog"));
attr.fd_array = (long) &map_fd;
attr.log_level = opts->ctx->log_level;
attr.log_size = opts->ctx->log_size;
attr.log_buf = opts->ctx->log_buf;
attr.prog_flags = BPF_F_SLEEPABLE;
prog_fd = skel_sys_bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
if (prog_fd < 0) {
opts->errstr = "failed to load loader prog";
err = -errno;
goto out;
}
memset(&attr, 0, sizeof(attr));
attr.test.prog_fd = prog_fd;
attr.test.ctx_in = (long) opts->ctx;
attr.test.ctx_size_in = opts->ctx->sz;
err = skel_sys_bpf(BPF_PROG_RUN, &attr, sizeof(attr));
if (err < 0 || (int)attr.test.retval < 0) {
opts->errstr = "failed to execute loader prog";
if (err < 0) {
err = -errno;
} else {
err = (int)attr.test.retval;
errno = -err;
}
goto out;
}
err = 0;
out:
if (map_fd >= 0)
close(map_fd);
if (prog_fd >= 0)
close(prog_fd);
return err;
}
#endif

View File

@@ -4,6 +4,9 @@
#include <stdio.h>
#include "str_error.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
/*
* Wrapper to allow for building in non-GNU systems such as Alpine Linux's musl
* libc, while checking strerror_r() return to avoid having to check this in

177
src/strset.c Normal file
View File

@@ -0,0 +1,177 @@
// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
/* Copyright (c) 2021 Facebook */
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <linux/err.h>
#include "hashmap.h"
#include "libbpf_internal.h"
#include "strset.h"
struct strset {
void *strs_data;
size_t strs_data_len;
size_t strs_data_cap;
size_t strs_data_max_len;
/* lookup index for each unique string in strings set */
struct hashmap *strs_hash;
};
static size_t strset_hash_fn(const void *key, void *ctx)
{
const struct strset *s = ctx;
const char *str = s->strs_data + (long)key;
return str_hash(str);
}
static bool strset_equal_fn(const void *key1, const void *key2, void *ctx)
{
const struct strset *s = ctx;
const char *str1 = s->strs_data + (long)key1;
const char *str2 = s->strs_data + (long)key2;
return strcmp(str1, str2) == 0;
}
struct strset *strset__new(size_t max_data_sz, const char *init_data, size_t init_data_sz)
{
struct strset *set = calloc(1, sizeof(*set));
struct hashmap *hash;
int err = -ENOMEM;
if (!set)
return ERR_PTR(-ENOMEM);
hash = hashmap__new(strset_hash_fn, strset_equal_fn, set);
if (IS_ERR(hash))
goto err_out;
set->strs_data_max_len = max_data_sz;
set->strs_hash = hash;
if (init_data) {
long off;
set->strs_data = malloc(init_data_sz);
if (!set->strs_data)
goto err_out;
memcpy(set->strs_data, init_data, init_data_sz);
set->strs_data_len = init_data_sz;
set->strs_data_cap = init_data_sz;
for (off = 0; off < set->strs_data_len; off += strlen(set->strs_data + off) + 1) {
/* hashmap__add() returns EEXIST if string with the same
* content already is in the hash map
*/
err = hashmap__add(hash, (void *)off, (void *)off);
if (err == -EEXIST)
continue; /* duplicate */
if (err)
goto err_out;
}
}
return set;
err_out:
strset__free(set);
return ERR_PTR(err);
}
void strset__free(struct strset *set)
{
if (IS_ERR_OR_NULL(set))
return;
hashmap__free(set->strs_hash);
free(set->strs_data);
free(set);
}
size_t strset__data_size(const struct strset *set)
{
return set->strs_data_len;
}
const char *strset__data(const struct strset *set)
{
return set->strs_data;
}
static void *strset_add_str_mem(struct strset *set, size_t add_sz)
{
return libbpf_add_mem(&set->strs_data, &set->strs_data_cap, 1,
set->strs_data_len, set->strs_data_max_len, add_sz);
}
/* Find string offset that corresponds to a given string *s*.
* Returns:
* - >0 offset into string data, if string is found;
* - -ENOENT, if string is not in the string data;
* - <0, on any other error.
*/
int strset__find_str(struct strset *set, const char *s)
{
long old_off, new_off, len;
void *p;
/* see strset__add_str() for why we do this */
len = strlen(s) + 1;
p = strset_add_str_mem(set, len);
if (!p)
return -ENOMEM;
new_off = set->strs_data_len;
memcpy(p, s, len);
if (hashmap__find(set->strs_hash, (void *)new_off, (void **)&old_off))
return old_off;
return -ENOENT;
}
/* Add a string s to the string data. If the string already exists, return its
* offset within string data.
* Returns:
* - > 0 offset into string data, on success;
* - < 0, on error.
*/
int strset__add_str(struct strset *set, const char *s)
{
long old_off, new_off, len;
void *p;
int err;
/* Hashmap keys are always offsets within set->strs_data, so to even
* look up some string from the "outside", we need to first append it
* at the end, so that it can be addressed with an offset. Luckily,
* until set->strs_data_len is incremented, that string is just a piece
* of garbage for the rest of the code, so no harm, no foul. On the
* other hand, if the string is unique, it's already appended and
* ready to be used, only a simple set->strs_data_len increment away.
*/
len = strlen(s) + 1;
p = strset_add_str_mem(set, len);
if (!p)
return -ENOMEM;
new_off = set->strs_data_len;
memcpy(p, s, len);
/* Now attempt to add the string, but only if the string with the same
* contents doesn't exist already (HASHMAP_ADD strategy). If such
* string exists, we'll get its offset in old_off (that's old_key).
*/
err = hashmap__insert(set->strs_hash, (void *)new_off, (void *)new_off,
HASHMAP_ADD, (const void **)&old_off, NULL);
if (err == -EEXIST)
return old_off; /* duplicated string, return existing offset */
if (err)
return err;
set->strs_data_len += len; /* new unique string, adjust data length */
return new_off;
}

21
src/strset.h Normal file
View File

@@ -0,0 +1,21 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* Copyright (c) 2021 Facebook */
#ifndef __LIBBPF_STRSET_H
#define __LIBBPF_STRSET_H
#include <stdbool.h>
#include <stddef.h>
struct strset;
struct strset *strset__new(size_t max_data_sz, const char *init_data, size_t init_data_sz);
void strset__free(struct strset *set);
const char *strset__data(const struct strset *set);
size_t strset__data_size(const struct strset *set);
int strset__find_str(struct strset *set, const char *s);
int strset__add_str(struct strset *set, const char *s);
#endif /* __LIBBPF_STRSET_H */

950
src/xsk.c

File diff suppressed because it is too large Load Diff

118
src/xsk.h
View File

@@ -3,7 +3,8 @@
/*
* AF_XDP user-space access library.
*
* Copyright(c) 2018 - 2019 Intel Corporation.
* Copyright (c) 2018 - 2019 Intel Corporation.
* Copyright (c) 2019 Facebook
*
* Author(s): Magnus Karlsson <magnus.karlsson@intel.com>
*/
@@ -13,15 +14,80 @@
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
#include <linux/if_xdp.h>
#include "libbpf.h"
#include "libbpf_util.h"
#ifdef __cplusplus
extern "C" {
#endif
/* Load-Acquire Store-Release barriers used by the XDP socket
* library. The following macros should *NOT* be considered part of
* the xsk.h API, and is subject to change anytime.
*
* LIBRARY INTERNAL
*/
#define __XSK_READ_ONCE(x) (*(volatile typeof(x) *)&x)
#define __XSK_WRITE_ONCE(x, v) (*(volatile typeof(x) *)&x) = (v)
#if defined(__i386__) || defined(__x86_64__)
# define libbpf_smp_store_release(p, v) \
do { \
asm volatile("" : : : "memory"); \
__XSK_WRITE_ONCE(*p, v); \
} while (0)
# define libbpf_smp_load_acquire(p) \
({ \
typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \
asm volatile("" : : : "memory"); \
___p1; \
})
#elif defined(__aarch64__)
# define libbpf_smp_store_release(p, v) \
asm volatile ("stlr %w1, %0" : "=Q" (*p) : "r" (v) : "memory")
# define libbpf_smp_load_acquire(p) \
({ \
typeof(*p) ___p1; \
asm volatile ("ldar %w0, %1" \
: "=r" (___p1) : "Q" (*p) : "memory"); \
___p1; \
})
#elif defined(__riscv)
# define libbpf_smp_store_release(p, v) \
do { \
asm volatile ("fence rw,w" : : : "memory"); \
__XSK_WRITE_ONCE(*p, v); \
} while (0)
# define libbpf_smp_load_acquire(p) \
({ \
typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \
asm volatile ("fence r,rw" : : : "memory"); \
___p1; \
})
#endif
#ifndef libbpf_smp_store_release
#define libbpf_smp_store_release(p, v) \
do { \
__sync_synchronize(); \
__XSK_WRITE_ONCE(*p, v); \
} while (0)
#endif
#ifndef libbpf_smp_load_acquire
#define libbpf_smp_load_acquire(p) \
({ \
typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \
__sync_synchronize(); \
___p1; \
})
#endif
/* LIBRARY INTERNAL -- END */
/* Do not access these members directly. Use the functions below. */
#define DEFINE_XSK_RING(name) \
struct name { \
@@ -96,7 +162,8 @@ static inline __u32 xsk_prod_nb_free(struct xsk_ring_prod *r, __u32 nb)
* this function. Without this optimization it whould have been
* free_entries = r->cached_prod - r->cached_cons + r->size.
*/
r->cached_cons = *r->consumer + r->size;
r->cached_cons = libbpf_smp_load_acquire(r->consumer);
r->cached_cons += r->size;
return r->cached_cons - r->cached_prod;
}
@@ -106,15 +173,14 @@ static inline __u32 xsk_cons_nb_avail(struct xsk_ring_cons *r, __u32 nb)
__u32 entries = r->cached_prod - r->cached_cons;
if (entries == 0) {
r->cached_prod = *r->producer;
r->cached_prod = libbpf_smp_load_acquire(r->producer);
entries = r->cached_prod - r->cached_cons;
}
return (entries > nb) ? nb : entries;
}
static inline size_t xsk_ring_prod__reserve(struct xsk_ring_prod *prod,
size_t nb, __u32 *idx)
static inline __u32 xsk_ring_prod__reserve(struct xsk_ring_prod *prod, __u32 nb, __u32 *idx)
{
if (xsk_prod_nb_free(prod, nb) < nb)
return 0;
@@ -125,27 +191,19 @@ static inline size_t xsk_ring_prod__reserve(struct xsk_ring_prod *prod,
return nb;
}
static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, size_t nb)
static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, __u32 nb)
{
/* Make sure everything has been written to the ring before indicating
* this to the kernel by writing the producer pointer.
*/
libbpf_smp_wmb();
*prod->producer += nb;
libbpf_smp_store_release(prod->producer, *prod->producer + nb);
}
static inline size_t xsk_ring_cons__peek(struct xsk_ring_cons *cons,
size_t nb, __u32 *idx)
static inline __u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __u32 *idx)
{
size_t entries = xsk_cons_nb_avail(cons, nb);
__u32 entries = xsk_cons_nb_avail(cons, nb);
if (entries > 0) {
/* Make sure we do not speculatively read the data before
* we have received the packet buffers from the ring.
*/
libbpf_smp_rmb();
*idx = cons->cached_cons;
cons->cached_cons += entries;
}
@@ -153,14 +211,18 @@ static inline size_t xsk_ring_cons__peek(struct xsk_ring_cons *cons,
return entries;
}
static inline void xsk_ring_cons__release(struct xsk_ring_cons *cons, size_t nb)
static inline void xsk_ring_cons__cancel(struct xsk_ring_cons *cons, __u32 nb)
{
cons->cached_cons -= nb;
}
static inline void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb)
{
/* Make sure data has been read before indicating we are done
* with the entries by updating the consumer pointer.
*/
libbpf_smp_rwmb();
libbpf_smp_store_release(cons->consumer, *cons->consumer + nb);
*cons->consumer += nb;
}
static inline void *xsk_umem__get_data(void *umem_area, __u64 addr)
@@ -201,6 +263,11 @@ struct xsk_umem_config {
__u32 flags;
};
LIBBPF_API int xsk_setup_xdp_prog(int ifindex,
int *xsks_map_fd);
LIBBPF_API int xsk_socket__update_xskmap(struct xsk_socket *xsk,
int xsks_map_fd);
/* Flags for the libbpf_flags field. */
#define XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD (1 << 0)
@@ -234,6 +301,15 @@ LIBBPF_API int xsk_socket__create(struct xsk_socket **xsk,
struct xsk_ring_cons *rx,
struct xsk_ring_prod *tx,
const struct xsk_socket_config *config);
LIBBPF_API int
xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
const char *ifname,
__u32 queue_id, struct xsk_umem *umem,
struct xsk_ring_cons *rx,
struct xsk_ring_prod *tx,
struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_socket_config *config);
/* Returns 0 for success and -EBUSY if the umem is still in use. */
LIBBPF_API int xsk_umem__delete(struct xsk_umem *umem);

View File

@@ -2,11 +2,11 @@
PHASES=(${@:-SETUP RUN RUN_ASAN CLEANUP})
DEBIAN_RELEASE="${DEBIAN_RELEASE:-testing}"
CONT_NAME="${CONT_NAME:-debian-$DEBIAN_RELEASE-$RANDOM}"
CONT_NAME="${CONT_NAME:-libbpf-debian-$DEBIAN_RELEASE}"
ENV_VARS="${ENV_VARS:-}"
DOCKER_RUN="${DOCKER_RUN:-docker run}"
REPO_ROOT="${REPO_ROOT:-$PWD}"
ADDITIONAL_DEPS=(clang pkg-config gcc-8)
ADDITIONAL_DEPS=(clang pkg-config gcc-10)
CFLAGS="-g -O2 -Werror -Wall"
function info() {
@@ -18,10 +18,10 @@ function error() {
}
function docker_exec() {
docker exec $ENV_VARS -it $CONT_NAME "$@"
docker exec $ENV_VARS $CONT_NAME "$@"
}
set -e
set -eu
source "$(dirname $0)/travis_wait.bash"
@@ -30,32 +30,42 @@ for phase in "${PHASES[@]}"; do
SETUP)
info "Setup phase"
info "Using Debian $DEBIAN_RELEASE"
docker --version
docker pull debian:$DEBIAN_RELEASE
info "Starting container $CONT_NAME"
$DOCKER_RUN -v $REPO_ROOT:/build:rw \
-w /build --privileged=true --name $CONT_NAME \
-dit --net=host debian:$DEBIAN_RELEASE /bin/bash
echo -e "::group::Build Env Setup"
docker_exec bash -c "echo deb-src http://deb.debian.org/debian $DEBIAN_RELEASE main >>/etc/apt/sources.list"
docker_exec apt-get -y update
docker_exec apt-get -y build-dep libelf-dev
docker_exec apt-get -y install libelf-dev
docker_exec apt-get -y install "${ADDITIONAL_DEPS[@]}"
docker_exec apt-get -y install aptitude
docker_exec aptitude -y build-dep libelf-dev
docker_exec aptitude -y install libelf-dev
docker_exec aptitude -y install "${ADDITIONAL_DEPS[@]}"
echo -e "::endgroup::"
;;
RUN|RUN_CLANG|RUN_GCC8|RUN_ASAN|RUN_CLANG_ASAN|RUN_GCC8_ASAN)
RUN|RUN_CLANG|RUN_GCC10|RUN_ASAN|RUN_CLANG_ASAN|RUN_GCC10_ASAN)
CC="cc"
if [[ "$phase" = *"CLANG"* ]]; then
ENV_VARS="-e CC=clang -e CXX=clang++"
CC="clang"
elif [[ "$phase" = *"GCC8"* ]]; then
ENV_VARS="-e CC=gcc-8 -e CXX=g++-8"
CC="gcc-8"
elif [[ "$phase" = *"GCC10"* ]]; then
ENV_VARS="-e CC=gcc-10 -e CXX=g++-10"
CC="gcc-10"
CFLAGS="${CFLAGS} -Wno-stringop-truncation"
else
CFLAGS="${CFLAGS} -Wno-stringop-truncation"
fi
if [[ "$phase" = *"ASAN"* ]]; then
CFLAGS="${CFLAGS} -fsanitize=address,undefined"
fi
docker_exec mkdir build install
docker_exec ${CC:-cc} --version
docker_exec ${CC} --version
info "build"
docker_exec make CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
docker_exec make -j$((4*$(nproc))) CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
info "ldd build/libbpf.so:"
docker_exec ldd build/libbpf.so
if ! docker_exec ldd build/libbpf.so | grep -q libelf; then
@@ -63,8 +73,9 @@ for phase in "${PHASES[@]}"; do
exit 1
fi
info "install"
docker_exec make -C src OBJDIR=../build DESTDIR=../install install
docker_exec rm -rf build install
docker_exec make -j$((4*$(nproc))) -C src OBJDIR=../build DESTDIR=../install install
info "link binary"
docker_exec bash -c "CFLAGS=\"${CFLAGS}\" ./travis-ci/managers/test_compile.sh"
;;
CLEANUP)
info "Cleanup phase"

View File

@@ -0,0 +1,14 @@
#!/bin/bash
set -euox pipefail
CFLAGS=${CFLAGS:-}
cat << EOF > main.c
#include <bpf/libbpf.h>
int main() {
return bpf_object__open(0) < 0;
}
EOF
# static linking
${CC:-cc} ${CFLAGS} -o main -I./install/usr/include main.c ./build/libbpf.a -lelf -lz

23
travis-ci/managers/ubuntu.sh Executable file
View File

@@ -0,0 +1,23 @@
#!/bin/bash
set -eux
RELEASE="focal"
apt-get update
apt-get install -y pkg-config
source "$(dirname $0)/travis_wait.bash"
cd $REPO_ROOT
CFLAGS="-g -O2 -Werror -Wall -fsanitize=address,undefined -Wno-stringop-truncation"
mkdir build install
cc --version
make -j$((4*$(nproc))) CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
ldd build/libbpf.so
if ! ldd build/libbpf.so | grep -q libelf; then
echo "FAIL: No reference to libelf.so in libbpf.so!"
exit 1
fi
make -j$((4*$(nproc))) -C src OBJDIR=../build DESTDIR=../install install
CFLAGS=${CFLAGS} $(dirname $0)/test_compile.sh

View File

@@ -1,23 +0,0 @@
#!/bin/bash
set -e
set -x
apt-get update
apt-get -y build-dep libelf-dev
apt-get install -y libelf-dev
source "$(dirname $0)/travis_wait.bash"
cd $REPO_ROOT
CFLAGS="-g -O2 -Werror -Wall -fsanitize=address,undefined"
mkdir build install
cc --version
make CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
ldd build/libbpf.so
if ! ldd build/libbpf.so | grep -q libelf; then
echo "FAIL: No reference to libelf.so in libbpf.so!"
exit 1
fi
make -C src OBJDIR=../build DESTDIR=../install install
rm -rf build install

View File

@@ -0,0 +1,34 @@
#!/bin/bash
set -eu
source $(cd $(dirname $0) && pwd)/helpers.sh
CWD=$(pwd)
REPO_PATH=$1
PAHOLE_ORIGIN=${PAHOLE_ORIGIN:-https://git.kernel.org/pub/scm/devel/pahole/pahole.git}
PAHOLE_BRANCH=${PAHOLE_BRANCH:-master}
travis_fold start build_pahole "Building pahole ${PAHOLE_ORIGIN} ${PAHOLE_BRANCH}"
mkdir -p ${REPO_PATH}
cd ${REPO_PATH}
git init
git remote add origin ${PAHOLE_ORIGIN}
git fetch origin
git checkout ${PAHOLE_BRANCH}
# temporary work-around to bump pahole to 1.22 before it is officially released
sed -i 's/DDWARVES_MINOR_VERSION=21/DDWARVES_MINOR_VERSION=22/' CMakeLists.txt
mkdir -p build
cd build
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -D__LIB=lib ..
make -j$((4*$(nproc))) all
sudo make install
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:-}:/usr/local/lib
ldd $(which pahole)
pahole --version
travis_fold end build_pahole

View File

@@ -0,0 +1,42 @@
#!/bin/bash
set -euo pipefail
source $(cd $(dirname $0) && pwd)/helpers.sh
travis_fold start prepare_selftests "Building selftests"
sudo apt-get -y install python-docutils # for rst2man
LLVM_VER=14
LIBBPF_PATH="${REPO_ROOT}"
PREPARE_SELFTESTS_SCRIPT=${VMTEST_ROOT}/prepare_selftests-${KERNEL}.sh
if [ -f "${PREPARE_SELFTESTS_SCRIPT}" ]; then
(cd "${REPO_ROOT}/${REPO_PATH}/tools/testing/selftests/bpf" && ${PREPARE_SELFTESTS_SCRIPT})
fi
if [[ "${KERNEL}" = 'LATEST' ]]; then
VMLINUX_H=
else
VMLINUX_H=${VMTEST_ROOT}/vmlinux.h
fi
make \
CLANG=clang-${LLVM_VER} \
LLC=llc-${LLVM_VER} \
LLVM_STRIP=llvm-strip-${LLVM_VER} \
VMLINUX_BTF="${VMLINUX_BTF}" \
VMLINUX_H=${VMLINUX_H} \
-C "${REPO_ROOT}/${REPO_PATH}/tools/testing/selftests/bpf" \
-j $((4*$(nproc))) >/dev/null
mkdir ${LIBBPF_PATH}/selftests
cp -R "${REPO_ROOT}/${REPO_PATH}/tools/testing/selftests/bpf" \
${LIBBPF_PATH}/selftests
cd ${LIBBPF_PATH}
rm selftests/bpf/.gitignore
git add selftests
git add "${VMTEST_ROOT}/configs/blacklist"
travis_fold end prepare_selftests

View File

@@ -0,0 +1,54 @@
#!/bin/bash
set -eu
source $(cd $(dirname $0) && pwd)/helpers.sh
CWD=$(pwd)
LIBBPF_PATH=$(pwd)
REPO_PATH=$1
KERNEL_ORIGIN=${KERNEL_ORIGIN:-https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git}
KERNEL_BRANCH=${KERNEL_BRANCH:-CHECKPOINT}
if [[ "${KERNEL_BRANCH}" = 'CHECKPOINT' ]]; then
echo "using CHECKPOINT sha1"
LINUX_SHA=$(cat ${LIBBPF_PATH}/CHECKPOINT-COMMIT)
else
echo "using ${KERNEL_BRANCH} sha1"
LINUX_SHA=$(git ls-remote ${KERNEL_ORIGIN} ${KERNEL_BRANCH} | awk '{print $1}')
fi
SNAPSHOT_URL=${KERNEL_ORIGIN}/snapshot/bpf-next-${LINUX_SHA}.tar.gz
echo REPO_PATH = ${REPO_PATH}
echo KERNEL_ORIGIN = ${KERNEL_ORIGIN}
echo LINUX_SHA = ${LINUX_SHA}
echo SNAPSHOT_URL = ${SNAPSHOT_URL}
if [ ! -d "${REPO_PATH}" ]; then
echo
travis_fold start pull_kernel_srcs "Fetching kernel sources"
mkdir -p $(dirname "${REPO_PATH}")
cd $(dirname "${REPO_PATH}")
# attempt to fetch desired bpf-next repo snapshot
if wget -nv ${SNAPSHOT_URL} && tar xf bpf-next-${LINUX_SHA}.tar.gz --totals ; then
mv bpf-next-${LINUX_SHA} $(basename ${REPO_PATH})
else
# but fallback to git fetch approach if that fails
mkdir -p $(basename ${REPO_PATH})
cd $(basename ${REPO_PATH})
git init
git remote add bpf-next ${KERNEL_ORIGIN}
# try shallow clone first
git fetch --depth 32 bpf-next
# check if desired SHA exists
if ! git cat-file -e ${LINUX_SHA}^{commit} ; then
# if not, fetch all of bpf-next; slow and painful
git fetch bpf-next
fi
git reset --hard ${LINUX_SHA}
fi
travis_fold end pull_kernel_srcs
fi

View File

@@ -0,0 +1,8 @@
INDEX https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/INDEX
libbpf-vmtest-rootfs-2020.09.27.tar.zst https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/libbpf-vmtest-rootfs-2020.09.27.tar.zst
vmlinux-4.9.0.zst https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinux-4.9.0.zst
vmlinux-5.5.0-rc6.zst https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinux-5.5.0-rc6.zst
vmlinux-5.5.0.zst https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinux-5.5.0.zst
vmlinuz-5.5.0-rc6 https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinuz-5.5.0-rc6
vmlinuz-5.5.0 https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinuz-5.5.0
vmlinuz-4.9.0 https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinuz-4.9.0

View File

@@ -0,0 +1,114 @@
# PERMANENTLY DISABLED
align # verifier output format changed
atomics # new atomic operations (v5.12+)
atomic_bounds # new atomic operations (v5.12+)
bind_perm # changed semantics of return values (v5.12+)
bpf_cookie # 5.15+
bpf_iter # bpf_iter support is missing
bpf_obj_id # bpf_link support missing for GET_OBJ_INFO, GET_FD_BY_ID, etc
bpf_tcp_ca # STRUCT_OPS is missing
btf_map_in_map # inner map leak fixed in 5.8
btf_skc_cls_ingress # v5.10+ functionality
cg_storage_multi # v5.9+ functionality
cgroup_attach_multi # BPF_F_REPLACE_PROG missing
cgroup_link # LINK_CREATE is missing
cgroup_skb_sk_lookup # bpf_sk_lookup_tcp() helper is missing
check_mtu # missing BPF helper (v5.12+)
cls_redirect # bpf_csum_level() helper is missing
connect_force_port # cgroup/get{peer,sock}name{4,6} support is missing
d_path # v5.10+ feature
enable_stats # BPF_ENABLE_STATS support is missing
fentry_fexit # bpf_prog_test_tracing missing
fentry_test # bpf_prog_test_tracing missing
fexit_bpf2bpf # freplace is missing
fexit_sleep # relies on bpf_trampoline fix in 5.12+
fexit_test # bpf_prog_test_tracing missing
flow_dissector # bpf_link-based flow dissector is in 5.8+
flow_dissector_reattach
for_each # v5.12+
get_func_ip_test # v5.15+
get_stack_raw_tp # exercising BPF verifier bug causing infinite loop
hash_large_key # v5.11+
ima # v5.11+
kfree_skb # 32-bit pointer arith in test_pkt_access
ksyms # __start_BTF has different name
kfunc_call # v5.13+
link_pinning # bpf_link is missing
linked_vars # v5.13+
load_bytes_relative # new functionality in 5.8
lookup_and_delete # v5.14+
map_init # per-CPU LRU missing
map_ptr # test uses BPF_MAP_TYPE_RINGBUF, added in 5.8
metadata # v5.10+
migrate_reuseport # v5.14+
mmap # 5.5 kernel is too permissive with re-mmaping
modify_return # fmod_ret support is missing
module_attach # module BTF support missing (v5.11+)
netcnt
netns_cookie # v5.15+
ns_current_pid_tgid # bpf_get_ns_current_pid_tgid() helper is missing
pe_preserve_elems # v5.10+
perf_branches # bpf_read_branch_records() helper is missing
perf_link # v5.15+
pkt_access # 32-bit pointer arith in test_pkt_access
probe_read_user_str # kernel bug with garbage bytes at the end
prog_run_xattr # 32-bit pointer arith in test_pkt_access
raw_tp_test_run # v5.10+
recursion # v5.12+
ringbuf # BPF_MAP_TYPE_RINGBUF is supported in 5.8+
# bug in verifier w/ tracking references
#reference_tracking/classifier/sk_lookup_success
reference_tracking
select_reuseport # UDP support is missing
send_signal # bpf_send_signal_thread() helper is missing
sk_assign # bpf_sk_assign helper missing
sk_lookup # v5.9+
sk_storage_tracing # missing bpf_sk_storage_get() helper
skb_ctx # ctx_{size, }_{in, out} in BPF_PROG_TEST_RUN is missing
skb_helpers # helpers added in 5.8+
snprintf # v5.13+
snprintf_btf # v5.10+
sock_fields # v5.10+
socket_cookie # v5.12+
sockmap_basic # uses new socket fields, 5.8+
sockmap_listen # no listen socket supportin SOCKMAP
sockopt_sk
sockopt_qos_to_cc # v5.15+
stacktrace_build_id # v5.9+
stack_var_off # v5.12+
syscall # v5.14+
task_local_storage # v5.12+
task_pt_regs # v5.15+
tcp_hdr_options # v5.10+, new TCP header options feature in BPF
tcpbpf_user # LINK_CREATE is missing
tc_redirect # v5.14+
test_bpffs # v5.10+, new CONFIG_BPF_PRELOAD=y and CONFIG_BPF_PRELOAD_UMG=y|m
test_bprm_opts # v5.11+
test_global_funcs # kernel doesn't support BTF linkage=global on FUNCs
test_local_storage # v5.10+ feature
test_lsm # no BPF_LSM support
test_overhead # no fmod_ret support
test_profiler # needs verifier logic improvements from v5.10+
test_skb_pkt_end # v5.11+
timer # v5.15+
timer_mim # v5.15+
trace_ext # v5.10+
trace_printk # v5.14+
trampoline_count # v5.12+ have lower allowed limits
udp_limit # no cgroup/sock_release BPF program type (5.9+)
varlen # verifier bug fixed in later kernels
vmlinux # hrtimer_nanosleep() signature changed incompatibly
xdp_adjust_tail # new XDP functionality added in 5.8
xdp_attach # IFLA_XDP_EXPECTED_FD support is missing
xdp_bonding # v5.15+
xdp_bpf2bpf # freplace is missing
xdp_context_test_run # v5.15+
xdp_cpumap_attach # v5.9+
xdp_devmap_attach # new feature in 5.8
xdp_link # v5.9+
# SUBTESTS FAILING (block entire test until blocking subtests works properly)
btf # "size check test", "func (Non zero vlen)"
tailcalls # tailcall_bpf2bpf_1, tailcall_bpf2bpf_2, tailcall_bpf2bpf_3

View File

@@ -0,0 +1,5 @@
# TEMPORARY
get_stack_raw_tp # spams with kernel warnings until next bpf -> bpf-next merge
stacktrace_build_id_nmi
stacktrace_build_id
task_fd_query_rawtp

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,7 @@
# btf_dump -- need to disable data dump sub-tests
core_retro
cpu_mask
hashmap
perf_buffer
section_names

View File

@@ -0,0 +1,55 @@
attach_probe
autoload
bpf_verif_scale
cgroup_attach_autodetach
cgroup_attach_override
core_autosize
core_extern
core_read_macros
core_reloc
core_retro
cpu_mask
endian
fexit_stress
get_branch_snapshot
get_stackid_cannot_attach
global_data
global_data_init
global_func_args
hashmap
l4lb_all
linked_funcs
linked_maps
map_lock
obj_name
perf_buffer
perf_event_stackmap
pinning
pkt_md_access
probe_user
queue_stack_map
raw_tp_writable_reject_nbd_invalid
raw_tp_writable_test_run
rdonly_maps
section_names
signal_pending
skeleton
sockmap_ktls
sockopt
sockopt_inherit
sockopt_multi
spinlock
stacktrace_map
stacktrace_map_raw_tp
static_linked
subprogs
task_fd_query_rawtp
task_fd_query_tp
tc_bpf
tcp_estats
tcp_rtt
tp_attach_query
xdp
xdp_info
xdp_noinline
xdp_perf

View File

@@ -0,0 +1,24 @@
# $1 - start or end
# $2 - fold identifier, no spaces
# $3 - fold section description
travis_fold() {
local YELLOW='\033[1;33m'
local NOCOLOR='\033[0m'
if [ -z ${GITHUB_WORKFLOW+x} ]; then
echo travis_fold:$1:$2
if [ ! -z "${3:-}" ]; then
echo -e "${YELLOW}$3${NOCOLOR}"
fi
echo
else
if [ $1 = "start" ]; then
line="::group::$2"
if [ ! -z "${3:-}" ]; then
line="$line - ${YELLOW}$3${NOCOLOR}"
fi
else
line="::endgroup::"
fi
echo -e "$line"
fi
}

158
travis-ci/vmtest/mkrootfs.sh Executable file
View File

@@ -0,0 +1,158 @@
#!/bin/bash
# This script is based on drgn script for generating Arch Linux bootstrap
# images.
# https://github.com/osandov/drgn/blob/master/scripts/vmtest/mkrootfs.sh
set -euo pipefail
usage () {
USAGE_STRING="usage: $0 [NAME]
$0 -h
Build an Arch Linux root filesystem image for testing libbpf in a virtual
machine.
The image is generated as a zstd-compressed tarball.
This must be run as root, as most of the installation is done in a chroot.
Arguments:
NAME name of generated image file (default:
libbpf-vmtest-rootfs-\$DATE.tar.zst)
Options:
-h display this help message and exit"
case "$1" in
out)
echo "$USAGE_STRING"
exit 0
;;
err)
echo "$USAGE_STRING" >&2
exit 1
;;
esac
}
while getopts "h" OPT; do
case "$OPT" in
h)
usage out
;;
*)
usage err
;;
esac
done
if [[ $OPTIND -eq $# ]]; then
NAME="${!OPTIND}"
elif [[ $OPTIND -gt $# ]]; then
NAME="libbpf-vmtest-rootfs-$(date +%Y.%m.%d).tar.zst"
else
usage err
fi
pacman_conf=
root=
trap 'rm -rf "$pacman_conf" "$root"' EXIT
pacman_conf="$(mktemp -p "$PWD")"
cat > "$pacman_conf" << "EOF"
[options]
Architecture = x86_64
CheckSpace
SigLevel = Required DatabaseOptional
[core]
Include = /etc/pacman.d/mirrorlist
[extra]
Include = /etc/pacman.d/mirrorlist
[community]
Include = /etc/pacman.d/mirrorlist
EOF
root="$(mktemp -d -p "$PWD")"
packages=(
busybox
# libbpf dependencies.
libelf
zlib
# selftests test_progs dependencies.
binutils
elfutils
glibc
iproute2
# selftests test_verifier dependencies.
libcap
)
pacstrap -C "$pacman_conf" -cGM "$root" "${packages[@]}"
# Remove unnecessary files from the chroot.
# We don't need the pacman databases anymore.
rm -rf "$root/var/lib/pacman/sync/"
# We don't need D, Fortran, or Go.
rm -f "$root/usr/lib/libgdruntime."* \
"$root/usr/lib/libgphobos."* \
"$root/usr/lib/libgfortran."* \
"$root/usr/lib/libgo."*
# We don't need any documentation.
rm -rf "$root/usr/share/{doc,help,man,texinfo}"
chroot "${root}" /bin/busybox --install
cat > "$root/etc/inittab" << "EOF"
::sysinit:/etc/init.d/rcS
::ctrlaltdel:/sbin/reboot
::shutdown:/sbin/swapoff -a
::shutdown:/bin/umount -a -r
::restart:/sbin/init
EOF
chmod 644 "$root/etc/inittab"
mkdir -m 755 "$root/etc/init.d" "$root/etc/rcS.d"
cat > "$root/etc/rcS.d/S10-mount" << "EOF"
#!/bin/sh
set -eux
/bin/mount proc /proc -t proc
# Mount devtmpfs if not mounted
if [[ -z $(/bin/mount -l -t devtmpfs) ]]; then
/bin/mount devtmpfs /dev -t devtmpfs
fi
/bin/mount sysfs /sys -t sysfs
/bin/mount bpffs /sys/fs/bpf -t bpf
/bin/mount debugfs /sys/kernel/debug -t debugfs
echo 'Listing currently mounted file systems'
/bin/mount
EOF
chmod 755 "$root/etc/rcS.d/S10-mount"
cat > "$root/etc/rcS.d/S40-network" << "EOF"
#!/bin/sh
set -eux
ip link set lo up
EOF
chmod 755 "$root/etc/rcS.d/S40-network"
cat > "$root/etc/init.d/rcS" << "EOF"
#!/bin/sh
set -eux
for path in /etc/rcS.d/S*; do
[ -x "$path" ] && "$path"
done
EOF
chmod 755 "$root/etc/init.d/rcS"
chmod 755 "$root"
tar -C "$root" -c . | zstd -T0 -19 -o "$NAME"
chmod 644 "$NAME"

View File

@@ -0,0 +1,3 @@
#!/bin/bash
printf "all:\n\ttouch bpf_testmod.ko\n\nclean:\n" > bpf_testmod/Makefile

View File

@@ -0,0 +1,3 @@
#!/bin/bash
printf "all:\n\ttouch bpf_testmod.ko\n\nclean:\n" > bpf_testmod/Makefile

View File

@@ -0,0 +1,21 @@
#!/bin/bash
set -eu
source $(cd $(dirname $0) && pwd)/helpers.sh
REPO_PATH=${1:-}
if [[ ! -z "$REPO_PATH" ]]; then
${VMTEST_ROOT}/checkout_latest_kernel.sh ${REPO_PATH}
cd ${REPO_PATH}
fi
if [[ "${KERNEL}" = 'LATEST' ]]; then
travis_fold start build_kernel "Kernel build"
cp ${VMTEST_ROOT}/configs/latest.config .config
make -j $((4*$(nproc))) olddefconfig all >/dev/null
travis_fold end build_kernel
fi

477
travis-ci/vmtest/run.sh Executable file
View File

@@ -0,0 +1,477 @@
#!/bin/bash
set -uo pipefail
trap 'exit 2' ERR
source $(cd $(dirname $0) && pwd)/helpers.sh
usage () {
USAGE_STRING="usage: $0 [-k KERNELRELEASE|-b DIR] [[-r ROOTFSVERSION] [-fo]|-I] [-Si] [-d DIR] IMG
$0 [-k KERNELRELEASE] -l
$0 -h
Run "${PROJECT_NAME}" tests in a virtual machine.
This exits with status 0 on success, 1 if the virtual machine ran successfully
but tests failed, and 2 if we encountered a fatal error.
This script uses sudo to mount and modify the disk image.
Arguments:
IMG path of virtual machine disk image to create
Versions:
-k, --kernel=KERNELRELEASE
kernel release to test. This is a glob pattern; the
newest (sorted by version number) release that matches
the pattern is used (default: newest available release)
-b, --build DIR use the kernel built in the given directory. This option
cannot be combined with -k
-r, --rootfs=ROOTFSVERSION
version of root filesystem to use (default: newest
available version)
Setup:
-f, --force overwrite IMG if it already exists
-o, --one-shot one-shot mode. By default, this script saves a clean copy
of the downloaded root filesystem image and vmlinux and
makes a copy (reflinked, when possible) for executing the
virtual machine. This allows subsequent runs to skip
downloading these files. If this option is given, the
root filesystem image and vmlinux are always
re-downloaded and are not saved. This option implies -f
-s, --setup-cmd setup commands run on VM boot. Whitespace characters
should be escaped with preceding '\'.
-I, --skip-image skip creating the disk image; use the existing one at
IMG. This option cannot be combined with -r, -f, or -o
-S, --skip-source skip copying the source files and init scripts
Miscellaneous:
-i, --interactive interactive mode. Boot the virtual machine into an
interactive shell instead of automatically running tests
-d, --dir=DIR working directory to use for downloading and caching
files (default: current working directory)
-l, --list list available kernel releases instead of running tests.
The list may be filtered with -k
-h, --help display this help message and exit"
case "$1" in
out)
echo "$USAGE_STRING"
exit 0
;;
err)
echo "$USAGE_STRING" >&2
exit 2
;;
esac
}
TEMP=$(getopt -o 'k:b:r:fos:ISid:lh' --long 'kernel:,build:,rootfs:,force,one-shot,setup-cmd,skip-image,skip-source:,interactive,dir:,list,help' -n "$0" -- "$@")
eval set -- "$TEMP"
unset TEMP
unset KERNELRELEASE
unset BUILDDIR
unset ROOTFSVERSION
unset IMG
unset SETUPCMD
FORCE=0
ONESHOT=0
SKIPIMG=0
SKIPSOURCE=0
APPEND=""
DIR="$PWD"
LIST=0
# by default will copy all files that aren't listed in git exclusions
# but it doesn't work for entire kernel tree very well
# so for full kernel tree you may need to SOURCE_FULLCOPY=0
SOURCE_FULLCOPY=${SOURCE_FULLCOPY:-1}
while true; do
case "$1" in
-k|--kernel)
KERNELRELEASE="$2"
shift 2
;;
-b|--build)
BUILDDIR="$2"
shift 2
;;
-r|--rootfs)
ROOTFSVERSION="$2"
shift 2
;;
-f|--force)
FORCE=1
shift
;;
-o|--one-shot)
ONESHOT=1
FORCE=1
shift
;;
-s|--setup-cmd)
SETUPCMD="$2"
shift 2
;;
-I|--skip-image)
SKIPIMG=1
shift
;;
-S|--skip-source)
SKIPSOURCE=1
shift
;;
-i|--interactive)
APPEND=" single"
shift
;;
-d|--dir)
DIR="$2"
shift 2
;;
-l|--list)
LIST=1
;;
-h|--help)
usage out
;;
--)
shift
break
;;
*)
usage err
;;
esac
done
if [[ -v BUILDDIR ]]; then
if [[ -v KERNELRELEASE ]]; then
usage err
fi
elif [[ ! -v KERNELRELEASE ]]; then
KERNELRELEASE='*'
fi
if [[ $SKIPIMG -ne 0 && ( -v ROOTFSVERSION || $FORCE -ne 0 ) ]]; then
usage err
fi
if (( LIST )); then
if [[ $# -ne 0 || -v BUILDDIR || -v ROOTFSVERSION || $FORCE -ne 0 ||
$SKIPIMG -ne 0 || $SKIPSOURCE -ne 0 || -n $APPEND ]]; then
usage err
fi
else
if [[ $# -ne 1 ]]; then
usage err
fi
IMG="${!OPTIND}"
fi
unset URLS
cache_urls() {
if ! declare -p URLS &> /dev/null; then
# This URL contains a mapping from file names to URLs where
# those files can be downloaded.
declare -gA URLS
while IFS=$'\t' read -r name url; do
URLS["$name"]="$url"
done < <(cat "${VMTEST_ROOT}/configs/INDEX")
fi
}
matching_kernel_releases() {
local pattern="$1"
{
for file in "${!URLS[@]}"; do
if [[ $file =~ ^vmlinux-(.*).zst$ ]]; then
release="${BASH_REMATCH[1]}"
case "$release" in
$pattern)
# sort -V handles rc versions properly
# if we use "~" instead of "-".
echo "${release//-rc/~rc}"
;;
esac
fi
done
} | sort -rV | sed 's/~rc/-rc/g'
}
newest_rootfs_version() {
{
for file in "${!URLS[@]}"; do
if [[ $file =~ ^${PROJECT_NAME}-vmtest-rootfs-(.*)\.tar\.zst$ ]]; then
echo "${BASH_REMATCH[1]}"
fi
done
} | sort -rV | head -1
}
download() {
local file="$1"
cache_urls
if [[ ! -v URLS[$file] ]]; then
echo "$file not found" >&2
return 1
fi
echo "Downloading $file..." >&2
curl -Lf "${URLS[$file]}" "${@:2}"
}
set_nocow() {
touch "$@"
chattr +C "$@" >/dev/null 2>&1 || true
}
cp_img() {
set_nocow "$2"
cp --reflink=auto "$1" "$2"
}
create_rootfs_img() {
local path="$1"
set_nocow "$path"
truncate -s 2G "$path"
mkfs.ext4 -q "$path"
}
download_rootfs() {
local rootfsversion="$1"
local dir="$2"
download "${PROJECT_NAME}-vmtest-rootfs-$rootfsversion.tar.zst" |
zstd -d | sudo tar -C "$dir" -x
}
if (( LIST )); then
cache_urls
matching_kernel_releases "$KERNELRELEASE"
exit 0
fi
if [[ $FORCE -eq 0 && $SKIPIMG -eq 0 && -e $IMG ]]; then
echo "$IMG already exists; use -f to overwrite it or -I to reuse it" >&2
exit 1
fi
# Only go to the network if it's actually a glob pattern.
if [[ -v BUILDDIR ]]; then
KERNELRELEASE="$(make -C "$BUILDDIR" -s kernelrelease)"
elif [[ ! $KERNELRELEASE =~ ^([^\\*?[]|\\[*?[])*\\?$ ]]; then
# We need to cache the list of URLs outside of the command
# substitution, which happens in a subshell.
cache_urls
KERNELRELEASE="$(matching_kernel_releases "$KERNELRELEASE" | head -1)"
if [[ -z $KERNELRELEASE ]]; then
echo "No matching kernel release found" >&2
exit 1
fi
fi
if [[ $SKIPIMG -eq 0 && ! -v ROOTFSVERSION ]]; then
cache_urls
ROOTFSVERSION="$(newest_rootfs_version)"
fi
echo "Kernel release: $KERNELRELEASE" >&2
echo
travis_fold start vmlinux_setup "Preparing Linux image"
if (( SKIPIMG )); then
echo "Not extracting root filesystem" >&2
else
echo "Root filesystem version: $ROOTFSVERSION" >&2
fi
echo "Disk image: $IMG" >&2
tmp=
ARCH_DIR="$DIR/x86_64"
mkdir -p "$ARCH_DIR"
mnt="$(mktemp -d -p "$DIR" mnt.XXXXXXXXXX)"
cleanup() {
if [[ -n $tmp ]]; then
rm -f "$tmp" || true
fi
if mountpoint -q "$mnt"; then
sudo umount "$mnt" || true
fi
if [[ -d "$mnt" ]]; then
rmdir "$mnt" || true
fi
}
trap cleanup EXIT
if [[ -v BUILDDIR ]]; then
vmlinuz="$BUILDDIR/$(make -C "$BUILDDIR" -s image_name)"
else
vmlinuz="${ARCH_DIR}/vmlinuz-${KERNELRELEASE}"
if [[ ! -e $vmlinuz ]]; then
tmp="$(mktemp "$vmlinuz.XXX.part")"
download "vmlinuz-${KERNELRELEASE}" -o "$tmp"
mv "$tmp" "$vmlinuz"
tmp=
fi
fi
# Mount and set up the rootfs image.
if (( ONESHOT )); then
rm -f "$IMG"
create_rootfs_img "$IMG"
sudo mount -o loop "$IMG" "$mnt"
download_rootfs "$ROOTFSVERSION" "$mnt"
else
if (( ! SKIPIMG )); then
rootfs_img="${ARCH_DIR}/${PROJECT_NAME}-vmtest-rootfs-${ROOTFSVERSION}.img"
if [[ ! -e $rootfs_img ]]; then
tmp="$(mktemp "$rootfs_img.XXX.part")"
set_nocow "$tmp"
truncate -s 2G "$tmp"
mkfs.ext4 -q "$tmp"
sudo mount -o loop "$tmp" "$mnt"
download_rootfs "$ROOTFSVERSION" "$mnt"
sudo umount "$mnt"
mv "$tmp" "$rootfs_img"
tmp=
fi
rm -f "$IMG"
cp_img "$rootfs_img" "$IMG"
fi
sudo mount -o loop "$IMG" "$mnt"
fi
# Install vmlinux.
vmlinux="$mnt/boot/vmlinux-${KERNELRELEASE}"
if [[ -v BUILDDIR || $ONESHOT -eq 0 ]]; then
if [[ -v BUILDDIR ]]; then
source_vmlinux="${BUILDDIR}/vmlinux"
else
source_vmlinux="${ARCH_DIR}/vmlinux-${KERNELRELEASE}"
if [[ ! -e $source_vmlinux ]]; then
tmp="$(mktemp "$source_vmlinux.XXX.part")"
download "vmlinux-${KERNELRELEASE}.zst" | zstd -dfo "$tmp"
mv "$tmp" "$source_vmlinux"
tmp=
fi
fi
echo "Copying vmlinux..." >&2
sudo rsync -cp --chmod 0644 "$source_vmlinux" "$vmlinux"
else
# We could use "sudo zstd -o", but let's not run zstd as root with
# input from the internet.
download "vmlinux-${KERNELRELEASE}.zst" |
zstd -d | sudo tee "$vmlinux" > /dev/null
sudo chmod 644 "$vmlinux"
fi
travis_fold end vmlinux_setup
REPO_PATH="${SELFTEST_REPO_PATH:-travis-ci/vmtest/bpf-next}"
LIBBPF_PATH="${REPO_ROOT}" \
VMTEST_ROOT="${VMTEST_ROOT}" \
REPO_PATH="${REPO_PATH}" \
VMLINUX_BTF=${vmlinux} ${VMTEST_ROOT}/build_selftests.sh
travis_fold start bpftool_checks "Running bpftool checks..."
if [[ "${KERNEL}" = 'LATEST' ]]; then
"${REPO_ROOT}/${REPO_PATH}/tools/testing/selftests/bpf/test_bpftool_synctypes.py" && \
echo "Consistency checks passed successfully."
else
echo "Consistency checks skipped."
fi
travis_fold end bpftool_checks
travis_fold start vm_init "Starting virtual machine..."
if (( SKIPSOURCE )); then
echo "Not copying source files..." >&2
else
echo "Copying source files..." >&2
# Copy the source files in.
sudo mkdir -p -m 0755 "$mnt/${PROJECT_NAME}"
if [[ "${SOURCE_FULLCOPY}" == "1" ]]; then
git ls-files -z | sudo rsync --files-from=- -0cpt . "$mnt/${PROJECT_NAME}"
else
sudo mkdir -p -m 0755 ${mnt}/${PROJECT_NAME}/{selftests,travis-ci}
tree --du -shaC "${REPO_ROOT}/selftests/bpf"
sudo rsync -avm "${REPO_ROOT}/selftests/bpf" "$mnt/${PROJECT_NAME}/selftests/"
sudo rsync -avm "${REPO_ROOT}/travis-ci/vmtest" "$mnt/${PROJECT_NAME}/travis-ci/"
fi
fi
setup_script="#!/bin/sh
echo 'Skipping setup commands'
echo 0 > /exitstatus
chmod 644 /exitstatus"
# Create the init scripts.
if [[ ! -z SETUPCMD ]]; then
# Unescape whitespace characters.
setup_cmd=$(sed 's/\(\\\)\([[:space:]]\)/\2/g' <<< "${SETUPCMD}")
kernel="${KERNELRELEASE}"
if [[ -v BUILDDIR ]]; then kernel='latest'; fi
setup_envvars="export KERNEL=${kernel}"
setup_script=$(printf "#!/bin/sh
set -eux
echo 'Running setup commands'
%s
set +e; %s; exitstatus=\$?; set -e
echo \$exitstatus > /exitstatus
chmod 644 /exitstatus" "${setup_envvars}" "${setup_cmd}")
fi
echo "${setup_script}" | sudo tee "$mnt/etc/rcS.d/S50-run-tests" > /dev/null
sudo chmod 755 "$mnt/etc/rcS.d/S50-run-tests"
fold_shutdown="$(travis_fold start shutdown)"
poweroff_script="#!/bin/sh
echo ${fold_shutdown}
echo -e '\033[1;33mShutdown\033[0m\n'
poweroff"
echo "${poweroff_script}" | sudo tee "$mnt/etc/rcS.d/S99-poweroff" > /dev/null
sudo chmod 755 "$mnt/etc/rcS.d/S99-poweroff"
sudo umount "$mnt"
echo "Starting VM with $(nproc) CPUs..."
if kvm-ok ; then
accel="-cpu kvm64 -enable-kvm"
else
accel="-cpu qemu64 -machine accel=tcg"
fi
qemu-system-x86_64 -nodefaults -display none -serial mon:stdio -no-reboot \
${accel} -smp "$(nproc)" -m 4G \
-drive file="$IMG",format=raw,index=1,media=disk,if=virtio,cache=none \
-kernel "$vmlinuz" -append "root=/dev/vda rw console=ttyS0,115200 kernel.panic=-1 $APPEND"
sudo mount -o loop "$IMG" "$mnt"
if exitstatus="$(cat "$mnt/exitstatus" 2>/dev/null)"; then
printf '\nTests exit status: %s\n' "$exitstatus" >&2
else
printf '\nCould not read tests exit status\n' >&2
exitstatus=1
fi
sudo umount "$mnt"
travis_fold end shutdown
exit "$exitstatus"

View File

@@ -0,0 +1,51 @@
#!/bin/bash
set -euo pipefail
source $(cd $(dirname $0) && pwd)/helpers.sh
test_progs() {
if [[ "${KERNEL}" != '4.9.0' ]]; then
travis_fold start test_progs "Testing test_progs"
./test_progs ${BLACKLIST:+-d$BLACKLIST} ${WHITELIST:+-a$WHITELIST}
travis_fold end test_progs
fi
travis_fold start test_progs-no_alu32 "Testing test_progs-no_alu32"
./test_progs-no_alu32 ${BLACKLIST:+-d$BLACKLIST} ${WHITELIST:+-a$WHITELIST}
travis_fold end test_progs-no_alu32
}
test_maps() {
travis_fold start test_maps "Testing test_maps"
./test_maps
travis_fold end test_maps
}
test_verifier() {
travis_fold start test_verifier "Testing test_verifier"
./test_verifier
travis_fold end test_verifier
}
travis_fold end vm_init
configs_path='libbpf/travis-ci/vmtest/configs'
blacklist_path="$configs_path/blacklist/BLACKLIST-${KERNEL}"
if [[ -s "${blacklist_path}" ]]; then
BLACKLIST=$(cat "${blacklist_path}" | cut -d'#' -f1 | tr -s '[:space:]' ',')
fi
whitelist_path="$configs_path/whitelist/WHITELIST-${KERNEL}"
if [[ -s "${whitelist_path}" ]]; then
WHITELIST=$(cat "${whitelist_path}" | cut -d'#' -f1 | tr -s '[:space:]' ',')
fi
cd libbpf/selftests/bpf
test_progs
if [[ "${KERNEL}" == 'latest' ]]; then
# test_maps
test_verifier
fi

51
travis-ci/vmtest/run_vmtest.sh Executable file
View File

@@ -0,0 +1,51 @@
#!/bin/bash
set -eu
source $(cd $(dirname $0) && pwd)/helpers.sh
VMTEST_SETUPCMD="GITHUB_WORKFLOW=${GITHUB_WORKFLOW:-} PROJECT_NAME=${PROJECT_NAME} ./${PROJECT_NAME}/travis-ci/vmtest/run_selftests.sh"
# if CHECKOUT_KERNEL is 1 code will consider that kernel code lives elsewhere
# if 0 it will consider that REPO_ROOT is a kernel tree
CHECKOUT_KERNEL=${CHECKOUT_KERNEL:-1}
echo "KERNEL: $KERNEL"
echo
# Build latest pahole
${VMTEST_ROOT}/build_pahole.sh travis-ci/vmtest/pahole
travis_fold start install_clang "Installing Clang/LLVM"
wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
sudo add-apt-repository "deb http://apt.llvm.org/focal/ llvm-toolchain-focal main"
sudo apt-get update
sudo apt-get install --allow-downgrades -y libc6=2.31-0ubuntu9.2
sudo aptitude install -y g++ libelf-dev
sudo aptitude install -y clang-14 lld-14 llvm-14
travis_fold end install_clang
# Build selftests (and latest kernel, if necessary)
if [[ "$CHECKOUT_KERNEL" == "1" ]]; then
${VMTEST_ROOT}/prepare_selftests.sh travis-ci/vmtest/bpf-next
else
${VMTEST_ROOT}/prepare_selftests.sh
fi
# Escape whitespace characters.
setup_cmd=$(sed 's/\([[:space:]]\)/\\\1/g' <<< "${VMTEST_SETUPCMD}")
sudo adduser "${USER}" kvm
if [[ "${KERNEL}" = 'LATEST' ]]; then
if [[ "$CHECKOUT_KERNEL" == "1" ]]; then
sudo -E sudo -E -u "${USER}" "${VMTEST_ROOT}/run.sh" -b travis-ci/vmtest/bpf-next -o -d ~ -s "${setup_cmd}" ~/root.img
else
sudo -E sudo -E -u "${USER}" "${VMTEST_ROOT}/run.sh" -b "${REPO_ROOT}" -o -d ~ -s "${setup_cmd}" ~/root.img
fi
else
sudo -E sudo -E -u "${USER}" "${VMTEST_ROOT}/run.sh" -k "${KERNEL}*" -o -d ~ -s "${setup_cmd}" ~/root.img
fi

87632
travis-ci/vmtest/vmlinux.h Normal file

File diff suppressed because it is too large Load Diff