Compare commits

..

323 Commits

Author SHA1 Message Date
Andrii Nakryiko
e8547bd4f7 vmtests: fix selftests checkout script
Fix the script.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-18 11:37:43 -07:00
Andrii Nakryiko
93959e4e43 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   bfdd5aaa54b0a44d9df550fe4c9db7e1470a11b8
Checkpoint bpf-next commit: 06a4ec1d9dc652e17ee3ac2ceb6c7cf6c2b75cdd
Baseline bpf commit:        929e54a989680c6f134b02293732030b897475dc
Checkpoint bpf commit:      3fb1a96a91120877488071a167d26d76be4be977

Andrii Nakryiko (4):
  libbpf: Fix BTF-defined map-in-map initialization on 32-bit host
    arches
  libbpf: Handle BTF pointer sizes more carefully
  libbpf: Enforce 64-bitness of BTF for BPF object files
  libbpf: Fix build on ppc64le architecture

Jean-Philippe Brucker (1):
  libbpf: Handle GCC built-in types for Arm NEON

Toke Høiland-Jørgensen (1):
  libbpf: Prevent overriding errno when logging errors

Yonghong Song (1):
  libbpf: Do not use __builtin_offsetof for offsetof

 src/bpf_helpers.h |  2 +-
 src/btf.c         | 83 +++++++++++++++++++++++++++++++++++++++++++++--
 src/btf.h         |  2 ++
 src/btf_dump.c    | 39 ++++++++++++++++++++--
 src/libbpf.c      | 32 +++++++++++-------
 src/libbpf.map    |  2 ++
 6 files changed, 143 insertions(+), 17 deletions(-)

--
2.24.1
2020-08-18 11:37:43 -07:00
Andrii Nakryiko
7ee1f12f94 libbpf: Fix build on ppc64le architecture
On ppc64le we get the following warning:

  In file included from btf_dump.c:16:0:
  btf_dump.c: In function ‘btf_dump_emit_struct_def’:
  ../include/linux/kernel.h:20:17: error: comparison of distinct pointer types lacks a cast [-Werror]
    (void) (&_max1 == &_max2);  \
                   ^
  btf_dump.c:882:11: note: in expansion of macro ‘max’
      m_sz = max(0LL, btf__resolve_size(d->btf, m->type));
             ^~~

Fix by explicitly casting to __s64, which is a return type from
btf__resolve_size().

Fixes: 702eddc77a90 ("libbpf: Handle GCC built-in types for Arm NEON")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200818164456.1181661-1-andriin@fb.com
2020-08-18 11:37:43 -07:00
Andrii Nakryiko
ff09ad9dac libbpf: Enforce 64-bitness of BTF for BPF object files
BPF object files are always targeting 64-bit BPF target architecture, so
enforce that at BTF level as well.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200813204945.1020225-7-andriin@fb.com
2020-08-18 11:37:43 -07:00
Andrii Nakryiko
025fcdc306 libbpf: Handle BTF pointer sizes more carefully
With libbpf and BTF it is pretty common to have libbpf built for one
architecture, while BTF information was generated for a different architecture
(typically, but not always, BPF). In such case, the size of a pointer might
differ betweem architectures. libbpf previously was always making an
assumption that pointer size for BTF is the same as native architecture
pointer size, but that breaks for cases where libbpf is built as 32-bit
library, while BTF is for 64-bit architecture.

To solve this, add heuristic to determine pointer size by searching for `long`
or `unsigned long` integer type and using its size as a pointer size. Also,
allow to override the pointer size with a new API btf__set_pointer_size(), for
cases where application knows which pointer size should be used. User
application can check what libbpf "guessed" by looking at the result of
btf__pointer_size(). If it's not 0, then libbpf successfully determined a
pointer size, otherwise native arch pointer size will be used.

For cases where BTF is parsed from ELF file, use ELF's class (32-bit or
64-bit) to determine pointer size.

Fixes: 8a138aed4a80 ("bpf: btf: Add BTF support to libbpf")
Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200813204945.1020225-5-andriin@fb.com
2020-08-18 11:37:43 -07:00
Andrii Nakryiko
b3405fcb08 libbpf: Fix BTF-defined map-in-map initialization on 32-bit host arches
Libbpf built in 32-bit mode should be careful about not conflating 64-bit BPF
pointers in BPF ELF file and host architecture pointers. This patch fixes
issue of incorrect initializating of map-in-map inner map slots due to such
difference.

Fixes: 646f02ffdd49 ("libbpf: Add BTF-defined map-in-map support")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200813204945.1020225-4-andriin@fb.com
2020-08-18 11:37:43 -07:00
Toke Høiland-Jørgensen
1194953749 libbpf: Prevent overriding errno when logging errors
Turns out there were a few more instances where libbpf didn't save the
errno before writing an error message, causing errno to be overridden by
the printf() return and the error disappearing if logging is enabled.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200813142905.160381-1-toke@redhat.com
2020-08-18 11:37:43 -07:00
Jean-Philippe Brucker
1d76180057 libbpf: Handle GCC built-in types for Arm NEON
When building Arm NEON (SIMD) code from lib/raid6/neon.uc, GCC emits
DWARF information using a base type "__Poly8_t", which is internal to
GCC and not recognized by Clang. This causes build failures when
building with Clang a vmlinux.h generated from an arm64 kernel that was
built with GCC.

	vmlinux.h:47284:9: error: unknown type name '__Poly8_t'
	typedef __Poly8_t poly8x16_t[16];
	        ^~~~~~~~~

The polyX_t types are defined as unsigned integers in the "Arm C
Language Extension" document (101028_Q220_00_en). Emit typedefs based on
standard integer types for the GCC internal types, similar to those
emitted by Clang.

Including linux/kernel.h to use ARRAY_SIZE() incidentally redefined
max(), causing a build bug due to different types, hence the seemingly
unrelated change.

Reported-by: Jakov Petrina <jakov.petrina@sartura.hr>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200812143909.3293280-1-jean-philippe@linaro.org
2020-08-18 11:37:43 -07:00
Yonghong Song
048bf21dac libbpf: Do not use __builtin_offsetof for offsetof
Commit 5fbc220862fc ("tools/libpf: Add offsetof/container_of macro
in bpf_helpers.h") added a macro offsetof() to get the offset of a
structure member:

   #define offsetof(TYPE, MEMBER)  ((size_t)&((TYPE *)0)->MEMBER)

In certain use cases, size_t type may not be available so
Commit da7a35062bcc ("libbpf bpf_helpers: Use __builtin_offsetof
for offsetof") changed to use __builtin_offsetof which removed
the dependency on type size_t, which I suggested.

But using __builtin_offsetof will prevent CO-RE relocation
generation in case that, e.g., TYPE is annotated with "preserve_access_info"
where a relocation is desirable in case the member offset is changed
in a different kernel version. So this patch reverted back to
the original macro but using "unsigned long" instead of "site_t".

Fixes: da7a35062bcc ("libbpf bpf_helpers: Use __builtin_offsetof for offsetof")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/bpf/20200811030852.3396929-1-yhs@fb.com
2020-08-18 11:37:43 -07:00
Andrii Nakryiko
e954437a76 travis-ci: flatten build stages to gain more speed ups
Do both builds and selftest runs as part of a single build step. This would
allow to complete CI testing faster, as builds will happen in parallel with
"Kernel LATEST + selftests" run.

Also re-enable s390x build.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-10 22:38:26 -07:00
Andrii Nakryiko
c57be0b4d6 vmtests: speed up fetching of bpf-next sources
Attempt to first fetch bpf-next tree from a snapshot, falling back to shallow
clone, and if that is not enough, doing a full bpf-next clone. This should
both improve a speed and (because of full clone fallback) improve test
reliability if libbpf wasn't synced in a while.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-10 22:31:52 -07:00
Andrii Nakryiko
bf3ab4b0d8 travis-ci: remove s390x build as it fails to be queued by Travis CI
It's been failing for few days. Comment it out for now.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-09 13:19:59 -07:00
Andrii Nakryiko
663f66decf vmtests: blacklist problematic tests
Blacklist btf_map_in_map permanently for 5.5. bpf_verif_scale is broken due to
Clang issues on latest. Do not run ALU32 flavor for test_progs on 4.9.0, which
doesn't support ALU32 yet.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-07 16:37:09 -07:00
Andrii Nakryiko
ed187d0400 vmtest: bump LLVM_VER to 12
Bump LLVM_VER variable used in selftest build to 12.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-07 16:37:09 -07:00
Andrii Nakryiko
80453d4b2d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   3c4f850e8441ac8b3b6dbaa6107604c4199ef01f
Checkpoint bpf-next commit: bfdd5aaa54b0a44d9df550fe4c9db7e1470a11b8
Baseline bpf commit:        5b801dfb7feb2738975d80223efc2fc193e55573
Checkpoint bpf commit:      929e54a989680c6f134b02293732030b897475dc

Andrii Nakryiko (3):
  libbpf: Make destructors more robust by handling ERR_PTR(err) cases
  libbpf: Add bpf_link detach APIs
  libbpf: Add btf__parse_raw() and generic btf__parse() APIs

Daniel T. Lee (1):
  libbf: Fix uninitialized pointer at btf__parse_raw()

Jerry Crunchtime (1):
  libbpf: Fix register in PT_REGS MIPS macros

Yonghong Song (1):
  tools/bpf: Support new uapi for map element bpf iterator

 include/uapi/linux/bpf.h |  20 ++++---
 src/bpf.c                |  13 +++++
 src/bpf.h                |   7 ++-
 src/bpf_tracing.h        |   4 +-
 src/btf.c                | 118 ++++++++++++++++++++++++++-------------
 src/btf.h                |   5 +-
 src/btf_dump.c           |   2 +-
 src/libbpf.c             |  20 ++++---
 src/libbpf.h             |   6 +-
 src/libbpf.map           |   4 ++
 10 files changed, 137 insertions(+), 62 deletions(-)

--
2.24.1
2020-08-07 16:37:09 -07:00
Daniel T. Lee
7f96c4b1d2 libbf: Fix uninitialized pointer at btf__parse_raw()
Recently, from commit 94a1fedd63ed ("libbpf: Add btf__parse_raw() and
generic btf__parse() APIs"), new API has been added to libbpf that
allows to parse BTF from raw data file (btf__parse_raw()).

The commit derives build failure of samples/bpf due to improper access
of uninitialized pointer at btf_parse_raw().

    btf.c: In function btf__parse_raw:
    btf.c:625:28: error: btf may be used uninitialized in this function
      625 |  return err ? ERR_PTR(err) : btf;
          |         ~~~~~~~~~~~~~~~~~~~^~~~~

This commit fixes the build failure of samples/bpf by adding code of
initializing btf pointer as NULL.

Fixes: 94a1fedd63ed ("libbpf: Add btf__parse_raw() and generic btf__parse() APIs")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200805223359.32109-1-danieltimlee@gmail.com
2020-08-07 16:37:09 -07:00
Yonghong Song
2be293cb4a tools/bpf: Support new uapi for map element bpf iterator
Previous commit adjusted kernel uapi for map
element bpf iterator. This patch adjusted libbpf API
due to uapi change. bpftool and bpf_iter selftests
are also changed accordingly.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200805055058.1457623-1-yhs@fb.com
2020-08-07 16:37:09 -07:00
Andrii Nakryiko
a0334e97aa libbpf: Add btf__parse_raw() and generic btf__parse() APIs
Add public APIs to parse BTF from raw data file (e.g.,
/sys/kernel/btf/vmlinux), as well as generic btf__parse(), which will try to
determine correct format, currently either raw or ELF.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200802013219.864880-2-andriin@fb.com
2020-08-07 16:37:09 -07:00
Andrii Nakryiko
2d97d4097f libbpf: Add bpf_link detach APIs
Add low-level bpf_link_detach() API. Also add higher-level bpf_link__detach()
one.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200731182830.286260-3-andriin@fb.com
2020-08-07 16:37:09 -07:00
Jerry Crunchtime
80a52e3252 libbpf: Fix register in PT_REGS MIPS macros
The o32, n32 and n64 calling conventions require the return
value to be stored in $v0 which maps to $2 register, i.e.,
the register 2.

Fixes: c1932cd ("bpf: Add MIPS support to samples/bpf.")
Signed-off-by: Jerry Crunchtime <jerry.c.t@web.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/43707d31-0210-e8f0-9226-1af140907641@web.de
2020-08-07 16:37:09 -07:00
Andrii Nakryiko
2dc7cbd893 libbpf: Make destructors more robust by handling ERR_PTR(err) cases
Most of libbpf "constructors" on failure return ERR_PTR(err) result encoded as
a pointer. It's a common mistake to eventually pass such malformed pointers
into xxx__destroy()/xxx__free() "destructors". So instead of fixing up
clean up code in selftests and user programs, handle such error pointers in
destructors themselves. This works beautifully for NULL pointers passed to
destructors, so might as well just work for error pointers.

Suggested-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200729232148.896125-1-andriin@fb.com
2020-08-07 16:37:09 -07:00
Thomas Hebb
0466b9833b README: Add Arch to list of downstream distros
Signed-off-by: Thomas Hebb <tommyhebb@gmail.com>
2020-08-06 21:21:02 -07:00
Andrii Nakryiko
ba8d45968b vmtests: specify v12 of clang/llvm for now
Whatever happened, clang-11 and llvm-11, to which clang/llvm packages resolve,
respectively, are not there anymore. Seems like clang-12/llvm-12 are the
latest now, but for whatever reason clang/llvm don't resolve to them yet.
Hard-code version 12 for now.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-08-06 17:32:33 -07:00
Thomas Hebb
734b3f0afe check-reallocarray.sh: Use the same compiler Make does
Currently we hardcode "gcc", which means we get a bogus result any time
a non-default CC is passed to Make. In fact, it's bogus even when CC is
not explicitly set, since Make's default is "cc", which isn't
necessarily the same as "gcc".

Fix the issue by passing the compiler to use to check-reallocarray.sh.

Signed-off-by: Thomas Hebb <tommyhebb@gmail.com>
2020-07-28 14:05:35 -07:00
Andrii Nakryiko
f56874ba8a vmtests: blacklist sk_lookup on LATEST and cg_storage_multi on 5.5
Blacklist two failing tests.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-07-28 14:03:17 -07:00
Andrii Nakryiko
3f26bf1adf sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   9a97c9d2af5ca798377342debf7f0f44281d050e
Checkpoint bpf-next commit: 3c4f850e8441ac8b3b6dbaa6107604c4199ef01f
Baseline bpf commit:        5b801dfb7feb2738975d80223efc2fc193e55573
Checkpoint bpf commit:      5b801dfb7feb2738975d80223efc2fc193e55573

Andrii Nakryiko (1):
  bpf: Fix bpf_ringbuf_output() signature to return long

 include/uapi/linux/bpf.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--
2.24.1
2020-07-28 14:03:17 -07:00
Andrii Nakryiko
ab01213b35 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   5c3320d7fece4612d4a413aa3c8e82cdb5b49fcb
Checkpoint bpf-next commit: 9a97c9d2af5ca798377342debf7f0f44281d050e
Baseline bpf commit:        b2f9f1535bb93ee5fa2ea30ac1c26fa0d676154c
Checkpoint bpf commit:      5b801dfb7feb2738975d80223efc2fc193e55573

Andrii Nakryiko (3):
  libbpf: Support stripping modifiers for btf_dump
  tools/bpftool: Strip away modifiers from global variables
  libbpf: Add support for BPF XDP link

Ciara Loftus (1):
  xsk: Add new statistics

Horatiu Vultur (1):
  net: bridge: Add port attribute IFLA_BRPORT_MRP_IN_OPEN

Ian Rogers (1):
  libbpf bpf_helpers: Use __builtin_offsetof for offsetof

Jakub Sitnicki (2):
  bpf: Sync linux/bpf.h to tools/
  libbpf: Add support for SK_LOOKUP program type

Lorenzo Bianconi (3):
  cpumap: Formalize map value as a named struct
  bpf: cpumap: Add the possibility to attach an eBPF program to cpumap
  libbpf: Add SEC name for xdp programs attached to CPUMAP

Quentin Monnet (1):
  bpf: Fix formatting in documentation for BPF helpers

Randy Dunlap (1):
  bpf: Drop duplicated words in uapi helper comments

Song Liu (1):
  libbpf: Print hint when PERF_EVENT_IOC_SET_BPF returns -EPROTO

Yonghong Song (2):
  bpf: Implement bpf iterator for map elements
  tools/libbpf: Add support for bpf map element iterator

 include/uapi/linux/bpf.h     | 155 +++++++++++++++++++++++++++++------
 include/uapi/linux/if_link.h |   1 +
 include/uapi/linux/if_xdp.h  |   5 +-
 src/bpf.c                    |   1 +
 src/bpf.h                    |   3 +-
 src/bpf_helpers.h            |   2 +-
 src/btf.h                    |   4 +-
 src/btf_dump.c               |  10 ++-
 src/libbpf.c                 |  27 +++++-
 src/libbpf.h                 |   7 +-
 src/libbpf.map               |   3 +
 src/libbpf_probes.c          |   3 +
 12 files changed, 188 insertions(+), 33 deletions(-)

--
2.24.1
2020-07-28 14:03:17 -07:00
Andrii Nakryiko
8af35e73a2 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-07-28 14:03:17 -07:00
Andrii Nakryiko
a290d45322 libbpf: Add support for BPF XDP link
Sync UAPI header and add support for using bpf_link-based XDP attachment.
Make xdp/ prog type set expected attach type. Kernel didn't enforce
attach_type for XDP programs before, so there is no backwards compatiblity
issues there.

Also fix section_names selftest to recognize that xdp prog types now have
expected attach type.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200722064603.3350758-8-andriin@fb.com
2020-07-28 14:03:17 -07:00
Song Liu
3f6b428909 libbpf: Print hint when PERF_EVENT_IOC_SET_BPF returns -EPROTO
The kernel prevents potential unwinder warnings and crashes by blocking
BPF program with bpf_get_[stack|stackid] on perf_event without
PERF_SAMPLE_CALLCHAIN, or with exclude_callchain_[kernel|user]. Print a
hint message in libbpf to help the user debug such issues.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200723180648.1429892-4-songliubraving@fb.com
2020-07-28 14:03:17 -07:00
Yonghong Song
5efd8395ef tools/libbpf: Add support for bpf map element iterator
Add map_fd to bpf_iter_attach_opts and flags to
bpf_link_create_opts. Later on, bpftool or selftest
will be able to create a bpf map element iterator
by passing map_fd to the kernel during link
creation time.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200723184117.590673-1-yhs@fb.com
2020-07-28 14:03:17 -07:00
Yonghong Song
b1720407ff bpf: Implement bpf iterator for map elements
The bpf iterator for map elements are implemented.
The bpf program will receive four parameters:
  bpf_iter_meta *meta: the meta data
  bpf_map *map:        the bpf_map whose elements are traversed
  void *key:           the key of one element
  void *value:         the value of the same element

Here, meta and map pointers are always valid, and
key has register type PTR_TO_RDONLY_BUF_OR_NULL and
value has register type PTR_TO_RDWR_BUF_OR_NULL.
The kernel will track the access range of key and value
during verification time. Later, these values will be compared
against the values in the actual map to ensure all accesses
are within range.

A new field iter_seq_info is added to bpf_map_ops which
is used to add map type specific information, i.e., seq_ops,
init/fini seq_file func and seq_file private data size.
Subsequent patches will have actual implementation
for bpf_map_ops->iter_seq_info.

In user space, BPF_ITER_LINK_MAP_FD needs to be
specified in prog attr->link_create.flags, which indicates
that attr->link_create.target_fd is a map_fd.
The reason for such an explicit flag is for possible
future cases where one bpf iterator may allow more than
one possible customization, e.g., pid and cgroup id for
task_file.

Current kernel internal implementation only allows
the target to register at most one required bpf_iter_link_info.
To support the above case, optional bpf_iter_link_info's
are needed, the target can be extended to register such link
infos, and user provided link_info needs to match one of
target supported ones.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200723184112.590360-1-yhs@fb.com
2020-07-28 14:03:17 -07:00
Ian Rogers
698820a9d9 libbpf bpf_helpers: Use __builtin_offsetof for offsetof
The non-builtin route for offsetof has a dependency on size_t from
stdlib.h/stdint.h that is undeclared and may break targets.
The offsetof macro in bpf_helpers may disable the same macro in other
headers that have a #ifdef offsetof guard. Rather than add additional
dependencies improve the offsetof macro declared here to use the
builtin that is available since llvm 3.7 (the first with a BPF backend).

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200720061741.1514673-1-irogers@google.com
2020-07-28 14:03:17 -07:00
Jakub Sitnicki
6d92249be0 libbpf: Add support for SK_LOOKUP program type
Make libbpf aware of the newly added program type, and assign it a
section name.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200717103536.397595-13-jakub@cloudflare.com
2020-07-28 14:03:17 -07:00
Jakub Sitnicki
1736996279 bpf: Sync linux/bpf.h to tools/
Newly added program, context type and helper is used by tests in a
subsequent patch. Synchronize the header file.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200717103536.397595-12-jakub@cloudflare.com
2020-07-28 14:03:17 -07:00
Randy Dunlap
f9f5f054d2 bpf: Drop duplicated words in uapi helper comments
Drop doubled words "will" and "attach".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/6b9f71ae-4f8e-0259-2c5d-187ddaefe6eb@infradead.org
2020-07-28 14:03:17 -07:00
Lorenzo Bianconi
4a5aecf034 libbpf: Add SEC name for xdp programs attached to CPUMAP
As for DEVMAP, support SEC("xdp_cpumap/") as a short cut for loading
the program with type BPF_PROG_TYPE_XDP and expected attach type
BPF_XDP_CPUMAP.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/33174c41993a6d860d9c7c1f280a2477ee39ed11.1594734381.git.lorenzo@kernel.org
2020-07-28 14:03:17 -07:00
Lorenzo Bianconi
77f11b3674 bpf: cpumap: Add the possibility to attach an eBPF program to cpumap
Introduce the capability to attach an eBPF program to cpumap entries.
The idea behind this feature is to add the possibility to define on
which CPU run the eBPF program if the underlying hw does not support
RSS. Current supported verdicts are XDP_DROP and XDP_PASS.

This patch has been tested on Marvell ESPRESSObin using xdp_redirect_cpu
sample available in the kernel tree to identify possible performance
regressions. Results show there are no observable differences in
packet-per-second:

$./xdp_redirect_cpu --progname xdp_cpu_map0 --dev eth0 --cpu 1
rx: 354.8 Kpps
rx: 356.0 Kpps
rx: 356.8 Kpps
rx: 356.3 Kpps
rx: 356.6 Kpps
rx: 356.6 Kpps
rx: 356.7 Kpps
rx: 355.8 Kpps
rx: 356.8 Kpps
rx: 356.8 Kpps

Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/5c9febdf903d810b3415732e5cd98491d7d9067a.1594734381.git.lorenzo@kernel.org
2020-07-28 14:03:17 -07:00
Lorenzo Bianconi
cd46c9d67e cpumap: Formalize map value as a named struct
As it has been already done for devmap, introduce 'struct bpf_cpumap_val'
to formalize the expected values that can be passed in for a CPUMAP.
Update cpumap code to use the struct.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/754f950674665dae6139c061d28c1d982aaf4170.1594734381.git.lorenzo@kernel.org
2020-07-28 14:03:17 -07:00
Horatiu Vultur
41054a32df net: bridge: Add port attribute IFLA_BRPORT_MRP_IN_OPEN
This patch adds a new port attribute, IFLA_BRPORT_MRP_IN_OPEN, which
allows to notify the userspace when the node lost the contiuity of
MRP_InTest frames.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 14:03:17 -07:00
Andrii Nakryiko
852b4c8e73 tools/bpftool: Strip away modifiers from global variables
Reliably remove all the type modifiers from read-only (.rodata) global
variable definitions, including cases of inner field const modifiers and
arrays of const values.

Also modify one of selftests to ensure that const volatile struct doesn't
prevent user-space from modifying .rodata variable.

Fixes: 985ead416df3 ("bpftool: Add skeleton codegen command")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200713232409.3062144-3-andriin@fb.com
2020-07-28 14:03:17 -07:00
Andrii Nakryiko
de60a31eba libbpf: Support stripping modifiers for btf_dump
One important use case when emitting const/volatile/restrict is undesirable is
BPF skeleton generation of DATASEC layout. These are further memory-mapped and
can be written/read from user-space directly.

For important case of .rodata variables, bpftool strips away first-level
modifiers, to make their use on user-space side simple and not requiring extra
type casts to override compiler complaining about writing to const variables.

This logic works mostly fine, but breaks in some more complicated cases. E.g.:

    const volatile int params[10];

Because in BTF it's a chain of ARRAY -> CONST -> VOLATILE -> INT, bpftool
stops at ARRAY and doesn't strip CONST and VOLATILE. In skeleton this variable
will be emitted as is. So when used from user-space, compiler will complain
about writing to const array. This is problematic, as also mentioned in [0].

To solve this for arrays and other non-trivial cases (e.g., inner
const/volatile fields inside the struct), teach btf_dump to strip away any
modifier, when requested. This is done as an extra option on
btf_dump__emit_type_decl() API.

Reported-by: Anton Protopopov <a.s.protopopov@gmail.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200713232409.3062144-2-andriin@fb.com
2020-07-28 14:03:17 -07:00
Ciara Loftus
8ec7d86efe xsk: Add new statistics
It can be useful for the user to know the reason behind a dropped packet.
Introduce new counters which track drops on the receive path caused by:
1. rx ring being full
2. fill ring being empty

Also, on the tx path introduce a counter which tracks the number of times
we attempt pull from the tx ring when it is empty.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200708072835.4427-2-ciara.loftus@intel.com
2020-07-28 14:03:17 -07:00
Andrii Nakryiko
c3984343bc sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   2977282b63c3b6f112145ecf0bcefff0c65bd3ac
Checkpoint bpf-next commit: 5c3320d7fece4612d4a413aa3c8e82cdb5b49fcb
Baseline bpf commit:        b2f9f1535bb93ee5fa2ea30ac1c26fa0d676154c
Checkpoint bpf commit:      b2f9f1535bb93ee5fa2ea30ac1c26fa0d676154c

Andrii Nakryiko (1):
  libbpf: Fix memory leak and optimize BTF sanitization

 src/btf.c    |  2 +-
 src/btf.h    |  2 +-
 src/libbpf.c | 11 +++--------
 3 files changed, 5 insertions(+), 10 deletions(-)

--
2.24.1
2020-07-10 09:11:41 -07:00
Andrii Nakryiko
5255eb2799 libbpf: Fix memory leak and optimize BTF sanitization
Coverity's static analysis helpfully reported a memory leak introduced by
0f0e55d8247c ("libbpf: Improve BTF sanitization handling"). While fixing it,
I realized that btf__new() already creates a memory copy, so there is no need
to do this. So this patch also fixes misleading btf__new() signature to make
data into a `const void *` input parameter. And it avoids unnecessary memory
allocation and copy in BTF sanitization code altogether.

Fixes: 0f0e55d8247c ("libbpf: Improve BTF sanitization handling")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200710011023.1655008-1-andriin@fb.com
2020-07-10 09:11:41 -07:00
Andrii Nakryiko
8b5e81a17a sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   2977282b63c3b6f112145ecf0bcefff0c65bd3ac
Checkpoint bpf-next commit: 2977282b63c3b6f112145ecf0bcefff0c65bd3ac
Baseline bpf commit:        0f57a1e522f413e87852e632f55de4723e511939
Checkpoint bpf commit:      b2f9f1535bb93ee5fa2ea30ac1c26fa0d676154c

Jakub Bogusz (1):
  libbpf: Fix libbpf hashmap on (I)LP32 architectures

 src/hashmap.h | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

--
2.24.1
2020-07-09 22:00:15 -07:00
Jakub Bogusz
cd016d93f7 libbpf: Fix libbpf hashmap on (I)LP32 architectures
On ILP32, 64-bit result was shifted by value calculated for 32-bit long type
and returned value was much outside hashmap capacity.
As advised by Andrii Nakryiko, this patch uses different hashing variant for
architectures with size_t shorter than long long.

Fixes: e3b924224028 ("libbpf: add resizable non-thread safe internal hashmap")
Signed-off-by: Jakub Bogusz <qboosh@pld-linux.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200709225723.1069937-1-andriin@fb.com
2020-07-09 22:00:15 -07:00
Andrii Nakryiko
deaee9541d vmtests: update blacklist for 5.5
Add two tests (sockopt_sk and udp_limit) to blacklist of 5.5 kernel.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-07-08 17:12:53 -07:00
Andrii Nakryiko
daa2c7f851 ci: re-arrange tests to prioritize higher-signal tests
Put selftests in first stage. Put long-running LATEST build & test case first,
so that it can be better parallelized with 4.9 and 5.5.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-07-08 17:12:53 -07:00
Andrii Nakryiko
006904d416 vmtests: whitelist core_retro for 4.9 tests
Add core_retro to whitelist for 4.9, as it is supposed to work on old kernels.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-07-08 17:12:53 -07:00
Andrii Nakryiko
e47ebc895d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   6b207d66aa9fad0deed13d5f824e1ea193b0a777
Checkpoint bpf-next commit: 2977282b63c3b6f112145ecf0bcefff0c65bd3ac
Baseline bpf commit:        e708e2bd55c921f5bb554fa5837d132a878951cf
Checkpoint bpf commit:      0f57a1e522f413e87852e632f55de4723e511939

Andrii Nakryiko (4):
  libbpf: Make BTF finalization strict
  libbpf: Add btf__set_fd() for more control over loaded BTF FD
  libbpf: Improve BTF sanitization handling
  libbpf: Handle missing BPF_OBJ_GET_INFO_BY_FD gracefully in
    perf_buffer

Stanislav Fomichev (1):
  libbpf: Add support for BPF_CGROUP_INET_SOCK_RELEASE

 include/uapi/linux/bpf.h |   1 +
 src/btf.c                |   7 +-
 src/btf.h                |   1 +
 src/libbpf.c             | 154 ++++++++++++++++++++++-----------------
 src/libbpf.map           |   1 +
 5 files changed, 95 insertions(+), 69 deletions(-)

--
2.24.1
2020-07-08 17:12:53 -07:00
Andrii Nakryiko
3b2837e296 libbpf: Handle missing BPF_OBJ_GET_INFO_BY_FD gracefully in perf_buffer
perf_buffer__new() is relying on BPF_OBJ_GET_INFO_BY_FD availability for few
sanity checks. OBJ_GET_INFO for maps is actually much more recent feature than
perf_buffer support itself, so this causes unnecessary problems on old kernels
before BPF_OBJ_GET_INFO_BY_FD was added.

This patch makes those sanity checks optional and just assumes best if command
is not supported. If user specified something incorrectly (e.g., wrong map
type), kernel will reject it later anyway, except user won't get a nice
explanation as to why it failed. This seems like a good trade off for
supporting perf_buffer on old kernels.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200708015318.3827358-6-andriin@fb.com
2020-07-08 17:12:53 -07:00
Andrii Nakryiko
90716e9e14 libbpf: Improve BTF sanitization handling
Change sanitization process to preserve original BTF, which might be used by
libbpf itself for Kconfig externs, CO-RE relocs, etc, even if kernel is old
and doesn't support BTF. To achieve that, if libbpf detects the need for BTF
sanitization, it would clone original BTF, sanitize it in-place, attempt to
load it into kernel, and if successful, will preserve loaded BTF FD in
original `struct btf`, while freeing sanitized local copy.

If kernel doesn't support any BTF, original btf and btf_ext will still be
preserved to be used later for CO-RE relocation and other BTF-dependent libbpf
features, which don't dependon kernel BTF support.

Patch takes care to not specify BTF and BTF.ext features when loading BPF
programs and/or maps, if it was detected that kernel doesn't support BTF
features.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200708015318.3827358-4-andriin@fb.com
2020-07-08 17:12:53 -07:00
Andrii Nakryiko
d5a36e2070 libbpf: Add btf__set_fd() for more control over loaded BTF FD
Add setter for BTF FD to allow application more fine-grained control in more
advanced scenarios. Storing BTF FD inside `struct btf` provides little benefit
and probably would be better done differently (e.g., btf__load() could just
return FD on success), but we are stuck with this due to backwards
compatibility. The main problem is that it's impossible to load BTF and than
free user-space memory, but keep FD intact, because `struct btf` assumes
ownership of that FD upon successful load and will attempt to close it during
btf__free(). To allow callers (e.g., libbpf itself for BTF sanitization) to
have more control over this, add btf__set_fd() to allow to reset FD
arbitrarily, if necessary.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200708015318.3827358-3-andriin@fb.com
2020-07-08 17:12:53 -07:00
Andrii Nakryiko
133543c202 libbpf: Make BTF finalization strict
With valid ELF and valid BTF, there is no reason (apart from bugs) why BTF
finalization should fail. So make it strict and return error if it fails. This
makes CO-RE relocation more reliable, as they are not going to be just
silently skipped, if BTF finalization failed.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200708015318.3827358-2-andriin@fb.com
2020-07-08 17:12:53 -07:00
Stanislav Fomichev
abb82202da libbpf: Add support for BPF_CGROUP_INET_SOCK_RELEASE
Add auto-detection for the cgroup/sock_release programs.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200706230128.4073544-3-sdf@google.com
2020-07-08 17:12:53 -07:00
Andrii Nakryiko
5020fdf8fc vmtests: fix 4.9 build
Drop blacklist and instead use a small whitelist of tests that are still
supposed to work on old 4.9 kernel.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-07-07 11:10:16 -07:00
Andrii Nakryiko
a846caca79 vmtests: test no-alu32 variant of test_progs
Add testing of no-alu32 flavor of test_progs.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-07-07 10:41:57 -07:00
Julia Kartseva
1b42b15b5e travis_ci: run tests for 4.9 kernel
Make sure that libbpf sanitizes BTF properly for older kernels.
Add a stage for 4.9.0 kernel in TravisCI.
For now make test failures non-blocking by adding 4.9.0 to `allow_failures`
section.
Blacklist is copy-pasted 5.5.0 kernel blacklist.
2020-07-01 15:38:31 -07:00
Andrii Nakryiko
a2b27a1b62 vmtests: remove custom 5.5 selftest preparetion actions
Now that pre-generated vmlinux.h is used for compilation of non-latest tests,
we don't need custom adjustments for 5.5 kernel selftests. Adjust blacklist
now that those new self-tests are built into test_progs.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-07-01 15:19:18 -07:00
Andrii Nakryiko
7b9d71b21d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   ca4db6389d611eee2eb7c1dfe710b62d8ea06772
Checkpoint bpf-next commit: 6b207d66aa9fad0deed13d5f824e1ea193b0a777
Baseline bpf commit:        2bdeb3ed547d8822b2566797afa6c2584abdb119
Checkpoint bpf commit:      e708e2bd55c921f5bb554fa5837d132a878951cf

Andrii Nakryiko (1):
  libbpf: Make bpf_endian co-exist with vmlinux.h

Song Liu (1):
  bpf: Introduce helper bpf_get_task_stack()

 include/uapi/linux/bpf.h | 37 +++++++++++++++++++++++++++++++++-
 src/bpf_endian.h         | 43 ++++++++++++++++++++++++++++++++--------
 2 files changed, 71 insertions(+), 9 deletions(-)

--
2.24.1
2020-07-01 14:36:55 -07:00
Andrii Nakryiko
89f7f0796a sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-07-01 14:36:55 -07:00
Song Liu
c054d91247 bpf: Introduce helper bpf_get_task_stack()
Introduce helper bpf_get_task_stack(), which dumps stack trace of given
task. This is different to bpf_get_stack(), which gets stack track of
current task. One potential use case of bpf_get_task_stack() is to call
it from bpf_iter__task and dump all /proc/<pid>/stack to a seq_file.

bpf_get_task_stack() uses stack_trace_save_tsk() instead of
get_perf_callchain() for kernel stack. The benefit of this choice is that
stack_trace_save_tsk() doesn't require changes in arch/. The downside of
using stack_trace_save_tsk() is that stack_trace_save_tsk() dumps the
stack trace to unsigned long array. For 32-bit systems, we need to
translate it to u64 array.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200630062846.664389-3-songliubraving@fb.com
2020-07-01 14:36:55 -07:00
Andrii Nakryiko
9c104b1637 libbpf: Make bpf_endian co-exist with vmlinux.h
Make bpf_endian.h compatible with vmlinux.h. It is a frequent request from
users wanting to use bpf_endian.h in their BPF applications using CO-RE and
vmlinux.h.

To achieve that, re-implement byte swap macros and drop all the header
includes. This way it can be used both with linux header includes, as well as
with a vmlinux.h.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200630152125.3631920-2-andriin@fb.com
2020-07-01 14:36:55 -07:00
Andrii Nakryiko
d08d57cd91 vmtests: check in vmlinux.h and use it for non-latest builds
Manually generate vmlinux.h based on latest.config to be used for non-latest
selftest build. This will keep bpftool and newest selftests builds succeeding,
while at runtime blacklist will skip them.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-30 18:09:33 -07:00
Andrii Nakryiko
803243cc33 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   b3eece09e2e69f528a1ab6104861550dec149083
Checkpoint bpf-next commit: afa12644c877d3f627281bb6493d7ca8f9976e3d
Baseline bpf commit:        4e15507fea70c0c312d79610efa46b6853ccf8e0
Checkpoint bpf commit:      2bdeb3ed547d8822b2566797afa6c2584abdb119

Andrii Nakryiko (4):
  bpf: Switch most helper return values from 32-bit int to 64-bit long
  libbpf: Prevent loading vmlinux BTF twice
  libbpf: Support disabling auto-loading BPF programs
  libbpf: Fix CO-RE relocs against .text section

Colin Ian King (1):
  libbpf: Fix spelling mistake "kallasyms" -> "kallsyms"

Dmitry Yakunin (1):
  bpf: Add SO_KEEPALIVE and related options to bpf_setsockopt

Jesper Dangaard Brouer (1):
  libbpf: Adjust SEC short cut for expected attach type BPF_XDP_DEVMAP

Quentin Monnet (1):
  bpf: Fix formatting in documentation for BPF helpers

Yonghong Song (3):
  bpf: Add bpf_skc_to_tcp6_sock() helper
  bpf: Add bpf_skc_to_{tcp, tcp_timewait, tcp_request}_sock() helpers
  bpf: Add bpf_skc_to_udp6_sock() helper

 include/uapi/linux/bpf.h | 277 ++++++++++++++++++++++-----------------
 src/libbpf.c             |  93 +++++++++----
 src/libbpf.h             |   2 +
 src/libbpf.map           |   2 +
 4 files changed, 233 insertions(+), 141 deletions(-)

--
2.24.1
2020-06-29 13:46:33 -07:00
Andrii Nakryiko
d707f8027b sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-06-29 13:46:33 -07:00
Jesper Dangaard Brouer
652f2c0a40 libbpf: Adjust SEC short cut for expected attach type BPF_XDP_DEVMAP
Adjust the SEC("xdp_devmap/") prog type prefix to contain a
slash "/" for expected attach type BPF_XDP_DEVMAP.  This is consistent
with other prog types like tracing.

Fixes: 2778797037a6 ("libbpf: Add SEC name for xdp programs attached to device map")
Suggested-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159309521882.821855.6873145686353617509.stgit@firesoul
2020-06-29 13:46:33 -07:00
Quentin Monnet
2fcd394505 bpf: Fix formatting in documentation for BPF helpers
When producing the bpf-helpers.7 man page from the documentation from
the BPF user space header file, rst2man complains:

    <stdin>:2636: (ERROR/3) Unexpected indentation.
    <stdin>:2640: (WARNING/2) Block quote ends without a blank line; unexpected unindent.

Let's fix formatting for the relevant chunk (item list in
bpf_ringbuf_query()'s description), and for a couple other functions.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200623153935.6215-1-quentin@isovalent.com
2020-06-29 13:46:33 -07:00
Andrii Nakryiko
af3c9f9fc4 libbpf: Fix CO-RE relocs against .text section
bpf_object__find_program_by_title(), used by CO-RE relocation code, doesn't
return .text "BPF program", if it is a function storage for sub-programs.
Because of that, any CO-RE relocation in helper non-inlined functions will
fail. Fix this by searching for .text-corresponding BPF program manually.

Adjust one of bpf_iter selftest to exhibit this pattern.

Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm")
Reported-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200619230423.691274-1-andriin@fb.com
2020-06-29 13:46:33 -07:00
Andrii Nakryiko
a62b08dd0c libbpf: Support disabling auto-loading BPF programs
Currently, bpf_object__load() (and by induction skeleton's load), will always
attempt to prepare, relocate, and load into kernel every single BPF program
found inside the BPF object file. This is often convenient and the right thing
to do and what users expect.

But there are plenty of cases (especially with BPF development constantly
picking up the pace), where BPF application is intended to work with old
kernels, with potentially reduced set of features. But on kernels supporting
extra features, it would like to take a full advantage of them, by employing
extra BPF program. This could be a choice of using fentry/fexit over
kprobe/kretprobe, if kernel is recent enough and is built with BTF. Or BPF
program might be providing optimized bpf_iter-based solution that user-space
might want to use, whenever available. And so on.

With libbpf and BPF CO-RE in particular, it's advantageous to not have to
maintain two separate BPF object files to achieve this. So to enable such use
cases, this patch adds ability to request not auto-loading chosen BPF
programs. In such case, libbpf won't attempt to perform relocations (which
might fail due to old kernel), won't try to resolve BTF types for
BTF-aware (tp_btf/fentry/fexit/etc) program types, because BTF might not be
present, and so on. Skeleton will also automatically skip auto-attachment step
for such not loaded BPF programs.

Overall, this feature allows to simplify development and deployment of
real-world BPF applications with complicated compatibility requirements.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200625232629.3444003-2-andriin@fb.com
2020-06-29 13:46:33 -07:00
Yonghong Song
318ed9d544 bpf: Add bpf_skc_to_udp6_sock() helper
The helper is used in tracing programs to cast a socket
pointer to a udp6_sock pointer.
The return value could be NULL if the casting is illegal.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/bpf/20200623230815.3988481-1-yhs@fb.com
2020-06-29 13:46:33 -07:00
Yonghong Song
47370741be bpf: Add bpf_skc_to_{tcp, tcp_timewait, tcp_request}_sock() helpers
Three more helpers are added to cast a sock_common pointer to
an tcp_sock, tcp_timewait_sock or a tcp_request_sock for
tracing programs.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230811.3988277-1-yhs@fb.com
2020-06-29 13:46:33 -07:00
Yonghong Song
26e5e7dcb0 bpf: Add bpf_skc_to_tcp6_sock() helper
The helper is used in tracing programs to cast a socket
pointer to a tcp6_sock pointer.
The return value could be NULL if the casting is illegal.

A new helper return type RET_PTR_TO_BTF_ID_OR_NULL is added
so the verifier is able to deduce proper return types for the helper.

Different from the previous BTF_ID based helpers,
the bpf_skc_to_tcp6_sock() argument can be several possible
btf_ids. More specifically, all possible socket data structures
with sock_common appearing in the first in the memory layout.
This patch only added socket types related to tcp and udp.

All possible argument btf_id and return value btf_id
for helper bpf_skc_to_tcp6_sock() are pre-calculcated and
cached. In the future, it is even possible to precompute
these btf_id's at kernel build time.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230809.3988195-1-yhs@fb.com
2020-06-29 13:46:33 -07:00
Dmitry Yakunin
cd469e21e8 bpf: Add SO_KEEPALIVE and related options to bpf_setsockopt
This patch adds support of SO_KEEPALIVE flag and TCP related options
to bpf_setsockopt() routine. This is helpful if we want to enable or tune
TCP keepalive for applications which don't do it in the userspace code.

v3:
  - update kernel-doc in uapi (Nikita Vetoshkin <nekto0n@yandex-team.ru>)

v4:
  - update kernel-doc in tools too (Alexei Starovoitov)
  - add test to selftests (Alexei Starovoitov)

Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200620153052.9439-3-zeil@yandex-team.ru
2020-06-29 13:46:33 -07:00
Andrii Nakryiko
18bfe12dc1 libbpf: Prevent loading vmlinux BTF twice
Prevent loading/parsing vmlinux BTF twice in some cases: for CO-RE relocations
and for BTF-aware hooks (tp_btf, fentry/fexit, etc).

Fixes: a6ed02cac690 ("libbpf: Load btf_vmlinux only once per object.")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200624043805.1794620-1-andriin@fb.com
2020-06-29 13:46:33 -07:00
Colin Ian King
fef856084a libbpf: Fix spelling mistake "kallasyms" -> "kallsyms"
There is a spelling mistake in a pr_warn message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200623084207.149253-1-colin.king@canonical.com
2020-06-29 13:46:33 -07:00
Andrii Nakryiko
6f8e021c3c bpf: Switch most helper return values from 32-bit int to 64-bit long
Switch most of BPF helper definitions from returning int to long. These
definitions are coming from comments in BPF UAPI header and are used to
generate bpf_helper_defs.h (under libbpf) to be later included and used from
BPF programs.

In actual in-kernel implementation, all the helpers are defined as returning
u64, but due to some historical reasons, most of them are actually defined as
returning int in UAPI (usually, to return 0 on success, and negative value on
error).

This actually causes Clang to quite often generate sub-optimal code, because
compiler believes that return value is 32-bit, and in a lot of cases has to be
up-converted (usually with a pair of 32-bit bit shifts) to 64-bit values,
before they can be used further in BPF code.

Besides just "polluting" the code, these 32-bit shifts quite often cause
problems for cases in which return value matters. This is especially the case
for the family of bpf_probe_read_str() functions. There are few other similar
helpers (e.g., bpf_read_branch_records()), in which return value is used by
BPF program logic to record variable-length data and process it. For such
cases, BPF program logic carefully manages offsets within some array or map to
read variable-length data. For such uses, it's crucial for BPF verifier to
track possible range of register values to prove that all the accesses happen
within given memory bounds. Those extraneous zero-extending bit shifts,
inserted by Clang (and quite often interleaved with other code, which makes
the issues even more challenging and sometimes requires employing extra
per-variable compiler barriers), throws off verifier logic and makes it mark
registers as having unknown variable offset. We'll study this pattern a bit
later below.

Another common pattern is to check return of BPF helper for non-zero state to
detect error conditions and attempt alternative actions in such case. Even in
this simple and straightforward case, this 32-bit vs BPF's native 64-bit mode
quite often leads to sub-optimal and unnecessary extra code. We'll look at
this pattern as well.

Clang's BPF target supports two modes of code generation: ALU32, in which it
is capable of using lower 32-bit parts of registers, and no-ALU32, in which
only full 64-bit registers are being used. ALU32 mode somewhat mitigates the
above described problems, but not in all cases.

This patch switches all the cases in which BPF helpers return 0 or negative
error from returning int to returning long. It is shown below that such change
in definition leads to equivalent or better code. No-ALU32 mode benefits more,
but ALU32 mode doesn't degrade or still gets improved code generation.

Another class of cases switched from int to long are bpf_probe_read_str()-like
helpers, which encode successful case as non-negative values, while still
returning negative value for errors.

In all of such cases, correctness is preserved due to two's complement
encoding of negative values and the fact that all helpers return values with
32-bit absolute value. Two's complement ensures that for negative values
higher 32 bits are all ones and when truncated, leave valid negative 32-bit
value with the same value. Non-negative values have upper 32 bits set to zero
and similarly preserve value when high 32 bits are truncated. This means that
just casting to int/u32 is correct and efficient (and in ALU32 mode doesn't
require any extra shifts).

To minimize the chances of regressions, two code patterns were investigated,
as mentioned above. For both patterns, BPF assembly was analyzed in
ALU32/NO-ALU32 compiler modes, both with current 32-bit int return type and
new 64-bit long return type.

Case 1. Variable-length data reading and concatenation. This is quite
ubiquitous pattern in tracing/monitoring applications, reading data like
process's environment variables, file path, etc. In such case, many pieces of
string-like variable-length data are read into a single big buffer, and at the
end of the process, only a part of array containing actual data is sent to
user-space for further processing. This case is tested in test_varlen.c
selftest (in the next patch). Code flow is roughly as follows:

  void *payload = &sample->payload;
  u64 len;

  len = bpf_probe_read_kernel_str(payload, MAX_SZ1, &source_data1);
  if (len <= MAX_SZ1) {
      payload += len;
      sample->len1 = len;
  }
  len = bpf_probe_read_kernel_str(payload, MAX_SZ2, &source_data2);
  if (len <= MAX_SZ2) {
      payload += len;
      sample->len2 = len;
  }
  /* and so on */
  sample->total_len = payload - &sample->payload;
  /* send over, e.g., perf buffer */

There could be two variations with slightly different code generated: when len
is 64-bit integer and when it is 32-bit integer. Both variations were analysed.
BPF assembly instructions between two successive invocations of
bpf_probe_read_kernel_str() were used to check code regressions. Results are
below, followed by short analysis. Left side is using helpers with int return
type, the right one is after the switch to long.

ALU32 + INT                                ALU32 + LONG
===========                                ============

64-BIT (13 insns):                         64-BIT (10 insns):
------------------------------------       ------------------------------------
  17:   call 115                             17:   call 115
  18:   if w0 > 256 goto +9 <LBB0_4>         18:   if r0 > 256 goto +6 <LBB0_4>
  19:   w1 = w0                              19:   r1 = 0 ll
  20:   r1 <<= 32                            21:   *(u64 *)(r1 + 0) = r0
  21:   r1 s>>= 32                           22:   r6 = 0 ll
  22:   r2 = 0 ll                            24:   r6 += r0
  24:   *(u64 *)(r2 + 0) = r1              00000000000000c8 <LBB0_4>:
  25:   r6 = 0 ll                            25:   r1 = r6
  27:   r6 += r1                             26:   w2 = 256
00000000000000e0 <LBB0_4>:                   27:   r3 = 0 ll
  28:   r1 = r6                              29:   call 115
  29:   w2 = 256
  30:   r3 = 0 ll
  32:   call 115

32-BIT (11 insns):                         32-BIT (12 insns):
------------------------------------       ------------------------------------
  17:   call 115                             17:   call 115
  18:   if w0 > 256 goto +7 <LBB1_4>         18:   if w0 > 256 goto +8 <LBB1_4>
  19:   r1 = 0 ll                            19:   r1 = 0 ll
  21:   *(u32 *)(r1 + 0) = r0                21:   *(u32 *)(r1 + 0) = r0
  22:   w1 = w0                              22:   r0 <<= 32
  23:   r6 = 0 ll                            23:   r0 >>= 32
  25:   r6 += r1                             24:   r6 = 0 ll
00000000000000d0 <LBB1_4>:                   26:   r6 += r0
  26:   r1 = r6                            00000000000000d8 <LBB1_4>:
  27:   w2 = 256                             27:   r1 = r6
  28:   r3 = 0 ll                            28:   w2 = 256
  30:   call 115                             29:   r3 = 0 ll
                                             31:   call 115

In ALU32 mode, the variant using 64-bit length variable clearly wins and
avoids unnecessary zero-extension bit shifts. In practice, this is even more
important and good, because BPF code won't need to do extra checks to "prove"
that payload/len are within good bounds.

32-bit len is one instruction longer. Clang decided to do 64-to-32 casting
with two bit shifts, instead of equivalent `w1 = w0` assignment. The former
uses extra register. The latter might potentially lose some range information,
but not for 32-bit value. So in this case, verifier infers that r0 is [0, 256]
after check at 18:, and shifting 32 bits left/right keeps that range intact.
We should probably look into Clang's logic and see why it chooses bitshifts
over sub-register assignments for this.

NO-ALU32 + INT                             NO-ALU32 + LONG
==============                             ===============

64-BIT (14 insns):                         64-BIT (10 insns):
------------------------------------       ------------------------------------
  17:   call 115                             17:   call 115
  18:   r0 <<= 32                            18:   if r0 > 256 goto +6 <LBB0_4>
  19:   r1 = r0                              19:   r1 = 0 ll
  20:   r1 >>= 32                            21:   *(u64 *)(r1 + 0) = r0
  21:   if r1 > 256 goto +7 <LBB0_4>         22:   r6 = 0 ll
  22:   r0 s>>= 32                           24:   r6 += r0
  23:   r1 = 0 ll                          00000000000000c8 <LBB0_4>:
  25:   *(u64 *)(r1 + 0) = r0                25:   r1 = r6
  26:   r6 = 0 ll                            26:   r2 = 256
  28:   r6 += r0                             27:   r3 = 0 ll
00000000000000e8 <LBB0_4>:                   29:   call 115
  29:   r1 = r6
  30:   r2 = 256
  31:   r3 = 0 ll
  33:   call 115

32-BIT (13 insns):                         32-BIT (13 insns):
------------------------------------       ------------------------------------
  17:   call 115                             17:   call 115
  18:   r1 = r0                              18:   r1 = r0
  19:   r1 <<= 32                            19:   r1 <<= 32
  20:   r1 >>= 32                            20:   r1 >>= 32
  21:   if r1 > 256 goto +6 <LBB1_4>         21:   if r1 > 256 goto +6 <LBB1_4>
  22:   r2 = 0 ll                            22:   r2 = 0 ll
  24:   *(u32 *)(r2 + 0) = r0                24:   *(u32 *)(r2 + 0) = r0
  25:   r6 = 0 ll                            25:   r6 = 0 ll
  27:   r6 += r1                             27:   r6 += r1
00000000000000e0 <LBB1_4>:                 00000000000000e0 <LBB1_4>:
  28:   r1 = r6                              28:   r1 = r6
  29:   r2 = 256                             29:   r2 = 256
  30:   r3 = 0 ll                            30:   r3 = 0 ll
  32:   call 115                             32:   call 115

In NO-ALU32 mode, for the case of 64-bit len variable, Clang generates much
superior code, as expected, eliminating unnecessary bit shifts. For 32-bit
len, code is identical.

So overall, only ALU-32 32-bit len case is more-or-less equivalent and the
difference stems from internal Clang decision, rather than compiler lacking
enough information about types.

Case 2. Let's look at the simpler case of checking return result of BPF helper
for errors. The code is very simple:

  long bla;
  if (bpf_probe_read_kenerl(&bla, sizeof(bla), 0))
      return 1;
  else
      return 0;

ALU32 + CHECK (9 insns)                    ALU32 + CHECK (9 insns)
====================================       ====================================
  0:    r1 = r10                             0:    r1 = r10
  1:    r1 += -8                             1:    r1 += -8
  2:    w2 = 8                               2:    w2 = 8
  3:    r3 = 0                               3:    r3 = 0
  4:    call 113                             4:    call 113
  5:    w1 = w0                              5:    r1 = r0
  6:    w0 = 1                               6:    w0 = 1
  7:    if w1 != 0 goto +1 <LBB2_2>          7:    if r1 != 0 goto +1 <LBB2_2>
  8:    w0 = 0                               8:    w0 = 0
0000000000000048 <LBB2_2>:                 0000000000000048 <LBB2_2>:
  9:    exit                                 9:    exit

Almost identical code, the only difference is the use of full register
assignment (r1 = r0) vs half-registers (w1 = w0) in instruction #5. On 32-bit
architectures, new BPF assembly might be slightly less optimal, in theory. But
one can argue that's not a big issue, given that use of full registers is
still prevalent (e.g., for parameter passing).

NO-ALU32 + CHECK (11 insns)                NO-ALU32 + CHECK (9 insns)
====================================       ====================================
  0:    r1 = r10                             0:    r1 = r10
  1:    r1 += -8                             1:    r1 += -8
  2:    r2 = 8                               2:    r2 = 8
  3:    r3 = 0                               3:    r3 = 0
  4:    call 113                             4:    call 113
  5:    r1 = r0                              5:    r1 = r0
  6:    r1 <<= 32                            6:    r0 = 1
  7:    r1 >>= 32                            7:    if r1 != 0 goto +1 <LBB2_2>
  8:    r0 = 1                               8:    r0 = 0
  9:    if r1 != 0 goto +1 <LBB2_2>        0000000000000048 <LBB2_2>:
 10:    r0 = 0                               9:    exit
0000000000000058 <LBB2_2>:
 11:    exit

NO-ALU32 is a clear improvement, getting rid of unnecessary zero-extension bit
shifts.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200623032224.4020118-1-andriin@fb.com
2020-06-29 13:46:33 -07:00
Andrii Nakryiko
143213eb82 README: info on routing general BPF/libbpf quesions
We keep getting more and more questions about BPF/libbpf usage.
This repo is not the right place to ask them, as not that many people
monitor it. Re-route folks to bpf@vger.kernel.org

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-22 20:52:42 -07:00
Andrii Nakryiko
ac74ee188d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   1bdb6c9a1c43fdf9b83b2331dfc6229bd2e71d9b
Checkpoint bpf-next commit: b3eece09e2e69f528a1ab6104861550dec149083
Baseline bpf commit:        4e15507fea70c0c312d79610efa46b6853ccf8e0
Checkpoint bpf commit:      4e15507fea70c0c312d79610efa46b6853ccf8e0

Andrii Nakryiko (3):
  libbpf: Generalize libbpf externs support
  libbpf: Add support for extracting kernel symbol addresses
  libbpf: Wrap source argument of BPF_CORE_READ macro in parentheses

 src/bpf_core_read.h |   8 +-
 src/bpf_helpers.h   |   1 +
 src/btf.h           |   5 +
 src/libbpf.c        | 482 +++++++++++++++++++++++++++++++-------------
 4 files changed, 350 insertions(+), 146 deletions(-)

--
2.24.1
2020-06-22 20:31:52 -07:00
Andrii Nakryiko
15943906dc libbpf: Wrap source argument of BPF_CORE_READ macro in parentheses
Wrap source argument of BPF_CORE_READ family of macros into parentheses to
allow uses like this:

BPF_CORE_READ((struct cast_struct *)src, a, b, c);

Fixes: 7db3822ab991 ("libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200619231703.738941-8-andriin@fb.com
2020-06-22 20:31:52 -07:00
Andrii Nakryiko
85749135a6 libbpf: Add support for extracting kernel symbol addresses
Add support for another (in addition to existing Kconfig) special kind of
externs in BPF code, kernel symbol externs. Such externs allow BPF code to
"know" kernel symbol address and either use it for comparisons with kernel
data structures (e.g., struct file's f_op pointer, to distinguish different
kinds of file), or, with the help of bpf_probe_user_kernel(), to follow
pointers and read data from global variables. Kernel symbol addresses are
found through /proc/kallsyms, which should be present in the system.

Currently, such kernel symbol variables are typeless: they have to be defined
as `extern const void <symbol>` and the only operation you can do (in C code)
with them is to take its address. Such extern should reside in a special
section '.ksyms'. bpf_helpers.h header provides __ksym macro for this. Strong
vs weak semantics stays the same as with Kconfig externs. If symbol is not
found in /proc/kallsyms, this will be a failure for strong (non-weak) extern,
but will be defaulted to 0 for weak externs.

If the same symbol is defined multiple times in /proc/kallsyms, then it will
be error if any of the associated addresses differs. In that case, address is
ambiguous, so libbpf falls on the side of caution, rather than confusing user
with randomly chosen address.

In the future, once kernel is extended with variables BTF information, such
ksym externs will be supported in a typed version, which will allow BPF
program to read variable's contents directly, similarly to how it's done for
fentry/fexit input arguments.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20200619231703.738941-3-andriin@fb.com
2020-06-22 20:31:52 -07:00
Andrii Nakryiko
3b320677cd libbpf: Generalize libbpf externs support
Switch existing Kconfig externs to be just one of few possible kinds of more
generic externs. This refactoring is in preparation for ksymbol extern
support, added in the follow up patch. There are no functional changes
intended.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20200619231703.738941-2-andriin@fb.com
2020-06-22 20:31:52 -07:00
Andrii Nakryiko
15fee53503 vmtests: blacklist test using RINGBUF
Test was updated to use BPF_MAP_TYPE_RINGBUF, which is only available starting
from 5.8 version.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-22 17:11:04 -07:00
Andrii Nakryiko
169d35c746 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   69119673bd50b176ded34032fadd41530fb5af21
Checkpoint bpf-next commit: 1bdb6c9a1c43fdf9b83b2331dfc6229bd2e71d9b
Baseline bpf commit:        4e15507fea70c0c312d79610efa46b6853ccf8e0
Checkpoint bpf commit:      4e15507fea70c0c312d79610efa46b6853ccf8e0

Andrii Nakryiko (2):
  libbpf: Bump version to 0.1.0
  libbpf: Add a bunch of attribute getters/setters for map definitions

 src/libbpf.c   | 100 +++++++++++++++++++++++++++++++++++++++++++++----
 src/libbpf.h   |  30 +++++++++++++--
 src/libbpf.map |  17 +++++++++
 3 files changed, 137 insertions(+), 10 deletions(-)

--
2.24.1
2020-06-22 17:11:04 -07:00
Andrii Nakryiko
d8d4713476 libbpf: Add a bunch of attribute getters/setters for map definitions
Add a bunch of getter for various aspects of BPF map. Some of these attribute
(e.g., key_size, value_size, type, etc) are available right now in struct
bpf_map_def, but this patch adds getter allowing to fetch them individually.
bpf_map_def approach isn't very scalable, when ABI stability requirements are
taken into account. It's much easier to extend libbpf and add support for new
features, when each aspect of BPF map has separate getter/setter.

Getters follow the common naming convention of not explicitly having "get" in
its name: bpf_map__type() returns map type, bpf_map__key_size() returns
key_size. Setters, though, explicitly have set in their name:
bpf_map__set_type(), bpf_map__set_key_size().

This patch ensures we now have a getter and a setter for the following
map attributes:
  - type;
  - max_entries;
  - map_flags;
  - numa_node;
  - key_size;
  - value_size;
  - ifindex.

bpf_map__resize() enforces unnecessary restriction of max_entries > 0. It is
unnecessary, because libbpf actually supports zero max_entries for some cases
(e.g., for PERF_EVENT_ARRAY map) and treats it specially during map creation
time. To allow setting max_entries=0, new bpf_map__set_max_entries() setter is
added. bpf_map__resize()'s behavior is preserved for backwards compatibility
reasons.

Map ifindex getter is added as well. There is a setter already, but no
corresponding getter. Fix this assymetry as well. bpf_map__set_ifindex()
itself is converted from void function into error-returning one, similar to
other setters. The only error returned right now is -EBUSY, if BPF map is
already loaded and has corresponding FD.

One lacking attribute with no ability to get/set or even specify it
declaratively is numa_node. This patch fixes this gap and both adds
programmatic getter/setter, as well as adds support for numa_node field in
BTF-defined map.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200621062112.3006313-1-andriin@fb.com
2020-06-22 17:11:04 -07:00
Andrii Nakryiko
ef26b4c37f libbpf: Bump version to 0.1.0
Bump libbpf version to 0.1.0, as new development cycle starts.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200617183132.1970836-1-andriin@fb.com
2020-06-22 17:11:04 -07:00
Andrii Nakryiko
d7b2934cf9 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   69119673bd50b176ded34032fadd41530fb5af21
Checkpoint bpf-next commit: 4e15507fea70c0c312d79610efa46b6853ccf8e0
Baseline bpf commit:        6903cdae9f9f08d61e49c16cbef11c293e33a615
Checkpoint bpf commit:      4e15507fea70c0c312d79610efa46b6853ccf8e0

Andrii Nakryiko (1):
  libbpf: Forward-declare bpf_stats_type for systems with outdated UAPI
    headers

 src/bpf.h | 2 ++
 1 file changed, 2 insertions(+)

--
2.24.1
2020-06-22 15:43:44 -07:00
Andrii Nakryiko
c83d2166e8 libbpf: Forward-declare bpf_stats_type for systems with outdated UAPI headers
Systems that doesn't yet have the very latest linux/bpf.h header, enum
bpf_stats_type will be undefined, causing compilation warnings. Prevents this
by forward-declaring enum.

Fixes: 0bee106716cf ("libbpf: Add support for command BPF_ENABLE_STATS")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200621031159.2279101-1-andriin@fb.com
2020-06-22 15:43:44 -07:00
Andrii Nakryiko
fb27968bf1 vmtests: blacklist 5.5 test and temporary blacklist core_reloc test
Permanently blacklist load_bytes_relative test on 5.5 due to missing
functionality.

Also temporarily blacklist core_reloc test due to failure on latest kernel.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-17 11:48:22 -07:00
Andrii Nakryiko
d6ae406429 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   cb8e59cc87201af93dfbb6c3dccc8fcad72a09c2
Checkpoint bpf-next commit: 69119673bd50b176ded34032fadd41530fb5af21
Baseline bpf commit:        47f6bc4ce1ff70d7ba0924c2f1c218c96cd585fb
Checkpoint bpf commit:      6903cdae9f9f08d61e49c16cbef11c293e33a615

Andrii Nakryiko (2):
  libbpf: Support pre-initializing .bss global variables
  bpf: Fix definition of bpf_ringbuf_output() helper in UAPI comments

 include/uapi/linux/bpf.h | 2 +-
 src/libbpf.c             | 4 ----
 2 files changed, 1 insertion(+), 5 deletions(-)

--
2.24.1
2020-06-17 11:48:22 -07:00
Andrii Nakryiko
cb174c5b8d sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-06-17 11:48:22 -07:00
Andrii Nakryiko
17f747ed38 bpf: Fix definition of bpf_ringbuf_output() helper in UAPI comments
Fix definition of bpf_ringbuf_output() in UAPI header comments, which is used
to generate libbpf's bpf_helper_defs.h header. Return value is a number (error
code), not a pointer.

Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200615214926.3638836-1-andriin@fb.com
2020-06-17 11:48:22 -07:00
Andrii Nakryiko
bf34234885 libbpf: Support pre-initializing .bss global variables
Remove invalid assumption in libbpf that .bss map doesn't have to be updated
in kernel. With addition of skeleton and memory-mapped initialization image,
.bss doesn't have to be all zeroes when BPF map is created, because user-code
might have initialized those variables from user-space.

Fixes: eba9c5f498a1 ("libbpf: Refactor global data map initialization")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200612194504.557844-1-andriin@fb.com
2020-06-17 11:48:22 -07:00
Andrii Nakryiko
46c272f9b4 sync: don't check and warn about non-empty merges anymore
Initial versions of sync script couldn't handle non-empty merges. But since
then, script became smarter, more interactive and thus more powerful and can
handle some complicated situations easily on its own, while falling back to
human intervention for even more complicated situations. This non-empty merge
check has outlived its purpose and is just an annoying bump in sync process.
Drop it.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-10 13:59:07 -07:00
Andrii Nakryiko
40e69c9538 vmtests: un-blacklist ringbuf and cls_redirect selftests
Both tests should be fixed now.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-10 13:58:45 -07:00
Andrii Nakryiko
a975d8ea28 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   9bc499befeef07a4d79f4924bfca05634ad8fc97
Checkpoint bpf-next commit: cb8e59cc87201af93dfbb6c3dccc8fcad72a09c2
Baseline bpf commit:        bdc48fa11e46f867ea4d75fa59ee87a7f48be144
Checkpoint bpf commit:      47f6bc4ce1ff70d7ba0924c2f1c218c96cd585fb

Andrii Nakryiko (1):
  libbpf: Handle GCC noreturn-turned-volatile quirk

Arnaldo Carvalho de Melo (1):
  libbpf: Define __WORDSIZE if not available

Jesper Dangaard Brouer (1):
  bpf: Selftests and tools use struct bpf_devmap_val from uapi

 include/uapi/linux/bpf.h | 13 +++++++++++++
 src/btf_dump.c           | 33 ++++++++++++++++++++++++---------
 src/hashmap.h            |  7 +++----
 3 files changed, 40 insertions(+), 13 deletions(-)

--
2.24.1
2020-06-10 13:58:45 -07:00
Andrii Nakryiko
45f7113925 libbpf: Handle GCC noreturn-turned-volatile quirk
Handle a GCC quirk of emitting extra volatile modifier in DWARF (and
subsequently preserved in BTF by pahole) for function pointers marked as
__attribute__((noreturn)). This was the way to mark such functions before GCC
2.5 added noreturn attribute. Drop such func_proto modifiers, similarly to how
it's done for array (also to handle GCC quirk/bug).

Such volatile attribute is emitted by GCC only, so existing selftests can't
express such test. Simple repro is like this (compiled with GCC + BTF
generated by pahole):

  struct my_struct {
      void __attribute__((noreturn)) (*fn)(int);
  };
  struct my_struct a;

Without this fix, output will be:

struct my_struct {
    voidvolatile  (*fn)(int);
};

With the fix:

struct my_struct {
    void (*fn)(int);
};

Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion")
Reported-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/bpf/20200610052335.2862559-1-andriin@fb.com
2020-06-10 13:58:45 -07:00
Arnaldo Carvalho de Melo
6816734203 libbpf: Define __WORDSIZE if not available
Some systems, such as Android, don't have a define for __WORDSIZE, do it
in terms of __SIZEOF_LONG__, as done in perf since 2012:

   http://git.kernel.org/torvalds/c/3f34f6c0233ae055b5

For reference: https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html

I build tested it here and Andrii did some Travis CI build tests too.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200608161150.GA3073@kernel.org
2020-06-10 13:58:45 -07:00
Jesper Dangaard Brouer
11d2a59689 bpf: Selftests and tools use struct bpf_devmap_val from uapi
Sync tools uapi bpf.h header file and update selftests that use
struct bpf_devmap_val.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/159170951195.2102545.1833108712124273987.stgit@firesoul
2020-06-10 13:58:45 -07:00
Andrii Nakryiko
8c7527ea88 travis-ci: fix travis_terminate invocation
travis_terminate expects integer argument for exit code. Add it.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-10 12:12:01 -07:00
Toke Høiland-Jørgensen
c569e03985 README: Add BTF and Clang information for Arch Linux
Arch recently added BTF to their distribution kernels - see
https://bugs.archlinux.org/task/66260

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
2020-06-08 09:33:59 -07:00
Andrii Nakryiko
1862741fb0 vmtest: disable ringbuf test on latest for now
ringbuf selftest is flaky, disable it for now.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-04 10:48:08 -07:00
Andrii Nakryiko
6a269cf458 README: add OpenSUSE BTF availability info
Add note about OpenSUSE Tumbleweed and BTF.
2020-06-04 10:42:40 -07:00
Andrii Nakryiko
6e15a022db README: add BTF and CO-RE info
Add list of Linux distributions with kernel BTF built-in.
Give few useful links to BPF CO-RE-related material to help users get started.
2020-06-03 11:26:00 -07:00
Andrii Nakryiko
20d9816471 vmtest: temporary blacklist changes to make CI green
Coarse-grained blacklisting until test_progs blacklisting w/ subtests works
better.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-02 18:09:36 -07:00
Andrii Nakryiko
538b3f4ce7 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   9a25c1df24a6fea9dc79eec950453c4e00f707fd
Checkpoint bpf-next commit: 9bc499befeef07a4d79f4924bfca05634ad8fc97
Baseline bpf commit:        bdc48fa11e46f867ea4d75fa59ee87a7f48be144
Checkpoint bpf commit:      bdc48fa11e46f867ea4d75fa59ee87a7f48be144

Daniel Borkmann (2):
  bpf: Fix up bpf_skb_adjust_room helper's skb csum setting
  bpf: Add csum_level helper for fixing up csum levels

 include/uapi/linux/bpf.h | 51 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 50 insertions(+), 1 deletion(-)

--
2.24.1
2020-06-02 18:09:36 -07:00
Andrii Nakryiko
f2610ca9cf sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-06-02 18:09:36 -07:00
Daniel Borkmann
adb5dd203c bpf: Add csum_level helper for fixing up csum levels
Add a bpf_csum_level() helper which BPF programs can use in combination
with bpf_skb_adjust_room() when they pass in BPF_F_ADJ_ROOM_NO_CSUM_RESET
flag to the latter to avoid falling back to CHECKSUM_NONE.

The bpf_csum_level() allows to adjust CHECKSUM_UNNECESSARY skb->csum_levels
via BPF_CSUM_LEVEL_{INC,DEC} which calls __skb_{incr,decr}_checksum_unnecessary()
on the skb. The helper also allows a BPF_CSUM_LEVEL_RESET which sets the skb's
csum to CHECKSUM_NONE as well as a BPF_CSUM_LEVEL_QUERY to just return the
current level. Without this helper, there is no way to otherwise adjust the
skb->csum_level. I did not add an extra dummy flags as there is plenty of free
bitspace in level argument itself iff ever needed in future.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Lorenz Bauer <lmb@cloudflare.com>
Link: https://lore.kernel.org/bpf/279ae3717cb3d03c0ffeb511493c93c450a01e1a.1591108731.git.daniel@iogearbox.net
2020-06-02 18:09:36 -07:00
Daniel Borkmann
3aadd91e97 bpf: Fix up bpf_skb_adjust_room helper's skb csum setting
Lorenz recently reported:

  In our TC classifier cls_redirect [0], we use the following sequence of
  helper calls to decapsulate a GUE (basically IP + UDP + custom header)
  encapsulated packet:

    bpf_skb_adjust_room(skb, -encap_len, BPF_ADJ_ROOM_MAC, BPF_F_ADJ_ROOM_FIXED_GSO)
    bpf_redirect(skb->ifindex, BPF_F_INGRESS)

  It seems like some checksums of the inner headers are not validated in
  this case. For example, a TCP SYN packet with invalid TCP checksum is
  still accepted by the network stack and elicits a SYN ACK. [...]

  That is, we receive the following packet from the driver:

    | ETH | IP | UDP | GUE | IP | TCP |
    skb->ip_summed == CHECKSUM_UNNECESSARY

  ip_summed is CHECKSUM_UNNECESSARY because our NICs do rx checksum offloading.
  On this packet we run skb_adjust_room_mac(-encap_len), and get the following:

    | ETH | IP | TCP |
    skb->ip_summed == CHECKSUM_UNNECESSARY

  Note that ip_summed is still CHECKSUM_UNNECESSARY. After bpf_redirect()'ing
  into the ingress, we end up in tcp_v4_rcv(). There, skb_checksum_init() is
  turned into a no-op due to CHECKSUM_UNNECESSARY.

The bpf_skb_adjust_room() helper is not aware of protocol specifics. Internally,
it handles the CHECKSUM_COMPLETE case via skb_postpull_rcsum(), but that does
not cover CHECKSUM_UNNECESSARY. In this case skb->csum_level of the original
skb prior to bpf_skb_adjust_room() call was 0, that is, covering UDP. Right now
there is no way to adjust the skb->csum_level. NICs that have checksum offload
disabled (CHECKSUM_NONE) or that support CHECKSUM_COMPLETE are not affected.

Use a safe default for CHECKSUM_UNNECESSARY by resetting to CHECKSUM_NONE and
add a flag to the helper called BPF_F_ADJ_ROOM_NO_CSUM_RESET that allows users
from opting out. Opting out is useful for the case where we don't remove/add
full protocol headers, or for the case where a user wants to adjust the csum
level manually e.g. through bpf_csum_level() helper that is added in subsequent
patch.

The bpf_skb_proto_{4_to_6,6_to_4}() for NAT64/46 translation from the BPF
bpf_skb_change_proto() helper uses bpf_skb_net_hdr_{push,pop}() pair internally
as well but doesn't change layers, only transitions between v4 to v6 and vice
versa, therefore no adoption is required there.

  [0] https://lore.kernel.org/bpf/20200424185556.7358-1-lmb@cloudflare.com/

Fixes: 2be7e212d541 ("bpf: add bpf_skb_adjust_room helper")
Reported-by: Lorenz Bauer <lmb@cloudflare.com>
Reported-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/CACAyw9-uU_52esMd1JjuA80fRPHJv5vsSg8GnfW3t_qDU4aVKQ@mail.gmail.com/
Link: https://lore.kernel.org/bpf/11a90472e7cce83e76ddbfce81fdfce7bfc68808.1591108731.git.daniel@iogearbox.net
2020-06-02 18:09:36 -07:00
Andrii Nakryiko
1206ab0e75 vmtest: optionally adjust selftest files depending on kernel version
Some selftests can't be compiled on older kernels. This allows to fix these
problems, if necessary.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-01 22:22:32 -07:00
Andrii Nakryiko
70eac9941d Makefile: add ringbuf.o to the list of object files
Add newly added ringbuf.o to the list of OBJS.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-06-01 22:22:32 -07:00
Andrii Nakryiko
2fdbf42f98 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   dda18a5c0b75461d1ed228f80b59c67434b8d601
Checkpoint bpf-next commit: 9a25c1df24a6fea9dc79eec950453c4e00f707fd
Baseline bpf commit:        f85c1598ddfe83f61d0656bd1d2025fa3b148b99
Checkpoint bpf commit:      bdc48fa11e46f867ea4d75fa59ee87a7f48be144

Alexei Starovoitov (1):
  tools/bpf: sync bpf.h

Andrii Nakryiko (3):
  bpf: Implement BPF ring buffer and verifier support for it
  libbpf: Add BPF ring buffer support
  libbpf: Add _GNU_SOURCE for reallocarray to ringbuf.c

David Ahern (3):
  bpf: Add support to attach bpf program to a devmap entry
  xdp: Add xdp_txq_info to xdp_buff
  libbpf: Add SEC name for xdp programs attached to device map

Eelco Chaudron (2):
  libbpf: Add API to consume the perf ring buffer content
  libbpf: Fix perf_buffer__free() API for sparse allocs

Jakub Sitnicki (2):
  bpf: Add link-based BPF program attachment to network namespace
  libbpf: Add support for bpf_link-based netns attachment

John Fastabend (1):
  bpf, sk_msg: Add get socket storage helpers

 include/uapi/linux/bpf.h |  95 ++++++++++++-
 src/libbpf.c             |  49 ++++++-
 src/libbpf.h             |  24 ++++
 src/libbpf.map           |   7 +
 src/libbpf_probes.c      |   5 +
 src/ringbuf.c            | 288 +++++++++++++++++++++++++++++++++++++++
 6 files changed, 461 insertions(+), 7 deletions(-)
 create mode 100644 src/ringbuf.c

--
2.24.1
2020-06-01 22:22:32 -07:00
Andrii Nakryiko
365e4805a1 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-06-01 22:22:32 -07:00
Jakub Sitnicki
890f25520a libbpf: Add support for bpf_link-based netns attachment
Add bpf_program__attach_nets(), which uses LINK_CREATE subcommand to create
an FD-based kernel bpf_link, for attach types tied to network namespace,
that is BPF_FLOW_DISSECTOR for the moment.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200531082846.2117903-7-jakub@cloudflare.com
2020-06-01 22:22:32 -07:00
Jakub Sitnicki
fbdee96fa1 bpf: Add link-based BPF program attachment to network namespace
Extend bpf() syscall subcommands that operate on bpf_link, that is
LINK_CREATE, LINK_UPDATE, OBJ_GET_INFO, to accept attach types tied to
network namespaces (only flow dissector at the moment).

Link-based and prog-based attachment can be used interchangeably, but only
one can exist at a time. Attempts to attach a link when a prog is already
attached directly, and the other way around, will be met with -EEXIST.
Attempts to detach a program when link exists result in -EINVAL.

Attachment of multiple links of same attach type to one netns is not
supported with the intention to lift the restriction when a use-case
presents itself. Because of that link create returns -E2BIG when trying to
create another netns link, when one already exists.

Link-based attachments to netns don't keep a netns alive by holding a ref
to it. Instead links get auto-detached from netns when the latter is being
destroyed, using a pernet pre_exit callback.

When auto-detached, link lives in defunct state as long there are open FDs
for it. -ENOLINK is returned if a user tries to update a defunct link.

Because bpf_link to netns doesn't hold a ref to struct net, special care is
taken when releasing, updating, or filling link info. The netns might be
getting torn down when any of these link operations are in progress. That
is why auto-detach and update/release/fill_info are synchronized by the
same mutex. Also, link ops have to always check if auto-detach has not
happened yet and if netns is still alive (refcnt > 0).

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200531082846.2117903-5-jakub@cloudflare.com
2020-06-01 22:22:32 -07:00
Andrii Nakryiko
f54c56be0d libbpf: Add _GNU_SOURCE for reallocarray to ringbuf.c
On systems with recent enough glibc, reallocarray compat won't kick in, so
reallocarray() itself has to come from stdlib.h include. But _GNU_SOURCE is
necessary to enable it. So add it.

Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200601202601.2139477-1-andriin@fb.com
2020-06-01 22:22:32 -07:00
Alexei Starovoitov
8dc4b38871 tools/bpf: sync bpf.h
Sync bpf.h into tool/include/uapi/

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
David Ahern
ed023acd35 libbpf: Add SEC name for xdp programs attached to device map
Support SEC("xdp_devmap*") as a short cut for loading the program with
type BPF_PROG_TYPE_XDP and expected attach type BPF_XDP_DEVMAP.

Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200529220716.75383-5-dsahern@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
David Ahern
ff3116bfcb xdp: Add xdp_txq_info to xdp_buff
Add xdp_txq_info as the Tx counterpart to xdp_rxq_info. At the
moment only the device is added. Other fields (queue_index)
can be added as use cases arise.

>From a UAPI perspective, add egress_ifindex to xdp context for
bpf programs to see the Tx device.

Update the verifier to only allow accesses to egress_ifindex by
XDP programs with BPF_XDP_DEVMAP expected attach type.

Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200529220716.75383-4-dsahern@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
David Ahern
65f4b3ba4c bpf: Add support to attach bpf program to a devmap entry
Add BPF_XDP_DEVMAP attach type for use with programs associated with a
DEVMAP entry.

Allow DEVMAPs to associate a program with a device entry by adding
a bpf_prog.fd to 'struct bpf_devmap_val'. Values read show the program
id, so the fd and id are a union. bpf programs can get access to the
struct via vmlinux.h.

The program associated with the fd must have type XDP with expected
attach type BPF_XDP_DEVMAP. When a program is associated with a device
index, the program is run on an XDP_REDIRECT and before the buffer is
added to the per-cpu queue. At this point rxq data is still valid; the
next patch adds tx device information allowing the prorgam to see both
ingress and egress device indices.

XDP generic is skb based and XDP programs do not work with skb's. Block
the use case by walking maps used by a program that is to be attached
via xdpgeneric and fail if any of them are DEVMAP / DEVMAP_HASH with

Block attach of BPF_XDP_DEVMAP programs to devices.

Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200529220716.75383-3-dsahern@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
Andrii Nakryiko
e1bf7a787e libbpf: Add BPF ring buffer support
Declaring and instantiating BPF ring buffer doesn't require any changes to
libbpf, as it's just another type of maps. So using existing BTF-defined maps
syntax with __uint(type, BPF_MAP_TYPE_RINGBUF) and __uint(max_elements,
<size-of-ring-buf>) is all that's necessary to create and use BPF ring buffer.

This patch adds BPF ring buffer consumer to libbpf. It is very similar to
perf_buffer implementation in terms of API, but also attempts to fix some
minor problems and inconveniences with existing perf_buffer API.

ring_buffer support both single ring buffer use case (with just using
ring_buffer__new()), as well as allows to add more ring buffers, each with its
own callback and context. This allows to efficiently poll and consume
multiple, potentially completely independent, ring buffers, using single
epoll instance.

The latter is actually a problem in practice for applications
that are using multiple sets of perf buffers. They have to create multiple
instances for struct perf_buffer and poll them independently or in a loop,
each approach having its own problems (e.g., inability to use a common poll
timeout). struct ring_buffer eliminates this problem by aggregating many
independent ring buffer instances under the single "ring buffer manager".

Second, perf_buffer's callback can't return error, so applications that need
to stop polling due to error in data or data signalling the end, have to use
extra mechanisms to signal that polling has to stop. ring_buffer's callback
can return error, which will be passed through back to user code and can be
acted upon appropariately.

Two APIs allow to consume ring buffer data:
  - ring_buffer__poll(), which will wait for data availability notification
    and will consume data only from reported ring buffer(s); this API allows
    to efficiently use resources by reading data only when it becomes
    available;
  - ring_buffer__consume(), will attempt to read new records regardless of
    data availablity notification sub-system. This API is useful for cases
    when lowest latency is required, in expense of burning CPU resources.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200529075424.3139988-3-andriin@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
Andrii Nakryiko
17a6d61898 bpf: Implement BPF ring buffer and verifier support for it
This commit adds a new MPSC ring buffer implementation into BPF ecosystem,
which allows multiple CPUs to submit data to a single shared ring buffer. On
the consumption side, only single consumer is assumed.

Motivation
----------
There are two distinctive motivators for this work, which are not satisfied by
existing perf buffer, which prompted creation of a new ring buffer
implementation.
  - more efficient memory utilization by sharing ring buffer across CPUs;
  - preserving ordering of events that happen sequentially in time, even
  across multiple CPUs (e.g., fork/exec/exit events for a task).

These two problems are independent, but perf buffer fails to satisfy both.
Both are a result of a choice to have per-CPU perf ring buffer.  Both can be
also solved by having an MPSC implementation of ring buffer. The ordering
problem could technically be solved for perf buffer with some in-kernel
counting, but given the first one requires an MPSC buffer, the same solution
would solve the second problem automatically.

Semantics and APIs
------------------
Single ring buffer is presented to BPF programs as an instance of BPF map of
type BPF_MAP_TYPE_RINGBUF. Two other alternatives considered, but ultimately
rejected.

One way would be to, similar to BPF_MAP_TYPE_PERF_EVENT_ARRAY, make
BPF_MAP_TYPE_RINGBUF could represent an array of ring buffers, but not enforce
"same CPU only" rule. This would be more familiar interface compatible with
existing perf buffer use in BPF, but would fail if application needed more
advanced logic to lookup ring buffer by arbitrary key. HASH_OF_MAPS addresses
this with current approach. Additionally, given the performance of BPF
ringbuf, many use cases would just opt into a simple single ring buffer shared
among all CPUs, for which current approach would be an overkill.

Another approach could introduce a new concept, alongside BPF map, to
represent generic "container" object, which doesn't necessarily have key/value
interface with lookup/update/delete operations. This approach would add a lot
of extra infrastructure that has to be built for observability and verifier
support. It would also add another concept that BPF developers would have to
familiarize themselves with, new syntax in libbpf, etc. But then would really
provide no additional benefits over the approach of using a map.
BPF_MAP_TYPE_RINGBUF doesn't support lookup/update/delete operations, but so
doesn't few other map types (e.g., queue and stack; array doesn't support
delete, etc).

The approach chosen has an advantage of re-using existing BPF map
infrastructure (introspection APIs in kernel, libbpf support, etc), being
familiar concept (no need to teach users a new type of object in BPF program),
and utilizing existing tooling (bpftool). For common scenario of using
a single ring buffer for all CPUs, it's as simple and straightforward, as
would be with a dedicated "container" object. On the other hand, by being
a map, it can be combined with ARRAY_OF_MAPS and HASH_OF_MAPS map-in-maps to
implement a wide variety of topologies, from one ring buffer for each CPU
(e.g., as a replacement for perf buffer use cases), to a complicated
application hashing/sharding of ring buffers (e.g., having a small pool of
ring buffers with hashed task's tgid being a look up key to preserve order,
but reduce contention).

Key and value sizes are enforced to be zero. max_entries is used to specify
the size of ring buffer and has to be a power of 2 value.

There are a bunch of similarities between perf buffer
(BPF_MAP_TYPE_PERF_EVENT_ARRAY) and new BPF ring buffer semantics:
  - variable-length records;
  - if there is no more space left in ring buffer, reservation fails, no
    blocking;
  - memory-mappable data area for user-space applications for ease of
    consumption and high performance;
  - epoll notifications for new incoming data;
  - but still the ability to do busy polling for new data to achieve the
    lowest latency, if necessary.

BPF ringbuf provides two sets of APIs to BPF programs:
  - bpf_ringbuf_output() allows to *copy* data from one place to a ring
    buffer, similarly to bpf_perf_event_output();
  - bpf_ringbuf_reserve()/bpf_ringbuf_commit()/bpf_ringbuf_discard() APIs
    split the whole process into two steps. First, a fixed amount of space is
    reserved. If successful, a pointer to a data inside ring buffer data area
    is returned, which BPF programs can use similarly to a data inside
    array/hash maps. Once ready, this piece of memory is either committed or
    discarded. Discard is similar to commit, but makes consumer ignore the
    record.

bpf_ringbuf_output() has disadvantage of incurring extra memory copy, because
record has to be prepared in some other place first. But it allows to submit
records of the length that's not known to verifier beforehand. It also closely
matches bpf_perf_event_output(), so will simplify migration significantly.

bpf_ringbuf_reserve() avoids the extra copy of memory by providing a memory
pointer directly to ring buffer memory. In a lot of cases records are larger
than BPF stack space allows, so many programs have use extra per-CPU array as
a temporary heap for preparing sample. bpf_ringbuf_reserve() avoid this needs
completely. But in exchange, it only allows a known constant size of memory to
be reserved, such that verifier can verify that BPF program can't access
memory outside its reserved record space. bpf_ringbuf_output(), while slightly
slower due to extra memory copy, covers some use cases that are not suitable
for bpf_ringbuf_reserve().

The difference between commit and discard is very small. Discard just marks
a record as discarded, and such records are supposed to be ignored by consumer
code. Discard is useful for some advanced use-cases, such as ensuring
all-or-nothing multi-record submission, or emulating temporary malloc()/free()
within single BPF program invocation.

Each reserved record is tracked by verifier through existing
reference-tracking logic, similar to socket ref-tracking. It is thus
impossible to reserve a record, but forget to submit (or discard) it.

bpf_ringbuf_query() helper allows to query various properties of ring buffer.
Currently 4 are supported:
  - BPF_RB_AVAIL_DATA returns amount of unconsumed data in ring buffer;
  - BPF_RB_RING_SIZE returns the size of ring buffer;
  - BPF_RB_CONS_POS/BPF_RB_PROD_POS returns current logical possition of
    consumer/producer, respectively.
Returned values are momentarily snapshots of ring buffer state and could be
off by the time helper returns, so this should be used only for
debugging/reporting reasons or for implementing various heuristics, that take
into account highly-changeable nature of some of those characteristics.

One such heuristic might involve more fine-grained control over poll/epoll
notifications about new data availability in ring buffer. Together with
BPF_RB_NO_WAKEUP/BPF_RB_FORCE_WAKEUP flags for output/commit/discard helpers,
it allows BPF program a high degree of control and, e.g., more efficient
batched notifications. Default self-balancing strategy, though, should be
adequate for most applications and will work reliable and efficiently already.

Design and implementation
-------------------------
This reserve/commit schema allows a natural way for multiple producers, either
on different CPUs or even on the same CPU/in the same BPF program, to reserve
independent records and work with them without blocking other producers. This
means that if BPF program was interruped by another BPF program sharing the
same ring buffer, they will both get a record reserved (provided there is
enough space left) and can work with it and submit it independently. This
applies to NMI context as well, except that due to using a spinlock during
reservation, in NMI context, bpf_ringbuf_reserve() might fail to get a lock,
in which case reservation will fail even if ring buffer is not full.

The ring buffer itself internally is implemented as a power-of-2 sized
circular buffer, with two logical and ever-increasing counters (which might
wrap around on 32-bit architectures, that's not a problem):
  - consumer counter shows up to which logical position consumer consumed the
    data;
  - producer counter denotes amount of data reserved by all producers.

Each time a record is reserved, producer that "owns" the record will
successfully advance producer counter. At that point, data is still not yet
ready to be consumed, though. Each record has 8 byte header, which contains
the length of reserved record, as well as two extra bits: busy bit to denote
that record is still being worked on, and discard bit, which might be set at
commit time if record is discarded. In the latter case, consumer is supposed
to skip the record and move on to the next one. Record header also encodes
record's relative offset from the beginning of ring buffer data area (in
pages). This allows bpf_ringbuf_commit()/bpf_ringbuf_discard() to accept only
the pointer to the record itself, without requiring also the pointer to ring
buffer itself. Ring buffer memory location will be restored from record
metadata header. This significantly simplifies verifier, as well as improving
API usability.

Producer counter increments are serialized under spinlock, so there is
a strict ordering between reservations. Commits, on the other hand, are
completely lockless and independent. All records become available to consumer
in the order of reservations, but only after all previous records where
already committed. It is thus possible for slow producers to temporarily hold
off submitted records, that were reserved later.

Reservation/commit/consumer protocol is verified by litmus tests in
Documentation/litmus-test/bpf-rb.

One interesting implementation bit, that significantly simplifies (and thus
speeds up as well) implementation of both producers and consumers is how data
area is mapped twice contiguously back-to-back in the virtual memory. This
allows to not take any special measures for samples that have to wrap around
at the end of the circular buffer data area, because the next page after the
last data page would be first data page again, and thus the sample will still
appear completely contiguous in virtual memory. See comment and a simple ASCII
diagram showing this visually in bpf_ringbuf_area_alloc().

Another feature that distinguishes BPF ringbuf from perf ring buffer is
a self-pacing notifications of new data being availability.
bpf_ringbuf_commit() implementation will send a notification of new record
being available after commit only if consumer has already caught up right up
to the record being committed. If not, consumer still has to catch up and thus
will see new data anyways without needing an extra poll notification.
Benchmarks (see tools/testing/selftests/bpf/benchs/bench_ringbuf.c) show that
this allows to achieve a very high throughput without having to resort to
tricks like "notify only every Nth sample", which are necessary with perf
buffer. For extreme cases, when BPF program wants more manual control of
notifications, commit/discard/output helpers accept BPF_RB_NO_WAKEUP and
BPF_RB_FORCE_WAKEUP flags, which give full control over notifications of data
availability, but require extra caution and diligence in using this API.

Comparison to alternatives
--------------------------
Before considering implementing BPF ring buffer from scratch existing
alternatives in kernel were evaluated, but didn't seem to meet the needs. They
largely fell into few categores:
  - per-CPU buffers (perf, ftrace, etc), which don't satisfy two motivations
    outlined above (ordering and memory consumption);
  - linked list-based implementations; while some were multi-producer designs,
    consuming these from user-space would be very complicated and most
    probably not performant; memory-mapping contiguous piece of memory is
    simpler and more performant for user-space consumers;
  - io_uring is SPSC, but also requires fixed-sized elements. Naively turning
    SPSC queue into MPSC w/ lock would have subpar performance compared to
    locked reserve + lockless commit, as with BPF ring buffer. Fixed sized
    elements would be too limiting for BPF programs, given existing BPF
    programs heavily rely on variable-sized perf buffer already;
  - specialized implementations (like a new printk ring buffer, [0]) with lots
    of printk-specific limitations and implications, that didn't seem to fit
    well for intended use with BPF programs.

  [0] https://lwn.net/Articles/779550/

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200529075424.3139988-2-andriin@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
Eelco Chaudron
ff2322b879 libbpf: Fix perf_buffer__free() API for sparse allocs
In case the cpu_bufs are sparsely allocated they are not all
free'ed. These changes will fix this.

Fixes: fb84b8224655 ("libbpf: add perf buffer API")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159056888305.330763.9684536967379110349.stgit@ebuild
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
John Fastabend
ab1b4f3844 bpf, sk_msg: Add get socket storage helpers
Add helpers to use local socket storage.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/159033907577.12355.14740125020572756560.stgit@john-Precision-5820-Tower
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
Eelco Chaudron
df9a526f99 libbpf: Add API to consume the perf ring buffer content
This new API, perf_buffer__consume, can be used as follows:

- When you have a perf ring where wakeup_events is higher than 1,
  and you have remaining data in the rings you would like to pull
  out on exit (or maybe based on a timeout).

- For low latency cases where you burn a CPU that constantly polls
  the queues.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159048487929.89441.7465713173442594608.stgit@ebuild
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 22:22:32 -07:00
Andrii Nakryiko
3b23942542 ci: blacklist bpf_iter tests
Disable a bunch of new kernel selftests that can't succeed on 5.5 kernel.
Flatten Travis tests into a single stage to parallelize and speed them up.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-05-20 01:00:06 -07:00
Andrii Nakryiko
90941cde5f sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   c321022244708aec4675de4f032ef1ba9ff0c640
Checkpoint bpf-next commit: dda18a5c0b75461d1ed228f80b59c67434b8d601
Baseline bpf commit:        7f645462ca01d01abb94d75e6768c8b3ed3a188b
Checkpoint bpf commit:      f85c1598ddfe83f61d0656bd1d2025fa3b148b99

Alexei Starovoitov (1):
  tools/bpf: sync bpf.h

Andrey Ignatov (2):
  bpf: Support narrow loads from bpf_sock_addr.user_port
  bpf: Introduce bpf_sk_{, ancestor_}cgroup_id helpers

Daniel Borkmann (2):
  bpf: Add get{peer, sock}name attach types for sock_addr
  bpf, libbpf: Enable get{peer, sock}name attach types

Eelco Chaudron (1):
  libbpf: Fix probe code to return EPERM if encountered

Gustavo A. R. Silva (1):
  bpf, libbpf: Replace zero-length array with flexible-array

Horatiu Vultur (1):
  net: bridge: Add port attribute IFLA_BRPORT_MRP_RING_OPEN

Ian Rogers (2):
  libbpf, hashmap: Remove unused #include
  libbpf, hashmap: Fix signedness warnings

Quentin Monnet (1):
  tools, bpf: Synchronise BPF UAPI header with tools

Song Liu (2):
  bpf: Sharing bpf runtime stats with BPF_ENABLE_STATS
  libbpf: Add support for command BPF_ENABLE_STATS

Stanislav Fomichev (2):
  bpf: Bpf_{g,s}etsockopt for struct bpf_sock_addr
  bpf: Allow any port in bpf_bind helper

Sumanth Korikkar (1):
  libbpf: Fix register naming in PT_REGS s390 macros

Yonghong Song (7):
  bpf: Allow loading of a bpf_iter program
  bpf: Support bpf tracing/iter programs for BPF_LINK_CREATE
  bpf: Create anonymous bpf iterator
  bpf: Add bpf_seq_printf and bpf_seq_write helpers
  tools/libbpf: Add bpf_iter support
  tools/libpf: Add offsetof/container_of macro in bpf_helpers.h
  bpf: Change btf_iter func proto prefix to "bpf_iter_"

 include/uapi/linux/bpf.h     | 208 +++++++++++++++++++++++++++--------
 include/uapi/linux/if_link.h |   1 +
 src/bpf.c                    |  20 ++++
 src/bpf.h                    |   3 +
 src/bpf_helpers.h            |  14 +++
 src/bpf_tracing.h            |  20 +++-
 src/hashmap.c                |   5 +-
 src/hashmap.h                |   1 -
 src/libbpf.c                 |  98 +++++++++++++++--
 src/libbpf.h                 |   9 ++
 src/libbpf.map               |   3 +
 src/libbpf_internal.h        |   2 +-
 12 files changed, 322 insertions(+), 62 deletions(-)

--
2.24.1
2020-05-20 01:00:06 -07:00
Andrii Nakryiko
97a0d1e7b5 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-05-20 01:00:06 -07:00
Alexei Starovoitov
d650751a9b tools/bpf: sync bpf.h
Sync tools/include/uapi/linux/bpf.h from include/uapi.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-05-20 01:00:06 -07:00
Daniel Borkmann
dcb0c5ac44 bpf, libbpf: Enable get{peer, sock}name attach types
Trivial patch to add the new get{peer,sock}name attach types to the section
definitions in order to hook them up to sock_addr cgroup program type.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/7fcd4b1e41a8ebb364754a5975c75a7795051bd2.1589841594.git.daniel@iogearbox.net
2020-05-20 01:00:06 -07:00
Daniel Borkmann
2c892f1aa1 bpf: Add get{peer, sock}name attach types for sock_addr
As stated in 983695fa6765 ("bpf: fix unconnected udp hooks"), the objective
for the existing cgroup connect/sendmsg/recvmsg/bind BPF hooks is to be
transparent to applications. In Cilium we make use of these hooks [0] in
order to enable E-W load balancing for existing Kubernetes service types
for all Cilium managed nodes in the cluster. Those backends can be local
or remote. The main advantage of this approach is that it operates as close
as possible to the socket, and therefore allows to avoid packet-based NAT
given in connect/sendmsg/recvmsg hooks we only need to xlate sock addresses.

This also allows to expose NodePort services on loopback addresses in the
host namespace, for example. As another advantage, this also efficiently
blocks bind requests for applications in the host namespace for exposed
ports. However, one missing item is that we also need to perform reverse
xlation for inet{,6}_getname() hooks such that we can return the service
IP/port tuple back to the application instead of the remote peer address.

The vast majority of applications does not bother about getpeername(), but
in a few occasions we've seen breakage when validating the peer's address
since it returns unexpectedly the backend tuple instead of the service one.
Therefore, this trivial patch allows to customise and adds a getpeername()
as well as getsockname() BPF cgroup hook for both IPv4 and IPv6 in order
to address this situation.

Simple example:

  # ./cilium/cilium service list
  ID   Frontend     Service Type   Backend
  1    1.2.3.4:80   ClusterIP      1 => 10.0.0.10:80

Before; curl's verbose output example, no getpeername() reverse xlation:

  # curl --verbose 1.2.3.4
  * Rebuilt URL to: 1.2.3.4/
  *   Trying 1.2.3.4...
  * TCP_NODELAY set
  * Connected to 1.2.3.4 (10.0.0.10) port 80 (#0)
  > GET / HTTP/1.1
  > Host: 1.2.3.4
  > User-Agent: curl/7.58.0
  > Accept: */*
  [...]

After; with getpeername() reverse xlation:

  # curl --verbose 1.2.3.4
  * Rebuilt URL to: 1.2.3.4/
  *   Trying 1.2.3.4...
  * TCP_NODELAY set
  * Connected to 1.2.3.4 (1.2.3.4) port 80 (#0)
  > GET / HTTP/1.1
  >  Host: 1.2.3.4
  > User-Agent: curl/7.58.0
  > Accept: */*
  [...]

Originally, I had both under a BPF_CGROUP_INET{4,6}_GETNAME type and exposed
peer to the context similar as in inet{,6}_getname() fashion, but API-wise
this is suboptimal as it always enforces programs having to test for ctx->peer
which can easily be missed, hence BPF_CGROUP_INET{4,6}_GET{PEER,SOCK}NAME split.
Similarly, the checked return code is on tnum_range(1, 1), but if a use case
comes up in future, it can easily be changed to return an error code instead.
Helper and ctx member access is the same as with connect/sendmsg/etc hooks.

  [0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.c

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/61a479d759b2482ae3efb45546490bacd796a220.1589841594.git.daniel@iogearbox.net
2020-05-20 01:00:06 -07:00
Ian Rogers
46407182c7 libbpf, hashmap: Fix signedness warnings
Fixes the following warnings:

  hashmap.c: In function ‘hashmap__clear’:
  hashmap.h:150:20: error: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare]
    150 |  for (bkt = 0; bkt < map->cap; bkt++)        \

  hashmap.c: In function ‘hashmap_grow’:
  hashmap.h:150:20: error: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare]
    150 |  for (bkt = 0; bkt < map->cap; bkt++)        \

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200515165007.217120-4-irogers@google.com
2020-05-20 01:00:06 -07:00
Ian Rogers
a00d463bb9 libbpf, hashmap: Remove unused #include
Remove #include of libbpf_internal.h that is unused.

Discussed in this thread:
https://lore.kernel.org/lkml/CAEf4BzZRmiEds_8R8g4vaAeWvJzPb4xYLnpF0X2VNY8oTzkphQ@mail.gmail.com/

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200515165007.217120-3-irogers@google.com
2020-05-20 01:00:06 -07:00
Sumanth Korikkar
d8fdd1e848 libbpf: Fix register naming in PT_REGS s390 macros
Fix register naming in PT_REGS s390 macros

Fixes: b8ebce86ffe6 ("libbpf: Provide CO-RE variants of PT_REGS macros")
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200513154414.29972-1-sumanthk@linux.ibm.com
2020-05-20 01:00:06 -07:00
Andrey Ignatov
b8482d74a1 bpf: Introduce bpf_sk_{, ancestor_}cgroup_id helpers
With having ability to lookup sockets in cgroup skb programs it becomes
useful to access cgroup id of retrieved sockets so that policies can be
implemented based on origin cgroup of such socket.

For example, a container running in a cgroup can have cgroup skb ingress
program that can lookup peer socket that is sending packets to a process
inside the container and decide whether those packets should be allowed
or denied based on cgroup id of the peer.

More specifically such ingress program can implement intra-host policy
"allow incoming packets only from this same container and not from any
other container on same host" w/o relying on source IP addresses since
quite often it can be the case that containers share same IP address on
the host.

Introduce two new helpers for this use-case: bpf_sk_cgroup_id() and
bpf_sk_ancestor_cgroup_id().

These helpers are similar to existing bpf_skb_{,ancestor_}cgroup_id
helpers with the only difference that sk is used to get cgroup id
instead of skb, and share code with them.

See documentation in UAPI for more details.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/f5884981249ce911f63e9b57ecd5d7d19154ff39.1589486450.git.rdna@fb.com
2020-05-20 01:00:06 -07:00
Andrey Ignatov
3cd9cac8fb bpf: Support narrow loads from bpf_sock_addr.user_port
bpf_sock_addr.user_port supports only 4-byte load and it leads to ugly
code in BPF programs, like:

	volatile __u32 user_port = ctx->user_port;
	__u16 port = bpf_ntohs(user_port);

Since otherwise clang may optimize the load to be 2-byte and it's
rejected by verifier.

Add support for 1- and 2-byte loads same way as it's supported for other
fields in bpf_sock_addr like user_ip4, msg_src_ip4, etc.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/c1e983f4c17573032601d0b2b1f9d1274f24bc16.1589420814.git.rdna@fb.com
2020-05-20 01:00:06 -07:00
Yonghong Song
70e6075d1d bpf: Change btf_iter func proto prefix to "bpf_iter_"
This is to be consistent with tracing and lsm programs
which have prefix "bpf_trace_" and "bpf_lsm_" respectively.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200513180216.2949387-1-yhs@fb.com
2020-05-20 01:00:06 -07:00
Eelco Chaudron
d71e9baa8b libbpf: Fix probe code to return EPERM if encountered
When the probe code was failing for any reason ENOTSUP was returned, even
if this was due to not having enough lock space. This patch fixes this by
returning EPERM to the user application, so it can respond and increase
the RLIMIT_MEMLOCK size.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/158927424896.2342.10402475603585742943.stgit@ebuild
2020-05-20 01:00:06 -07:00
Quentin Monnet
b41c6d34a4 tools, bpf: Synchronise BPF UAPI header with tools
Synchronise the bpf.h header under tools, to report the fixes recently
brought to the documentation for the BPF helpers.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200511161536.29853-5-quentin@isovalent.com
2020-05-20 01:00:06 -07:00
Gustavo A. R. Silva
9029d18d9b bpf, libbpf: Replace zero-length array with flexible-array
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200507185057.GA13981@embeddedor
2020-05-20 01:00:06 -07:00
Yonghong Song
f81f504e12 tools/libpf: Add offsetof/container_of macro in bpf_helpers.h
These two helpers will be used later in bpf_iter bpf program
bpf_iter_netlink.c. Put them in bpf_helpers.h since they could
be useful in other cases.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200509175919.2477104-1-yhs@fb.com
2020-05-20 01:00:06 -07:00
Yonghong Song
021e35fba2 tools/libbpf: Add bpf_iter support
Two new libbpf APIs are added to support bpf_iter:
  - bpf_program__attach_iter
    Given a bpf program and additional parameters, which is
    none now, returns a bpf_link.
  - bpf_iter_create
    syscall level API to create a bpf iterator.

The macro BPF_SEQ_PRINTF are also introduced. The format
looks like:
  BPF_SEQ_PRINTF(seq, "task id %d\n", pid);

This macro can help bpf program writers with
nicer bpf_seq_printf syntax similar to the kernel one.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200509175917.2476936-1-yhs@fb.com
2020-05-20 01:00:06 -07:00
Yonghong Song
7112841ade bpf: Add bpf_seq_printf and bpf_seq_write helpers
Two helpers bpf_seq_printf and bpf_seq_write, are added for
writing data to the seq_file buffer.

bpf_seq_printf supports common format string flag/width/type
fields so at least I can get identical results for
netlink and ipv6_route targets.

For bpf_seq_printf and bpf_seq_write, return value -EOVERFLOW
specifically indicates a write failure due to overflow, which
means the object will be repeated in the next bpf invocation
if object collection stays the same. Note that if the object
collection is changed, depending how collection traversal is
done, even if the object still in the collection, it may not
be visited.

For bpf_seq_printf, format %s, %p{i,I}{4,6} needs to
read kernel memory. Reading kernel memory may fail in
the following two cases:
  - invalid kernel address, or
  - valid kernel address but requiring a major fault
If reading kernel memory failed, the %s string will be
an empty string and %p{i,I}{4,6} will be all 0.
Not returning error to bpf program is consistent with
what bpf_trace_printk() does for now.

bpf_seq_printf may return -EBUSY meaning that internal percpu
buffer for memory copy of strings or other pointees is
not available. Bpf program can return 1 to indicate it
wants the same object to be repeated. Right now, this should not
happen on no-RT kernels since migrate_disable(), which guards
bpf prog call, calls preempt_disable().

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200509175914.2476661-1-yhs@fb.com
2020-05-20 01:00:06 -07:00
Yonghong Song
940f4df57b bpf: Create anonymous bpf iterator
A new bpf command BPF_ITER_CREATE is added.

The anonymous bpf iterator is seq_file based.
The seq_file private data are referenced by targets.
The bpf_iter infrastructure allocated additional space
at seq_file->private before the space used by targets
to store some meta data, e.g.,
  prog:       prog to run
  session_id: an unique id for each opened seq_file
  seq_num:    how many times bpf programs are queried in this session
  done_stop:  an internal state to decide whether bpf program
              should be called in seq_ops->stop() or not

The seq_num will start from 0 for valid objects.
The bpf program may see the same seq_num more than once if
 - seq_file buffer overflow happens and the same object
   is retried by bpf_seq_read(), or
 - the bpf program explicitly requests a retry of the
   same object

Since module is not supported for bpf_iter, all target
registeration happens at __init time, so there is no
need to change bpf_iter_unreg_target() as it is used
mostly in error path of the init function at which time
no bpf iterators have been created yet.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200509175905.2475770-1-yhs@fb.com
2020-05-20 01:00:06 -07:00
Yonghong Song
46c906b6d1 bpf: Support bpf tracing/iter programs for BPF_LINK_CREATE
Given a bpf program, the step to create an anonymous bpf iterator is:
  - create a bpf_iter_link, which combines bpf program and the target.
    In the future, there could be more information recorded in the link.
    A link_fd will be returned to the user space.
  - create an anonymous bpf iterator with the given link_fd.

The bpf_iter_link can be pinned to bpffs mount file system to
create a file based bpf iterator as well.

The benefit to use of bpf_iter_link:
  - using bpf link simplifies design and implementation as bpf link
    is used for other tracing bpf programs.
  - for file based bpf iterator, bpf_iter_link provides a standard
    way to replace underlying bpf programs.
  - for both anonymous and free based iterators, bpf link query
    capability can be leveraged.

The patch added support of tracing/iter programs for BPF_LINK_CREATE.
A new link type BPF_LINK_TYPE_ITER is added to facilitate link
querying. Currently, only prog_id is needed, so there is no
additional in-kernel show_fdinfo() and fill_link_info() hook
is needed for BPF_LINK_TYPE_ITER link.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200509175901.2475084-1-yhs@fb.com
2020-05-20 01:00:06 -07:00
Yonghong Song
9dc3736a7f bpf: Allow loading of a bpf_iter program
A bpf_iter program is a tracing program with attach type
BPF_TRACE_ITER. The load attribute
  attach_btf_id
is used by the verifier against a particular kernel function,
which represents a target, e.g., __bpf_iter__bpf_map
for target bpf_map which is implemented later.

The program return value must be 0 or 1 for now.
  0 : successful, except potential seq_file buffer overflow
      which is handled by seq_file reader.
  1 : request to restart the same object

In the future, other return values may be used for filtering or
teminating the iterator.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200509175900.2474947-1-yhs@fb.com
2020-05-20 01:00:06 -07:00
Stanislav Fomichev
8b3cbf12a2 bpf: Allow any port in bpf_bind helper
We want to have a tighter control on what ports we bind to in
the BPF_CGROUP_INET{4,6}_CONNECT hooks even if it means
connect() becomes slightly more expensive. The expensive part
comes from the fact that we now need to call inet_csk_get_port()
that verifies that the port is not used and allocates an entry
in the hash table for it.

Since we can't rely on "snum || !bind_address_no_port" to prevent
us from calling POST_BIND hook anymore, let's add another bind flag
to indicate that the call site is BPF program.

v5:
* fix wrong AF_INET (should be AF_INET6) in the bpf program for v6

v3:
* More bpf_bind documentation refinements (Martin KaFai Lau)
* Add UDP tests as well (Martin KaFai Lau)
* Don't start the thread, just do socket+bind+listen (Martin KaFai Lau)

v2:
* Update documentation (Andrey Ignatov)
* Pass BIND_FORCE_ADDRESS_NO_PORT conditionally (Andrey Ignatov)

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200508174611.228805-5-sdf@google.com
2020-05-20 01:00:06 -07:00
Stanislav Fomichev
dfa07417ff bpf: Bpf_{g,s}etsockopt for struct bpf_sock_addr
Currently, bpf_getsockopt and bpf_setsockopt helpers operate on the
'struct bpf_sock_ops' context in BPF_PROG_TYPE_SOCK_OPS program.
Let's generalize them and make them available for 'struct bpf_sock_addr'.
That way, in the future, we can allow those helpers in more places.

As an example, let's expose those 'struct bpf_sock_addr' based helpers to
BPF_CGROUP_INET{4,6}_CONNECT hooks. That way we can override CC before the
connection is made.

v3:
* Expose custom helpers for bpf_sock_addr context instead of doing
  generic bpf_sock argument (as suggested by Daniel). Even with
  try_socket_lock that doesn't sleep we have a problem where context sk
  is already locked and socket lock is non-nestable.

v2:
* s/BPF_PROG_TYPE_CGROUP_SOCKOPT/BPF_PROG_TYPE_SOCK_OPS/

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200430233152.199403-1-sdf@google.com
2020-05-20 01:00:06 -07:00
Song Liu
5c1c96c579 libbpf: Add support for command BPF_ENABLE_STATS
bpf_enable_stats() is added to enable given stats.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200430071506.1408910-3-songliubraving@fb.com
2020-05-20 01:00:06 -07:00
Song Liu
83f269b088 bpf: Sharing bpf runtime stats with BPF_ENABLE_STATS
Currently, sysctl kernel.bpf_stats_enabled controls BPF runtime stats.
Typical userspace tools use kernel.bpf_stats_enabled as follows:

  1. Enable kernel.bpf_stats_enabled;
  2. Check program run_time_ns;
  3. Sleep for the monitoring period;
  4. Check program run_time_ns again, calculate the difference;
  5. Disable kernel.bpf_stats_enabled.

The problem with this approach is that only one userspace tool can toggle
this sysctl. If multiple tools toggle the sysctl at the same time, the
measurement may be inaccurate.

To fix this problem while keep backward compatibility, introduce a new
bpf command BPF_ENABLE_STATS. On success, this command enables stats and
returns a valid fd. BPF_ENABLE_STATS takes argument "type". Currently,
only one type, BPF_STATS_RUN_TIME, is supported. We can extend the
command to support other types of stats in the future.

With BPF_ENABLE_STATS, user space tool would have the following flow:

  1. Get a fd with BPF_ENABLE_STATS, and make sure it is valid;
  2. Check program run_time_ns;
  3. Sleep for the monitoring period;
  4. Check program run_time_ns again, calculate the difference;
  5. Close the fd.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200430071506.1408910-2-songliubraving@fb.com
2020-05-20 01:00:06 -07:00
Horatiu Vultur
597d350e4a net: bridge: Add port attribute IFLA_BRPORT_MRP_RING_OPEN
This patch adds a new port attribute, IFLA_BRPORT_MRP_RING_OPEN, which allows
to notify the userspace when the port lost the continuite of MRP frames.

This attribute is set by kernel whenever the SW or HW detects that the ring is
being open or closed.

Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-20 01:00:06 -07:00
Andrii Nakryiko
7fc4d5025b vmtest: add bpf_obj_id to 5.5.0 blacklist
bpf_obj_id selftest added testing of bpf_link related operations, which are
not implemented in 5.5.0. Blacklist it.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
bd9e2feb2a sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   2fcd80144b93ff90836a44f2054b4d82133d3a85
Checkpoint bpf-next commit: c321022244708aec4675de4f032ef1ba9ff0c640
Baseline bpf commit:        edadedf1c5b4e4404192a0a4c3c0c05e3b7672ab
Checkpoint bpf commit:      7f645462ca01d01abb94d75e6768c8b3ed3a188b

Andrii Nakryiko (8):
  bpf: Add support for BPF_OBJ_GET_INFO_BY_FD for bpf_link
  libbpf: Add low-level APIs for new bpf_link commands
  libbpf: Refactor BTF-defined map definition parsing logic
  libbpf: Refactor map creation logic and fix cleanup leak
  libbpf: Add BTF-defined map-in-map support
  libbpf: Fix memory leak and possible double-free in hashmap__clear
  libbpf: Fix huge memory leak in libbpf_find_vmlinux_btf_id()
  libbpf: Fix false uninitialized variable warning

David Ahern (1):
  libbpf: Only check mode flags in get_xdp_id

Jakub Wilk (1):
  bpf: Fix reStructuredText markup

Maciej Żenczykowski (1):
  bpf: add bpf_ktime_get_boot_ns()

Mao Wenan (1):
  libbpf: Return err if bpf_object__load failed

Yoshiki Komachi (1):
  bpf_helpers.h: Add note for building with vmlinux.h or linux/types.h

Zou Wei (1):
  libbpf: Remove unneeded semicolon in btf_dump_emit_type

 include/uapi/linux/bpf.h |  46 ++-
 src/bpf.c                |  19 +-
 src/bpf.h                |   4 +-
 src/bpf_helpers.h        |   7 +
 src/btf_dump.c           |   2 +-
 src/hashmap.c            |   7 +
 src/libbpf.c             | 705 +++++++++++++++++++++++++++------------
 src/libbpf.map           |   6 +
 src/netlink.c            |   2 +
 9 files changed, 572 insertions(+), 226 deletions(-)

--
2.24.1
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
814ed5011f sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
f8faf2b33d libbpf: Fix false uninitialized variable warning
Some versions of GCC falsely detect that vi might not be initialized. That's
not true, but let's silence it with NULL initialization.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200430021436.1522502-1-andriin@fb.com
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
3cb0b3fd52 libbpf: Fix huge memory leak in libbpf_find_vmlinux_btf_id()
BTF object wasn't freed.

Fixes: a6ed02cac690 ("libbpf: Load btf_vmlinux only once per object.")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20200429012111.277390-9-andriin@fb.com
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
edb1aaa8dc libbpf: Fix memory leak and possible double-free in hashmap__clear
Fix memory leak in hashmap_clear() not freeing hashmap_entry structs for each
of the remaining entries. Also NULL-out bucket list to prevent possible
double-free between hashmap__clear() and hashmap__free().

Running test_progs-asan flavor clearly showed this problem.

Reported-by: Alston Tang <alston64@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200429012111.277390-5-andriin@fb.com
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
f3271942dd libbpf: Add BTF-defined map-in-map support
As discussed at LPC 2019 ([0]), this patch brings (a quite belated) support
for declarative BTF-defined map-in-map support in libbpf. It allows to define
ARRAY_OF_MAPS and HASH_OF_MAPS BPF maps without any user-space initialization
code involved.

Additionally, it allows to initialize outer map's slots with references to
respective inner maps at load time, also completely declaratively.

Despite a weak type system of C, the way BTF-defined map-in-map definition
works, it's actually quite hard to accidentally initialize outer map with
incompatible inner maps. This being C, of course, it's still possible, but
even that would be caught at load time and error returned with helpful debug
log pointing exactly to the slot that failed to be initialized.

As an example, here's a rather advanced HASH_OF_MAPS declaration and
initialization example, filling slots #0 and #4 with two inner maps:

  #include <bpf/bpf_helpers.h>

  struct inner_map {
          __uint(type, BPF_MAP_TYPE_ARRAY);
          __uint(max_entries, 1);
          __type(key, int);
          __type(value, int);
  } inner_map1 SEC(".maps"),
    inner_map2 SEC(".maps");

  struct outer_hash {
          __uint(type, BPF_MAP_TYPE_HASH_OF_MAPS);
          __uint(max_entries, 5);
          __uint(key_size, sizeof(int));
          __array(values, struct inner_map);
  } outer_hash SEC(".maps") = {
          .values = {
                  [0] = &inner_map2,
                  [4] = &inner_map1,
          },
  };

Here's the relevant part of libbpf debug log showing pretty clearly of what's
going on with map-in-map initialization:

  libbpf: .maps relo #0: for 6 value 0 rel.r_offset 96 name 260 ('inner_map1')
  libbpf: .maps relo #0: map 'outer_arr' slot [0] points to map 'inner_map1'
  libbpf: .maps relo #1: for 7 value 32 rel.r_offset 112 name 249 ('inner_map2')
  libbpf: .maps relo #1: map 'outer_arr' slot [2] points to map 'inner_map2'
  libbpf: .maps relo #2: for 7 value 32 rel.r_offset 144 name 249 ('inner_map2')
  libbpf: .maps relo #2: map 'outer_hash' slot [0] points to map 'inner_map2'
  libbpf: .maps relo #3: for 6 value 0 rel.r_offset 176 name 260 ('inner_map1')
  libbpf: .maps relo #3: map 'outer_hash' slot [4] points to map 'inner_map1'
  libbpf: map 'inner_map1': created successfully, fd=4
  libbpf: map 'inner_map2': created successfully, fd=5
  libbpf: map 'outer_hash': created successfully, fd=7
  libbpf: map 'outer_hash': slot [0] set to map 'inner_map2' fd=5
  libbpf: map 'outer_hash': slot [4] set to map 'inner_map1' fd=4

Notice from the log above that fd=6 (not logged explicitly) is used for inner
"prototype" map, necessary for creation of outer map. It is destroyed
immediately after outer map is created.

See also included selftest with some extra comments explaining extra details
of usage. Additionally, similar initialization syntax and libbpf functionality
can be used to do initialization of BPF_PROG_ARRAY with references to BPF
sub-programs. This can be done in follow up patches, if there will be a demand
for this.

  [0] https://linuxplumbersconf.org/event/4/contributions/448/

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200429002739.48006-4-andriin@fb.com
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
040f73a7c7 libbpf: Refactor map creation logic and fix cleanup leak
Factor out map creation and destruction logic to simplify code and especially
error handling. Also fix map FD leak in case of partially successful map
creation during bpf_object load operation.

Fixes: 57a00f41644f ("libbpf: Add auto-pinning of maps when loading BPF objects")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200429002739.48006-3-andriin@fb.com
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
35283f89c6 libbpf: Refactor BTF-defined map definition parsing logic
Factor out BTF map definition logic into stand-alone routine for easier reuse
for map-in-map case.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200429002739.48006-2-andriin@fb.com
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
1c4c845e79 libbpf: Add low-level APIs for new bpf_link commands
Add low-level API calls for bpf_link_get_next_id() and
bpf_link_get_fd_by_id().

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200429001614.1544-6-andriin@fb.com
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
2a374b5df0 bpf: Add support for BPF_OBJ_GET_INFO_BY_FD for bpf_link
Add ability to fetch bpf_link details through BPF_OBJ_GET_INFO_BY_FD command.
Also enhance show_fdinfo to potentially include bpf_link type-specific
information (similarly to obj_info).

Also introduce enum bpf_link_type stored in bpf_link itself and expose it in
UAPI. bpf_link_tracing also now will store and return bpf_attach_type.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200429001614.1544-5-andriin@fb.com
2020-05-01 18:58:47 -07:00
Zou Wei
7878754030 libbpf: Remove unneeded semicolon in btf_dump_emit_type
Fixes the following coccicheck warning:

 tools/lib/bpf/btf_dump.c:661:4-5: Unneeded semicolon

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1588064829-70613-1-git-send-email-zou_wei@huawei.com
2020-05-01 18:58:47 -07:00
Mao Wenan
da5aa114e2 libbpf: Return err if bpf_object__load failed
bpf_object__load() has various return code, when it failed to load
object, it must return err instead of -EINVAL.

Signed-off-by: Mao Wenan <maowenan@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200426063635.130680-3-maowenan@huawei.com
2020-05-01 18:58:47 -07:00
Maciej Żenczykowski
625f64a126 bpf: add bpf_ktime_get_boot_ns()
On a device like a cellphone which is constantly suspending
and resuming CLOCK_MONOTONIC is not particularly useful for
keeping track of or reacting to external network events.
Instead you want to use CLOCK_BOOTTIME.

Hence add bpf_ktime_get_boot_ns() as a mirror of bpf_ktime_get_ns()
based around CLOCK_BOOTTIME instead of CLOCK_MONOTONIC.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-05-01 18:58:47 -07:00
Yoshiki Komachi
ba344d9494 bpf_helpers.h: Add note for building with vmlinux.h or linux/types.h
The following error was shown when a bpf program was compiled without
vmlinux.h auto-generated from BTF:

 # clang -I./linux/tools/lib/ -I/lib/modules/$(uname -r)/build/include/ \
   -O2 -Wall -target bpf -emit-llvm -c bpf_prog.c -o bpf_prog.bc
 ...
 In file included from linux/tools/lib/bpf/bpf_helpers.h:5:
 linux/tools/lib/bpf/bpf_helper_defs.h:56:82: error: unknown type name '__u64'
 ...

It seems that bpf programs are intended for being built together with
the vmlinux.h (which will have all the __u64 and other typedefs). But
users may mistakenly think "include <linux/types.h>" is missing
because the vmlinux.h is not common for non-bpf developers. IMO, an
explicit comment therefore should be added to bpf_helpers.h as this
patch shows.

Signed-off-by: Yoshiki Komachi <komachi.yoshiki@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/1587427527-29399-1-git-send-email-komachi.yoshiki@gmail.com
2020-05-01 18:58:47 -07:00
Jakub Wilk
976e29343d bpf: Fix reStructuredText markup
The patch fixes:
$ scripts/bpf_helpers_doc.py > bpf-helpers.rst
$ rst2man bpf-helpers.rst > bpf-helpers.7
bpf-helpers.rst:1105: (WARNING/2) Inline strong start-string without end-string.

Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200422082324.2030-1-jwilk@jwilk.net
2020-05-01 18:58:47 -07:00
David Ahern
b3da63d59d libbpf: Only check mode flags in get_xdp_id
The commit in the Fixes tag changed get_xdp_id to only return prog_id
if flags is 0, but there are other XDP flags than the modes - e.g.,
XDP_FLAGS_UPDATE_IF_NOEXIST. Since the intention was only to look at
MODE flags, clear other ones before checking if flags is 0.

Fixes: f07cbad29741 ("libbpf: Fix bpf_get_link_xdp_id flags handling")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrey Ignatov <rdna@fb.com>
2020-05-01 18:58:47 -07:00
Andrii Nakryiko
902ba3fd33 README: add Debian libbpf package link
Debian is now packaging libbpf from this repo. Add link to the package to README.
2020-05-01 18:20:43 -07:00
Andrii Nakryiko
cf3fc46ea8 sync: squelch annoying warning from filter-branch git command
Newer git started emitting warning about dangerousness of filter-branch.
Squelch it with FILTER_BRANCH_SQUELCH_WARNING=1 envvar.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-04-29 23:01:56 -07:00
Andrii Nakryiko
6a1615c263 vmtests: blacklist mmap test on 5.5
5.5 kernel has a bug in kernel allowing to violate read-only access to
mmap()-ed map. Disable selftest that now is failing.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-04-17 15:31:03 -07:00
Andrii Nakryiko
e66d297441 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   1a323ea5356edbb3073dc59d51b9e6b86908857d
Checkpoint bpf-next commit: 2fcd80144b93ff90836a44f2054b4d82133d3a85
Baseline bpf commit:        94b18a87efdd1626a1e6aef87271af4a7c616d36
Checkpoint bpf commit:      edadedf1c5b4e4404192a0a4c3c0c05e3b7672ab

Andrey Ignatov (1):
  libbpf: Fix bpf_get_link_xdp_id flags handling

Andrii Nakryiko (1):
  libbpf: Always specify expected_attach_type on program load if
    supported

Jeremy Cline (1):
  libbpf: Initialize *nl_pid so gcc 10 is happy

Toke Høiland-Jørgensen (1):
  libbpf: Fix type of old_fd in bpf_xdp_set_link_opts

 src/libbpf.c  | 126 ++++++++++++++++++++++++++++++++------------------
 src/libbpf.h  |   2 +-
 src/netlink.c |   6 +--
 3 files changed, 86 insertions(+), 48 deletions(-)

--
2.24.1
2020-04-17 15:31:03 -07:00
Toke Høiland-Jørgensen
632afdff45 libbpf: Fix type of old_fd in bpf_xdp_set_link_opts
The 'old_fd' parameter used for atomic replacement of XDP programs is
supposed to be an FD, but was left as a u32 from an earlier iteration of
the patch that added it. It was converted to an int when read, so things
worked correctly even with negative values, but better change the
definition to correctly reflect the intention.

Fixes: bd5ca3ef93cd ("libbpf: Add function to set link XDP fd while specifying old program")
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200414145025.182163-1-toke@redhat.com
2020-04-17 15:31:03 -07:00
Andrii Nakryiko
6e706b38bd libbpf: Always specify expected_attach_type on program load if supported
For some types of BPF programs that utilize expected_attach_type, libbpf won't
set load_attr.expected_attach_type, even if expected_attach_type is known from
section definition. This was done to preserve backwards compatibility with old
kernels that didn't recognize expected_attach_type attribute yet (which was
added in 5e43f899b03a ("bpf: Check attach type at prog load time"). But this
is problematic for some BPF programs that utilize newer features that require
kernel to know specific expected_attach_type (e.g., extended set of return
codes for cgroup_skb/egress programs).

This patch makes libbpf specify expected_attach_type by default, but also
detect support for this field in kernel and not set it during program load.
This allows to have a good metadata for bpf_program
(e.g., bpf_program__get_extected_attach_type()), but still work with old
kernels (for cases where it can work at all).

Additionally, due to expected_attach_type being always set for recognized
program types, bpf_program__attach_cgroup doesn't have to do extra checks to
determine correct attach type, so remove that additional logic.

Also adjust section_names selftest to account for this change.

More detailed discussion can be found in [0].

  [0] https://lore.kernel.org/bpf/20200412003604.GA15986@rdna-mbp.dhcp.thefacebook.com/

Fixes: 5cf1e9145630 ("bpf: cgroup inet skb programs can return 0 to 3")
Fixes: 5e43f899b03a ("bpf: Check attach type at prog load time")
Reported-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/20200414182645.1368174-1-andriin@fb.com
2020-04-17 15:31:03 -07:00
Andrey Ignatov
850293ba1c libbpf: Fix bpf_get_link_xdp_id flags handling
Currently if one of XDP_FLAGS_{DRV,HW,SKB}_MODE flags is passed to
bpf_get_link_xdp_id() and there is a single XDP program attached to
ifindex, that program's id will be returned by bpf_get_link_xdp_id() in
prog_id argument no matter what mode the program is attached in, i.e.
flags argument is not taken into account.

For example, if there is a single program attached with
XDP_FLAGS_SKB_MODE but user calls bpf_get_link_xdp_id() with flags =
XDP_FLAGS_DRV_MODE, that skb program will be returned.

Fix it by returning info->prog_id only if user didn't specify flags. If
flags is specified then return corresponding mode-specific-field from
struct xdp_link_info.

The initial error was introduced in commit 50db9f073188 ("libbpf: Add a
support for getting xdp prog id on ifindex") and then refactored in
473f4e133a12 so 473f4e133a12 is used in the Fixes tag.

Fixes: 473f4e133a12 ("libbpf: Add bpf_get_link_xdp_info() function to get more XDP information")
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/0e9e30490b44b447bb2bebc69c7135e7fe7e4e40.1586236080.git.rdna@fb.com
2020-04-17 15:31:03 -07:00
Jeremy Cline
fb528063b2 libbpf: Initialize *nl_pid so gcc 10 is happy
Builds of Fedora's kernel-tools package started to fail with "may be
used uninitialized" warnings for nl_pid in bpf_set_link_xdp_fd() and
bpf_get_link_xdp_info() on the s390 architecture.

Although libbpf_netlink_open() always returns a negative number when it
does not set *nl_pid, the compiler does not determine this and thus
believes the variable might be used uninitialized. Assuage gcc's fears
by explicitly initializing nl_pid.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1807781

Signed-off-by: Jeremy Cline <jcline@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200404051430.698058-1-jcline@redhat.com
2020-04-17 15:31:03 -07:00
Andrii Nakryiko
97ada10bd8 ci: update blacklists and Kconfig
Disable some of newest selftests on 5.5.0, turn on BPF_LSM on latest.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-04-02 00:02:25 -07:00
Andrii Nakryiko
9a35753b42 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   483d7a30f538e2f8addd32aa9a3d2e94ae55fa65
Checkpoint bpf-next commit: 1a323ea5356edbb3073dc59d51b9e6b86908857d
Baseline bpf commit:        94b18a87efdd1626a1e6aef87271af4a7c616d36
Checkpoint bpf commit:      94b18a87efdd1626a1e6aef87271af4a7c616d36

Andrii Nakryiko (2):
  bpf: Implement bpf_link-based cgroup BPF program attachment
  libbpf: Add support for bpf_link-based cgroup attachment

Antoine Tenart (1):
  net: macsec: add support for offloading to the MAC

Daniel Borkmann (2):
  bpf: Add netns cookie and enable it for bpf cgroup hooks
  bpf: Enable bpf cgroup hooks to retrieve cgroup v2 and ancestor id

Fletcher Dunn (1):
  libbpf, xsk: Init all ring members in xsk_umem__create and
    xsk_socket__create

Joe Stringer (1):
  bpf: Add socket assign support

KP Singh (2):
  bpf: Introduce BPF_PROG_TYPE_LSM
  tools/libbpf: Add support for BPF_PROG_TYPE_LSM

Mark Starovoytov (1):
  net: macsec: add support for specifying offload upon link creation

Stanislav Fomichev (1):
  libbpf: Don't allocate 16M for log buffer by default

Tobias Klauser (1):
  libbpf: Remove unused parameter `def` to get_map_field_int

Toke Høiland-Jørgensen (3):
  tools: Add EXPECTED_FD-related definitions in if_link.h
  libbpf: Add function to set link XDP fd while specifying old program
  libbpf: Add setter for initial value for internal maps

 include/uapi/linux/bpf.h     |  82 ++++++++++++++++++++-
 include/uapi/linux/if_link.h |   6 +-
 src/bpf.c                    |  37 +++++++++-
 src/bpf.h                    |  19 +++++
 src/btf.c                    |  20 ++++--
 src/libbpf.c                 | 134 +++++++++++++++++++++++++++++------
 src/libbpf.h                 |  22 +++++-
 src/libbpf.map               |   9 +++
 src/libbpf_probes.c          |   1 +
 src/netlink.c                |  34 ++++++++-
 src/xsk.c                    |  16 ++++-
 11 files changed, 345 insertions(+), 35 deletions(-)

--
2.24.1
2020-04-02 00:02:25 -07:00
Andrii Nakryiko
c4af2093cc sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-04-02 00:02:25 -07:00
Andrii Nakryiko
1543a19f36 libbpf: Add support for bpf_link-based cgroup attachment
Add bpf_program__attach_cgroup(), which uses BPF_LINK_CREATE subcommand to
create an FD-based kernel bpf_link. Also add low-level bpf_link_create() API.

If expected_attach_type is not specified explicitly with
bpf_program__set_expected_attach_type(), libbpf will try to determine proper
attach type from BPF program's section definition.

Also add support for bpf_link's underlying BPF program replacement:
  - unconditional through high-level bpf_link__update_program() API;
  - cmpxchg-like with specifying expected current BPF program through
    low-level bpf_link_update() API.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200330030001.2312810-4-andriin@fb.com
2020-04-02 00:02:25 -07:00
Andrii Nakryiko
8b41602694 bpf: Implement bpf_link-based cgroup BPF program attachment
Implement new sub-command to attach cgroup BPF programs and return FD-based
bpf_link back on success. bpf_link, once attached to cgroup, cannot be
replaced, except by owner having its FD. Cgroup bpf_link supports only
BPF_F_ALLOW_MULTI semantics. Both link-based and prog-based BPF_F_ALLOW_MULTI
attachments can be freely intermixed.

To prevent bpf_cgroup_link from keeping cgroup alive past the point when no
BPF program can be executed, implement auto-detachment of link. When
cgroup_bpf_release() is called, all attached bpf_links are forced to release
cgroup refcounts, but they leave bpf_link otherwise active and allocated, as
well as still owning underlying bpf_prog. This is because user-space might
still have FDs open and active, so bpf_link as a user-referenced object can't
be freed yet. Once last active FD is closed, bpf_link will be freed and
underlying bpf_prog refcount will be dropped. But cgroup refcount won't be
touched, because cgroup is released already.

The inherent race between bpf_cgroup_link release (from closing last FD) and
cgroup_bpf_release() is resolved by both operations taking cgroup_mutex. So
the only additional check required is when bpf_cgroup_link attempts to detach
itself from cgroup. At that time we need to check whether there is still
cgroup associated with that link. And if not, exit with success, because
bpf_cgroup_link was already successfully detached.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Roman Gushchin <guro@fb.com>
Link: https://lore.kernel.org/bpf/20200330030001.2312810-2-andriin@fb.com
2020-04-02 00:02:25 -07:00
Joe Stringer
cecb299ac4 bpf: Add socket assign support
Add support for TPROXY via a new bpf helper, bpf_sk_assign().

This helper requires the BPF program to discover the socket via a call
to bpf_sk*_lookup_*(), then pass this socket to the new helper. The
helper takes its own reference to the socket in addition to any existing
reference that may or may not currently be obtained for the duration of
BPF processing. For the destination socket to receive the traffic, the
traffic must be routed towards that socket via local route. The
simplest example route is below, but in practice you may want to route
traffic more narrowly (eg by CIDR):

  $ ip route add local default dev lo

This patch avoids trying to introduce an extra bit into the skb->sk, as
that would require more invasive changes to all code interacting with
the socket to ensure that the bit is handled correctly, such as all
error-handling cases along the path from the helper in BPF through to
the orphan path in the input. Instead, we opt to use the destructor
variable to switch on the prefetch of the socket.

Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200329225342.16317-2-joe@wand.net.nz
2020-04-02 00:02:25 -07:00
KP Singh
90e89264b9 tools/libbpf: Add support for BPF_PROG_TYPE_LSM
Since BPF_PROG_TYPE_LSM uses the same attaching mechanism as
BPF_PROG_TYPE_TRACING, the common logic is refactored into a static
function bpf_program__attach_btf_id.

A new API call bpf_program__attach_lsm is still added to avoid userspace
conflicts if this ever changes in the future.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Florent Revest <revest@google.com>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200329004356.27286-7-kpsingh@chromium.org
2020-04-02 00:02:25 -07:00
KP Singh
f69cc97272 bpf: Introduce BPF_PROG_TYPE_LSM
Introduce types and configs for bpf programs that can be attached to
LSM hooks. The programs can be enabled by the config option
CONFIG_BPF_LSM.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Florent Revest <revest@google.com>
Reviewed-by: Thomas Garnier <thgarnie@google.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Link: https://lore.kernel.org/bpf/20200329004356.27286-2-kpsingh@chromium.org
2020-04-02 00:02:25 -07:00
Toke Høiland-Jørgensen
a6e9750c8a libbpf: Add setter for initial value for internal maps
For internal maps (most notably the maps backing global variables), libbpf
uses an internal mmaped area to store the data after opening the object.
This data is subsequently copied into the kernel map when the object is
loaded.

This adds a function to set a new value for that data, which can be used to
before it is loaded into the kernel. This is especially relevant for RODATA
maps, since those are frozen on load.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200329132253.232541-1-toke@redhat.com
2020-04-02 00:02:25 -07:00
Toke Høiland-Jørgensen
60bade6674 libbpf: Add function to set link XDP fd while specifying old program
This adds a new function to set the XDP fd while specifying the FD of the
program to replace, using the newly added IFLA_XDP_EXPECTED_FD netlink
parameter. The new function uses the opts struct mechanism to be extendable
in the future.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/158515700857.92963.7052131201257841700.stgit@toke.dk
2020-04-02 00:02:25 -07:00
Toke Høiland-Jørgensen
e13c1b7b85 tools: Add EXPECTED_FD-related definitions in if_link.h
This adds the IFLA_XDP_EXPECTED_FD netlink attribute definition and the
XDP_FLAGS_REPLACE flag to if_link.h in tools/include.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/158515700747.92963.8615391897417388586.stgit@toke.dk
2020-04-02 00:02:25 -07:00
Fletcher Dunn
1d8451ccaf libbpf, xsk: Init all ring members in xsk_umem__create and xsk_socket__create
Fix a sharp edge in xsk_umem__create and xsk_socket__create.  Almost all of
the members of the ring buffer structs are initialized, but the "cached_xxx"
variables are not all initialized.  The caller is required to zero them.
This is needlessly dangerous.  The results if you don't do it can be very bad.
For example, they can cause xsk_prod_nb_free and xsk_cons_nb_avail to return
values greater than the size of the queue.  xsk_ring_cons__peek can return an
index that does not refer to an item that has been queued.

I have confirmed that without this change, my program misbehaves unless I
memset the ring buffers to zero before calling the function.  Afterwards,
my program works without (or with) the memset.

Signed-off-by: Fletcher Dunn <fletcherd@valvesoftware.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/85f12913cde94b19bfcb598344701c38@valvesoftware.com
2020-04-02 00:02:25 -07:00
Daniel Borkmann
fad6e249ea bpf: Enable bpf cgroup hooks to retrieve cgroup v2 and ancestor id
Enable the bpf_get_current_cgroup_id() helper for connect(), sendmsg(),
recvmsg() and bind-related hooks in order to retrieve the cgroup v2
context which can then be used as part of the key for BPF map lookups,
for example. Given these hooks operate in process context 'current' is
always valid and pointing to the app that is performing mentioned
syscalls if it's subject to a v2 cgroup. Also with same motivation of
commit 7723628101aa ("bpf: Introduce bpf_skb_ancestor_cgroup_id helper")
enable retrieval of ancestor from current so the cgroup id can be used
for policy lookups which can then forbid connect() / bind(), for example.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/d2a7ef42530ad299e3cbb245e6c12374b72145ef.1585323121.git.daniel@iogearbox.net
2020-04-02 00:02:25 -07:00
Daniel Borkmann
64f7fa917c bpf: Add netns cookie and enable it for bpf cgroup hooks
In Cilium we're mainly using BPF cgroup hooks today in order to implement
kube-proxy free Kubernetes service translation for ClusterIP, NodePort (*),
ExternalIP, and LoadBalancer as well as HostPort mapping [0] for all traffic
between Cilium managed nodes. While this works in its current shape and avoids
packet-level NAT for inter Cilium managed node traffic, there is one major
limitation we're facing today, that is, lack of netns awareness.

In Kubernetes, the concept of Pods (which hold one or multiple containers)
has been built around network namespaces, so while we can use the global scope
of attaching to root BPF cgroup hooks also to our advantage (e.g. for exposing
NodePort ports on loopback addresses), we also have the need to differentiate
between initial network namespaces and non-initial one. For example, ExternalIP
services mandate that non-local service IPs are not to be translated from the
host (initial) network namespace as one example. Right now, we have an ugly
work-around in place where non-local service IPs for ExternalIP services are
not xlated from connect() and friends BPF hooks but instead via less efficient
packet-level NAT on the veth tc ingress hook for Pod traffic.

On top of determining whether we're in initial or non-initial network namespace
we also have a need for a socket-cookie like mechanism for network namespaces
scope. Socket cookies have the nice property that they can be combined as part
of the key structure e.g. for BPF LRU maps without having to worry that the
cookie could be recycled. We are planning to use this for our sessionAffinity
implementation for services. Therefore, add a new bpf_get_netns_cookie() helper
which would resolve both use cases at once: bpf_get_netns_cookie(NULL) would
provide the cookie for the initial network namespace while passing the context
instead of NULL would provide the cookie from the application's network namespace.
We're using a hole, so no size increase; the assignment happens only once.
Therefore this allows for a comparison on initial namespace as well as regular
cookie usage as we have today with socket cookies. We could later on enable
this helper for other program types as well as we would see need.

  (*) Both externalTrafficPolicy={Local|Cluster} types
  [0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.c

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/c47d2346982693a9cf9da0e12690453aded4c788.1585323121.git.daniel@iogearbox.net
2020-04-02 00:02:25 -07:00
Stanislav Fomichev
240b8fa098 libbpf: Don't allocate 16M for log buffer by default
For each prog/btf load we allocate and free 16 megs of verifier buffer.
On production systems it doesn't really make sense because the
programs/btf have gone through extensive testing and (mostly) guaranteed
to successfully load.

Let's assume successful case by default and skip buffer allocation
on the first try. If there is an error, start with BPF_LOG_BUF_SIZE
and double it on each ENOSPC iteration.

v3:
* Return -ENOMEM when can't allocate log buffer (Andrii Nakryiko)

v2:
* Don't allocate the buffer at all on the first try (Andrii Nakryiko)

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200325195521.112210-1-sdf@google.com
2020-04-02 00:02:25 -07:00
Tobias Klauser
3756d20499 libbpf: Remove unused parameter def to get_map_field_int
Has been unused since commit ef99b02b23ef ("libbpf: capture value in BTF
type info for BTF-defined map defs").

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200325113655.19341-1-tklauser@distanz.ch
2020-04-02 00:02:25 -07:00
Mark Starovoytov
9e8b23289f net: macsec: add support for specifying offload upon link creation
This patch adds new netlink attribute to allow a user to (optionally)
specify the desired offload mode immediately upon MACSec link creation.

Separate iproute patch will be required to support this from user space.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-02 00:02:25 -07:00
Antoine Tenart
902eca48e5 net: macsec: add support for offloading to the MAC
This patch adds a new MACsec offloading option, MACSEC_OFFLOAD_MAC,
allowing a user to select a MAC as a provider for MACsec offloading
operations.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-02 00:02:25 -07:00
Andrii Nakryiko
9f0d55c24a vmtests: organize blacklists, enable sockmap_listen tests
Enable now-fixed sockmap_listen tests. Disabled vmlinux test on 5.5, on which
hrtimer_nanosleep() signature is incompatible. Filled out remaining
permanently disabled tests resons.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-03-17 14:56:36 -07:00
Andrii Nakryiko
e53dd1c436 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   9b79c0be350d3825ef26ed9eebac6ae50df506bc
Checkpoint bpf-next commit: 483d7a30f538e2f8addd32aa9a3d2e94ae55fa65
Baseline bpf commit:        90db6d772f749e38171d04619a5e3cd8804a6d02
Checkpoint bpf commit:      94b18a87efdd1626a1e6aef87271af4a7c616d36

Andrii Nakryiko (2):
  libbpf: Ignore incompatible types with matching name during CO-RE
    relocation
  libbpf: Provide CO-RE variants of PT_REGS macros

Wenbo Zhang (1):
  bpf, libbpf: Fix ___bpf_kretprobe_args1(x) macro definition

 src/bpf_tracing.h | 105 +++++++++++++++++++++++++++++++++++++++++++++-
 src/libbpf.c      |   4 ++
 2 files changed, 108 insertions(+), 1 deletion(-)

--
2.17.1
2020-03-17 14:56:36 -07:00
Andrii Nakryiko
da790d6014 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-03-17 14:56:36 -07:00
Wenbo Zhang
3d81b13b36 bpf, libbpf: Fix ___bpf_kretprobe_args1(x) macro definition
Use PT_REGS_RC instead of PT_REGS_RET to get ret correctly.

Fixes: df8ff35311c8 ("libbpf: Merge selftests' bpf_trace_helpers.h into libbpf's bpf_tracing.h")
Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200315083252.22274-1-ethercflow@gmail.com
2020-03-17 14:56:36 -07:00
Andrii Nakryiko
64bd9e074b libbpf: Provide CO-RE variants of PT_REGS macros
Syscall raw tracepoints have struct pt_regs pointer as tracepoint's first
argument. After that, reading any of pt_regs fields requires bpf_probe_read(),
even for tp_btf programs. Due to that, PT_REGS_PARMx macros are not usable as
is. This patch adds CO-RE variants of those macros that use BPF_CORE_READ() to
read necessary fields. This provides relocatable architecture-agnostic pt_regs
field accesses.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200313172336.1879637-4-andriin@fb.com
2020-03-17 14:56:36 -07:00
Andrii Nakryiko
53d473dd8e libbpf: Ignore incompatible types with matching name during CO-RE relocation
When finding target type candidates, ignore forward declarations, functions,
and other named types of incompatible kind. Not doing this can cause false
errors.  See [0] for one such case (due to struct pt_regs forward
declaration).

  [0] https://github.com/iovisor/bcc/pull/2806#issuecomment-598543645

Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm")
Reported-by: Wenbo Zhang <ethercflow@gmail.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200313172336.1879637-3-andriin@fb.com
2020-03-17 14:56:36 -07:00
Andrii Nakryiko
6d64d927a2 vmtests: enable previously failing kprobe selftests
With fixes in selftests, these tests should now pass.
Also add ability to add comments to blacklist/whitelist to explain why certain
test is disabled.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
cd87f1568e sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   abbc61a5f26d52a5d3abbbe552b275360b2c6631
Checkpoint bpf-next commit: 9b79c0be350d3825ef26ed9eebac6ae50df506bc
Baseline bpf commit:        542bf38f11d11bf98c69b2f83f3519ada8a76e95
Checkpoint bpf commit:      90db6d772f749e38171d04619a5e3cd8804a6d02

Andrii Nakryiko (4):
  libbpf: Fix handling of optional field_name in
    btf_dump__emit_type_decl
  bpf: Switch BPF UAPI #define constants used from BPF program side to
    enums
  libbpf: Assume unsigned values for BTF_KIND_ENUM
  libbpf: Split BTF presence checks into libbpf- and kernel-specific
    parts

Carlos Neira (1):
  bpf: Added new helper bpf_get_ns_current_pid_tgid

Eelco Chaudron (1):
  bpf: Add bpf_xdp_output() helper

KP Singh (2):
  bpf: Introduce BPF_MODIFY_RETURN
  tools/libbpf: Add support for BPF_MODIFY_RETURN

Willem de Bruijn (1):
  bpf: Sync uapi bpf.h to tools/

 include/uapi/linux/bpf.h | 223 +++++++++++++++++++++++++++------------
 src/btf_dump.c           |  10 +-
 src/libbpf.c             |  21 +++-
 3 files changed, 176 insertions(+), 78 deletions(-)

--
2.17.1
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
c417a4cb6f sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-03-12 22:57:51 -07:00
Eelco Chaudron
fa21d33fff bpf: Add bpf_xdp_output() helper
Introduce new helper that reuses existing xdp perf_event output
implementation, but can be called from raw_tracepoint programs
that receive 'struct xdp_buff *' as a tracepoint argument.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/158348514556.2239.11050972434793741444.stgit@xdp-tutorial
2020-03-12 22:57:51 -07:00
Carlos Neira
84cf76de9c bpf: Added new helper bpf_get_ns_current_pid_tgid
New bpf helper bpf_get_ns_current_pid_tgid,
This helper will return pid and tgid from current task
which namespace matches dev_t and inode number provided,
this will allows us to instrument a process inside a container.

Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200304204157.58695-3-cneirabustos@gmail.com
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
2ef4fdac6c libbpf: Split BTF presence checks into libbpf- and kernel-specific parts
Needs for application BTF being present differs between user-space libbpf needs and kernel
needs. Currently, BTF is mandatory only in kernel only when BPF application is
using STRUCT_OPS. While libbpf itself relies more heavily on presense of BTF:
  - for BTF-defined maps;
  - for Kconfig externs;
  - for STRUCT_OPS as well.

Thus, checks for presence and validness of bpf_object's BPF needs to be
performed separately, which is patch does.

Fixes: 5327644614a1 ("libbpf: Relax check whether BTF is mandatory")
Reported-by: Michal Rostecki <mrostecki@opensuse.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200312185033.736911-1-andriin@fb.com
2020-03-12 22:57:51 -07:00
KP Singh
1d72c9c382 tools/libbpf: Add support for BPF_MODIFY_RETURN
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200304191853.1529-6-kpsingh@chromium.org
2020-03-12 22:57:51 -07:00
KP Singh
7930230b43 bpf: Introduce BPF_MODIFY_RETURN
When multiple programs are attached, each program receives the return
value from the previous program on the stack and the last program
provides the return value to the attached function.

The fmod_ret bpf programs are run after the fentry programs and before
the fexit programs. The original function is only called if all the
fmod_ret programs return 0 to avoid any unintended side-effects. The
success value, i.e. 0 is not currently configurable but can be made so
where user-space can specify it at load time.

For example:

int func_to_be_attached(int a, int b)
{  <--- do_fentry

do_fmod_ret:
   <update ret by calling fmod_ret>
   if (ret != 0)
        goto do_fexit;

original_function:

    <side_effects_happen_here>

}  <--- do_fexit

The fmod_ret program attached to this function can be defined as:

SEC("fmod_ret/func_to_be_attached")
int BPF_PROG(func_name, int a, int b, int ret)
{
        // This will skip the original function logic.
        return 1;
}

The first fmod_ret program is passed 0 in its return argument.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200304191853.1529-4-kpsingh@chromium.org
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
483a8c238f libbpf: Assume unsigned values for BTF_KIND_ENUM
Currently, BTF_KIND_ENUM type doesn't record whether enum values should be
interpreted as signed or unsigned. In Linux, most enums are unsigned, though,
so interpreting them as unsigned matches real world better.

Change btf_dump test case to test maximum 32-bit value, instead of negative
value.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200303003233.3496043-3-andriin@fb.com
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
26cbe2384c bpf: Switch BPF UAPI #define constants used from BPF program side to enums
Switch BPF UAPI constants, previously defined as #define macro, to anonymous
enum values. This preserves constants values and behavior in expressions, but
has added advantaged of being captured as part of DWARF and, subsequently, BTF
type info. Which, in turn, greatly improves usefulness of generated vmlinux.h
for BPF applications, as it will not require BPF users to copy/paste various
flags and constants, which are frequently used with BPF helpers. Only those
constants that are used/useful from BPF program side are converted.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200303003233.3496043-2-andriin@fb.com
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
cb4a430c8a libbpf: Fix handling of optional field_name in btf_dump__emit_type_decl
Internal functions, used by btf_dump__emit_type_decl(), assume field_name is
never going to be NULL. Ensure it's always the case.

Fixes: 9f81654eebe8 ("libbpf: Expose BTF-to-C type declaration emitting API")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200303180800.3303471-1-andriin@fb.com
2020-03-12 22:57:51 -07:00
Willem de Bruijn
f67d535cdb bpf: Sync uapi bpf.h to tools/
sync tools/include/uapi/linux/bpf.h to match include/uapi/linux/bpf.h

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200303200503.226217-3-willemdebruijn.kernel@gmail.com
2020-03-12 22:57:51 -07:00
Julia Kartseva
ef4785f065 vmtest: libbpf#137 follow-ups
- Run test_{maps|verifier} only with the latest kernel
- Mount run control script
- Style

Signed-off-by: Julia Kartseva (hex@fb.com)
2020-03-12 21:36:30 -07:00
Andrii Nakryiko
9a424bea42 vmtests: add few missing Kconfig settings
Add few missing Kconfig settings that might be relied on in selftests.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-03-11 14:44:28 -07:00
Julia Kartseva
10e4311ad7 vmtest: add mkrootfs.sh to build Arch Linux disk image
Generate a disk image for libbpf testing in compressed *.zst format

The mkrootfs.sh has the following stages:
- run pacstrap to install libbpf and selftests dependencies.
- create /etc/fstab w/ bpffs and debugfs filesystems
- create /etc/init.d/rcS to mount in bootime
- create /etc/inittab to invoke /etc/init.d/rcS
- compress an image

In addition ./travis-ci/vmtest/run.sh set up ext4 fs and mounts
it as a loop device:
mkfs.ext4 -q "$tmp"
mount -o loop "$tmp" "$mnt"

Signed-off-by: Julia Kartseva (hex@fb.com)
2020-03-11 08:31:13 -07:00
Julia Kartseva
50febacba1 vmtest: disk image update; run test_{maps|verifier}; blacklist update
The disk image is updated to 2020-03-11.

blacklist for LATEST kernel:
attach_probe (needs root cause)
perf_buffer (needs root cause)
send_signal (flaky)
sockmap_listen (flaky)

Run test_maps and test_verifier.
test_maps is not expected to pass for kernels other then LATEST.

Signed-off-by: Julia Kartseva (hex@fb.com)
2020-03-11 08:31:13 -07:00
Andrii Nakryiko
ef7d57fcec vmtest: blacklist link_pinning selftest on 5.5.0
Link pinning is not supported by 5.5.0 and older kernels.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-03-03 00:05:56 -08:00
Andrii Nakryiko
7e7a15321e sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   503d539a6e417b018616bf3060e0b5814fafce47
Checkpoint bpf-next commit: abbc61a5f26d52a5d3abbbe552b275360b2c6631
Baseline bpf commit:        41f57cfde186dba6e357f9db25eafbed017e4487
Checkpoint bpf commit:      542bf38f11d11bf98c69b2f83f3519ada8a76e95

Andrii Nakryiko (3):
  libbpf: Fix use of PT_REGS_PARM macros with vmlinux.h
  libbpf: Merge selftests' bpf_trace_helpers.h into libbpf's
    bpf_tracing.h
  libbpf: Add bpf_link pinning/unpinning

 src/bpf_tracing.h | 120 +++++++++++++++++++++++++++++++++++++++++-
 src/libbpf.c      | 131 ++++++++++++++++++++++++++++++++++++----------
 src/libbpf.h      |   5 ++
 src/libbpf.map    |   5 ++
 4 files changed, 233 insertions(+), 28 deletions(-)

--
2.17.1
2020-03-03 00:05:56 -08:00
Andrii Nakryiko
77ac09c3eb libbpf: Add bpf_link pinning/unpinning
With bpf_link abstraction supported by kernel explicitly, add
pinning/unpinning API for links. Also allow to create (open) bpf_link from BPF
FS file.

This API allows to have an "ephemeral" FD-based BPF links (like raw tracepoint
or fexit/freplace attachments) surviving user process exit, by pinning them in
a BPF FS, which is an important use case for long-running BPF programs.

As part of this, expose underlying FD for bpf_link. While legacy bpf_link's
might not have a FD associated with them (which will be expressed as
a bpf_link with fd=-1), kernel's abstraction is based around FD-based usage,
so match it closely. This, subsequently, allows to have a generic
pinning/unpinning API for generalized bpf_link. For some types of bpf_links
kernel might not support pinning, in which case bpf_link__pin() will return
error.

With FD being part of generic bpf_link, also get rid of bpf_link_fd in favor
of using vanialla bpf_link.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200303043159.323675-3-andriin@fb.com
2020-03-03 00:05:56 -08:00
Andrii Nakryiko
40a08ef216 libbpf: Merge selftests' bpf_trace_helpers.h into libbpf's bpf_tracing.h
Move BPF_PROG, BPF_KPROBE, and BPF_KRETPROBE macro into libbpf's bpf_tracing.h
header to make it available for non-selftests users.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200229231112.1240137-5-andriin@fb.com
2020-03-03 00:05:56 -08:00
Andrii Nakryiko
b6683d1aeb libbpf: Fix use of PT_REGS_PARM macros with vmlinux.h
Add detection of vmlinux.h to bpf_tracing.h header for PT_REGS macro.
Currently, BPF applications have to define __KERNEL__ symbol to use correct
definition of struct pt_regs on x86 arch. This is due to different field names
under internal kernel vs UAPI conditions. To make this more transparent for
users, detect vmlinux.h by checking __VMLINUX_H__ symbol.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200229231112.1240137-3-andriin@fb.com
2020-03-03 00:05:56 -08:00
Julia Kartseva
5247b0b0dc vmtest: enable more networking kernel selftests
Set up loopback to enable more tests:
- bpf_tcp_ca
- cgroup_attach_autodetach
- cgroup_attach_multi
- cgroup_attach_override
- select_reuseport
- sockmap_ktls

Signed-off-by: Julia Kartseva hex@fb.com
2020-02-26 14:02:34 -08:00
Andrii Nakryiko
c2b01ad4f3 vmtest: trim down kernel config to minimize build time
Remove unnecesary drivers and features to speed up kernel compilation.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-02-26 12:08:31 -08:00
Andrii Nakryiko
c4468dec74 sync: bump kernel commit to latest to pull in latest selftests
Manually bump sync commit from kernel repo. There are no libbpf changes, but
we need latest selftest patches to try to debug more of crashing selftests.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-02-25 20:01:06 -08:00
Andrii Nakryiko
40229b3ffd ci: enable more test_progs tests
Trim tests blacklist.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-02-22 09:20:41 -08:00
Andrii Nakryiko
7f2d538c27 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   5327644614a18f5d0ff845844a4e9976210b3d8d
Checkpoint bpf-next commit: 8eece07c011f88da0ccf4127fca9a4e4faaf58ae
Baseline bpf commit:        41f57cfde186dba6e357f9db25eafbed017e4487
Checkpoint bpf commit:      41f57cfde186dba6e357f9db25eafbed017e4487

Eelco Chaudron (2):
  libbpf: Bump libpf current version to v0.0.8
  libbpf: Add support for dynamic program attach target

 src/libbpf.c   | 34 ++++++++++++++++++++++++++++++----
 src/libbpf.h   |  4 ++++
 src/libbpf.map |  5 +++++
 3 files changed, 39 insertions(+), 4 deletions(-)

--
2.17.1
2020-02-22 09:20:41 -08:00
Eelco Chaudron
b7c162a433 libbpf: Add support for dynamic program attach target
Currently when you want to attach a trace program to a bpf program
the section name needs to match the tracepoint/function semantics.

However the addition of the bpf_program__set_attach_target() API
allows you to specify the tracepoint/function dynamically.

The call flow would look something like this:

  xdp_fd = bpf_prog_get_fd_by_id(id);
  trace_obj = bpf_object__open_file("func.o", NULL);
  prog = bpf_object__find_program_by_title(trace_obj,
                                           "fentry/myfunc");
  bpf_program__set_expected_attach_type(prog, BPF_TRACE_FENTRY);
  bpf_program__set_attach_target(prog, xdp_fd,
                                 "xdpfilt_blk_all");
  bpf_object__load(trace_obj)

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/158220519486.127661.7964708960649051384.stgit@xdp-tutorial
2020-02-22 09:20:41 -08:00
Eelco Chaudron
36c26f12f1 libbpf: Bump libpf current version to v0.0.8
New development cycles starts, bump to v0.0.8.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/158220518424.127661.8278643006567775528.stgit@xdp-tutorial
2020-02-22 09:20:41 -08:00
Andrii Nakryiko
22d5d40493 ci: fetch and build latest pahole
Build latest pahole from sources and not rely on hacky Ubuntu repository
approach.
Also enable tests for latest kernel that rely on pahole 1.16.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-02-21 20:39:35 -08:00
Andrii Nakryiko
17c26b7da6 ci: clean up .travis.yaml
Clean up Travis CI config, extract multi-step initializations into scripts.
Also, move kernel-building tests to happen last to not block lightweight
Debian and Ubuntu tests.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-02-21 20:39:35 -08:00
Andrii Nakryiko
e287979374 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   35b9211c0a2427e8f39e534f442f43804fc8d5ca
Checkpoint bpf-next commit: 5327644614a18f5d0ff845844a4e9976210b3d8d
Baseline bpf commit:        08dc225d8868d5094ada62f471ebdfcce9dbc298
Checkpoint bpf commit:      41f57cfde186dba6e357f9db25eafbed017e4487

Andrii Nakryiko (1):
  libbpf: Relax check whether BTF is mandatory

Daniel Xu (1):
  selftests/bpf: Add bpf_read_branch_records() selftest

Toke Høiland-Jørgensen (2):
  bpf, uapi: Remove text about bpf_redirect_map() giving higher
    performance
  libbpf: Sanitise internal map names so they are not rejected by the
    kernel

 include/uapi/linux/bpf.h | 41 ++++++++++++++++++++++++++++++----------
 src/libbpf.c             | 12 ++++++++----
 2 files changed, 39 insertions(+), 14 deletions(-)

--
2.17.1
2020-02-20 17:56:42 -08:00
Andrii Nakryiko
552af3d963 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-02-20 17:56:42 -08:00
Toke Høiland-Jørgensen
c772c9cbde libbpf: Sanitise internal map names so they are not rejected by the kernel
The kernel only accepts map names with alphanumeric characters, underscores
and periods in their name. However, the auto-generated internal map names
used by libbpf takes their prefix from the user-supplied BPF object name,
which has no such restriction. This can lead to "Invalid argument" errors
when trying to load a BPF program using global variables.

Fix this by sanitising the map names, replacing any non-allowed characters
with underscores.

Fixes: d859900c4c56 ("bpf, libbpf: support global data/bss/rodata sections")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200217171701.215215-1-toke@redhat.com
2020-02-20 17:56:42 -08:00
Toke Høiland-Jørgensen
031a38cceb bpf, uapi: Remove text about bpf_redirect_map() giving higher performance
The performance of bpf_redirect() is now roughly the same as that of
bpf_redirect_map(). However, David Ahern pointed out that the header file
has not been updated to reflect this, and still says that a significant
performance increase is possible when using bpf_redirect_map(). Remove this
text from the bpf_redirect_map() description, and reword the description in
bpf_redirect() slightly. Also fix the 'Return' section of the
bpf_redirect_map() documentation.

Fixes: 1d233886dd90 ("xdp: Use bulking for non-map XDP_REDIRECT and consolidate code paths")
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200218130334.29889-1-toke@redhat.com
2020-02-20 17:56:42 -08:00
Andrii Nakryiko
6ff5062480 libbpf: Relax check whether BTF is mandatory
If BPF program is using BTF-defined maps, BTF is required only for
libbpf itself to process map definitions. If after that BTF fails to
be loaded into kernel (e.g., if it doesn't support BTF at all), this
shouldn't prevent valid BPF program from loading. Existing
retry-without-BTF logic for creating maps will succeed to create such
maps without any problems. So, presence of .maps section shouldn't make
BTF required for kernel. Update the check accordingly.

Validated by ensuring simple BPF program with BTF-defined maps is still
loaded on old kernel without BTF support and map is correctly parsed and
created.

Fixes: abd29c931459 ("libbpf: allow specifying map definitions using BTF")
Reported-by: Julia Kartseva <hex@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200220062635.1497872-1-andriin@fb.com
2020-02-20 17:56:42 -08:00
Daniel Xu
fdff85e63e selftests/bpf: Add bpf_read_branch_records() selftest
Add a selftest to test:

* default bpf_read_branch_records() behavior
* BPF_F_GET_BRANCH_RECORDS_SIZE flag behavior
* error path on non branch record perf events
* using helper to write to stack
* using helper to write to global

On host with hardware counter support:

    # ./test_progs -t perf_branches
    #27/1 perf_branches_hw:OK
    #27/2 perf_branches_no_hw:OK
    #27 perf_branches:OK
    Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED

On host without hardware counter support (VM):

    # ./test_progs -t perf_branches
    #27/1 perf_branches_hw:OK
    #27/2 perf_branches_no_hw:OK
    #27 perf_branches:OK
    Summary: 1/2 PASSED, 1 SKIPPED, 0 FAILED

Also sync tools/include/uapi/linux/bpf.h.

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200218030432.4600-3-dxu@dxuuu.xyz
2020-02-20 17:56:42 -08:00
Andrii Nakryiko
5c7661fd5e vmtest: update and sort blacklists
Update blacklists to omit some of the newest selftests. Also ensure that
blacklist is sorted alphabetically.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-02-20 17:56:42 -08:00
Andrii Nakryiko
1feb21b081 vmtest: remove temporary runqslower fix
It's now in bpf-next and this work around is not needed anymore.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-02-20 11:48:22 -08:00
Andrii Nakryiko
fa8cb316fb sync: fix commit signature determination in sync script
Commit signature, used to determine already synced commits, includes a short
stats per each file relevant. Fix this script to include only files that are
actually synced (i.e., exclude Makefile, Build file, etc).

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-02-20 11:12:59 -08:00
Julia Kartseva
f72fe00e70 vmtest: #121 follow-ups. Loop increase bpf-next git fetch depth
- The previously introduced git fetch depth of bpf-next tree is not sufficient
when bpf-next tree is far ahead from libbpf checkpoint commit, so increase the
depth up to 128 max. Since 128 may be an overkill for a general case, increase
exponentially in a loop until max is reached.

- Do not fetch bpf-next twice
- Remove setup_example.sh
2020-02-19 15:01:47 -08:00
Julia Kartseva
583bddce6b vmtest: build and run bpf kernel selftests against various kernels
Run kernel selftests in vmtest with the goal to test libbpf backward
compatibility with older kernels.
The list of kernels should be specified in .travis.yml config in
`jobs` section, e.g. KERNEL=5.5.0.
Enlisted kernel releases
- 5.5.0 # built from main
- 5.5.0-rc6 # built from bpf-next
- LATEST

The kernel specified as 'LATEST' in .travis.yml is built from bpf-next kernel
tree, the rest of the kernels are downloaded from the specified in INDEX file.
The kernel sources from bpf-next are manually patched with [1] from bpf tree to
fix ranqslower build. This workaround should be removed after the patch is merged
from bpf to bpf-next tree.
Due to kernel sources being checked out the duration of the LATEST kernel test is
~30m.

bpf selftests are built from tools/testing/selftests/bpf/ of bpf-next tree with
HEAD revision set to CHECKPOINT-COMMIT specified in libbpf so selftests and
libbpf are in sync.
Currently only programs are tested with test_progs program, test_maps and
test_verifier should follow.
test_progs are run with blacklist required due to:
- some features, e.g. fentry/fexit are not supported in older kernels
- environment limitations, e.g an absence of the recent pahole in Debian
- incomplete disk image

The blacklist is passed to test_progs with -b option as specified in [2]
patch set.

Most of the preceeding tests are disabled due to incomplete disk image currenly
lacking proper networking settings.
For the LATEST kernel fome fentry/fexit tests are disabled due to pahole v1.16
is not abailible in Debian yet.

Next steps are resolving issues with blacklisted tests, enabling maps and
verifier testing, expanding the list of tested kernels.

[1] https://lore.kernel.org/bpf/908498f794661c44dca54da9e09dc0c382df6fcb.1580425879.git.hex@fb.com/t.mbox.gz
[2] https://www.spinics.net/lists/netdev/msg625192.html
2020-02-17 22:12:17 -08:00
Julia Kartseva
a52fb86a96 vmtest: add configs for bpf kernel selftests
vmtest is run as a TravisCI job in order to test libbpf backward compatibility
with the older kernels

Add config files required to build and run bpf kernel selftests in vmtest:
- latest.config: latest kernel config
- INDEX: links to binaries (kernels, disk image) to download
- blacklist/BLACKLIST-${kernel}: blacklisted bpf program tests for ${kernel}
2020-02-17 22:12:17 -08:00
Andrii Nakryiko
e5dbc1a96f sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   a6ed02cac690b635dbb938690e795812ce1e14ca
Checkpoint bpf-next commit: 35b9211c0a2427e8f39e534f442f43804fc8d5ca
Baseline bpf commit:        1712b2fff8c682d145c7889d2290696647d82dab
Checkpoint bpf commit:      08dc225d8868d5094ada62f471ebdfcce9dbc298

Alexei Starovoitov (1):
  libbpf: Add support for program extensions

Andrii Nakryiko (2):
  libbpf: Improve handling of failed CO-RE relocations
  libbpf: Fix realloc usage in bpf_core_find_cands

Antoine Tenart (1):
  net: macsec: introduce the macsec_context structure

Martin KaFai Lau (1):
  bpf: Sync uapi bpf.h to tools/

 include/uapi/linux/bpf.h     |  10 +++-
 include/uapi/linux/if_link.h |   7 +++
 src/bpf.c                    |   3 +-
 src/libbpf.c                 | 112 +++++++++++++++++++++--------------
 src/libbpf.h                 |   8 ++-
 src/libbpf.map               |   2 +
 src/libbpf_probes.c          |   1 +
 7 files changed, 97 insertions(+), 46 deletions(-)

--
2.17.1
2020-01-24 14:08:27 -08:00
Andrii Nakryiko
96333403ca sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-01-24 14:08:27 -08:00
Andrii Nakryiko
928f2fc146 libbpf: Fix realloc usage in bpf_core_find_cands
Fix bug requesting invalid size of reallocated array when constructing CO-RE
relocation candidate list. This can cause problems if there are many potential
candidates and a very fine-grained memory allocator bucket sizes are used.

Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm")
Reported-by: William Smith <williampsmith@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200124201847.212528-1-andriin@fb.com
2020-01-24 14:08:27 -08:00
Andrii Nakryiko
8fd8b5bb46 libbpf: Improve handling of failed CO-RE relocations
Previously, if libbpf failed to resolve CO-RE relocation for some
instructions, it would either return error immediately, or, if
.relaxed_core_relocs option was set, would replace relocatable offset/imm part
of an instruction with a bogus value (-1). Neither approach is good, because
there are many possible scenarios where relocation is expected to fail (e.g.,
when some field knowingly can be missing on specific kernel versions). On the
other hand, replacing offset with invalid one can hide programmer errors, if
this relocation failue wasn't anticipated.

This patch deprecates .relaxed_core_relocs option and changes the approach to
always replacing instruction, for which relocation failed, with invalid BPF
helper call instruction. For cases where this is expected, BPF program should
already ensure that that instruction is unreachable, in which case this
invalid instruction is going to be silently ignored. But if instruction wasn't
guarded, BPF program will be rejected at verification step with verifier log
pointing precisely to the place in assembly where the problem is.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200124053837.2434679-1-andriin@fb.com
2020-01-24 14:08:27 -08:00
Martin KaFai Lau
b999e8f2c1 bpf: Sync uapi bpf.h to tools/
This patch sync uapi bpf.h to tools/.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200122233652.903348-1-kafai@fb.com
2020-01-24 14:08:27 -08:00
Alexei Starovoitov
c6c86a53f2 libbpf: Add support for program extensions
Add minimal support for program extensions. bpf_object_open_opts() needs to be
called with attach_prog_fd = target_prog_fd and BPF program extension needs to
have in .c file section definition like SEC("freplace/func_to_be_replaced").
libbpf will search for "func_to_be_replaced" in the target_prog_fd's BTF and
will pass it in attach_btf_id to the kernel. This approach works for tests, but
more compex use case may need to request function name (and attach_btf_id that
kernel sees) to be more dynamic. Such API will be added in future patches.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200121005348.2769920-3-ast@kernel.org
2020-01-24 14:08:27 -08:00
Antoine Tenart
c69f0d12f3 net: macsec: introduce the macsec_context structure
This patch introduces the macsec_context structure. It will be used
in the kernel to exchange information between the common MACsec
implementation (macsec.c) and the MACsec hardware offloading
implementations. This structure contains pointers to MACsec specific
structures which contain the actual MACsec configuration, and to the
underlying device (phydev for now).

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24 14:08:27 -08:00
Andrii Nakryiko
033ad7ee78 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   20f21d98cf12b8ecd69e8defc93fae9e3b353b13
Checkpoint bpf-next commit: a6ed02cac690b635dbb938690e795812ce1e14ca
Baseline bpf commit:        1712b2fff8c682d145c7889d2290696647d82dab
Checkpoint bpf commit:      1712b2fff8c682d145c7889d2290696647d82dab

Andrii Nakryiko (3):
  libbpf: Fix error handling bug in btf_dump__new
  libbpf: Simplify BTF initialization logic
  libbpf: Fix potential multiplication overflow in mmap() size
    calculation

KP Singh (1):
  libbpf: Load btf_vmlinux only once per object.

 src/btf_dump.c |   1 +
 src/libbpf.c   | 174 ++++++++++++++++++++++++++++++-------------------
 2 files changed, 109 insertions(+), 66 deletions(-)

--
2.17.1
2020-01-17 20:33:22 -08:00
KP Singh
397db2175d libbpf: Load btf_vmlinux only once per object.
As more programs (TRACING, STRUCT_OPS, and upcoming LSM) use vmlinux
BTF information, loading the BTF vmlinux information for every program
in an object is sub-optimal. The fix was originally proposed in:

   https://lore.kernel.org/bpf/CAEf4BzZodr3LKJuM7QwD38BiEH02Cc1UbtnGpVkCJ00Mf+V_Qg@mail.gmail.com/

The btf_vmlinux is populated in the object if any of the programs in
the object requires it just before the programs are loaded and freed
after the programs finish loading.

Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Brendan Jackman <jackmanb@chromium.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200117212825.11755-1-kpsingh@chromium.org
2020-01-17 20:33:22 -08:00
Andrii Nakryiko
6756bdc96e libbpf: Fix potential multiplication overflow in mmap() size calculation
Prevent potential overflow performed in 32-bit integers, before assigning
result to size_t. Reported by LGTM static analysis.

Fixes: eba9c5f498a1 ("libbpf: Refactor global data map initialization")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200117060801.1311525-4-andriin@fb.com
2020-01-17 20:33:22 -08:00
Andrii Nakryiko
9c1ae55dbd libbpf: Simplify BTF initialization logic
Current implementation of bpf_object's BTF initialization is very convoluted
and thus prone to errors. It doesn't have to be like that. This patch
simplifies it significantly.

This code also triggered static analysis issues over logically dead code due
to redundant error checks. This simplification should fix that as well.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200117060801.1311525-3-andriin@fb.com
2020-01-17 20:33:22 -08:00
Andrii Nakryiko
091f073ff0 libbpf: Fix error handling bug in btf_dump__new
Fix missing jump to error handling in btf_dump__new, found by Coverity static
code analysis.

Fixes: 9f81654eebe8 ("libbpf: Expose BTF-to-C type declaration emitting API")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200117060801.1311525-2-andriin@fb.com
2020-01-17 20:33:22 -08:00
Andrii Nakryiko
ad51a528dc travis-ci: make sure before_script override is non-empty
Travis CI seems to be ignoring empty before_script override. Let's make sure
it's a non-empty no-op.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-01-17 20:00:47 -08:00
Andrii Nakryiko
fa29cc01ff sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   858e284f0ec18bff2620d9a6afe764dc683f8ba1
Checkpoint bpf-next commit: 20f21d98cf12b8ecd69e8defc93fae9e3b353b13
Baseline bpf commit:        1712b2fff8c682d145c7889d2290696647d82dab
Checkpoint bpf commit:      1712b2fff8c682d145c7889d2290696647d82dab

Andrii Nakryiko (1):
  libbpf: Revert bpf_helper_defs.h inclusion regression

 src/bpf_helpers.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--
2.17.1
2020-01-16 20:06:41 -08:00
Andrii Nakryiko
a2bec08412 libbpf: Revert bpf_helper_defs.h inclusion regression
Revert bpf_helpers.h's change to include auto-generated bpf_helper_defs.h
through <> instead of "", which causes it to be searched in include path. This
can break existing applications that don't have their include path pointing
directly to where libbpf installs its headers.

There is ongoing work to make all (not just bpf_helper_defs.h) includes more
consistent across libbpf and its consumers, but this unbreaks user code as is
right now without any regressions. Selftests still behave sub-optimally
(taking bpf_helper_defs.h from libbpf's source directory, if it's present
there), which will be fixed in subsequent patches.

Fixes: 6910d7d3867a ("selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir")
Reported-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200117004103.148068-1-andriin@fb.com
2020-01-16 20:06:41 -08:00
Andrii Nakryiko
080fd68e9c sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   1d1a3bcffe360a56fd8cc287ed74d4c3066daf42
Checkpoint bpf-next commit: 858e284f0ec18bff2620d9a6afe764dc683f8ba1
Baseline bpf commit:        e7a5f1f1cd0008e5ad379270a8657e121eedb669
Checkpoint bpf commit:      1712b2fff8c682d145c7889d2290696647d82dab

Andrii Nakryiko (2):
  tools: Sync uapi/linux/if_link.h
  libbpf: Support .text sub-calls relocations

Brian Vazquez (1):
  libbpf: Fix unneeded extra initialization in bpf_map_batch_common

Martin KaFai Lau (1):
  libbpf: Expose bpf_find_kernel_btf as a LIBBPF_API

Yonghong Song (3):
  bpf: Add bpf_send_signal_thread() helper
  tools/bpf: Sync uapi header bpf.h
  libbpf: Add libbpf support to batch ops

 include/uapi/linux/bpf.h     |  40 +++++++++++-
 include/uapi/linux/if_link.h |   1 +
 src/bpf.c                    |  58 +++++++++++++++++
 src/bpf.h                    |  22 +++++++
 src/btf.c                    | 102 +++++++++++++++++++++++++++--
 src/btf.h                    |   2 +
 src/libbpf.c                 | 122 +++++++----------------------------
 src/libbpf.map               |   5 ++
 8 files changed, 247 insertions(+), 105 deletions(-)

--
2.17.1
2020-01-16 10:38:48 -08:00
Andrii Nakryiko
437f57042c sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-01-16 10:38:48 -08:00
Brian Vazquez
f4f271b068 libbpf: Fix unneeded extra initialization in bpf_map_batch_common
bpf_attr doesn't required to be declared with '= {}' as memset is used
in the code.

Fixes: 2ab3d86ea1859 ("libbpf: Add libbpf support to batch ops")
Reported-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200116045918.75597-1-brianvv@google.com
2020-01-16 10:38:48 -08:00
Martin KaFai Lau
bd35a43bb3 libbpf: Expose bpf_find_kernel_btf as a LIBBPF_API
This patch exposes bpf_find_kernel_btf() as a LIBBPF_API.
It will be used in 'bpftool map dump' in a following patch
to dump a map with btf_vmlinux_value_type_id set.

bpf_find_kernel_btf() is renamed to libbpf_find_kernel_btf()
and moved to btf.c.  As <linux/kernel.h> is included,
some of the max/min type casting needs to be fixed.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200115230031.1102305-1-kafai@fb.com
2020-01-16 10:38:48 -08:00
Yonghong Song
d91f681d3b libbpf: Add libbpf support to batch ops
Added four libbpf API functions to support map batch operations:
  . int bpf_map_delete_batch( ... )
  . int bpf_map_lookup_batch( ... )
  . int bpf_map_lookup_and_delete_batch( ... )
  . int bpf_map_update_batch( ... )

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-8-brianvv@google.com
2020-01-16 10:38:48 -08:00
Yonghong Song
1e51491d05 tools/bpf: Sync uapi header bpf.h
sync uapi header include/uapi/linux/bpf.h to
tools/include/uapi/linux/bpf.h

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-7-brianvv@google.com
2020-01-16 10:38:48 -08:00
Andrii Nakryiko
37440e95d1 libbpf: Support .text sub-calls relocations
The LLVM patch https://reviews.llvm.org/D72197 makes LLVM emit function call
relocations within the same section. This includes a default .text section,
which contains any BPF sub-programs. This wasn't the case before and so libbpf
was able to get a way with slightly simpler handling of subprogram call
relocations.

This patch adds support for .text section relocations. It needs to ensure
correct order of relocations, so does two passes:
- first, relocate .text instructions, if there are any relocations in it;
- then process all the other programs and copy over patched .text instructions
for all sub-program calls.

v1->v2:
- break early once .text program is processed.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115190856.2391325-1-andriin@fb.com
2020-01-16 10:38:48 -08:00
Yonghong Song
681f2f9291 bpf: Add bpf_send_signal_thread() helper
Commit 8b401f9ed244 ("bpf: implement bpf_send_signal() helper")
added helper bpf_send_signal() which permits bpf program to
send a signal to the current process. The signal may be
delivered to any threads in the process.

We found a use case where sending the signal to the current
thread is more preferable.
  - A bpf program will collect the stack trace and then
    send signal to the user application.
  - The user application will add some thread specific
    information to the just collected stack trace for
    later analysis.

If bpf_send_signal() is used, user application will need
to check whether the thread receiving the signal matches
the thread collecting the stack by checking thread id.
If not, it will need to send signal to another thread
through pthread_kill().

This patch proposed a new helper bpf_send_signal_thread(),
which sends the signal to the thread corresponding to
the current kernel task. This way, user space is guaranteed that
bpf_program execution context and user space signal handling
context are the same thread.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115035002.602336-1-yhs@fb.com
2020-01-16 10:38:48 -08:00
Andrii Nakryiko
0e4638ec14 tools: Sync uapi/linux/if_link.h
Sync uapi/linux/if_link.h into tools to avoid out of sync warnings during
libbpf build.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200113073143.1779940-2-andriin@fb.com
2020-01-16 10:38:48 -08:00
hex
234a45a128 Update README with Distribution section
- List of current distros having libbpf packaged from GH
- Rationale of having libbpf packaged from GH
- List of package dependencies
2020-01-14 16:12:42 -08:00
Andrii Nakryiko
8687395198 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   2bbc078f812d45b8decb55935dab21199bd21489
Checkpoint bpf-next commit: 1d1a3bcffe360a56fd8cc287ed74d4c3066daf42
Baseline bpf commit:        3c2f450e553ce47fcb0d6141807a8858e3213c9c
Checkpoint bpf commit:      e7a5f1f1cd0008e5ad379270a8657e121eedb669

Alexei Starovoitov (1):
  libbpf: Sanitize global functions

Andrey Ignatov (1):
  bpf: Document BPF_F_QUERY_EFFECTIVE flag

Andrii Nakryiko (3):
  libbpf: Make bpf_map order and indices stable
  selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir
  libbpf: Poison kernel-only integer types

Martin KaFai Lau (2):
  bpf: Synch uapi bpf.h to tools/
  bpf: libbpf: Add STRUCT_OPS support

Michal Rostecki (1):
  libbpf: Add probe for large INSN limit

 include/uapi/linux/bpf.h |  26 +-
 include/uapi/linux/btf.h |   6 +
 src/bpf.c                |  13 +-
 src/bpf.h                |   5 +-
 src/bpf_helpers.h        |   2 +-
 src/bpf_prog_linfo.c     |   3 +
 src/btf.c                |   3 +
 src/btf_dump.c           |   3 +
 src/hashmap.c            |   3 +
 src/libbpf.c             | 701 +++++++++++++++++++++++++++++++++++++--
 src/libbpf.h             |   6 +-
 src/libbpf.map           |   4 +
 src/libbpf_errno.c       |   3 +
 src/libbpf_probes.c      |  26 ++
 src/netlink.c            |   3 +
 src/nlattr.c             |   3 +
 src/str_error.c          |   3 +
 src/xsk.c                |   3 +
 18 files changed, 784 insertions(+), 32 deletions(-)

--
2.17.1
2020-01-10 11:15:12 -08:00
Andrii Nakryiko
0cccc9ff28 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-01-10 11:15:12 -08:00
Andrii Nakryiko
b50eb28758 libbpf: Poison kernel-only integer types
It's been a recurring issue with types like u32 slipping into libbpf source
code accidentally. This is not detected during builds inside kernel source
tree, but becomes a compilation error in libbpf's Github repo. Libbpf is
supposed to use only __{s,u}{8,16,32,64} typedefs, so poison {s,u}{8,16,32,64}
explicitly in every .c file. Doing that in a bit more centralized way, e.g.,
inside libbpf_internal.h breaks selftests, which are both using kernel u32 and
libbpf_internal.h.

This patch also fixes a new u32 occurence in libbpf.c, added recently.

Fixes: 590a00888250 ("bpf: libbpf: Add STRUCT_OPS support")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200110181916.271446-1-andriin@fb.com
2020-01-10 11:15:12 -08:00
Alexei Starovoitov
8d936a1570 libbpf: Sanitize global functions
In case the kernel doesn't support BTF_FUNC_GLOBAL sanitize BTF produced by the
compiler for global functions.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200110064124.1760511-2-ast@kernel.org
2020-01-10 11:15:12 -08:00
Andrii Nakryiko
2c8602eb54 selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir
Reorder includes search path to ensure $(OUTPUT) and $(CURDIR) go before
libbpf's directory. Also fix bpf_helpers.h to include bpf_helper_defs.h in
such a way as to leverage includes search path. This allows selftests to not
use libbpf's local and potentially stale bpf_helper_defs.h. It's important
because selftests/bpf's Makefile only re-generates bpf_helper_defs.h in
seltests' output directory, not the one in libbpf's directory.

Also force regeneration of bpf_helper_defs.h when libbpf.a is updated to
reduce staleness.

Fixes: fa633a0f8919 ("libbpf: Fix build on read-only filesystems")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200110051716.1591485-3-andriin@fb.com
2020-01-10 11:15:12 -08:00
Andrii Nakryiko
1ef23426e7 libbpf: Make bpf_map order and indices stable
Currently, libbpf re-sorts bpf_map structs after all the maps are added and
initialized, which might change their relative order and invalidate any
bpf_map pointer or index taken before that. This is inconvenient and
error-prone. For instance, it can cause .kconfig map index to point to a wrong
map.

Furthermore, libbpf itself doesn't rely on any specific ordering of bpf_maps,
so it's just an unnecessary complication right now. This patch drops sorting
of maps and makes their relative positions fixed. If efficient index is ever
needed, it's better to have a separate array of pointers as a search index,
instead of reordering bpf_map struct in-place. This will be less error-prone
and will allow multiple independent orderings, if necessary (e.g., either by
section index or by name).

Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables")
Reported-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200110034247.1220142-1-andriin@fb.com
2020-01-10 11:15:12 -08:00
Andrey Ignatov
d2100072b9 bpf: Document BPF_F_QUERY_EFFECTIVE flag
Document BPF_F_QUERY_EFFECTIVE flag, mostly to clarify how it affects
attach_flags what may not be obvious and what may lead to confision.

Specifically attach_flags is returned only for target_fd but if programs
are inherited from an ancestor cgroup then returned attach_flags for
current cgroup may be confusing. For example, two effective programs of
same attach_type can be returned but w/o BPF_F_ALLOW_MULTI in
attach_flags.

Simple repro:
  # bpftool c s /sys/fs/cgroup/path/to/task
  ID       AttachType      AttachFlags     Name
  # bpftool c s /sys/fs/cgroup/path/to/task effective
  ID       AttachType      AttachFlags     Name
  95043    ingress                         tw_ipt_ingress
  95048    ingress                         tw_ingress

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200108014006.938363-1-rdna@fb.com
2020-01-10 11:15:12 -08:00
Martin KaFai Lau
f3edca46e5 bpf: libbpf: Add STRUCT_OPS support
This patch adds BPF STRUCT_OPS support to libbpf.

The only sec_name convention is SEC(".struct_ops") to identify the
struct_ops implemented in BPF,
e.g. To implement a tcp_congestion_ops:

SEC(".struct_ops")
struct tcp_congestion_ops dctcp = {
	.init           = (void *)dctcp_init,  /* <-- a bpf_prog */
	/* ... some more func prts ... */
	.name           = "bpf_dctcp",
};

Each struct_ops is defined as a global variable under SEC(".struct_ops")
as above.  libbpf creates a map for each variable and the variable name
is the map's name.  Multiple struct_ops is supported under
SEC(".struct_ops").

In the bpf_object__open phase, libbpf will look for the SEC(".struct_ops")
section and find out what is the btf-type the struct_ops is
implementing.  Note that the btf-type here is referring to
a type in the bpf_prog.o's btf.  A "struct bpf_map" is added
by bpf_object__add_map() as other maps do.  It will then
collect (through SHT_REL) where are the bpf progs that the
func ptrs are referring to.  No btf_vmlinux is needed in
the open phase.

In the bpf_object__load phase, the map-fields, which depend
on the btf_vmlinux, are initialized (in bpf_map__init_kern_struct_ops()).
It will also set the prog->type, prog->attach_btf_id, and
prog->expected_attach_type.  Thus, the prog's properties do
not rely on its section name.
[ Currently, the bpf_prog's btf-type ==> btf_vmlinux's btf-type matching
  process is as simple as: member-name match + btf-kind match + size match.
  If these matching conditions fail, libbpf will reject.
  The current targeting support is "struct tcp_congestion_ops" which
  most of its members are function pointers.
  The member ordering of the bpf_prog's btf-type can be different from
  the btf_vmlinux's btf-type. ]

Then, all obj->maps are created as usual (in bpf_object__create_maps()).

Once the maps are created and prog's properties are all set,
the libbpf will proceed to load all the progs.

bpf_map__attach_struct_ops() is added to register a struct_ops
map to a kernel subsystem.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200109003514.3856730-1-kafai@fb.com
2020-01-10 11:15:12 -08:00
Martin KaFai Lau
cabb077325 bpf: Synch uapi bpf.h to tools/
This patch sync uapi bpf.h to tools/

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200109003512.3856559-1-kafai@fb.com
2020-01-10 11:15:12 -08:00
Michal Rostecki
b95b281039 libbpf: Add probe for large INSN limit
Introduce a new probe which checks whether kernel has large maximum
program size which was increased in the following commit:

c04c0d2b968a ("bpf: increase complexity limit and maximum program size")

Based on the similar check in Cilium[0], authored by Daniel Borkmann.

  [0] 657d0f585a

Co-authored-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Link: https://lore.kernel.org/bpf/20200108162428.25014-2-mrostecki@opensuse.org
2020-01-10 11:15:12 -08:00
Andrii Nakryiko
5033d7177e sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   7745ff9842617323adbe24e71c495d5ebd9aa352
Checkpoint bpf-next commit: 2bbc078f812d45b8decb55935dab21199bd21489
Baseline bpf commit:        1148f9adbe71415836a18a36c1b4ece999ab0973
Checkpoint bpf commit:      3c2f450e553ce47fcb0d6141807a8858e3213c9c

Andrey Ignatov (2):
  bpf: Support replacing cgroup-bpf program in MULTI mode
  libbpf: Introduce bpf_prog_attach_xattr

Andrii Nakryiko (1):
  libbpf: Support CO-RE relocations for LDX/ST/STX instructions

 include/uapi/linux/bpf.h | 10 ++++++++++
 src/bpf.c                | 17 ++++++++++++++++-
 src/bpf.h                | 11 +++++++++++
 src/libbpf.c             | 31 ++++++++++++++++++++++++++++---
 src/libbpf.map           |  1 +
 5 files changed, 66 insertions(+), 4 deletions(-)

--
2.17.1
2019-12-30 12:05:19 -08:00
Andrii Nakryiko
49058f8c6f libbpf: Support CO-RE relocations for LDX/ST/STX instructions
Clang patch [0] enables emitting relocatable generic ALU/ALU64 instructions
(i.e, shifts and arithmetic operations), as well as generic load/store
instructions. The former ones are already supported by libbpf as is. This
patch adds further support for load/store instructions. Relocatable field
offset is encoded in BPF instruction's 16-bit offset section and are adjusted
by libbpf based on target kernel BTF.

These Clang changes and corresponding libbpf changes allow for more succinct
generated BPF code by encoding relocatable field reads as a single
ST/LDX/STX instruction. It also enables relocatable access to BPF context.
Previously, if context struct (e.g., __sk_buff) was accessed with CO-RE
relocations (e.g., due to preserve_access_index attribute), it would be
rejected by BPF verifier due to modified context pointer dereference. With
Clang patch, such context accesses are both relocatable and have a fixed
offset from the point of view of BPF verifier.

  [0] https://reviews.llvm.org/D71790

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191223180305.86417-1-andriin@fb.com
2019-12-30 12:05:19 -08:00
Andrey Ignatov
8b20ffa4b9 libbpf: Introduce bpf_prog_attach_xattr
Introduce a new bpf_prog_attach_xattr function that, in addition to
program fd, target fd and attach type, accepts an extendable struct
bpf_prog_attach_opts.

bpf_prog_attach_opts relies on DECLARE_LIBBPF_OPTS macro to maintain
backward and forward compatibility and has the following "optional"
attach attributes:

* existing attach_flags, since it's not required when attaching in NONE
  mode. Even though it's quite often used in MULTI and OVERRIDE mode it
  seems to be a good idea to reduce number of arguments to
  bpf_prog_attach_xattr;

* newly introduced attribute of BPF_PROG_ATTACH command: replace_prog_fd
  that is fd of previously attached cgroup-bpf program to replace if
  BPF_F_REPLACE flag is used.

The new function is named to be consistent with other xattr-functions
(bpf_prog_test_run_xattr, bpf_create_map_xattr, bpf_load_program_xattr).

The struct bpf_prog_attach_opts is supposed to be used with
DECLARE_LIBBPF_OPTS macro.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/bd6e0732303eb14e4b79cb128268d9e9ad6db208.1576741281.git.rdna@fb.com
2019-12-30 12:05:19 -08:00
Andrey Ignatov
1b1e30679f bpf: Support replacing cgroup-bpf program in MULTI mode
The common use-case in production is to have multiple cgroup-bpf
programs per attach type that cover multiple use-cases. Such programs
are attached with BPF_F_ALLOW_MULTI and can be maintained by different
people.

Order of programs usually matters, for example imagine two egress
programs: the first one drops packets and the second one counts packets.
If they're swapped the result of counting program will be different.

It brings operational challenges with updating cgroup-bpf program(s)
attached with BPF_F_ALLOW_MULTI since there is no way to replace a
program:

* One way to update is to detach all programs first and then attach the
  new version(s) again in the right order. This introduces an
  interruption in the work a program is doing and may not be acceptable
  (e.g. if it's egress firewall);

* Another way is attach the new version of a program first and only then
  detach the old version. This introduces the time interval when two
  versions of same program are working, what may not be acceptable if a
  program is not idempotent. It also imposes additional burden on
  program developers to make sure that two versions of their program can
  co-exist.

Solve the problem by introducing a "replace" mode in BPF_PROG_ATTACH
command for cgroup-bpf programs being attached with BPF_F_ALLOW_MULTI
flag. This mode is enabled by newly introduced BPF_F_REPLACE attach flag
and bpf_attr.replace_bpf_fd attribute to pass fd of the old program to
replace

That way user can replace any program among those attached with
BPF_F_ALLOW_MULTI flag without the problems described above.

Details of the new API:

* If BPF_F_REPLACE is set but replace_bpf_fd doesn't have valid
  descriptor of BPF program, BPF_PROG_ATTACH will return corresponding
  error (EINVAL or EBADF).

* If replace_bpf_fd has valid descriptor of BPF program but such a
  program is not attached to specified cgroup, BPF_PROG_ATTACH will
  return ENOENT.

BPF_F_REPLACE is introduced to make the user intent clear, since
replace_bpf_fd alone can't be used for this (its default value, 0, is a
valid fd). BPF_F_REPLACE also makes it possible to extend the API in the
future (e.g. add BPF_F_BEFORE and BPF_F_AFTER if needed).

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Andrii Narkyiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/30cd850044a0057bdfcaaf154b7d2f39850ba813.1576741281.git.rdna@fb.com
2019-12-30 12:05:19 -08:00
Andrii Nakryiko
e7a82fc033 sync: add zlib dependency and libbpf_common.h to list of installed headers
zlib is now a direct dependency of libbpf (previously zlib was only dependency
of libelf, on which libbpf depends as well). For non-pkg-config case, specify
`-lz` compiler flag explicitly.

Recent sync also added another public header to libbpf. Include it in a list
of headers that are installed on target system.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
7c5583ab2d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   679152d3a32e305c213f83160c328c37566ae8bc
Checkpoint bpf-next commit: 7745ff9842617323adbe24e71c495d5ebd9aa352
Baseline bpf commit:        fe3300897cbfd76c6cb825776e5ac0ca50a91ca4
Checkpoint bpf commit:      1148f9adbe71415836a18a36c1b4ece999ab0973

Andrii Nakryiko (26):
  libbpf: Extract and generalize CPU mask parsing logic
  libbpf: Don't attach perf_buffer to offline/missing CPUs
  libbpf: Don't require root for bpf_object__open()
  libbpf: Add generic bpf_program__attach()
  libbpf: Move non-public APIs from libbpf.h to libbpf_internal.h
  libbpf: Add BPF_EMBED_OBJ macro for embedding BPF .o files
  libbpf: Extract common user-facing helpers
  libbpf: Expose btf__align_of() API
  libbpf: Expose BTF-to-C type declaration emitting API
  libbpf: Expose BPF program's function name
  libbpf: Refactor global data map initialization
  libbpf: Postpone BTF ID finding for TRACING programs to load phase
  libbpf: Reduce log level of supported section names dump
  libbpf: Add BPF object skeleton support
  libbpf: Extract internal map names into constants
  libbpf: Support libbpf-provided extern variables
  bpftool: Generate externs datasec in BPF skeleton
  libbpf: Support flexible arrays in CO-RE
  libbpf: Add zlib as a dependency in pkg-config template
  libbpf: Reduce log level for custom section names
  libbpf: Remove BPF_EMBED_OBJ macro from libbpf.h
  libbpf: Add bpf_link__disconnect() API to preserve underlying BPF
    resource
  libbpf: Put Kconfig externs into .kconfig section
  libbpf: Allow to augment system Kconfig through extra optional config
  libbpf: BTF is required when externs are present
  libbpf: Fix another __u64 printf warning

Jakub Sitnicki (1):
  libbpf: Recognize SK_REUSEPORT programs from section name

Prashant Bhole (1):
  libbpf: Fix build by renaming variables

Toke Høiland-Jørgensen (4):
  libbpf: Print hint about ulimit when getting permission denied error
  libbpf: Fix libbpf_common.h when installing libbpf through 'make
    install'
  libbpf: Add missing newline in opts validation macro
  libbpf: Fix printing of ulimit value

 include/uapi/linux/btf.h |    7 +-
 src/bpf.h                |    6 +-
 src/bpf_helpers.h        |   11 +
 src/btf.c                |   48 +-
 src/btf.h                |   29 +-
 src/btf_dump.c           |  115 ++-
 src/libbpf.c             | 1673 ++++++++++++++++++++++++++++++++------
 src/libbpf.h             |  107 +--
 src/libbpf.map           |   12 +
 src/libbpf.pc.template   |    2 +-
 src/libbpf_common.h      |   40 +
 src/libbpf_internal.h    |   21 +-
 12 files changed, 1678 insertions(+), 393 deletions(-)
 create mode 100644 src/libbpf_common.h

--
2.17.1
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
0d9d85e345 libbpf: Fix another __u64 printf warning
Fix yet another printf warning for %llu specifier on ppc64le. This time size_t
casting won't work, so cast to verbose `unsigned long long`.

Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191219052103.3515-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Toke Høiland-Jørgensen
c8c4edf4c9 libbpf: Fix printing of ulimit value
Naresh pointed out that libbpf builds fail on 32-bit architectures because
rlimit.rlim_cur is defined as 'unsigned long long' on those architectures.
Fix this by using %zu in printf and casting to size_t.

Fixes: dc3a2d254782 ("libbpf: Print hint about ulimit when getting permission denied error")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191219090236.905059-1-toke@redhat.com
2019-12-19 15:34:27 -08:00
Toke Høiland-Jørgensen
23983fd75b libbpf: Add missing newline in opts validation macro
The error log output in the opts validation macro was missing a newline.

Fixes: 2ce8450ef5a3 ("libbpf: add bpf_object__open_{file, mem} w/ extensible opts")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191219120714.928380-1-toke@redhat.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
4bbdefdce1 libbpf: BTF is required when externs are present
BTF is required to get type information about extern variables.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191219002837.3074619-4-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
5bc09e54fa libbpf: Allow to augment system Kconfig through extra optional config
Instead of all or nothing approach of overriding Kconfig file location, allow
to extend it with extra values and override chosen subset of values though
optional user-provided extra config, passed as a string through open options'
.kconfig option. If same config key is present in both user-supplied config
and Kconfig, user-supplied one wins. This allows applications to more easily
test various conditions despite host kernel's real configuration. If all of
BPF object's __kconfig externs are satisfied from user-supplied config, system
Kconfig won't be read at all.

Simplify selftests by not needing to create temporary Kconfig files.

Suggested-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191219002837.3074619-3-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
9f61b5b95c libbpf: Put Kconfig externs into .kconfig section
Move Kconfig-provided externs into custom .kconfig section. Add __kconfig into
bpf_helpers.h for user convenience. Update selftests accordingly.

Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191219002837.3074619-2-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
baa3268b13 libbpf: Add bpf_link__disconnect() API to preserve underlying BPF resource
There are cases in which BPF resource (program, map, etc) has to outlive
userspace program that "installed" it in the system in the first place.
When BPF program is attached, libbpf returns bpf_link object, which
is supposed to be destroyed after no longer necessary through
bpf_link__destroy() API. Currently, bpf_link destruction causes both automatic
detachment and frees up any resources allocated to for bpf_link in-memory
representation. This is inconvenient for the case described above because of
coupling of detachment and resource freeing.

This patch introduces bpf_link__disconnect() API call, which marks bpf_link as
disconnected from its underlying BPF resouces. This means that when bpf_link
is destroyed later, all its memory resources will be freed, but BPF resource
itself won't be detached.

This design allows to follow strict and resource-leak-free design by default,
while giving easy and straightforward way for user code to opt for keeping BPF
resource attached beyond lifetime of a bpf_link. For some BPF programs (i.e.,
FS-based tracepoints, kprobes, raw tracepoint, etc), user has to make sure to
pin BPF program to prevent kernel to automatically detach it on process exit.
This should typically be achived by pinning BPF program (or map in some cases)
in BPF FS.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191218225039.2668205-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
f5599ef856 libbpf: Remove BPF_EMBED_OBJ macro from libbpf.h
Drop BPF_EMBED_OBJ and struct bpf_embed_data now that skeleton automatically
embeds contents of its source object file. While BPF_EMBED_OBJ is useful
independently of skeleton, we are currently don't have any use cases utilizing
it, so let's remove them until/if we need it.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191218052552.2915188-3-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
99c65fed78 libbpf: Reduce log level for custom section names
Libbpf is trying to recognize BPF program type based on its section name
during bpf_object__open() phase. This is not strictly enforced and user code
has ability to specify/override correct BPF program type after open.  But if
BPF program is using custom section name, libbpf will still emit warnings,
which can be quite annoying to users. This patch reduces log level of
information messages emitted by libbpf if section name is not canonical. User
can still get a list of all supported section names as debug-level message.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191217234228.1739308-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Toke Høiland-Jørgensen
e9adfa851f libbpf: Fix libbpf_common.h when installing libbpf through 'make install'
This fixes two issues with the newly introduced libbpf_common.h file:

- The header failed to include <string.h> for the definition of memset()
- The new file was not included in the install_headers rule in the Makefile

Both of these issues cause breakage when installing libbpf with 'make
install' and trying to use it in applications.

Fixes: 544402d4b493 ("libbpf: Extract common user-facing helpers")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191217112810.768078-1-toke@redhat.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
8363b8d4e6 libbpf: Add zlib as a dependency in pkg-config template
List zlib as another dependency of libbpf in pkg-config template.
Verified it is correctly resolved to proper -lz flag:

$ make DESTDIR=/tmp/libbpf-install install
$ pkg-config --libs /tmp/libbpf-install/usr/local/lib64/pkgconfig/libbpf.pc
-L/usr/local/lib64 -lbpf
$ pkg-config --libs --static /tmp/libbpf-install/usr/local/lib64/pkgconfig/libbpf.pc
-L/usr/local/lib64 -lbpf -lelf -lz

Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Cc: Luca Boccassi <bluca@debian.org>
Link: https://lore.kernel.org/bpf/20191216183830.3972964-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Toke Høiland-Jørgensen
f892b464d0 libbpf: Print hint about ulimit when getting permission denied error
Probably the single most common error newcomers to XDP are stumped by is
the 'permission denied' error they get when trying to load their program
and 'ulimit -l' is set too low. For examples, see [0], [1].

Since the error code is UAPI, we can't change that. Instead, this patch
adds a few heuristics in libbpf and outputs an additional hint if they are
met: If an EPERM is returned on map create or program load, and geteuid()
shows we are root, and the current RLIMIT_MEMLOCK is not infinity, we
output a hint about raising 'ulimit -l' as an additional log line.

[0] https://marc.info/?l=xdp-newbies&m=157043612505624&w=2
[1] https://github.com/xdp-project/xdp-tutorial/issues/86

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191216181204.724953-1-toke@redhat.com
2019-12-19 15:34:27 -08:00
Prashant Bhole
a4132d1590 libbpf: Fix build by renaming variables
In btf__align_of() variable name 't' is shadowed by inner block
declaration of another variable with same name. Patch renames
variables in order to fix it.

  CC       sharedobjs/btf.o
btf.c: In function ‘btf__align_of’:
btf.c:303:21: error: declaration of ‘t’ shadows a previous local [-Werror=shadow]
  303 |   int i, align = 1, t;
      |                     ^
btf.c:283:25: note: shadowed declaration is here
  283 |  const struct btf_type *t = btf__type_by_id(btf, id);
      |

Fixes: 3d208f4ca111 ("libbpf: Expose btf__align_of() API")
Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20191216082738.28421-1-prashantbhole.linux@gmail.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
303916a126 libbpf: Support flexible arrays in CO-RE
Some data stuctures in kernel are defined with either zero-sized array or
flexible (dimensionless) array at the end of a struct. Actual data of such
array follows in memory immediately after the end of that struct, forming its
variable-sized "body" of elements. Support such access pattern in CO-RE
relocation handling.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191215070844.1014385-2-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
8eea7ed8e8 bpftool: Generate externs datasec in BPF skeleton
Add support for generation of mmap()-ed read-only view of libbpf-provided
extern variables. As externs are not supposed to be provided by user code
(that's what .data, .bss, and .rodata is for), don't mmap() it initially. Only
after skeleton load is performed, map .extern contents as read-only memory.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-4-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
fa030ffd20 libbpf: Support libbpf-provided extern variables
Add support for extern variables, provided to BPF program by libbpf. Currently
the following extern variables are supported:
  - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is
    executing, follows KERNEL_VERSION() macro convention, can be 4- and 8-byte
    long;
  - CONFIG_xxx values; a set of values of actual kernel config. Tristate,
    boolean, strings, and integer values are supported.

Set of possible values is determined by declared type of extern variable.
Supported types of variables are:
- Tristate values. Are represented as `enum libbpf_tristate`. Accepted values
  are **strictly** 'y', 'n', or 'm', which are represented as TRI_YES, TRI_NO,
  or TRI_MODULE, respectively.
- Boolean values. Are represented as bool (_Bool) types. Accepted values are
  'y' and 'n' only, turning into true/false values, respectively.
- Single-character values. Can be used both as a substritute for
  bool/tristate, or as a small-range integer:
  - 'y'/'n'/'m' are represented as is, as characters 'y', 'n', or 'm';
  - integers in a range [-128, 127] or [0, 255] (depending on signedness of
    char in target architecture) are recognized and represented with
    respective values of char type.
- Strings. String values are declared as fixed-length char arrays. String of
  up to that length will be accepted and put in first N bytes of char array,
  with the rest of bytes zeroed out. If config string value is longer than
  space alloted, it will be truncated and warning message emitted. Char array
  is always zero terminated. String literals in config have to be enclosed in
  double quotes, just like C-style string literals.
- Integers. 8-, 16-, 32-, and 64-bit integers are supported, both signed and
  unsigned variants. Libbpf enforces parsed config value to be in the
  supported range of corresponding integer type. Integers values in config can
  be:
  - decimal integers, with optional + and - signs;
  - hexadecimal integers, prefixed with 0x or 0X;
  - octal integers, starting with 0.

Config file itself is searched in /boot/config-$(uname -r) location with
fallback to /proc/config.gz, unless config path is specified explicitly
through bpf_object_open_opts' kernel_config_path option. Both gzipped and
plain text formats are supported. Libbpf adds explicit dependency on zlib
because of this, but this shouldn't be a problem, given libelf already depends
on zlib.

All detected extern variables, are put into a separate .extern internal map.
It, similarly to .rodata map, is marked as read-only from BPF program side, as
well as is frozen on load. This allows BPF verifier to track extern values as
constants and perform enhanced branch prediction and dead code elimination.
This can be relied upon for doing kernel version/feature detection and using
potentially unsupported field relocations or BPF helpers in a CO-RE-based BPF
program, while still having a single version of BPF program running on old and
new kernels. Selftests are validating this explicitly for unexisting BPF
helper.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-3-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
ea06bc30fa libbpf: Extract internal map names into constants
Instead of duplicating string literals, keep them in one place and consistent.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-2-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
531ac0e65f libbpf: Add BPF object skeleton support
Add new set of APIs, allowing to open/load/attach BPF object through BPF
object skeleton, generated by bpftool for a specific BPF object file. All the
xxx_skeleton() APIs wrap up corresponding bpf_object_xxx() APIs, but
additionally also automate map/program lookups by name, global data
initialization and mmap()-ing, etc.  All this greatly improves and simplifies
userspace usability of working with BPF programs. See follow up patches for
examples.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-13-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
e35cb347ce libbpf: Reduce log level of supported section names dump
It's quite spammy. And now that bpf_object__open() is trying to determine
program type from its section name, we are getting these verbose messages all
the time. Reduce their log level to DEBUG.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-12-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
68fa3f0b57 libbpf: Postpone BTF ID finding for TRACING programs to load phase
Move BTF ID determination for BPF_PROG_TYPE_TRACING programs to a load phase.
Performing it at open step is inconvenient, because it prevents BPF skeleton
generation on older host kernel, which doesn't contain BTF_KIND_FUNCs
information in vmlinux BTF. This is a common set up, though, when, e.g.,
selftests are compiled on older host kernel, but the test program itself is
executed in qemu VM with bleeding edge kernel. Having this BTF searching
performed at load time allows to successfully use bpf_object__open() for
codegen and inspection of BPF object file.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-11-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
1c145f0fda libbpf: Refactor global data map initialization
Refactor global data map initialization to use anonymous mmap()-ed memory
instead of malloc()-ed one. This allows to do a transparent re-mmap()-ing of
already existing memory address to point to BPF map's memory after
bpf_object__load() step (done in follow up patch). This choreographed setup
allows to have a nice and unsurprising way to pre-initialize read-only (and
r/w as well) maps by user and after BPF map creation keep working with
mmap()-ed contents of this map. All in a way that doesn't require user code to
update any pointers: the illusion of working with memory contents is preserved
before and after actual BPF map instantiation.

Selftests and runqslower example demonstrate this feature in follow up patches.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-10-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
418c07226a libbpf: Expose BPF program's function name
Add APIs to get BPF program function name, as opposed to bpf_program__title(),
which returns BPF program function's section name. Function name has a benefit
of being a valid C identifier and uniquely identifies a specific BPF program,
while section name can be duplicated across multiple independent BPF programs.

Add also bpf_object__find_program_by_name(), similar to
bpf_object__find_program_by_title(), to facilitate looking up BPF programs by
their C function names.

Convert one of selftests to new API for look up.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-9-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
5ec0ba6530 libbpf: Expose BTF-to-C type declaration emitting API
Expose API that allows to emit type declaration and field/variable definition
(if optional field name is specified) in valid C syntax for any provided BTF
type. This is going to be used by bpftool when emitting data section layout as
a struct. As part of making this API useful in a stand-alone fashion, move
initialization of some of the internal btf_dump state to earlier phase.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-8-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
600ba1c5e1 libbpf: Expose btf__align_of() API
Expose BTF API that calculates type alignment requirements.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-7-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
aa73e35dc3 libbpf: Extract common user-facing helpers
LIBBPF_API and DECLARE_LIBBPF_OPTS are needed in many public libbpf API
headers. Extract them into libbpf_common.h to avoid unnecessary
interdependency between btf.h, libbpf.h, and bpf.h or code duplication.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-6-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
dca6176410 libbpf: Add BPF_EMBED_OBJ macro for embedding BPF .o files
Add a convenience macro BPF_EMBED_OBJ, which allows to embed other files
(typically used to embed BPF .o files) into a hosting userspace programs. To
C program it is exposed as struct bpf_embed_data, containing a pointer to
raw data and its size in bytes.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-5-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
6f88f26945 libbpf: Move non-public APIs from libbpf.h to libbpf_internal.h
Few libbpf APIs are not public but currently exposed through libbpf.h to be
used by bpftool. Move them to libbpf_internal.h, where intent of being
non-stable and non-public is much more obvious.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-4-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
f7af143516 libbpf: Add generic bpf_program__attach()
Generalize BPF program attaching and allow libbpf to auto-detect type (and
extra parameters, where applicable) and attach supported BPF program types
based on program sections. Currently this is supported for:
- kprobe/kretprobe;
- tracepoint;
- raw tracepoint;
- tracing programs (typed raw TP/fentry/fexit).

More types support can be trivially added within this framework.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-3-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
4f3c7b3e13 libbpf: Don't require root for bpf_object__open()
Reorganize bpf_object__open and bpf_object__load steps such that
bpf_object__open doesn't need root access. This was previously done for
feature probing and BTF sanitization. This doesn't have to happen on open,
though, so move all those steps into the load phase.

This is important, because it makes it possible for tools like bpftool, to
just open BPF object file and inspect their contents: programs, maps, BTF,
etc. For such operations it is prohibitive to require root access. On the
other hand, there is a lot of custom libbpf logic in those steps, so its best
avoided for tools to reimplement all that on their own.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-2-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
b85e83f6cb libbpf: Don't attach perf_buffer to offline/missing CPUs
It's quite common on some systems to have more CPUs enlisted as "possible",
than there are (and could ever be) present/online CPUs. In such cases,
perf_buffer creationg will fail due to inability to create perf event on
missing CPU with error like this:

libbpf: failed to open perf buffer event on cpu #16: No such device

This patch fixes the logic of perf_buffer__new() to ignore CPUs that are
missing or currently offline. In rare cases where user explicitly listed
specific CPUs to connect to, behavior is unchanged: libbpf will try to open
perf event buffer on specified CPU(s) anyways.

Fixes: fb84b8224655 ("libbpf: add perf buffer API")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191212013609.1691168-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
33d1fbea57 libbpf: Extract and generalize CPU mask parsing logic
This logic is re-used for parsing a set of online CPUs. Having it as an
isolated piece of code working with input string makes it conveninent to test
this logic as well. While refactoring, also improve the robustness of original
implementation.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191212013548.1690564-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Jakub Sitnicki
b234d12c97 libbpf: Recognize SK_REUSEPORT programs from section name
Allow loading BPF object files that contain SK_REUSEPORT programs without
having to manually set the program type before load if the the section name
is set to "sk_reuseport".

Makes user-space code needed to load SK_REUSEPORT BPF program more concise.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191212102259.418536-2-jakub@cloudflare.com
2019-12-19 15:34:27 -08:00
hex
7a1d185108 libbpf: fix Coverity scan CI
A follow up of [1]
Travis CI stages use default phases when no override provided.
This leads to Coverity scan stage fail due to execuing the default
before_script: phase of VMTEST.
Fix this with an explicit override with empty value.

[1] https://github.com/libbpf/libbpf/pull/108
2019-12-17 16:46:57 -08:00
hex
76d5bb6a13 libbpf: Add VMTEST to CI
Extend continuous integration tests by adding testing against various kernel
versions.
The code is based on vmtest CI scripts implemented by osandov@
for drgn [1] with the following modifications:
- The downloadables are stored in Amazon S3 cloud indexed in [2]
- `--setup-cmd` command line option is added to vmtest/run.sh so
  setup commands run on VM boot can be set in e.g. `.travis.yml`
- Travis build matrix [2] is introduced for VM tests so VM tests are
  followed by the existing CI tests. The matrix has `KERNEL` and
  `VMTEST_SETUPCMD` dimensions.
- Minor style fixes.

The vmtest extention code is located in travis-ci/vmtest and contains
`run.sh` and `setup_example.sh`
- `run.sh` is responsible for the vmtest workflow: downloading vmlinux
  and rootfs image from the cloud, fs mounting, syncing libbpf sources
  to the image, setting up scripts run on VM boot, starting VM using
  QEMU.
  `run.sh` covers more use cases than a script for a job run in TravisCI,
  e.g. int can build a kernel w/ `--build` option.

- `setup_example.sh` is an example of a script run in VM which can be
  modified to e.g. run actual libbpf tests. A setup script should have
  executable permission.

To set up a new kernel version for a test:
1) upload vmlinuz.* and vmlinux.*\.zst to Amazon S3 store
located at [4];
2) modify INDEX [2] file.

[1] https://github.com/osandov/drgn
[2] https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/INDEX
[3] https://docs.travis-ci.com/user/build-matrix
[4] https://libbpf-vmtest.s3-us-west-1.amazonaws.com/
2019-12-16 21:04:03 -08:00
Frantisek Sumsal
c42bfcbf0e travis: build on ppc64le as well 2019-12-13 01:04:46 -08:00
Andrii Nakryiko
c2fc7c15a3 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   e7096c131e5161fa3b8e52a650d7719d2857adfd
Checkpoint bpf-next commit: 679152d3a32e305c213f83160c328c37566ae8bc
Baseline bpf commit:        e42617b825f8073569da76dc4510bfa019b1c35a
Checkpoint bpf commit:      fe3300897cbfd76c6cb825776e5ac0ca50a91ca4

Andrii Nakryiko (2):
  libbpf: Bump libpf current version to v0.0.7
  libbpf: Fix printf compilation warnings on ppc64le arch

 src/libbpf.c   | 37 +++++++++++++++++++------------------
 src/libbpf.map |  3 +++
 2 files changed, 22 insertions(+), 18 deletions(-)

--
2.17.1
2019-12-12 14:40:26 -08:00
Andrii Nakryiko
4060a65222 libbpf: Fix printf compilation warnings on ppc64le arch
On ppc64le __u64 and __s64 are defined as long int and unsigned long int,
respectively. This causes compiler to emit warning when %lld/%llu are used to
printf 64-bit numbers. Fix this by casting to size_t/ssize_t with %zu and %zd
format specifiers, respectively.

v1->v2:
- use size_t/ssize_t instead of custom typedefs (Martin).

Fixes: 1f8e2bcb2cd5 ("libbpf: Refactor relocation handling")
Fixes: abd29c931459 ("libbpf: allow specifying map definitions using BTF")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191212171918.638010-1-andriin@fb.com
2019-12-12 14:40:26 -08:00
Andrii Nakryiko
a26f6b1375 libbpf: Bump libpf current version to v0.0.7
New development cycles starts, bump to v0.0.7 proactively.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191209224022.3544519-1-andriin@fb.com
2019-12-12 14:40:26 -08:00
Toke Høiland-Jørgensen
6e686c26fa Makefile: Add cscope and tags rules
These were added to the kernel repo, but not in Github. However, they are
useful for browsing the source in Github while prototyping new features and
compiling them into userspace utilities.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
2019-12-11 10:38:48 -08:00
53 changed files with 93133 additions and 1525 deletions

View File

@@ -1,13 +1,27 @@
sudo: required
language: bash
dist: bionic
services:
- docker
env:
global:
- PROJECT_NAME='libbpf'
- AUTHOR_EMAIL="$(git log -1 --pretty=\"%aE\")"
- CI_MANAGERS="$TRAVIS_BUILD_DIR/travis-ci/managers"
- REPO_ROOT="$TRAVIS_BUILD_DIR"
- CI_ROOT="$REPO_ROOT/travis-ci"
- VMTEST_ROOT="$CI_ROOT/vmtest"
addons:
apt:
packages:
- qemu-kvm
- zstd
- binutils-dev
- elfutils
- libcap-dev
- libelf-dev
- libdw-dev
stages:
# Run Coverity periodically instead of for each PR for following reasons:
@@ -21,113 +35,76 @@ stages:
jobs:
include:
- stage: Build & test
name: Debian Testing
- stage: Builds & Tests
name: Kernel LATEST + selftests
language: bash
env:
- DEBIAN_RELEASE="testing"
- CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"
before_install:
- sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
script:
- $CI_MANAGERS/debian.sh RUN || travis_terminate
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
env: KERNEL=LATEST
script: $CI_ROOT/vmtest/run_vmtest.sh || travis_terminate 1
- name: Debian Testing (ASan+UBSan)
- name: Kernel 4.9.0 + selftests
language: bash
env:
- DEBIAN_RELEASE="testing"
- CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"
before_install:
- sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
script:
- $CI_MANAGERS/debian.sh RUN_ASAN || travis_terminate
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
env: KERNEL=4.9.0
script: $CI_ROOT/vmtest/run_vmtest.sh || travis_terminate 1
- name: Debian Testing (clang)
- name: Kernel 5.5.0 + selftests
language: bash
env:
- DEBIAN_RELEASE="testing"
- CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"
before_install:
- sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
script:
- $CI_MANAGERS/debian.sh RUN_CLANG || travis_terminate
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
env: KERNEL=5.5.0
script: $CI_ROOT/vmtest/run_vmtest.sh || travis_terminate 1
- name: Debian Testing (clang ASan+UBSan)
- name: Debian Build
language: bash
env:
- DEBIAN_RELEASE="testing"
- CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"
before_install:
- sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
script:
- $CI_MANAGERS/debian.sh RUN_CLANG_ASAN || travis_terminate
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Debian Testing (gcc-8)
- name: Debian Build (ASan+UBSan)
language: bash
env:
- DEBIAN_RELEASE="testing"
- CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"
before_install:
- sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
script:
- $CI_MANAGERS/debian.sh RUN_GCC8 || travis_terminate
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_ASAN || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Debian Testing (gcc-8 ASan+UBSan)
- name: Debian Build (clang)
language: bash
env:
- DEBIAN_RELEASE="testing"
- CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"
before_install:
- sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
script:
- $CI_MANAGERS/debian.sh RUN_GCC8_ASAN || travis_terminate
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_CLANG || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Ubuntu Bionic
- name: Debian Build (clang ASan+UBSan)
language: bash
script:
- sudo $CI_MANAGERS/ubuntu.sh || travis_terminate
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_CLANG_ASAN || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Ubuntu Bionic (arm)
- name: Debian Build (gcc-8)
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_GCC8 || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Debian Build (gcc-8 ASan+UBSan)
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_GCC8_ASAN || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Ubuntu Bionic Build
language: bash
script: sudo $CI_ROOT/managers/ubuntu.sh || travis_terminate 1
- name: Ubuntu Bionic Build (arm)
arch: arm64
language: bash
script:
- sudo $CI_MANAGERS/ubuntu.sh || travis_terminate
script: sudo $CI_ROOT/managers/ubuntu.sh || travis_terminate 1
- name: Ubuntu Bionic (s390x)
- name: Ubuntu Bionic Build (s390x)
arch: s390x
language: bash
script:
- sudo $CI_MANAGERS/ubuntu.sh || travis_terminate
script: sudo $CI_ROOT/managers/ubuntu.sh || travis_terminate 1
- name: Ubuntu Bionic Build (ppc64le)
arch: ppc64le
language: bash
script: sudo $CI_ROOT/managers/ubuntu.sh || travis_terminate 1
- stage: Coverity
language: bash
@@ -148,4 +125,6 @@ jobs:
- sudo apt-get -y build-dep libelf-dev
- sudo apt-get install -y libelf-dev pkg-config
script:
- scripts/coverity.sh || travis_terminate
- scripts/coverity.sh || travis_terminate 1
allow_failures:
- env: KERNEL=x.x.x

View File

@@ -1 +1 @@
e42617b825f8073569da76dc4510bfa019b1c35a
3fb1a96a91120877488071a167d26d76be4be977

View File

@@ -1 +1 @@
e7096c131e5161fa3b8e52a650d7719d2857adfd
06a4ec1d9dc652e17ee3ac2ceb6c7cf6c2b75cdd

116
README.md
View File

@@ -1,20 +1,29 @@
This is a mirror of [bpf-next linux tree](https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next)'s
This is a mirror of [bpf-next Linux source
tree](https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next)'s
`tools/lib/bpf` directory plus its supporting header files.
The following files will by sync'ed with bpf-next repo:
- `src/` <-> `bpf-next/tools/lib/bpf/`
- `include/uapi/linux/bpf_common.h` <-> `bpf-next/tools/include/uapi/linux/bpf_common.h`
- `include/uapi/linux/bpf.h` <-> `bpf-next/tools/include/uapi/linux/bpf.h`
- `include/uapi/linux/btf.h` <-> `bpf-next/tools/include/uapi/linux/btf.h`
- `include/uapi/linux/if_link.h` <-> `bpf-next/tools/include/uapi/linux/if_link.h`
- `include/uapi/linux/if_xdp.h` <-> `bpf-next/tools/include/uapi/linux/if_xdp.h`
- `include/uapi/linux/netlink.h` <-> `bpf-next/tools/include/uapi/linux/netlink.h`
- `include/tools/libc_compat.h` <-> `bpf-next/tools/include/tools/libc_compat.h`
All the gory details of syncing can be found in `scripts/sync-kernel.sh`
script.
Other header files at this repo (`include/linux/*.h`) are reduced versions of
their counterpart files at bpf-next's `tools/include/linux/*.h` to make compilation
successful.
Some header files in this repo (`include/linux/*.h`) are reduced versions of
their counterpart files at
[bpf-next](https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/)'s
`tools/include/linux/*.h` to make compilation successful.
BPF questions
=============
All general BPF questions, including kernel functionality, libbpf APIs and
their application, should be sent to bpf@vger.kernel.org mailing list. You can
subscribe to it [here](http://vger.kernel.org/vger-lists.html#bpf) and search
its archive [here](https://lore.kernel.org/bpf/). Please search the archive
before asking new questions. It very well might be that this was already
addressed or answered before.
bpf@vger.kernel.org is monitored by many more people and they will happily try
to help you with whatever issue you have. This repository's PRs and issues
should be opened only for dealing with issues pertaining to specific way this
libbpf mirror repo is set up and organized.
Build
[![Build Status](https://travis-ci.org/libbpf/libbpf.svg?branch=master)](https://travis-ci.org/libbpf/libbpf)
@@ -25,8 +34,9 @@ libelf is an internal dependency of libbpf and thus it is required to link
against and must be installed on the system for applications to work.
pkg-config is used by default to find libelf, and the program called can be
overridden with `PKG_CONFIG`.
If using `pkg-config` at build time is not desired, it can be disabled by setting
`NO_PKG_CONFIG=1` when calling make.
If using `pkg-config` at build time is not desired, it can be disabled by
setting `NO_PKG_CONFIG=1` when calling make.
To build both static libbpf.a and shared libbpf.so:
```bash
@@ -51,8 +61,80 @@ $ cd src
$ PKG_CONFIG_PATH=/build/root/lib64/pkgconfig DESTDIR=/build/root make install
```
Distributions
=============
Distributions packaging libbpf from this mirror:
- [Fedora](https://src.fedoraproject.org/rpms/libbpf)
- [Gentoo](https://packages.gentoo.org/packages/dev-libs/libbpf)
- [Debian](https://packages.debian.org/sid/libbpf-dev)
- [Arch](https://www.archlinux.org/packages/extra/x86_64/libbpf/)
Benefits of packaging from the mirror over packaging from kernel sources:
- Consistent versioning across distributions.
- No ties to any specific kernel, transparent handling of older kernels.
Libbpf is designed to be kernel-agnostic and work across multitude of
kernel versions. It has built-in mechanisms to gracefully handle older
kernels, that are missing some of the features, by working around or
gracefully degrading functionality. Thus libbpf is not tied to a specific
kernel version and can/should be packaged and versioned independently.
- Continuous integration testing via
[TravisCI](https://travis-ci.org/libbpf/libbpf).
- Static code analysis via [LGTM](https://lgtm.com/projects/g/libbpf/libbpf)
and [Coverity](https://scan.coverity.com/projects/libbpf).
Package dependencies of libbpf, package names may vary across distros:
- zlib
- libelf
BPF CO-RE (Compile Once Run Everywhere)
=========================================
Libbpf supports building BPF CO-RE-enabled applications, which, in contrast to
[BCC](https://github.com/iovisor/bcc/), do not require Clang/LLVM runtime
being deployed to target servers and doesn't rely on kernel-devel headers
being available.
It does rely on kernel to be built with [BTF type
information](https://www.kernel.org/doc/html/latest/bpf/btf.html), though.
Some major Linux distributions come with kernel BTF already built in:
- Fedora 31+
- RHEL 8.2+
- OpenSUSE Tumbleweed (in the next release, as of 2020-06-04)
- Arch Linux (from kernel 5.7.1.arch1-1)
If your kernel doesn't come with BTF built-in, you'll need to build custom
kernel. You'll need:
- `pahole` 1.16+ tool (part of `dwarves` package), which performs DWARF to
BTF conversion;
- kernel built with `CONFIG_DEBUG_INFO_BTF=y` option;
- you can check if your kernel has BTF built-in by looking for
`/sys/kernel/btf/vmlinux` file:
```shell
$ ls -la /sys/kernel/btf/vmlinux
-r--r--r--. 1 root root 3541561 Jun 2 18:16 /sys/kernel/btf/vmlinux
```
To develop and build BPF programs, you'll need Clang/LLVM 10+. The following
distributions have Clang/LLVM 10+ packaged by default:
- Fedora 32+
- Ubuntu 20.04+
- Arch Linux
Otherwise, please make sure to update it on your system.
The following resources are useful to understand what BPF CO-RE is and how to
use it:
- [BPF Portability and CO-RE](https://facebookmicrosites.github.io/bpf/blog/2020/02/19/bpf-portability-and-co-re.html)
- [HOWTO: BCC to libbpf conversion](https://facebookmicrosites.github.io/bpf/blog/2020/02/20/bcc-to-libbpf-howto-guide.html)
- [libbpf-tools in BCC repo](https://github.com/iovisor/bcc/tree/master/libbpf-tools)
contain lots of real-world tools converted from BCC to BPF CO-RE. Consider
converting some more to both contribute to the BPF community and gain some
more experience with it.
License
=====
=======
This work is dual-licensed under BSD 2-clause license and GNU LGPL v2.1 license.
You can choose between one of them if you use this work.

File diff suppressed because it is too large Load Diff

View File

@@ -22,9 +22,9 @@ struct btf_header {
};
/* Max # of type identifier */
#define BTF_MAX_TYPE 0x0000ffff
#define BTF_MAX_TYPE 0x000fffff
/* Max offset into the string section */
#define BTF_MAX_NAME_OFFSET 0x0000ffff
#define BTF_MAX_NAME_OFFSET 0x00ffffff
/* Max # of struct/union/enum members or func args */
#define BTF_MAX_VLEN 0xffff
@@ -142,7 +142,14 @@ struct btf_param {
enum {
BTF_VAR_STATIC = 0,
BTF_VAR_GLOBAL_ALLOCATED,
BTF_VAR_GLOBAL_ALLOCATED = 1,
BTF_VAR_GLOBAL_EXTERN = 2,
};
enum btf_func_linkage {
BTF_FUNC_STATIC = 0,
BTF_FUNC_GLOBAL = 1,
BTF_FUNC_EXTERN = 2,
};
/* BTF_KIND_VAR is followed by a single "struct btf_var" to describe

View File

@@ -169,6 +169,7 @@ enum {
IFLA_MAX_MTU,
IFLA_PROP_LIST,
IFLA_ALT_IFNAME, /* Alternative ifname */
IFLA_PERM_ADDRESS,
__IFLA_MAX
};
@@ -342,6 +343,8 @@ enum {
IFLA_BRPORT_NEIGH_SUPPRESS,
IFLA_BRPORT_ISOLATED,
IFLA_BRPORT_BACKUP_PORT,
IFLA_BRPORT_MRP_RING_OPEN,
IFLA_BRPORT_MRP_IN_OPEN,
__IFLA_BRPORT_MAX
};
#define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
@@ -462,6 +465,7 @@ enum {
IFLA_MACSEC_REPLAY_PROTECT,
IFLA_MACSEC_VALIDATION,
IFLA_MACSEC_PAD,
IFLA_MACSEC_OFFLOAD,
__IFLA_MACSEC_MAX,
};
@@ -485,6 +489,14 @@ enum macsec_validation_type {
MACSEC_VALIDATE_MAX = __MACSEC_VALIDATE_END - 1,
};
enum macsec_offload {
MACSEC_OFFLOAD_OFF = 0,
MACSEC_OFFLOAD_PHY = 1,
MACSEC_OFFLOAD_MAC = 2,
__MACSEC_OFFLOAD_END,
MACSEC_OFFLOAD_MAX = __MACSEC_OFFLOAD_END - 1,
};
/* IPVLAN section */
enum {
IFLA_IPVLAN_UNSPEC,
@@ -952,11 +964,12 @@ enum {
#define XDP_FLAGS_SKB_MODE (1U << 1)
#define XDP_FLAGS_DRV_MODE (1U << 2)
#define XDP_FLAGS_HW_MODE (1U << 3)
#define XDP_FLAGS_REPLACE (1U << 4)
#define XDP_FLAGS_MODES (XDP_FLAGS_SKB_MODE | \
XDP_FLAGS_DRV_MODE | \
XDP_FLAGS_HW_MODE)
#define XDP_FLAGS_MASK (XDP_FLAGS_UPDATE_IF_NOEXIST | \
XDP_FLAGS_MODES)
XDP_FLAGS_MODES | XDP_FLAGS_REPLACE)
/* These are stored into IFLA_XDP_ATTACHED on dump. */
enum {
@@ -976,6 +989,7 @@ enum {
IFLA_XDP_DRV_PROG_ID,
IFLA_XDP_SKB_PROG_ID,
IFLA_XDP_HW_PROG_ID,
IFLA_XDP_EXPECTED_FD,
__IFLA_XDP_MAX,
};

View File

@@ -73,9 +73,12 @@ struct xdp_umem_reg {
};
struct xdp_statistics {
__u64 rx_dropped; /* Dropped for reasons other than invalid desc */
__u64 rx_dropped; /* Dropped for other reasons */
__u64 rx_invalid_descs; /* Dropped due to invalid descriptor */
__u64 tx_invalid_descs; /* Dropped due to invalid descriptor */
__u64 rx_ring_full; /* Dropped due to rx ring being full */
__u64 rx_fill_ring_empty_descs; /* Failed to retrieve item from fill ring */
__u64 tx_ring_empty_descs; /* Failed to retrieve item from tx ring */
};
struct xdp_options {

View File

@@ -1,4 +1,5 @@
#!/bin/sh
# Usage: check-reallocarray.sh cc_path [cc_args...]
tfile=$(mktemp /tmp/test_reallocarray_XXXXXXXX.c)
ofile=${tfile%.c}.o
@@ -13,6 +14,6 @@ int main(void)
}
EOL
gcc $tfile -o $ofile >/dev/null 2>&1
"$@" $tfile -o $ofile >/dev/null 2>&1
if [ $? -ne 0 ]; then echo "FAIL"; fi
/bin/rm -f $tfile $ofile

View File

@@ -49,7 +49,7 @@ PATH_MAP=( \
[tools/include/tools/libc_compat.h]=include/tools/libc_compat.h \
)
LIBBPF_PATHS="${!PATH_MAP[@]}"
LIBBPF_PATHS="${!PATH_MAP[@]} :^tools/lib/bpf/Makefile :^tools/lib/bpf/Build :^tools/lib/bpf/.gitignore"
LIBBPF_VIEW_PATHS="${PATH_MAP[@]}"
LIBBPF_VIEW_EXCLUDE_REGEX='^src/(Makefile|Build|test_libbpf\.c|bpf_helper_defs\.h|\.gitignore)$'
@@ -79,39 +79,10 @@ commit_desc()
# The idea is that this single-line signature is good enough to make final
# decision about whether two commits are the same, across different repos.
# $1 - commit ref
# $2 - paths filter
commit_signature()
{
git log -n1 --pretty='("%s")|%aI|%b' --shortstat $1 | tr '\n' '|'
}
# Validate there are no non-empty merges (we can't handle them)
# $1 - baseline tag
# $2 - tip tag
validate_merges()
{
local baseline_tag=$1
local tip_tag=$2
local new_merges
local merge_change_cnt
local ignore_merge_resolutions
local desc
new_merges=$(git rev-list --merges --topo-order --reverse ${baseline_tag}..${tip_tag} ${LIBBPF_PATHS[@]})
for new_merge in ${new_merges}; do
desc=$(commit_desc ${new_merge})
echo "MERGE: ${desc}"
merge_change_cnt=$(git show --format='' ${new_merge} | wc -l)
if ((${merge_change_cnt} > 0)); then
read -p "Merge '${desc}' is non-empty, which will cause conflicts! Do you want to proceed? [y/N]: " ignore_merge_resolutions
case "${ignore_merge_resolutions}" in
"y" | "Y")
echo "Skipping '${desc}'..."
continue
;;
esac
exit 3
fi
done
git show --pretty='("%s")|%aI|%b' --shortstat $1 -- ${2-.} | tr '\n' '|'
}
# Cherry-pick commits touching libbpf-related files
@@ -133,7 +104,7 @@ cherry_pick_commits()
new_commits=$(git rev-list --no-merges --topo-order --reverse ${baseline_tag}..${tip_tag} ${LIBBPF_PATHS[@]})
for new_commit in ${new_commits}; do
desc="$(commit_desc ${new_commit})"
signature="$(commit_signature ${new_commit})"
signature="$(commit_signature ${new_commit} "${LIBBPF_PATHS[@]}")"
synced_cnt=$(grep -F "${signature}" ${TMP_DIR}/libbpf_commits.txt | wc -l)
manual_check=0
if ((${synced_cnt} > 0)); then
@@ -242,18 +213,14 @@ git branch ${BPF_TIP_TAG} ${BPF_TIP_COMMIT}
git branch ${SQUASH_BASE_TAG} ${SQUASH_COMMIT}
git checkout -b ${SQUASH_TIP_TAG} ${SQUASH_COMMIT}
# Validate there are no non-empty merges in bpf-next and bpf trees
validate_merges ${BASELINE_TAG} ${TIP_TAG}
validate_merges ${BPF_BASELINE_TAG} ${BPF_TIP_TAG}
# Cherry-pick new commits onto squashed baseline commit
cherry_pick_commits ${BASELINE_TAG} ${TIP_TAG}
cherry_pick_commits ${BPF_BASELINE_TAG} ${BPF_TIP_TAG}
# Move all libbpf files into __libbpf directory.
git filter-branch --prune-empty -f --tree-filter "${LIBBPF_TREE_FILTER}" ${SQUASH_TIP_TAG} ${SQUASH_BASE_TAG}
FILTER_BRANCH_SQUELCH_WARNING=1 git filter-branch --prune-empty -f --tree-filter "${LIBBPF_TREE_FILTER}" ${SQUASH_TIP_TAG} ${SQUASH_BASE_TAG}
# Make __libbpf a new root directory
git filter-branch --prune-empty -f --subdirectory-filter __libbpf ${SQUASH_TIP_TAG} ${SQUASH_BASE_TAG}
FILTER_BRANCH_SQUELCH_WARNING=1 git filter-branch --prune-empty -f --subdirectory-filter __libbpf ${SQUASH_TIP_TAG} ${SQUASH_BASE_TAG}
# If there are no new commits with libbpf-related changes, bail out
COMMIT_CNT=$(git rev-list --count ${SQUASH_BASE_TAG}..${SQUASH_TIP_TAG})
@@ -317,8 +284,8 @@ echo "Verifying Linux's and Github's libbpf state"
cd_to ${LINUX_REPO}
git checkout -b ${VIEW_TAG} ${TIP_COMMIT}
git filter-branch -f --tree-filter "${LIBBPF_TREE_FILTER}" ${VIEW_TAG}^..${VIEW_TAG}
git filter-branch -f --subdirectory-filter __libbpf ${VIEW_TAG}^..${VIEW_TAG}
FILTER_BRANCH_SQUELCH_WARNING=1 git filter-branch -f --tree-filter "${LIBBPF_TREE_FILTER}" ${VIEW_TAG}^..${VIEW_TAG}
FILTER_BRANCH_SQUELCH_WARNING=1 git filter-branch -f --subdirectory-filter __libbpf ${VIEW_TAG}^..${VIEW_TAG}
git ls-files -- ${LIBBPF_VIEW_PATHS[@]} > ${TMP_DIR}/linux-view.ls
cd_to ${LIBBPF_REPO}

View File

@@ -10,7 +10,7 @@ TOPDIR = ..
INCLUDES := -I. -I$(TOPDIR)/include -I$(TOPDIR)/include/uapi
ALL_CFLAGS := $(INCLUDES)
FEATURE_REALLOCARRAY := $(shell $(TOPDIR)/scripts/check-reallocarray.sh)
FEATURE_REALLOCARRAY := $(shell $(TOPDIR)/scripts/check-reallocarray.sh $(CC))
ifneq ($(FEATURE_REALLOCARRAY),)
ALL_CFLAGS += -DCOMPAT_NEED_REALLOCARRAY
endif
@@ -21,7 +21,7 @@ CFLAGS ?= -g -O2 -Werror -Wall
ALL_CFLAGS += $(CFLAGS) -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64
ALL_LDFLAGS += $(LDFLAGS)
ifdef NO_PKG_CONFIG
ALL_LDFLAGS += -lelf
ALL_LDFLAGS += -lelf -lz
else
PKG_CONFIG ?= pkg-config
ALL_CFLAGS += $(shell $(PKG_CONFIG) --cflags libelf)
@@ -33,7 +33,7 @@ SHARED_OBJDIR := $(OBJDIR)/sharedobjs
STATIC_OBJDIR := $(OBJDIR)/staticobjs
OBJS := bpf.o btf.o libbpf.o libbpf_errno.o netlink.o \
nlattr.o str_error.o libbpf_probes.o bpf_prog_linfo.o xsk.o \
btf_dump.o hashmap.o
btf_dump.o hashmap.o ringbuf.o
SHARED_OBJS := $(addprefix $(SHARED_OBJDIR)/,$(OBJS))
STATIC_OBJS := $(addprefix $(STATIC_OBJDIR)/,$(OBJS))
@@ -47,7 +47,7 @@ endif
HEADERS := bpf.h libbpf.h btf.h xsk.h libbpf_util.h \
bpf_helpers.h bpf_helper_defs.h bpf_tracing.h \
bpf_endian.h bpf_core_read.h
bpf_endian.h bpf_core_read.h libbpf_common.h
UAPI_HEADERS := $(addprefix $(TOPDIR)/include/uapi/linux/,\
bpf.h bpf_common.h btf.h)
@@ -68,6 +68,8 @@ LIBDIR ?= $(PREFIX)/$(LIBSUBDIR)
INCLUDEDIR ?= $(PREFIX)/include
UAPIDIR ?= $(PREFIX)/include
TAGS_PROG := $(if $(shell which etags 2>/dev/null),etags,ctags)
all: $(STATIC_LIBS) $(SHARED_LIBS) $(PC_FILE)
$(OBJDIR)/libbpf.a: $(STATIC_OBJS)
@@ -133,3 +135,12 @@ install_pkgconfig: $(PC_FILE)
clean:
rm -rf *.o *.a *.so *.so.* *.pc $(SHARED_OBJDIR) $(STATIC_OBJDIR)
.PHONY: cscope tags
cscope:
ls *.c *.h > cscope.files
cscope -b -q -f cscope.out
tags:
rm -f TAGS tags
ls *.c *.h | xargs $(TAGS_PROG) -a

177
src/bpf.c
View File

@@ -32,6 +32,9 @@
#include "libbpf.h"
#include "libbpf_internal.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
/*
* When building perf, unistd.h is overridden. __NR_bpf is
* required to be defined explicitly.
@@ -95,7 +98,11 @@ int bpf_create_map_xattr(const struct bpf_create_map_attr *create_attr)
attr.btf_key_type_id = create_attr->btf_key_type_id;
attr.btf_value_type_id = create_attr->btf_value_type_id;
attr.map_ifindex = create_attr->map_ifindex;
attr.inner_map_fd = create_attr->inner_map_fd;
if (attr.map_type == BPF_MAP_TYPE_STRUCT_OPS)
attr.btf_vmlinux_value_type_id =
create_attr->btf_vmlinux_value_type_id;
else
attr.inner_map_fd = create_attr->inner_map_fd;
return sys_bpf(BPF_MAP_CREATE, &attr, sizeof(attr));
}
@@ -228,7 +235,11 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
memset(&attr, 0, sizeof(attr));
attr.prog_type = load_attr->prog_type;
attr.expected_attach_type = load_attr->expected_attach_type;
if (attr.prog_type == BPF_PROG_TYPE_TRACING) {
if (attr.prog_type == BPF_PROG_TYPE_STRUCT_OPS ||
attr.prog_type == BPF_PROG_TYPE_LSM) {
attr.attach_btf_id = load_attr->attach_btf_id;
} else if (attr.prog_type == BPF_PROG_TYPE_TRACING ||
attr.prog_type == BPF_PROG_TYPE_EXT) {
attr.attach_btf_id = load_attr->attach_btf_id;
attr.attach_prog_fd = load_attr->attach_prog_fd;
} else {
@@ -443,6 +454,64 @@ int bpf_map_freeze(int fd)
return sys_bpf(BPF_MAP_FREEZE, &attr, sizeof(attr));
}
static int bpf_map_batch_common(int cmd, int fd, void *in_batch,
void *out_batch, void *keys, void *values,
__u32 *count,
const struct bpf_map_batch_opts *opts)
{
union bpf_attr attr;
int ret;
if (!OPTS_VALID(opts, bpf_map_batch_opts))
return -EINVAL;
memset(&attr, 0, sizeof(attr));
attr.batch.map_fd = fd;
attr.batch.in_batch = ptr_to_u64(in_batch);
attr.batch.out_batch = ptr_to_u64(out_batch);
attr.batch.keys = ptr_to_u64(keys);
attr.batch.values = ptr_to_u64(values);
attr.batch.count = *count;
attr.batch.elem_flags = OPTS_GET(opts, elem_flags, 0);
attr.batch.flags = OPTS_GET(opts, flags, 0);
ret = sys_bpf(cmd, &attr, sizeof(attr));
*count = attr.batch.count;
return ret;
}
int bpf_map_delete_batch(int fd, void *keys, __u32 *count,
const struct bpf_map_batch_opts *opts)
{
return bpf_map_batch_common(BPF_MAP_DELETE_BATCH, fd, NULL,
NULL, keys, NULL, count, opts);
}
int bpf_map_lookup_batch(int fd, void *in_batch, void *out_batch, void *keys,
void *values, __u32 *count,
const struct bpf_map_batch_opts *opts)
{
return bpf_map_batch_common(BPF_MAP_LOOKUP_BATCH, fd, in_batch,
out_batch, keys, values, count, opts);
}
int bpf_map_lookup_and_delete_batch(int fd, void *in_batch, void *out_batch,
void *keys, void *values, __u32 *count,
const struct bpf_map_batch_opts *opts)
{
return bpf_map_batch_common(BPF_MAP_LOOKUP_AND_DELETE_BATCH,
fd, in_batch, out_batch, keys, values,
count, opts);
}
int bpf_map_update_batch(int fd, void *keys, void *values, __u32 *count,
const struct bpf_map_batch_opts *opts)
{
return bpf_map_batch_common(BPF_MAP_UPDATE_BATCH, fd, NULL, NULL,
keys, values, count, opts);
}
int bpf_obj_pin(int fd, const char *pathname)
{
union bpf_attr attr;
@@ -466,14 +535,29 @@ int bpf_obj_get(const char *pathname)
int bpf_prog_attach(int prog_fd, int target_fd, enum bpf_attach_type type,
unsigned int flags)
{
DECLARE_LIBBPF_OPTS(bpf_prog_attach_opts, opts,
.flags = flags,
);
return bpf_prog_attach_xattr(prog_fd, target_fd, type, &opts);
}
int bpf_prog_attach_xattr(int prog_fd, int target_fd,
enum bpf_attach_type type,
const struct bpf_prog_attach_opts *opts)
{
union bpf_attr attr;
if (!OPTS_VALID(opts, bpf_prog_attach_opts))
return -EINVAL;
memset(&attr, 0, sizeof(attr));
attr.target_fd = target_fd;
attr.attach_bpf_fd = prog_fd;
attr.attach_type = type;
attr.attach_flags = flags;
attr.attach_flags = OPTS_GET(opts, flags, 0);
attr.replace_bpf_fd = OPTS_GET(opts, replace_prog_fd, 0);
return sys_bpf(BPF_PROG_ATTACH, &attr, sizeof(attr));
}
@@ -501,6 +585,64 @@ int bpf_prog_detach2(int prog_fd, int target_fd, enum bpf_attach_type type)
return sys_bpf(BPF_PROG_DETACH, &attr, sizeof(attr));
}
int bpf_link_create(int prog_fd, int target_fd,
enum bpf_attach_type attach_type,
const struct bpf_link_create_opts *opts)
{
union bpf_attr attr;
if (!OPTS_VALID(opts, bpf_link_create_opts))
return -EINVAL;
memset(&attr, 0, sizeof(attr));
attr.link_create.prog_fd = prog_fd;
attr.link_create.target_fd = target_fd;
attr.link_create.attach_type = attach_type;
attr.link_create.flags = OPTS_GET(opts, flags, 0);
attr.link_create.iter_info =
ptr_to_u64(OPTS_GET(opts, iter_info, (void *)0));
attr.link_create.iter_info_len = OPTS_GET(opts, iter_info_len, 0);
return sys_bpf(BPF_LINK_CREATE, &attr, sizeof(attr));
}
int bpf_link_detach(int link_fd)
{
union bpf_attr attr;
memset(&attr, 0, sizeof(attr));
attr.link_detach.link_fd = link_fd;
return sys_bpf(BPF_LINK_DETACH, &attr, sizeof(attr));
}
int bpf_link_update(int link_fd, int new_prog_fd,
const struct bpf_link_update_opts *opts)
{
union bpf_attr attr;
if (!OPTS_VALID(opts, bpf_link_update_opts))
return -EINVAL;
memset(&attr, 0, sizeof(attr));
attr.link_update.link_fd = link_fd;
attr.link_update.new_prog_fd = new_prog_fd;
attr.link_update.flags = OPTS_GET(opts, flags, 0);
attr.link_update.old_prog_fd = OPTS_GET(opts, old_prog_fd, 0);
return sys_bpf(BPF_LINK_UPDATE, &attr, sizeof(attr));
}
int bpf_iter_create(int link_fd)
{
union bpf_attr attr;
memset(&attr, 0, sizeof(attr));
attr.iter_create.link_fd = link_fd;
return sys_bpf(BPF_ITER_CREATE, &attr, sizeof(attr));
}
int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
__u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt)
{
@@ -603,6 +745,11 @@ int bpf_btf_get_next_id(__u32 start_id, __u32 *next_id)
return bpf_obj_get_next_id(start_id, next_id, BPF_BTF_GET_NEXT_ID);
}
int bpf_link_get_next_id(__u32 start_id, __u32 *next_id)
{
return bpf_obj_get_next_id(start_id, next_id, BPF_LINK_GET_NEXT_ID);
}
int bpf_prog_get_fd_by_id(__u32 id)
{
union bpf_attr attr;
@@ -633,13 +780,23 @@ int bpf_btf_get_fd_by_id(__u32 id)
return sys_bpf(BPF_BTF_GET_FD_BY_ID, &attr, sizeof(attr));
}
int bpf_obj_get_info_by_fd(int prog_fd, void *info, __u32 *info_len)
int bpf_link_get_fd_by_id(__u32 id)
{
union bpf_attr attr;
memset(&attr, 0, sizeof(attr));
attr.link_id = id;
return sys_bpf(BPF_LINK_GET_FD_BY_ID, &attr, sizeof(attr));
}
int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len)
{
union bpf_attr attr;
int err;
memset(&attr, 0, sizeof(attr));
attr.info.bpf_fd = prog_fd;
attr.info.bpf_fd = bpf_fd;
attr.info.info_len = *info_len;
attr.info.info = ptr_to_u64(info);
@@ -708,3 +865,13 @@ int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf, __u32 *buf_len,
return err;
}
int bpf_enable_stats(enum bpf_stats_type type)
{
union bpf_attr attr;
memset(&attr, 0, sizeof(attr));
attr.enable_stats.type = type;
return sys_bpf(BPF_ENABLE_STATS, &attr, sizeof(attr));
}

View File

@@ -28,14 +28,12 @@
#include <stddef.h>
#include <stdint.h>
#include "libbpf_common.h"
#ifdef __cplusplus
extern "C" {
#endif
#ifndef LIBBPF_API
#define LIBBPF_API __attribute__((visibility("default")))
#endif
struct bpf_create_map_attr {
const char *name;
enum bpf_map_type map_type;
@@ -48,7 +46,10 @@ struct bpf_create_map_attr {
__u32 btf_key_type_id;
__u32 btf_value_type_id;
__u32 map_ifindex;
__u32 inner_map_fd;
union {
__u32 inner_map_fd;
__u32 btf_vmlinux_value_type_id;
};
};
LIBBPF_API int
@@ -126,14 +127,74 @@ LIBBPF_API int bpf_map_lookup_and_delete_elem(int fd, const void *key,
LIBBPF_API int bpf_map_delete_elem(int fd, const void *key);
LIBBPF_API int bpf_map_get_next_key(int fd, const void *key, void *next_key);
LIBBPF_API int bpf_map_freeze(int fd);
struct bpf_map_batch_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
__u64 elem_flags;
__u64 flags;
};
#define bpf_map_batch_opts__last_field flags
LIBBPF_API int bpf_map_delete_batch(int fd, void *keys,
__u32 *count,
const struct bpf_map_batch_opts *opts);
LIBBPF_API int bpf_map_lookup_batch(int fd, void *in_batch, void *out_batch,
void *keys, void *values, __u32 *count,
const struct bpf_map_batch_opts *opts);
LIBBPF_API int bpf_map_lookup_and_delete_batch(int fd, void *in_batch,
void *out_batch, void *keys,
void *values, __u32 *count,
const struct bpf_map_batch_opts *opts);
LIBBPF_API int bpf_map_update_batch(int fd, void *keys, void *values,
__u32 *count,
const struct bpf_map_batch_opts *opts);
LIBBPF_API int bpf_obj_pin(int fd, const char *pathname);
LIBBPF_API int bpf_obj_get(const char *pathname);
struct bpf_prog_attach_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
unsigned int flags;
int replace_prog_fd;
};
#define bpf_prog_attach_opts__last_field replace_prog_fd
LIBBPF_API int bpf_prog_attach(int prog_fd, int attachable_fd,
enum bpf_attach_type type, unsigned int flags);
LIBBPF_API int bpf_prog_attach_xattr(int prog_fd, int attachable_fd,
enum bpf_attach_type type,
const struct bpf_prog_attach_opts *opts);
LIBBPF_API int bpf_prog_detach(int attachable_fd, enum bpf_attach_type type);
LIBBPF_API int bpf_prog_detach2(int prog_fd, int attachable_fd,
enum bpf_attach_type type);
union bpf_iter_link_info; /* defined in up-to-date linux/bpf.h */
struct bpf_link_create_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
__u32 flags;
union bpf_iter_link_info *iter_info;
__u32 iter_info_len;
};
#define bpf_link_create_opts__last_field iter_info_len
LIBBPF_API int bpf_link_create(int prog_fd, int target_fd,
enum bpf_attach_type attach_type,
const struct bpf_link_create_opts *opts);
LIBBPF_API int bpf_link_detach(int link_fd);
struct bpf_link_update_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
__u32 flags; /* extra flags */
__u32 old_prog_fd; /* expected old program FD */
};
#define bpf_link_update_opts__last_field old_prog_fd
LIBBPF_API int bpf_link_update(int link_fd, int new_prog_fd,
const struct bpf_link_update_opts *opts);
LIBBPF_API int bpf_iter_create(int link_fd);
struct bpf_prog_test_run_attr {
int prog_fd;
int repeat;
@@ -163,10 +224,12 @@ LIBBPF_API int bpf_prog_test_run(int prog_fd, int repeat, void *data,
LIBBPF_API int bpf_prog_get_next_id(__u32 start_id, __u32 *next_id);
LIBBPF_API int bpf_map_get_next_id(__u32 start_id, __u32 *next_id);
LIBBPF_API int bpf_btf_get_next_id(__u32 start_id, __u32 *next_id);
LIBBPF_API int bpf_link_get_next_id(__u32 start_id, __u32 *next_id);
LIBBPF_API int bpf_prog_get_fd_by_id(__u32 id);
LIBBPF_API int bpf_map_get_fd_by_id(__u32 id);
LIBBPF_API int bpf_btf_get_fd_by_id(__u32 id);
LIBBPF_API int bpf_obj_get_info_by_fd(int prog_fd, void *info, __u32 *info_len);
LIBBPF_API int bpf_link_get_fd_by_id(__u32 id);
LIBBPF_API int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len);
LIBBPF_API int bpf_prog_query(int target_fd, enum bpf_attach_type type,
__u32 query_flags, __u32 *attach_flags,
__u32 *prog_ids, __u32 *prog_cnt);
@@ -177,6 +240,9 @@ LIBBPF_API int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf,
__u32 *buf_len, __u32 *prog_id, __u32 *fd_type,
__u64 *probe_offset, __u64 *probe_addr);
enum bpf_stats_type; /* defined in up-to-date linux/bpf.h */
LIBBPF_API int bpf_enable_stats(enum bpf_stats_type type);
#ifdef __cplusplus
} /* extern "C" */
#endif

View File

@@ -217,7 +217,7 @@ enum bpf_field_info_kind {
*/
#define BPF_CORE_READ_INTO(dst, src, a, ...) \
({ \
___core_read(bpf_core_read, dst, src, a, ##__VA_ARGS__) \
___core_read(bpf_core_read, dst, (src), a, ##__VA_ARGS__) \
})
/*
@@ -227,7 +227,7 @@ enum bpf_field_info_kind {
*/
#define BPF_CORE_READ_STR_INTO(dst, src, a, ...) \
({ \
___core_read(bpf_core_read_str, dst, src, a, ##__VA_ARGS__) \
___core_read(bpf_core_read_str, dst, (src), a, ##__VA_ARGS__)\
})
/*
@@ -254,8 +254,8 @@ enum bpf_field_info_kind {
*/
#define BPF_CORE_READ(src, a, ...) \
({ \
___type(src, a, ##__VA_ARGS__) __r; \
BPF_CORE_READ_INTO(&__r, src, a, ##__VA_ARGS__); \
___type((src), a, ##__VA_ARGS__) __r; \
BPF_CORE_READ_INTO(&__r, (src), a, ##__VA_ARGS__); \
__r; \
})

View File

@@ -2,8 +2,35 @@
#ifndef __BPF_ENDIAN__
#define __BPF_ENDIAN__
#include <linux/stddef.h>
#include <linux/swab.h>
/*
* Isolate byte #n and put it into byte #m, for __u##b type.
* E.g., moving byte #6 (nnnnnnnn) into byte #1 (mmmmmmmm) for __u64:
* 1) xxxxxxxx nnnnnnnn xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx mmmmmmmm xxxxxxxx
* 2) nnnnnnnn xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx mmmmmmmm xxxxxxxx 00000000
* 3) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 nnnnnnnn
* 4) 00000000 00000000 00000000 00000000 00000000 00000000 nnnnnnnn 00000000
*/
#define ___bpf_mvb(x, b, n, m) ((__u##b)(x) << (b-(n+1)*8) >> (b-8) << (m*8))
#define ___bpf_swab16(x) ((__u16)( \
___bpf_mvb(x, 16, 0, 1) | \
___bpf_mvb(x, 16, 1, 0)))
#define ___bpf_swab32(x) ((__u32)( \
___bpf_mvb(x, 32, 0, 3) | \
___bpf_mvb(x, 32, 1, 2) | \
___bpf_mvb(x, 32, 2, 1) | \
___bpf_mvb(x, 32, 3, 0)))
#define ___bpf_swab64(x) ((__u64)( \
___bpf_mvb(x, 64, 0, 7) | \
___bpf_mvb(x, 64, 1, 6) | \
___bpf_mvb(x, 64, 2, 5) | \
___bpf_mvb(x, 64, 3, 4) | \
___bpf_mvb(x, 64, 4, 3) | \
___bpf_mvb(x, 64, 5, 2) | \
___bpf_mvb(x, 64, 6, 1) | \
___bpf_mvb(x, 64, 7, 0)))
/* LLVM's BPF target selects the endianness of the CPU
* it compiles on, or the user specifies (bpfel/bpfeb),
@@ -23,16 +50,16 @@
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
# define __bpf_ntohs(x) __builtin_bswap16(x)
# define __bpf_htons(x) __builtin_bswap16(x)
# define __bpf_constant_ntohs(x) ___constant_swab16(x)
# define __bpf_constant_htons(x) ___constant_swab16(x)
# define __bpf_constant_ntohs(x) ___bpf_swab16(x)
# define __bpf_constant_htons(x) ___bpf_swab16(x)
# define __bpf_ntohl(x) __builtin_bswap32(x)
# define __bpf_htonl(x) __builtin_bswap32(x)
# define __bpf_constant_ntohl(x) ___constant_swab32(x)
# define __bpf_constant_htonl(x) ___constant_swab32(x)
# define __bpf_constant_ntohl(x) ___bpf_swab32(x)
# define __bpf_constant_htonl(x) ___bpf_swab32(x)
# define __bpf_be64_to_cpu(x) __builtin_bswap64(x)
# define __bpf_cpu_to_be64(x) __builtin_bswap64(x)
# define __bpf_constant_be64_to_cpu(x) ___constant_swab64(x)
# define __bpf_constant_cpu_to_be64(x) ___constant_swab64(x)
# define __bpf_constant_be64_to_cpu(x) ___bpf_swab64(x)
# define __bpf_constant_cpu_to_be64(x) ___bpf_swab64(x)
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
# define __bpf_ntohs(x) (x)
# define __bpf_htons(x) (x)

File diff suppressed because it is too large Load Diff

View File

@@ -2,10 +2,17 @@
#ifndef __BPF_HELPERS__
#define __BPF_HELPERS__
/*
* Note that bpf programs need to include either
* vmlinux.h (auto-generated from BTF) or linux/types.h
* in advance since bpf_helper_defs.h uses such types
* as __u64.
*/
#include "bpf_helper_defs.h"
#define __uint(name, val) int (*name)[val]
#define __type(name, val) typeof(val) *name
#define __array(name, val) typeof(val) *name[]
/* Helper macro to print out debug messages */
#define bpf_printk(fmt, ...) \
@@ -25,6 +32,23 @@
#ifndef __always_inline
#define __always_inline __attribute__((always_inline))
#endif
#ifndef __weak
#define __weak __attribute__((weak))
#endif
/*
* Helper macro to manipulate data structures
*/
#ifndef offsetof
#define offsetof(TYPE, MEMBER) ((unsigned long)&((TYPE *)0)->MEMBER)
#endif
#ifndef container_of
#define container_of(ptr, type, member) \
({ \
void *__mptr = (void *)(ptr); \
((type *)(__mptr - offsetof(type, member))); \
})
#endif
/*
* Helper structure used by eBPF C program
@@ -44,4 +68,13 @@ enum libbpf_pin_type {
LIBBPF_PIN_BY_NAME,
};
enum libbpf_tristate {
TRI_NO = 0,
TRI_YES = 1,
TRI_MODULE = 2,
};
#define __kconfig __attribute__((section(".kconfig")))
#define __ksym __attribute__((section(".ksyms")))
#endif

View File

@@ -8,6 +8,9 @@
#include "libbpf.h"
#include "libbpf_internal.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
struct bpf_prog_linfo {
void *raw_linfo;
void *raw_jited_linfo;

View File

@@ -49,7 +49,8 @@
#if defined(bpf_target_x86)
#ifdef __KERNEL__
#if defined(__KERNEL__) || defined(__VMLINUX_H__)
#define PT_REGS_PARM1(x) ((x)->di)
#define PT_REGS_PARM2(x) ((x)->si)
#define PT_REGS_PARM3(x) ((x)->dx)
@@ -60,7 +61,20 @@
#define PT_REGS_RC(x) ((x)->ax)
#define PT_REGS_SP(x) ((x)->sp)
#define PT_REGS_IP(x) ((x)->ip)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), di)
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), si)
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), dx)
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), cx)
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), r8)
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), sp)
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), bp)
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), ax)
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), sp)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), ip)
#else
#ifdef __i386__
/* i386 kernel is built with -mregparm=3 */
#define PT_REGS_PARM1(x) ((x)->eax)
@@ -73,7 +87,20 @@
#define PT_REGS_RC(x) ((x)->eax)
#define PT_REGS_SP(x) ((x)->esp)
#define PT_REGS_IP(x) ((x)->eip)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), eax)
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), edx)
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), ecx)
#define PT_REGS_PARM4_CORE(x) 0
#define PT_REGS_PARM5_CORE(x) 0
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), esp)
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), ebp)
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), eax)
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), esp)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), eip)
#else
#define PT_REGS_PARM1(x) ((x)->rdi)
#define PT_REGS_PARM2(x) ((x)->rsi)
#define PT_REGS_PARM3(x) ((x)->rdx)
@@ -84,6 +111,18 @@
#define PT_REGS_RC(x) ((x)->rax)
#define PT_REGS_SP(x) ((x)->rsp)
#define PT_REGS_IP(x) ((x)->rip)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), rdi)
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), rsi)
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), rdx)
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), rcx)
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), r8)
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), rsp)
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), rbp)
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), rax)
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), rsp)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), rip)
#endif
#endif
@@ -104,6 +143,17 @@ struct pt_regs;
#define PT_REGS_SP(x) (((PT_REGS_S390 *)(x))->gprs[15])
#define PT_REGS_IP(x) (((PT_REGS_S390 *)(x))->psw.addr)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[2])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[3])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[4])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[5])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[6])
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[14])
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[11])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[2])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[15])
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), psw.addr)
#elif defined(bpf_target_arm)
#define PT_REGS_PARM1(x) ((x)->uregs[0])
@@ -117,6 +167,17 @@ struct pt_regs;
#define PT_REGS_SP(x) ((x)->uregs[13])
#define PT_REGS_IP(x) ((x)->uregs[12])
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), uregs[0])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), uregs[1])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), uregs[2])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), uregs[3])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), uregs[4])
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), uregs[14])
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), uregs[11])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), uregs[0])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), uregs[13])
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), uregs[12])
#elif defined(bpf_target_arm64)
/* arm64 provides struct user_pt_regs instead of struct pt_regs to userspace */
@@ -134,6 +195,17 @@ struct pt_regs;
#define PT_REGS_SP(x) (((PT_REGS_ARM64 *)(x))->sp)
#define PT_REGS_IP(x) (((PT_REGS_ARM64 *)(x))->pc)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[0])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[1])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[2])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[3])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[4])
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[30])
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[29])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[0])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), sp)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), pc)
#elif defined(bpf_target_mips)
#define PT_REGS_PARM1(x) ((x)->regs[4])
@@ -143,10 +215,21 @@ struct pt_regs;
#define PT_REGS_PARM5(x) ((x)->regs[8])
#define PT_REGS_RET(x) ((x)->regs[31])
#define PT_REGS_FP(x) ((x)->regs[30]) /* Works only with CONFIG_FRAME_POINTER */
#define PT_REGS_RC(x) ((x)->regs[1])
#define PT_REGS_RC(x) ((x)->regs[2])
#define PT_REGS_SP(x) ((x)->regs[29])
#define PT_REGS_IP(x) ((x)->cp0_epc)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), regs[4])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), regs[5])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), regs[6])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), regs[7])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), regs[8])
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), regs[31])
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), regs[30])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), regs[2])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), regs[29])
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), cp0_epc)
#elif defined(bpf_target_powerpc)
#define PT_REGS_PARM1(x) ((x)->gpr[3])
@@ -158,6 +241,15 @@ struct pt_regs;
#define PT_REGS_SP(x) ((x)->sp)
#define PT_REGS_IP(x) ((x)->nip)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), gpr[3])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), gpr[4])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), gpr[5])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), gpr[6])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), gpr[7])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), gpr[3])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), sp)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), nip)
#elif defined(bpf_target_sparc)
#define PT_REGS_PARM1(x) ((x)->u_regs[UREG_I0])
@@ -169,11 +261,22 @@ struct pt_regs;
#define PT_REGS_RC(x) ((x)->u_regs[UREG_I0])
#define PT_REGS_SP(x) ((x)->u_regs[UREG_FP])
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I0])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I1])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I2])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I3])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I4])
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I7])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I0])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), u_regs[UREG_FP])
/* Should this also be a bpf_target check for the sparc case? */
#if defined(__arch64__)
#define PT_REGS_IP(x) ((x)->tpc)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), tpc)
#else
#define PT_REGS_IP(x) ((x)->pc)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), pc)
#endif
#endif
@@ -192,4 +295,138 @@ struct pt_regs;
(void *)(PT_REGS_FP(ctx) + sizeof(ip))); })
#endif
#define ___bpf_concat(a, b) a ## b
#define ___bpf_apply(fn, n) ___bpf_concat(fn, n)
#define ___bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _a, _b, _c, N, ...) N
#define ___bpf_narg(...) \
___bpf_nth(_, ##__VA_ARGS__, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)
#define ___bpf_empty(...) \
___bpf_nth(_, ##__VA_ARGS__, N, N, N, N, N, N, N, N, N, N, 0)
#define ___bpf_ctx_cast0() ctx
#define ___bpf_ctx_cast1(x) ___bpf_ctx_cast0(), (void *)ctx[0]
#define ___bpf_ctx_cast2(x, args...) ___bpf_ctx_cast1(args), (void *)ctx[1]
#define ___bpf_ctx_cast3(x, args...) ___bpf_ctx_cast2(args), (void *)ctx[2]
#define ___bpf_ctx_cast4(x, args...) ___bpf_ctx_cast3(args), (void *)ctx[3]
#define ___bpf_ctx_cast5(x, args...) ___bpf_ctx_cast4(args), (void *)ctx[4]
#define ___bpf_ctx_cast6(x, args...) ___bpf_ctx_cast5(args), (void *)ctx[5]
#define ___bpf_ctx_cast7(x, args...) ___bpf_ctx_cast6(args), (void *)ctx[6]
#define ___bpf_ctx_cast8(x, args...) ___bpf_ctx_cast7(args), (void *)ctx[7]
#define ___bpf_ctx_cast9(x, args...) ___bpf_ctx_cast8(args), (void *)ctx[8]
#define ___bpf_ctx_cast10(x, args...) ___bpf_ctx_cast9(args), (void *)ctx[9]
#define ___bpf_ctx_cast11(x, args...) ___bpf_ctx_cast10(args), (void *)ctx[10]
#define ___bpf_ctx_cast12(x, args...) ___bpf_ctx_cast11(args), (void *)ctx[11]
#define ___bpf_ctx_cast(args...) \
___bpf_apply(___bpf_ctx_cast, ___bpf_narg(args))(args)
/*
* BPF_PROG is a convenience wrapper for generic tp_btf/fentry/fexit and
* similar kinds of BPF programs, that accept input arguments as a single
* pointer to untyped u64 array, where each u64 can actually be a typed
* pointer or integer of different size. Instead of requring user to write
* manual casts and work with array elements by index, BPF_PROG macro
* allows user to declare a list of named and typed input arguments in the
* same syntax as for normal C function. All the casting is hidden and
* performed transparently, while user code can just assume working with
* function arguments of specified type and name.
*
* Original raw context argument is preserved as well as 'ctx' argument.
* This is useful when using BPF helpers that expect original context
* as one of the parameters (e.g., for bpf_perf_event_output()).
*/
#define BPF_PROG(name, args...) \
name(unsigned long long *ctx); \
static __attribute__((always_inline)) typeof(name(0)) \
____##name(unsigned long long *ctx, ##args); \
typeof(name(0)) name(unsigned long long *ctx) \
{ \
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
return ____##name(___bpf_ctx_cast(args)); \
_Pragma("GCC diagnostic pop") \
} \
static __attribute__((always_inline)) typeof(name(0)) \
____##name(unsigned long long *ctx, ##args)
struct pt_regs;
#define ___bpf_kprobe_args0() ctx
#define ___bpf_kprobe_args1(x) \
___bpf_kprobe_args0(), (void *)PT_REGS_PARM1(ctx)
#define ___bpf_kprobe_args2(x, args...) \
___bpf_kprobe_args1(args), (void *)PT_REGS_PARM2(ctx)
#define ___bpf_kprobe_args3(x, args...) \
___bpf_kprobe_args2(args), (void *)PT_REGS_PARM3(ctx)
#define ___bpf_kprobe_args4(x, args...) \
___bpf_kprobe_args3(args), (void *)PT_REGS_PARM4(ctx)
#define ___bpf_kprobe_args5(x, args...) \
___bpf_kprobe_args4(args), (void *)PT_REGS_PARM5(ctx)
#define ___bpf_kprobe_args(args...) \
___bpf_apply(___bpf_kprobe_args, ___bpf_narg(args))(args)
/*
* BPF_KPROBE serves the same purpose for kprobes as BPF_PROG for
* tp_btf/fentry/fexit BPF programs. It hides the underlying platform-specific
* low-level way of getting kprobe input arguments from struct pt_regs, and
* provides a familiar typed and named function arguments syntax and
* semantics of accessing kprobe input paremeters.
*
* Original struct pt_regs* context is preserved as 'ctx' argument. This might
* be necessary when using BPF helpers like bpf_perf_event_output().
*/
#define BPF_KPROBE(name, args...) \
name(struct pt_regs *ctx); \
static __attribute__((always_inline)) typeof(name(0)) \
____##name(struct pt_regs *ctx, ##args); \
typeof(name(0)) name(struct pt_regs *ctx) \
{ \
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
return ____##name(___bpf_kprobe_args(args)); \
_Pragma("GCC diagnostic pop") \
} \
static __attribute__((always_inline)) typeof(name(0)) \
____##name(struct pt_regs *ctx, ##args)
#define ___bpf_kretprobe_args0() ctx
#define ___bpf_kretprobe_args1(x) \
___bpf_kretprobe_args0(), (void *)PT_REGS_RC(ctx)
#define ___bpf_kretprobe_args(args...) \
___bpf_apply(___bpf_kretprobe_args, ___bpf_narg(args))(args)
/*
* BPF_KRETPROBE is similar to BPF_KPROBE, except, it only provides optional
* return value (in addition to `struct pt_regs *ctx`), but no input
* arguments, because they will be clobbered by the time probed function
* returns.
*/
#define BPF_KRETPROBE(name, args...) \
name(struct pt_regs *ctx); \
static __attribute__((always_inline)) typeof(name(0)) \
____##name(struct pt_regs *ctx, ##args); \
typeof(name(0)) name(struct pt_regs *ctx) \
{ \
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
return ____##name(___bpf_kretprobe_args(args)); \
_Pragma("GCC diagnostic pop") \
} \
static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)
/*
* BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values
* in a structure.
*/
#define BPF_SEQ_PRINTF(seq, fmt, args...) \
({ \
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
static const char ___fmt[] = fmt; \
unsigned long long ___param[] = { args }; \
_Pragma("GCC diagnostic pop") \
int ___ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt), \
___param, sizeof(___param)); \
___ret; \
})
#endif

305
src/btf.c
View File

@@ -8,6 +8,10 @@
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <sys/utsname.h>
#include <sys/param.h>
#include <sys/stat.h>
#include <linux/kernel.h>
#include <linux/err.h>
#include <linux/btf.h>
#include <gelf.h>
@@ -17,8 +21,11 @@
#include "libbpf_internal.h"
#include "hashmap.h"
#define BTF_MAX_NR_TYPES 0x7fffffff
#define BTF_MAX_STR_OFFSET 0x7fffffff
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
#define BTF_MAX_NR_TYPES 0x7fffffffU
#define BTF_MAX_STR_OFFSET 0x7fffffffU
static struct btf_type btf_void;
@@ -34,6 +41,7 @@ struct btf {
__u32 types_size;
__u32 data_size;
int fd;
int ptr_sz;
};
static inline __u64 ptr_to_u64(const void *ptr)
@@ -50,7 +58,7 @@ static int btf_add_type(struct btf *btf, struct btf_type *t)
if (btf->types_size == BTF_MAX_NR_TYPES)
return -E2BIG;
expand_by = max(btf->types_size >> 2, 16);
expand_by = max(btf->types_size >> 2, 16U);
new_size = min(BTF_MAX_NR_TYPES, btf->types_size + expand_by);
new_types = realloc(btf->types, sizeof(*new_types) * new_size);
@@ -214,6 +222,70 @@ const struct btf_type *btf__type_by_id(const struct btf *btf, __u32 type_id)
return btf->types[type_id];
}
static int determine_ptr_size(const struct btf *btf)
{
const struct btf_type *t;
const char *name;
int i;
for (i = 1; i <= btf->nr_types; i++) {
t = btf__type_by_id(btf, i);
if (!btf_is_int(t))
continue;
name = btf__name_by_offset(btf, t->name_off);
if (!name)
continue;
if (strcmp(name, "long int") == 0 ||
strcmp(name, "long unsigned int") == 0) {
if (t->size != 4 && t->size != 8)
continue;
return t->size;
}
}
return -1;
}
static size_t btf_ptr_sz(const struct btf *btf)
{
if (!btf->ptr_sz)
((struct btf *)btf)->ptr_sz = determine_ptr_size(btf);
return btf->ptr_sz < 0 ? sizeof(void *) : btf->ptr_sz;
}
/* Return pointer size this BTF instance assumes. The size is heuristically
* determined by looking for 'long' or 'unsigned long' integer type and
* recording its size in bytes. If BTF type information doesn't have any such
* type, this function returns 0. In the latter case, native architecture's
* pointer size is assumed, so will be either 4 or 8, depending on
* architecture that libbpf was compiled for. It's possible to override
* guessed value by using btf__set_pointer_size() API.
*/
size_t btf__pointer_size(const struct btf *btf)
{
if (!btf->ptr_sz)
((struct btf *)btf)->ptr_sz = determine_ptr_size(btf);
if (btf->ptr_sz < 0)
/* not enough BTF type info to guess */
return 0;
return btf->ptr_sz;
}
/* Override or set pointer size in bytes. Only values of 4 and 8 are
* supported.
*/
int btf__set_pointer_size(struct btf *btf, size_t ptr_sz)
{
if (ptr_sz != 4 && ptr_sz != 8)
return -EINVAL;
btf->ptr_sz = ptr_sz;
return 0;
}
static bool btf_type_is_void(const struct btf_type *t)
{
return t == &btf_void || btf_is_fwd(t);
@@ -246,7 +318,7 @@ __s64 btf__resolve_size(const struct btf *btf, __u32 type_id)
size = t->size;
goto done;
case BTF_KIND_PTR:
size = sizeof(void *);
size = btf_ptr_sz(btf);
goto done;
case BTF_KIND_TYPEDEF:
case BTF_KIND_VOLATILE:
@@ -278,6 +350,45 @@ done:
return nelems * size;
}
int btf__align_of(const struct btf *btf, __u32 id)
{
const struct btf_type *t = btf__type_by_id(btf, id);
__u16 kind = btf_kind(t);
switch (kind) {
case BTF_KIND_INT:
case BTF_KIND_ENUM:
return min(btf_ptr_sz(btf), (size_t)t->size);
case BTF_KIND_PTR:
return btf_ptr_sz(btf);
case BTF_KIND_TYPEDEF:
case BTF_KIND_VOLATILE:
case BTF_KIND_CONST:
case BTF_KIND_RESTRICT:
return btf__align_of(btf, t->type);
case BTF_KIND_ARRAY:
return btf__align_of(btf, btf_array(t)->type);
case BTF_KIND_STRUCT:
case BTF_KIND_UNION: {
const struct btf_member *m = btf_members(t);
__u16 vlen = btf_vlen(t);
int i, max_align = 1, align;
for (i = 0; i < vlen; i++, m++) {
align = btf__align_of(btf, m->type);
if (align <= 0)
return align;
max_align = max(max_align, align);
}
return max_align;
}
default:
pr_warn("unsupported BTF_KIND:%u\n", btf_kind(t));
return 0;
}
}
int btf__resolve_type(const struct btf *btf, __u32 type_id)
{
const struct btf_type *t;
@@ -340,10 +451,10 @@ __s32 btf__find_by_name_kind(const struct btf *btf, const char *type_name,
void btf__free(struct btf *btf)
{
if (!btf)
if (IS_ERR_OR_NULL(btf))
return;
if (btf->fd != -1)
if (btf->fd >= 0)
close(btf->fd);
free(btf->data);
@@ -351,7 +462,7 @@ void btf__free(struct btf *btf)
free(btf);
}
struct btf *btf__new(__u8 *data, __u32 size)
struct btf *btf__new(const void *data, __u32 size)
{
struct btf *btf;
int err;
@@ -487,6 +598,18 @@ struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext)
if (IS_ERR(btf))
goto done;
switch (gelf_getclass(elf)) {
case ELFCLASS32:
btf__set_pointer_size(btf, 4);
break;
case ELFCLASS64:
btf__set_pointer_size(btf, 8);
break;
default:
pr_warn("failed to get ELF class (bitness) for %s\n", path);
break;
}
if (btf_ext && btf_ext_data) {
*btf_ext = btf_ext__new(btf_ext_data->d_buf,
btf_ext_data->d_size);
@@ -516,6 +639,83 @@ done:
return btf;
}
struct btf *btf__parse_raw(const char *path)
{
struct btf *btf = NULL;
void *data = NULL;
FILE *f = NULL;
__u16 magic;
int err = 0;
long sz;
f = fopen(path, "rb");
if (!f) {
err = -errno;
goto err_out;
}
/* check BTF magic */
if (fread(&magic, 1, sizeof(magic), f) < sizeof(magic)) {
err = -EIO;
goto err_out;
}
if (magic != BTF_MAGIC) {
/* definitely not a raw BTF */
err = -EPROTO;
goto err_out;
}
/* get file size */
if (fseek(f, 0, SEEK_END)) {
err = -errno;
goto err_out;
}
sz = ftell(f);
if (sz < 0) {
err = -errno;
goto err_out;
}
/* rewind to the start */
if (fseek(f, 0, SEEK_SET)) {
err = -errno;
goto err_out;
}
/* pre-alloc memory and read all of BTF data */
data = malloc(sz);
if (!data) {
err = -ENOMEM;
goto err_out;
}
if (fread(data, 1, sz, f) < sz) {
err = -EIO;
goto err_out;
}
/* finally parse BTF data */
btf = btf__new(data, sz);
err_out:
free(data);
if (f)
fclose(f);
return err ? ERR_PTR(err) : btf;
}
struct btf *btf__parse(const char *path, struct btf_ext **btf_ext)
{
struct btf *btf;
if (btf_ext)
*btf_ext = NULL;
btf = btf__parse_raw(path);
if (!IS_ERR(btf) || PTR_ERR(btf) != -EPROTO)
return btf;
return btf__parse_elf(path, btf_ext);
}
static int compare_vsi_off(const void *_a, const void *_b)
{
const struct btf_var_secinfo *a = _a;
@@ -539,6 +739,12 @@ static int btf_fixup_datasec(struct bpf_object *obj, struct btf *btf,
return -ENOENT;
}
/* .extern datasec size and var offsets were set correctly during
* extern collection step, so just skip straight to sorting variables
*/
if (t->size)
goto sort_vars;
ret = bpf_object__section_size(obj, name, &size);
if (ret || !size || (t->size && t->size != size)) {
pr_debug("Invalid size for section %s: %u bytes\n", name, size);
@@ -575,7 +781,8 @@ static int btf_fixup_datasec(struct bpf_object *obj, struct btf *btf,
vsi->offset = off;
}
qsort(t + 1, vars, sizeof(*vsi), compare_vsi_off);
sort_vars:
qsort(btf_var_secinfos(t), vars, sizeof(*vsi), compare_vsi_off);
return 0;
}
@@ -604,22 +811,32 @@ int btf__finalize_data(struct bpf_object *obj, struct btf *btf)
int btf__load(struct btf *btf)
{
__u32 log_buf_size = BPF_LOG_BUF_SIZE;
__u32 log_buf_size = 0;
char *log_buf = NULL;
int err = 0;
if (btf->fd >= 0)
return -EEXIST;
log_buf = malloc(log_buf_size);
if (!log_buf)
return -ENOMEM;
retry_load:
if (log_buf_size) {
log_buf = malloc(log_buf_size);
if (!log_buf)
return -ENOMEM;
*log_buf = 0;
*log_buf = 0;
}
btf->fd = bpf_load_btf(btf->data, btf->data_size,
log_buf, log_buf_size, false);
if (btf->fd < 0) {
if (!log_buf || errno == ENOSPC) {
log_buf_size = max((__u32)BPF_LOG_BUF_SIZE,
log_buf_size << 1);
free(log_buf);
goto retry_load;
}
err = -errno;
pr_warn("Error loading BTF: %s(%d)\n", strerror(errno), errno);
if (*log_buf)
@@ -637,6 +854,11 @@ int btf__fd(const struct btf *btf)
return btf->fd;
}
void btf__set_fd(struct btf *btf, int fd)
{
btf->fd = fd;
}
const void *btf__get_raw_data(const struct btf *btf, __u32 *size)
{
*size = btf->data_size;
@@ -957,7 +1179,7 @@ static int btf_ext_parse_hdr(__u8 *data, __u32 data_size)
void btf_ext__free(struct btf_ext *btf_ext)
{
if (!btf_ext)
if (IS_ERR_OR_NULL(btf_ext))
return;
free(btf_ext->data);
free(btf_ext);
@@ -1352,7 +1574,7 @@ static int btf_dedup_hypot_map_add(struct btf_dedup *d,
if (d->hypot_cnt == d->hypot_cap) {
__u32 *new_list;
d->hypot_cap += max(16, d->hypot_cap / 2);
d->hypot_cap += max((size_t)16, d->hypot_cap / 2);
new_list = realloc(d->hypot_list, sizeof(__u32) * d->hypot_cap);
if (!new_list)
return -ENOMEM;
@@ -1648,7 +1870,7 @@ static int btf_dedup_strings(struct btf_dedup *d)
if (strs.cnt + 1 > strs.cap) {
struct btf_str_ptr *new_ptrs;
strs.cap += max(strs.cnt / 2, 16);
strs.cap += max(strs.cnt / 2, 16U);
new_ptrs = realloc(strs.ptrs,
sizeof(strs.ptrs[0]) * strs.cap);
if (!new_ptrs) {
@@ -2882,3 +3104,54 @@ static int btf_dedup_remap_types(struct btf_dedup *d)
}
return 0;
}
/*
* Probe few well-known locations for vmlinux kernel image and try to load BTF
* data out of it to use for target BTF.
*/
struct btf *libbpf_find_kernel_btf(void)
{
struct {
const char *path_fmt;
bool raw_btf;
} locations[] = {
/* try canonical vmlinux BTF through sysfs first */
{ "/sys/kernel/btf/vmlinux", true /* raw BTF */ },
/* fall back to trying to find vmlinux ELF on disk otherwise */
{ "/boot/vmlinux-%1$s" },
{ "/lib/modules/%1$s/vmlinux-%1$s" },
{ "/lib/modules/%1$s/build/vmlinux" },
{ "/usr/lib/modules/%1$s/kernel/vmlinux" },
{ "/usr/lib/debug/boot/vmlinux-%1$s" },
{ "/usr/lib/debug/boot/vmlinux-%1$s.debug" },
{ "/usr/lib/debug/lib/modules/%1$s/vmlinux" },
};
char path[PATH_MAX + 1];
struct utsname buf;
struct btf *btf;
int i;
uname(&buf);
for (i = 0; i < ARRAY_SIZE(locations); i++) {
snprintf(path, PATH_MAX, locations[i].path_fmt, buf.release);
if (access(path, R_OK))
continue;
if (locations[i].raw_btf)
btf = btf__parse_raw(path);
else
btf = btf__parse_elf(path, NULL);
pr_debug("loading kernel BTF '%s': %ld\n",
path, IS_ERR(btf) ? PTR_ERR(btf) : 0);
if (IS_ERR(btf))
continue;
return btf;
}
pr_warn("failed to find valid kernel BTF\n");
return ERR_PTR(-ESRCH);
}

View File

@@ -8,14 +8,12 @@
#include <linux/btf.h>
#include <linux/types.h>
#include "libbpf_common.h"
#ifdef __cplusplus
extern "C" {
#endif
#ifndef LIBBPF_API
#define LIBBPF_API __attribute__((visibility("default")))
#endif
#define BTF_ELF_SEC ".BTF"
#define BTF_EXT_ELF_SEC ".BTF.ext"
#define MAPS_ELF_SEC ".maps"
@@ -65,9 +63,10 @@ struct btf_ext_header {
};
LIBBPF_API void btf__free(struct btf *btf);
LIBBPF_API struct btf *btf__new(__u8 *data, __u32 size);
LIBBPF_API struct btf *btf__parse_elf(const char *path,
struct btf_ext **btf_ext);
LIBBPF_API struct btf *btf__new(const void *data, __u32 size);
LIBBPF_API struct btf *btf__parse(const char *path, struct btf_ext **btf_ext);
LIBBPF_API struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext);
LIBBPF_API struct btf *btf__parse_raw(const char *path);
LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf);
LIBBPF_API int btf__load(struct btf *btf);
LIBBPF_API __s32 btf__find_by_name(const struct btf *btf,
@@ -77,9 +76,13 @@ LIBBPF_API __s32 btf__find_by_name_kind(const struct btf *btf,
LIBBPF_API __u32 btf__get_nr_types(const struct btf *btf);
LIBBPF_API const struct btf_type *btf__type_by_id(const struct btf *btf,
__u32 id);
LIBBPF_API size_t btf__pointer_size(const struct btf *btf);
LIBBPF_API int btf__set_pointer_size(struct btf *btf, size_t ptr_sz);
LIBBPF_API __s64 btf__resolve_size(const struct btf *btf, __u32 type_id);
LIBBPF_API int btf__resolve_type(const struct btf *btf, __u32 type_id);
LIBBPF_API int btf__align_of(const struct btf *btf, __u32 id);
LIBBPF_API int btf__fd(const struct btf *btf);
LIBBPF_API void btf__set_fd(struct btf *btf, int fd);
LIBBPF_API const void *btf__get_raw_data(const struct btf *btf, __u32 *size);
LIBBPF_API const char *btf__name_by_offset(const struct btf *btf, __u32 offset);
LIBBPF_API int btf__get_from_id(__u32 id, struct btf **btf);
@@ -103,6 +106,8 @@ LIBBPF_API int btf_ext__reloc_line_info(const struct btf *btf,
LIBBPF_API __u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext);
LIBBPF_API __u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext);
LIBBPF_API struct btf *libbpf_find_kernel_btf(void);
struct btf_dedup_opts {
unsigned int dedup_table_size;
bool dont_resolve_fwds;
@@ -127,6 +132,30 @@ LIBBPF_API void btf_dump__free(struct btf_dump *d);
LIBBPF_API int btf_dump__dump_type(struct btf_dump *d, __u32 id);
struct btf_dump_emit_type_decl_opts {
/* size of this struct, for forward/backward compatiblity */
size_t sz;
/* optional field name for type declaration, e.g.:
* - struct my_struct <FNAME>
* - void (*<FNAME>)(int)
* - char (*<FNAME>)[123]
*/
const char *field_name;
/* extra indentation level (in number of tabs) to emit for multi-line
* type declarations (e.g., anonymous struct); applies for lines
* starting from the second one (first line is assumed to have
* necessary indentation already
*/
int indent_level;
/* strip all the const/volatile/restrict mods */
bool strip_mods;
};
#define btf_dump_emit_type_decl_opts__last_field strip_mods
LIBBPF_API int
btf_dump__emit_type_decl(struct btf_dump *d, __u32 id,
const struct btf_dump_emit_type_decl_opts *opts);
/*
* A set of helpers for easier BTF types handling
*/
@@ -145,6 +174,11 @@ static inline bool btf_kflag(const struct btf_type *t)
return BTF_INFO_KFLAG(t->info);
}
static inline bool btf_is_void(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_UNKN;
}
static inline bool btf_is_int(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_INT;

View File

@@ -13,11 +13,15 @@
#include <errno.h>
#include <linux/err.h>
#include <linux/btf.h>
#include <linux/kernel.h>
#include "btf.h"
#include "hashmap.h"
#include "libbpf.h"
#include "libbpf_internal.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
static const char PREFIXES[] = "\t\t\t\t\t\t\t\t\t\t\t\t\t";
static const size_t PREFIX_CNT = sizeof(PREFIXES) - 1;
@@ -57,6 +61,8 @@ struct btf_dump {
const struct btf_ext *btf_ext;
btf_dump_printf_fn_t printf_fn;
struct btf_dump_opts opts;
int ptr_sz;
bool strip_mods;
/* per-type auxiliary state */
struct btf_dump_type_aux_state *type_states;
@@ -116,6 +122,8 @@ static void btf_dump_printf(const struct btf_dump *d, const char *fmt, ...)
va_end(args);
}
static int btf_dump_mark_referenced(struct btf_dump *d);
struct btf_dump *btf_dump__new(const struct btf *btf,
const struct btf_ext *btf_ext,
const struct btf_dump_opts *opts,
@@ -132,30 +140,53 @@ struct btf_dump *btf_dump__new(const struct btf *btf,
d->btf_ext = btf_ext;
d->printf_fn = printf_fn;
d->opts.ctx = opts ? opts->ctx : NULL;
d->ptr_sz = btf__pointer_size(btf) ? : sizeof(void *);
d->type_names = hashmap__new(str_hash_fn, str_equal_fn, NULL);
if (IS_ERR(d->type_names)) {
err = PTR_ERR(d->type_names);
d->type_names = NULL;
btf_dump__free(d);
return ERR_PTR(err);
goto err;
}
d->ident_names = hashmap__new(str_hash_fn, str_equal_fn, NULL);
if (IS_ERR(d->ident_names)) {
err = PTR_ERR(d->ident_names);
d->ident_names = NULL;
btf_dump__free(d);
return ERR_PTR(err);
goto err;
}
d->type_states = calloc(1 + btf__get_nr_types(d->btf),
sizeof(d->type_states[0]));
if (!d->type_states) {
err = -ENOMEM;
goto err;
}
d->cached_names = calloc(1 + btf__get_nr_types(d->btf),
sizeof(d->cached_names[0]));
if (!d->cached_names) {
err = -ENOMEM;
goto err;
}
/* VOID is special */
d->type_states[0].order_state = ORDERED;
d->type_states[0].emit_state = EMITTED;
/* eagerly determine referenced types for anon enums */
err = btf_dump_mark_referenced(d);
if (err)
goto err;
return d;
err:
btf_dump__free(d);
return ERR_PTR(err);
}
void btf_dump__free(struct btf_dump *d)
{
int i, cnt;
if (!d)
if (IS_ERR_OR_NULL(d))
return;
free(d->type_states);
@@ -175,7 +206,6 @@ void btf_dump__free(struct btf_dump *d)
free(d);
}
static int btf_dump_mark_referenced(struct btf_dump *d);
static int btf_dump_order_type(struct btf_dump *d, __u32 id, bool through_ptr);
static void btf_dump_emit_type(struct btf_dump *d, __u32 id, __u32 cont_id);
@@ -202,27 +232,6 @@ int btf_dump__dump_type(struct btf_dump *d, __u32 id)
if (id > btf__get_nr_types(d->btf))
return -EINVAL;
/* type states are lazily allocated, as they might not be needed */
if (!d->type_states) {
d->type_states = calloc(1 + btf__get_nr_types(d->btf),
sizeof(d->type_states[0]));
if (!d->type_states)
return -ENOMEM;
d->cached_names = calloc(1 + btf__get_nr_types(d->btf),
sizeof(d->cached_names[0]));
if (!d->cached_names)
return -ENOMEM;
/* VOID is special */
d->type_states[0].order_state = ORDERED;
d->type_states[0].emit_state = EMITTED;
/* eagerly determine referenced types for anon enums */
err = btf_dump_mark_referenced(d);
if (err)
return err;
}
d->emit_queue_cnt = 0;
err = btf_dump_order_type(d, id, false);
if (err < 0)
@@ -543,6 +552,9 @@ static int btf_dump_order_type(struct btf_dump *d, __u32 id, bool through_ptr)
}
}
static void btf_dump_emit_missing_aliases(struct btf_dump *d, __u32 id,
const struct btf_type *t);
static void btf_dump_emit_struct_fwd(struct btf_dump *d, __u32 id,
const struct btf_type *t);
static void btf_dump_emit_struct_def(struct btf_dump *d, __u32 id,
@@ -653,7 +665,7 @@ static void btf_dump_emit_type(struct btf_dump *d, __u32 id, __u32 cont_id)
if (!btf_dump_is_blacklisted(d, id)) {
btf_dump_emit_typedef_def(d, id, t, 0);
btf_dump_printf(d, ";\n\n");
};
}
tstate->fwd_emitted = 1;
break;
default:
@@ -665,6 +677,9 @@ static void btf_dump_emit_type(struct btf_dump *d, __u32 id, __u32 cont_id)
switch (kind) {
case BTF_KIND_INT:
/* Emit type alias definitions if necessary */
btf_dump_emit_missing_aliases(d, id, t);
tstate->emit_state = EMITTED;
break;
case BTF_KIND_ENUM:
@@ -752,41 +767,6 @@ static void btf_dump_emit_type(struct btf_dump *d, __u32 id, __u32 cont_id)
}
}
static int btf_align_of(const struct btf *btf, __u32 id)
{
const struct btf_type *t = btf__type_by_id(btf, id);
__u16 kind = btf_kind(t);
switch (kind) {
case BTF_KIND_INT:
case BTF_KIND_ENUM:
return min(sizeof(void *), t->size);
case BTF_KIND_PTR:
return sizeof(void *);
case BTF_KIND_TYPEDEF:
case BTF_KIND_VOLATILE:
case BTF_KIND_CONST:
case BTF_KIND_RESTRICT:
return btf_align_of(btf, t->type);
case BTF_KIND_ARRAY:
return btf_align_of(btf, btf_array(t)->type);
case BTF_KIND_STRUCT:
case BTF_KIND_UNION: {
const struct btf_member *m = btf_members(t);
__u16 vlen = btf_vlen(t);
int i, align = 1;
for (i = 0; i < vlen; i++, m++)
align = max(align, btf_align_of(btf, m->type));
return align;
}
default:
pr_warn("unsupported BTF_KIND:%u\n", btf_kind(t));
return 1;
}
}
static bool btf_is_struct_packed(const struct btf *btf, __u32 id,
const struct btf_type *t)
{
@@ -794,18 +774,18 @@ static bool btf_is_struct_packed(const struct btf *btf, __u32 id,
int align, i, bit_sz;
__u16 vlen;
align = btf_align_of(btf, id);
align = btf__align_of(btf, id);
/* size of a non-packed struct has to be a multiple of its alignment*/
if (t->size % align)
if (align && t->size % align)
return true;
m = btf_members(t);
vlen = btf_vlen(t);
/* all non-bitfield fields have to be naturally aligned */
for (i = 0; i < vlen; i++, m++) {
align = btf_align_of(btf, m->type);
align = btf__align_of(btf, m->type);
bit_sz = btf_member_bitfield_size(t, i);
if (bit_sz == 0 && m->offset % (8 * align) != 0)
if (align && bit_sz == 0 && m->offset % (8 * align) != 0)
return true;
}
@@ -826,7 +806,7 @@ static void btf_dump_emit_bit_padding(const struct btf_dump *d,
int align, int lvl)
{
int off_diff = m_off - cur_off;
int ptr_bits = sizeof(void *) * 8;
int ptr_bits = d->ptr_sz * 8;
if (off_diff <= 0)
/* no gap */
@@ -889,7 +869,7 @@ static void btf_dump_emit_struct_def(struct btf_dump *d,
fname = btf_name_of(d, m->name_off);
m_sz = btf_member_bitfield_size(t, i);
m_off = btf_member_bit_offset(t, i);
align = packed ? 1 : btf_align_of(d->btf, m->type);
align = packed ? 1 : btf__align_of(d->btf, m->type);
btf_dump_emit_bit_padding(d, off, m_off, m_sz, align, lvl + 1);
btf_dump_printf(d, "\n%s", pfx(lvl + 1));
@@ -899,7 +879,7 @@ static void btf_dump_emit_struct_def(struct btf_dump *d,
btf_dump_printf(d, ": %d", m_sz);
off = m_off + m_sz;
} else {
m_sz = max(0, btf__resolve_size(d->btf, m->type));
m_sz = max((__s64)0, btf__resolve_size(d->btf, m->type));
off = m_off + m_sz * 8;
}
btf_dump_printf(d, ";");
@@ -907,7 +887,7 @@ static void btf_dump_emit_struct_def(struct btf_dump *d,
/* pad at the end, if necessary */
if (is_struct) {
align = packed ? 1 : btf_align_of(d->btf, id);
align = packed ? 1 : btf__align_of(d->btf, id);
btf_dump_emit_bit_padding(d, off, t->size * 8, 0, align,
lvl + 1);
}
@@ -919,6 +899,32 @@ static void btf_dump_emit_struct_def(struct btf_dump *d,
btf_dump_printf(d, " __attribute__((packed))");
}
static const char *missing_base_types[][2] = {
/*
* GCC emits typedefs to its internal __PolyX_t types when compiling Arm
* SIMD intrinsics. Alias them to standard base types.
*/
{ "__Poly8_t", "unsigned char" },
{ "__Poly16_t", "unsigned short" },
{ "__Poly64_t", "unsigned long long" },
{ "__Poly128_t", "unsigned __int128" },
};
static void btf_dump_emit_missing_aliases(struct btf_dump *d, __u32 id,
const struct btf_type *t)
{
const char *name = btf_dump_type_name(d, id);
int i;
for (i = 0; i < ARRAY_SIZE(missing_base_types); i++) {
if (strcmp(name, missing_base_types[i][0]) == 0) {
btf_dump_printf(d, "typedef %s %s;\n\n",
missing_base_types[i][1], name);
break;
}
}
}
static void btf_dump_emit_enum_fwd(struct btf_dump *d, __u32 id,
const struct btf_type *t)
{
@@ -946,13 +952,13 @@ static void btf_dump_emit_enum_def(struct btf_dump *d, __u32 id,
/* enumerators share namespace with typedef idents */
dup_cnt = btf_dump_name_dups(d, d->ident_names, name);
if (dup_cnt > 1) {
btf_dump_printf(d, "\n%s%s___%zu = %d,",
btf_dump_printf(d, "\n%s%s___%zu = %u,",
pfx(lvl + 1), name, dup_cnt,
(__s32)v->val);
(__u32)v->val);
} else {
btf_dump_printf(d, "\n%s%s = %d,",
btf_dump_printf(d, "\n%s%s = %u,",
pfx(lvl + 1), name,
(__s32)v->val);
(__u32)v->val);
}
}
btf_dump_printf(d, "\n%s}", pfx(lvl));
@@ -1051,6 +1057,23 @@ static int btf_dump_push_decl_stack_id(struct btf_dump *d, __u32 id)
* of a stack frame. Some care is required to "pop" stack frames after
* processing type declaration chain.
*/
int btf_dump__emit_type_decl(struct btf_dump *d, __u32 id,
const struct btf_dump_emit_type_decl_opts *opts)
{
const char *fname;
int lvl;
if (!OPTS_VALID(opts, btf_dump_emit_type_decl_opts))
return -EINVAL;
fname = OPTS_GET(opts, field_name, "");
lvl = OPTS_GET(opts, indent_level, 0);
d->strip_mods = OPTS_GET(opts, strip_mods, false);
btf_dump_emit_type_decl(d, id, fname, lvl);
d->strip_mods = false;
return 0;
}
static void btf_dump_emit_type_decl(struct btf_dump *d, __u32 id,
const char *fname, int lvl)
{
@@ -1060,6 +1083,10 @@ static void btf_dump_emit_type_decl(struct btf_dump *d, __u32 id,
stack_start = d->decl_stack_cnt;
for (;;) {
t = btf__type_by_id(d->btf, id);
if (d->strip_mods && btf_is_mod(t))
goto skip_mod;
err = btf_dump_push_decl_stack_id(d, id);
if (err < 0) {
/*
@@ -1071,12 +1098,11 @@ static void btf_dump_emit_type_decl(struct btf_dump *d, __u32 id,
d->decl_stack_cnt = stack_start;
return;
}
skip_mod:
/* VOID */
if (id == 0)
break;
t = btf__type_by_id(d->btf, id);
switch (btf_kind(t)) {
case BTF_KIND_PTR:
case BTF_KIND_VOLATILE:
@@ -1152,6 +1178,20 @@ static void btf_dump_emit_mods(struct btf_dump *d, struct id_stack *decl_stack)
}
}
static void btf_dump_drop_mods(struct btf_dump *d, struct id_stack *decl_stack)
{
const struct btf_type *t;
__u32 id;
while (decl_stack->cnt) {
id = decl_stack->ids[decl_stack->cnt - 1];
t = btf__type_by_id(d->btf, id);
if (!btf_is_mod(t))
return;
decl_stack->cnt--;
}
}
static void btf_dump_emit_name(const struct btf_dump *d,
const char *name, bool last_was_ptr)
{
@@ -1250,14 +1290,7 @@ static void btf_dump_emit_type_chain(struct btf_dump *d,
* a const/volatile modifier for array, so we are
* going to silently skip them here.
*/
while (decls->cnt) {
next_id = decls->ids[decls->cnt - 1];
next_t = btf__type_by_id(d->btf, next_id);
if (btf_is_mod(next_t))
decls->cnt--;
else
break;
}
btf_dump_drop_mods(d, decls);
if (decls->cnt == 0) {
btf_dump_emit_name(d, fname, last_was_ptr);
@@ -1285,7 +1318,15 @@ static void btf_dump_emit_type_chain(struct btf_dump *d,
__u16 vlen = btf_vlen(t);
int i;
btf_dump_emit_mods(d, decls);
/*
* GCC emits extra volatile qualifier for
* __attribute__((noreturn)) function pointers. Clang
* doesn't do it. It's a GCC quirk for backwards
* compatibility with code written for GCC <2.5. So,
* similarly to extra qualifiers for array, just drop
* them, instead of handling them.
*/
btf_dump_drop_mods(d, decls);
if (decls->cnt) {
btf_dump_printf(d, " (");
btf_dump_emit_type_chain(d, decls, fname, lvl);

View File

@@ -12,6 +12,9 @@
#include <linux/err.h>
#include "hashmap.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
/* start with 4 buckets */
#define HASHMAP_MIN_CAP_BITS 2
@@ -56,7 +59,14 @@ struct hashmap *hashmap__new(hashmap_hash_fn hash_fn,
void hashmap__clear(struct hashmap *map)
{
struct hashmap_entry *cur, *tmp;
size_t bkt;
hashmap__for_each_entry_safe(map, cur, tmp, bkt) {
free(cur);
}
free(map->buckets);
map->buckets = NULL;
map->cap = map->cap_bits = map->sz = 0;
}
@@ -90,8 +100,7 @@ static int hashmap_grow(struct hashmap *map)
struct hashmap_entry **new_buckets;
struct hashmap_entry *cur, *tmp;
size_t new_cap_bits, new_cap;
size_t h;
int bkt;
size_t h, bkt;
new_cap_bits = map->cap_bits + 1;
if (new_cap_bits < HASHMAP_MIN_CAP_BITS)

View File

@@ -10,17 +10,19 @@
#include <stdbool.h>
#include <stddef.h>
#ifdef __GLIBC__
#include <bits/wordsize.h>
#else
#include <bits/reg.h>
#endif
#include "libbpf_internal.h"
#include <limits.h>
static inline size_t hash_bits(size_t h, int bits)
{
/* shuffle bits and return requested number of upper bits */
return (h * 11400714819323198485llu) >> (__WORDSIZE - bits);
#if (__SIZEOF_SIZE_T__ == __SIZEOF_LONG_LONG__)
/* LP64 case */
return (h * 11400714819323198485llu) >> (__SIZEOF_LONG_LONG__ * 8 - bits);
#elif (__SIZEOF_SIZE_T__ <= __SIZEOF_LONG__)
return (h * 2654435769lu) >> (__SIZEOF_LONG__ * 8 - bits);
#else
# error "Unsupported size_t size"
#endif
}
typedef size_t (*hashmap_hash_fn)(const void *key, void *ctx);

File diff suppressed because it is too large Load Diff

View File

@@ -17,14 +17,12 @@
#include <sys/types.h> // for size_t
#include <linux/bpf.h>
#include "libbpf_common.h"
#ifdef __cplusplus
extern "C" {
#endif
#ifndef LIBBPF_API
#define LIBBPF_API __attribute__((visibility("default")))
#endif
enum libbpf_errno {
__LIBBPF_ERRNO__START = 4000,
@@ -67,28 +65,6 @@ struct bpf_object_open_attr {
enum bpf_prog_type prog_type;
};
/* Helper macro to declare and initialize libbpf options struct
*
* This dance with uninitialized declaration, followed by memset to zero,
* followed by assignment using compound literal syntax is done to preserve
* ability to use a nice struct field initialization syntax and **hopefully**
* have all the padding bytes initialized to zero. It's not guaranteed though,
* when copying literal, that compiler won't copy garbage in literal's padding
* bytes, but that's the best way I've found and it seems to work in practice.
*
* Macro declares opts struct of given type and name, zero-initializes,
* including any extra padding, it with memset() and then assigns initial
* values provided by users in struct initializer-syntax as varargs.
*/
#define DECLARE_LIBBPF_OPTS(TYPE, NAME, ...) \
struct TYPE NAME = ({ \
memset(&NAME, 0, sizeof(struct TYPE)); \
(struct TYPE) { \
.sz = sizeof(struct TYPE), \
__VA_ARGS__ \
}; \
})
struct bpf_object_open_opts {
/* size of this struct, for forward/backward compatiblity */
size_t sz;
@@ -101,7 +77,11 @@ struct bpf_object_open_opts {
const char *object_name;
/* parse map definitions non-strictly, allowing extra attributes/data */
bool relaxed_maps;
/* process CO-RE relocations non-strictly, allowing them to fail */
/* DEPRECATED: handle CO-RE relocations non-strictly, allowing failures.
* Value is ignored. Relocations always are processed non-strictly.
* Non-relocatable instructions are replaced with invalid ones to
* prevent accidental errors.
* */
bool relaxed_core_relocs;
/* maps that set the 'pinning' attribute in their definition will have
* their pin_path attribute set to a file in this directory, and be
@@ -109,15 +89,19 @@ struct bpf_object_open_opts {
*/
const char *pin_root_path;
__u32 attach_prog_fd;
/* Additional kernel config content that augments and overrides
* system Kconfig for CONFIG_xxx externs.
*/
const char *kconfig;
};
#define bpf_object_open_opts__last_field attach_prog_fd
#define bpf_object_open_opts__last_field kconfig
LIBBPF_API struct bpf_object *bpf_object__open(const char *path);
LIBBPF_API struct bpf_object *
bpf_object__open_file(const char *path, struct bpf_object_open_opts *opts);
bpf_object__open_file(const char *path, const struct bpf_object_open_opts *opts);
LIBBPF_API struct bpf_object *
bpf_object__open_mem(const void *obj_buf, size_t obj_buf_sz,
struct bpf_object_open_opts *opts);
const struct bpf_object_open_opts *opts);
/* deprecated bpf_object__open variants */
LIBBPF_API struct bpf_object *
@@ -126,11 +110,6 @@ bpf_object__open_buffer(const void *obj_buf, size_t obj_buf_sz,
LIBBPF_API struct bpf_object *
bpf_object__open_xattr(struct bpf_object_open_attr *attr);
int bpf_object__section_size(const struct bpf_object *obj, const char *name,
__u32 *size);
int bpf_object__variable_offset(const struct bpf_object *obj, const char *name,
__u32 *off);
enum libbpf_pin_type {
LIBBPF_PIN_NONE,
/* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */
@@ -161,6 +140,7 @@ struct bpf_object_load_attr {
LIBBPF_API int bpf_object__load(struct bpf_object *obj);
LIBBPF_API int bpf_object__load_xattr(struct bpf_object_load_attr *attr);
LIBBPF_API int bpf_object__unload(struct bpf_object *obj);
LIBBPF_API const char *bpf_object__name(const struct bpf_object *obj);
LIBBPF_API unsigned int bpf_object__kversion(const struct bpf_object *obj);
@@ -171,6 +151,9 @@ LIBBPF_API int bpf_object__btf_fd(const struct bpf_object *obj);
LIBBPF_API struct bpf_program *
bpf_object__find_program_by_title(const struct bpf_object *obj,
const char *title);
LIBBPF_API struct bpf_program *
bpf_object__find_program_by_name(const struct bpf_object *obj,
const char *name);
LIBBPF_API struct bpf_object *bpf_object__next(struct bpf_object *prev);
#define bpf_object__for_each_safe(pos, tmp) \
@@ -214,8 +197,11 @@ LIBBPF_API void *bpf_program__priv(const struct bpf_program *prog);
LIBBPF_API void bpf_program__set_ifindex(struct bpf_program *prog,
__u32 ifindex);
LIBBPF_API const char *bpf_program__name(const struct bpf_program *prog);
LIBBPF_API const char *bpf_program__title(const struct bpf_program *prog,
bool needs_copy);
LIBBPF_API bool bpf_program__autoload(const struct bpf_program *prog);
LIBBPF_API int bpf_program__set_autoload(struct bpf_program *prog, bool autoload);
/* returns program size in bytes */
LIBBPF_API size_t bpf_program__size(const struct bpf_program *prog);
@@ -235,8 +221,19 @@ LIBBPF_API void bpf_program__unload(struct bpf_program *prog);
struct bpf_link;
LIBBPF_API struct bpf_link *bpf_link__open(const char *path);
LIBBPF_API int bpf_link__fd(const struct bpf_link *link);
LIBBPF_API const char *bpf_link__pin_path(const struct bpf_link *link);
LIBBPF_API int bpf_link__pin(struct bpf_link *link, const char *path);
LIBBPF_API int bpf_link__unpin(struct bpf_link *link);
LIBBPF_API int bpf_link__update_program(struct bpf_link *link,
struct bpf_program *prog);
LIBBPF_API void bpf_link__disconnect(struct bpf_link *link);
LIBBPF_API int bpf_link__detach(struct bpf_link *link);
LIBBPF_API int bpf_link__destroy(struct bpf_link *link);
LIBBPF_API struct bpf_link *
bpf_program__attach(struct bpf_program *prog);
LIBBPF_API struct bpf_link *
bpf_program__attach_perf_event(struct bpf_program *prog, int pfd);
LIBBPF_API struct bpf_link *
@@ -253,9 +250,32 @@ bpf_program__attach_tracepoint(struct bpf_program *prog,
LIBBPF_API struct bpf_link *
bpf_program__attach_raw_tracepoint(struct bpf_program *prog,
const char *tp_name);
LIBBPF_API struct bpf_link *
bpf_program__attach_trace(struct bpf_program *prog);
LIBBPF_API struct bpf_link *
bpf_program__attach_lsm(struct bpf_program *prog);
LIBBPF_API struct bpf_link *
bpf_program__attach_cgroup(struct bpf_program *prog, int cgroup_fd);
LIBBPF_API struct bpf_link *
bpf_program__attach_netns(struct bpf_program *prog, int netns_fd);
LIBBPF_API struct bpf_link *
bpf_program__attach_xdp(struct bpf_program *prog, int ifindex);
struct bpf_map;
LIBBPF_API struct bpf_link *bpf_map__attach_struct_ops(struct bpf_map *map);
struct bpf_iter_attach_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
union bpf_iter_link_info *link_info;
__u32 link_info_len;
};
#define bpf_iter_attach_opts__last_field link_info_len
LIBBPF_API struct bpf_link *
bpf_program__attach_iter(struct bpf_program *prog,
const struct bpf_iter_attach_opts *opts);
struct bpf_insn;
/*
@@ -327,11 +347,15 @@ LIBBPF_API int bpf_program__set_socket_filter(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_tracepoint(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_raw_tracepoint(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_kprobe(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_lsm(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_sched_cls(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_sched_act(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_xdp(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_perf_event(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_tracing(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_struct_ops(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_extension(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_sk_lookup(struct bpf_program *prog);
LIBBPF_API enum bpf_prog_type bpf_program__get_type(struct bpf_program *prog);
LIBBPF_API void bpf_program__set_type(struct bpf_program *prog,
@@ -343,15 +367,23 @@ LIBBPF_API void
bpf_program__set_expected_attach_type(struct bpf_program *prog,
enum bpf_attach_type type);
LIBBPF_API int
bpf_program__set_attach_target(struct bpf_program *prog, int attach_prog_fd,
const char *attach_func_name);
LIBBPF_API bool bpf_program__is_socket_filter(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_tracepoint(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_raw_tracepoint(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_kprobe(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_lsm(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_sched_cls(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_sched_act(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_xdp(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_perf_event(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_tracing(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_struct_ops(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_extension(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_sk_lookup(const struct bpf_program *prog);
/*
* No need for __attribute__((packed)), all members of 'bpf_map_def'
@@ -371,7 +403,6 @@ struct bpf_map_def {
* The 'struct bpf_map' in include/linux/bpf.h is internal to the kernel,
* so no need to worry about a name clash.
*/
struct bpf_map;
LIBBPF_API struct bpf_map *
bpf_object__find_map_by_name(const struct bpf_object *obj, const char *name);
@@ -396,21 +427,47 @@ bpf_map__next(const struct bpf_map *map, const struct bpf_object *obj);
LIBBPF_API struct bpf_map *
bpf_map__prev(const struct bpf_map *map, const struct bpf_object *obj);
/* get/set map FD */
LIBBPF_API int bpf_map__fd(const struct bpf_map *map);
LIBBPF_API int bpf_map__reuse_fd(struct bpf_map *map, int fd);
/* get map definition */
LIBBPF_API const struct bpf_map_def *bpf_map__def(const struct bpf_map *map);
/* get map name */
LIBBPF_API const char *bpf_map__name(const struct bpf_map *map);
/* get/set map type */
LIBBPF_API enum bpf_map_type bpf_map__type(const struct bpf_map *map);
LIBBPF_API int bpf_map__set_type(struct bpf_map *map, enum bpf_map_type type);
/* get/set map size (max_entries) */
LIBBPF_API __u32 bpf_map__max_entries(const struct bpf_map *map);
LIBBPF_API int bpf_map__set_max_entries(struct bpf_map *map, __u32 max_entries);
LIBBPF_API int bpf_map__resize(struct bpf_map *map, __u32 max_entries);
/* get/set map flags */
LIBBPF_API __u32 bpf_map__map_flags(const struct bpf_map *map);
LIBBPF_API int bpf_map__set_map_flags(struct bpf_map *map, __u32 flags);
/* get/set map NUMA node */
LIBBPF_API __u32 bpf_map__numa_node(const struct bpf_map *map);
LIBBPF_API int bpf_map__set_numa_node(struct bpf_map *map, __u32 numa_node);
/* get/set map key size */
LIBBPF_API __u32 bpf_map__key_size(const struct bpf_map *map);
LIBBPF_API int bpf_map__set_key_size(struct bpf_map *map, __u32 size);
/* get/set map value size */
LIBBPF_API __u32 bpf_map__value_size(const struct bpf_map *map);
LIBBPF_API int bpf_map__set_value_size(struct bpf_map *map, __u32 size);
/* get map key/value BTF type IDs */
LIBBPF_API __u32 bpf_map__btf_key_type_id(const struct bpf_map *map);
LIBBPF_API __u32 bpf_map__btf_value_type_id(const struct bpf_map *map);
/* get/set map if_index */
LIBBPF_API __u32 bpf_map__ifindex(const struct bpf_map *map);
LIBBPF_API int bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex);
typedef void (*bpf_map_clear_priv_t)(struct bpf_map *, void *);
LIBBPF_API int bpf_map__set_priv(struct bpf_map *map, void *priv,
bpf_map_clear_priv_t clear_priv);
LIBBPF_API void *bpf_map__priv(const struct bpf_map *map);
LIBBPF_API int bpf_map__reuse_fd(struct bpf_map *map, int fd);
LIBBPF_API int bpf_map__resize(struct bpf_map *map, __u32 max_entries);
LIBBPF_API int bpf_map__set_initial_value(struct bpf_map *map,
const void *data, size_t size);
LIBBPF_API bool bpf_map__is_offload_neutral(const struct bpf_map *map);
LIBBPF_API bool bpf_map__is_internal(const struct bpf_map *map);
LIBBPF_API void bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex);
LIBBPF_API int bpf_map__set_pin_path(struct bpf_map *map, const char *path);
LIBBPF_API const char *bpf_map__get_pin_path(const struct bpf_map *map);
LIBBPF_API bool bpf_map__is_pinned(const struct bpf_map *map);
@@ -443,11 +500,40 @@ struct xdp_link_info {
__u8 attach_mode;
};
struct bpf_xdp_set_link_opts {
size_t sz;
int old_fd;
};
#define bpf_xdp_set_link_opts__last_field old_fd
LIBBPF_API int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags);
LIBBPF_API int bpf_set_link_xdp_fd_opts(int ifindex, int fd, __u32 flags,
const struct bpf_xdp_set_link_opts *opts);
LIBBPF_API int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags);
LIBBPF_API int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info,
size_t info_size, __u32 flags);
/* Ring buffer APIs */
struct ring_buffer;
typedef int (*ring_buffer_sample_fn)(void *ctx, void *data, size_t size);
struct ring_buffer_opts {
size_t sz; /* size of this struct, for forward/backward compatiblity */
};
#define ring_buffer_opts__last_field sz
LIBBPF_API struct ring_buffer *
ring_buffer__new(int map_fd, ring_buffer_sample_fn sample_cb, void *ctx,
const struct ring_buffer_opts *opts);
LIBBPF_API void ring_buffer__free(struct ring_buffer *rb);
LIBBPF_API int ring_buffer__add(struct ring_buffer *rb, int map_fd,
ring_buffer_sample_fn sample_cb, void *ctx);
LIBBPF_API int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms);
LIBBPF_API int ring_buffer__consume(struct ring_buffer *rb);
/* Perf buffer APIs */
struct perf_buffer;
typedef void (*perf_buffer_sample_fn)(void *ctx, int cpu,
@@ -503,6 +589,7 @@ perf_buffer__new_raw(int map_fd, size_t page_cnt,
LIBBPF_API void perf_buffer__free(struct perf_buffer *pb);
LIBBPF_API int perf_buffer__poll(struct perf_buffer *pb, int timeout_ms);
LIBBPF_API int perf_buffer__consume(struct perf_buffer *pb);
typedef enum bpf_perf_event_ret
(*bpf_perf_event_print_t)(struct perf_event_header *hdr,
@@ -512,18 +599,6 @@ bpf_perf_event_read_simple(void *mmap_mem, size_t mmap_size, size_t page_size,
void **copy_mem, size_t *copy_size,
bpf_perf_event_print_t fn, void *private_data);
struct nlattr;
typedef int (*libbpf_dump_nlmsg_t)(void *cookie, void *msg, struct nlattr **tb);
int libbpf_netlink_open(unsigned int *nl_pid);
int libbpf_nl_get_link(int sock, unsigned int nl_pid,
libbpf_dump_nlmsg_t dump_link_nlmsg, void *cookie);
int libbpf_nl_get_class(int sock, unsigned int nl_pid, int ifindex,
libbpf_dump_nlmsg_t dump_class_nlmsg, void *cookie);
int libbpf_nl_get_qdisc(int sock, unsigned int nl_pid, int ifindex,
libbpf_dump_nlmsg_t dump_qdisc_nlmsg, void *cookie);
int libbpf_nl_get_filter(int sock, unsigned int nl_pid, int ifindex, int handle,
libbpf_dump_nlmsg_t dump_filter_nlmsg, void *cookie);
struct bpf_prog_linfo;
struct bpf_prog_info;
@@ -550,6 +625,7 @@ LIBBPF_API bool bpf_probe_prog_type(enum bpf_prog_type prog_type,
LIBBPF_API bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex);
LIBBPF_API bool bpf_probe_helper(enum bpf_func_id id,
enum bpf_prog_type prog_type, __u32 ifindex);
LIBBPF_API bool bpf_probe_large_insn_limit(__u32 ifindex);
/*
* Get bpf_prog_info in continuous memory
@@ -630,6 +706,50 @@ bpf_program__bpil_offs_to_addr(struct bpf_prog_info_linear *info_linear);
*/
LIBBPF_API int libbpf_num_possible_cpus(void);
struct bpf_map_skeleton {
const char *name;
struct bpf_map **map;
void **mmaped;
};
struct bpf_prog_skeleton {
const char *name;
struct bpf_program **prog;
struct bpf_link **link;
};
struct bpf_object_skeleton {
size_t sz; /* size of this struct, for forward/backward compatibility */
const char *name;
void *data;
size_t data_sz;
struct bpf_object **obj;
int map_cnt;
int map_skel_sz; /* sizeof(struct bpf_skeleton_map) */
struct bpf_map_skeleton *maps;
int prog_cnt;
int prog_skel_sz; /* sizeof(struct bpf_skeleton_prog) */
struct bpf_prog_skeleton *progs;
};
LIBBPF_API int
bpf_object__open_skeleton(struct bpf_object_skeleton *s,
const struct bpf_object_open_opts *opts);
LIBBPF_API int bpf_object__load_skeleton(struct bpf_object_skeleton *s);
LIBBPF_API int bpf_object__attach_skeleton(struct bpf_object_skeleton *s);
LIBBPF_API void bpf_object__detach_skeleton(struct bpf_object_skeleton *s);
LIBBPF_API void bpf_object__destroy_skeleton(struct bpf_object_skeleton *s);
enum libbpf_tristate {
TRI_NO = 0,
TRI_YES = 1,
TRI_MODULE = 2,
};
#ifdef __cplusplus
} /* extern "C" */
#endif

View File

@@ -208,3 +208,94 @@ LIBBPF_0.0.6 {
btf__find_by_name_kind;
libbpf_find_vmlinux_btf_id;
} LIBBPF_0.0.5;
LIBBPF_0.0.7 {
global:
btf_dump__emit_type_decl;
bpf_link__disconnect;
bpf_map__attach_struct_ops;
bpf_map_delete_batch;
bpf_map_lookup_and_delete_batch;
bpf_map_lookup_batch;
bpf_map_update_batch;
bpf_object__find_program_by_name;
bpf_object__attach_skeleton;
bpf_object__destroy_skeleton;
bpf_object__detach_skeleton;
bpf_object__load_skeleton;
bpf_object__open_skeleton;
bpf_probe_large_insn_limit;
bpf_prog_attach_xattr;
bpf_program__attach;
bpf_program__name;
bpf_program__is_extension;
bpf_program__is_struct_ops;
bpf_program__set_extension;
bpf_program__set_struct_ops;
btf__align_of;
libbpf_find_kernel_btf;
} LIBBPF_0.0.6;
LIBBPF_0.0.8 {
global:
bpf_link__fd;
bpf_link__open;
bpf_link__pin;
bpf_link__pin_path;
bpf_link__unpin;
bpf_link__update_program;
bpf_link_create;
bpf_link_update;
bpf_map__set_initial_value;
bpf_program__attach_cgroup;
bpf_program__attach_lsm;
bpf_program__is_lsm;
bpf_program__set_attach_target;
bpf_program__set_lsm;
bpf_set_link_xdp_fd_opts;
} LIBBPF_0.0.7;
LIBBPF_0.0.9 {
global:
bpf_enable_stats;
bpf_iter_create;
bpf_link_get_fd_by_id;
bpf_link_get_next_id;
bpf_program__attach_iter;
bpf_program__attach_netns;
perf_buffer__consume;
ring_buffer__add;
ring_buffer__consume;
ring_buffer__free;
ring_buffer__new;
ring_buffer__poll;
} LIBBPF_0.0.8;
LIBBPF_0.1.0 {
global:
bpf_link__detach;
bpf_link_detach;
bpf_map__ifindex;
bpf_map__key_size;
bpf_map__map_flags;
bpf_map__max_entries;
bpf_map__numa_node;
bpf_map__set_key_size;
bpf_map__set_map_flags;
bpf_map__set_max_entries;
bpf_map__set_numa_node;
bpf_map__set_type;
bpf_map__set_value_size;
bpf_map__type;
bpf_map__value_size;
bpf_program__attach_xdp;
bpf_program__autoload;
bpf_program__is_sk_lookup;
bpf_program__set_autoload;
bpf_program__set_sk_lookup;
btf__parse;
btf__parse_raw;
btf__pointer_size;
btf__set_fd;
btf__set_pointer_size;
} LIBBPF_0.0.9;

View File

@@ -8,5 +8,5 @@ Name: libbpf
Description: BPF library
Version: @VERSION@
Libs: -L${libdir} -lbpf
Requires.private: libelf
Requires.private: libelf zlib
Cflags: -I${includedir}

40
src/libbpf_common.h Normal file
View File

@@ -0,0 +1,40 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/*
* Common user-facing libbpf helpers.
*
* Copyright (c) 2019 Facebook
*/
#ifndef __LIBBPF_LIBBPF_COMMON_H
#define __LIBBPF_LIBBPF_COMMON_H
#include <string.h>
#ifndef LIBBPF_API
#define LIBBPF_API __attribute__((visibility("default")))
#endif
/* Helper macro to declare and initialize libbpf options struct
*
* This dance with uninitialized declaration, followed by memset to zero,
* followed by assignment using compound literal syntax is done to preserve
* ability to use a nice struct field initialization syntax and **hopefully**
* have all the padding bytes initialized to zero. It's not guaranteed though,
* when copying literal, that compiler won't copy garbage in literal's padding
* bytes, but that's the best way I've found and it seems to work in practice.
*
* Macro declares opts struct of given type and name, zero-initializes,
* including any extra padding, it with memset() and then assigns initial
* values provided by users in struct initializer-syntax as varargs.
*/
#define DECLARE_LIBBPF_OPTS(TYPE, NAME, ...) \
struct TYPE NAME = ({ \
memset(&NAME, 0, sizeof(struct TYPE)); \
(struct TYPE) { \
.sz = sizeof(struct TYPE), \
__VA_ARGS__ \
}; \
})
#endif /* __LIBBPF_LIBBPF_COMMON_H */

View File

@@ -13,6 +13,9 @@
#include "libbpf.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
#define ERRNO_OFFSET(e) ((e) - __LIBBPF_ERRNO__START)
#define ERRCODE_OFFSET(c) ERRNO_OFFSET(LIBBPF_ERRNO__##c)
#define NR_ERRNO (__LIBBPF_ERRNO__END - __LIBBPF_ERRNO__START)

View File

@@ -76,7 +76,7 @@ static inline bool libbpf_validate_opts(const char *opts,
for (i = opts_sz; i < user_sz; i++) {
if (opts[i]) {
pr_warn("%s has non-zero extra bytes",
pr_warn("%s has non-zero extra bytes\n",
type_name);
return false;
}
@@ -95,9 +95,28 @@ static inline bool libbpf_validate_opts(const char *opts,
#define OPTS_GET(opts, field, fallback_value) \
(OPTS_HAS(opts, field) ? (opts)->field : fallback_value)
int parse_cpu_mask_str(const char *s, bool **mask, int *mask_sz);
int parse_cpu_mask_file(const char *fcpu, bool **mask, int *mask_sz);
int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
const char *str_sec, size_t str_len);
int bpf_object__section_size(const struct bpf_object *obj, const char *name,
__u32 *size);
int bpf_object__variable_offset(const struct bpf_object *obj, const char *name,
__u32 *off);
struct nlattr;
typedef int (*libbpf_dump_nlmsg_t)(void *cookie, void *msg, struct nlattr **tb);
int libbpf_netlink_open(unsigned int *nl_pid);
int libbpf_nl_get_link(int sock, unsigned int nl_pid,
libbpf_dump_nlmsg_t dump_link_nlmsg, void *cookie);
int libbpf_nl_get_class(int sock, unsigned int nl_pid, int ifindex,
libbpf_dump_nlmsg_t dump_class_nlmsg, void *cookie);
int libbpf_nl_get_qdisc(int sock, unsigned int nl_pid, int ifindex,
libbpf_dump_nlmsg_t dump_qdisc_nlmsg, void *cookie);
int libbpf_nl_get_filter(int sock, unsigned int nl_pid, int ifindex, int handle,
libbpf_dump_nlmsg_t dump_filter_nlmsg, void *cookie);
struct btf_ext_info {
/*
* info points to the individual info section (e.g. func_info and
@@ -134,7 +153,7 @@ struct btf_ext_info_sec {
__u32 sec_name_off;
__u32 num_info;
/* Followed by num_info * record_size number of bytes */
__u8 data[0];
__u8 data[];
};
/* The minimum bpf_func_info checked by the loader */

View File

@@ -17,6 +17,9 @@
#include "libbpf.h"
#include "libbpf_internal.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
static bool grep(const char *buffer, const char *pattern)
{
return !!strstr(buffer, pattern);
@@ -75,6 +78,9 @@ probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns,
case BPF_PROG_TYPE_CGROUP_SOCK_ADDR:
xattr.expected_attach_type = BPF_CGROUP_INET4_CONNECT;
break;
case BPF_PROG_TYPE_SK_LOOKUP:
xattr.expected_attach_type = BPF_SK_LOOKUP;
break;
case BPF_PROG_TYPE_KPROBE:
xattr.kern_version = get_kernel_version();
break;
@@ -103,6 +109,9 @@ probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns,
case BPF_PROG_TYPE_CGROUP_SYSCTL:
case BPF_PROG_TYPE_CGROUP_SOCKOPT:
case BPF_PROG_TYPE_TRACING:
case BPF_PROG_TYPE_STRUCT_OPS:
case BPF_PROG_TYPE_EXT:
case BPF_PROG_TYPE_LSM:
default:
break;
}
@@ -232,6 +241,11 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
if (btf_fd < 0)
return false;
break;
case BPF_MAP_TYPE_RINGBUF:
key_size = 0;
value_size = 0;
max_entries = 4096;
break;
case BPF_MAP_TYPE_UNSPEC:
case BPF_MAP_TYPE_HASH:
case BPF_MAP_TYPE_ARRAY:
@@ -251,6 +265,7 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
case BPF_MAP_TYPE_XSKMAP:
case BPF_MAP_TYPE_SOCKHASH:
case BPF_MAP_TYPE_REUSEPORT_SOCKARRAY:
case BPF_MAP_TYPE_STRUCT_OPS:
default:
break;
}
@@ -321,3 +336,24 @@ bool bpf_probe_helper(enum bpf_func_id id, enum bpf_prog_type prog_type,
return res;
}
/*
* Probe for availability of kernel commit (5.3):
*
* c04c0d2b968a ("bpf: increase complexity limit and maximum program size")
*/
bool bpf_probe_large_insn_limit(__u32 ifindex)
{
struct bpf_insn insns[BPF_MAXINSNS + 1];
int i;
for (i = 0; i < BPF_MAXINSNS; i++)
insns[i] = BPF_MOV64_IMM(BPF_REG_0, 1);
insns[BPF_MAXINSNS] = BPF_EXIT_INSN();
errno = 0;
probe_load(BPF_PROG_TYPE_SCHED_CLS, insns, ARRAY_SIZE(insns), NULL, 0,
ifindex);
return errno != E2BIG && errno != EINVAL;
}

View File

@@ -15,6 +15,9 @@
#include "libbpf_internal.h"
#include "nlattr.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
#ifndef SOL_NETLINK
#define SOL_NETLINK 270
#endif
@@ -129,7 +132,8 @@ done:
return ret;
}
int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
static int __bpf_set_link_xdp_fd_replace(int ifindex, int fd, int old_fd,
__u32 flags)
{
int sock, seq = 0, ret;
struct nlattr *nla, *nla_xdp;
@@ -138,7 +142,7 @@ int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
struct ifinfomsg ifinfo;
char attrbuf[64];
} req;
__u32 nl_pid;
__u32 nl_pid = 0;
sock = libbpf_netlink_open(&nl_pid);
if (sock < 0)
@@ -175,6 +179,14 @@ int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
nla->nla_len += nla_xdp->nla_len;
}
if (flags & XDP_FLAGS_REPLACE) {
nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
nla_xdp->nla_type = IFLA_XDP_EXPECTED_FD;
nla_xdp->nla_len = NLA_HDRLEN + sizeof(old_fd);
memcpy((char *)nla_xdp + NLA_HDRLEN, &old_fd, sizeof(old_fd));
nla->nla_len += nla_xdp->nla_len;
}
req.nh.nlmsg_len += NLA_ALIGN(nla->nla_len);
if (send(sock, &req, req.nh.nlmsg_len, 0) < 0) {
@@ -188,6 +200,29 @@ cleanup:
return ret;
}
int bpf_set_link_xdp_fd_opts(int ifindex, int fd, __u32 flags,
const struct bpf_xdp_set_link_opts *opts)
{
int old_fd = -1;
if (!OPTS_VALID(opts, bpf_xdp_set_link_opts))
return -EINVAL;
if (OPTS_HAS(opts, old_fd)) {
old_fd = OPTS_GET(opts, old_fd, -1);
flags |= XDP_FLAGS_REPLACE;
}
return __bpf_set_link_xdp_fd_replace(ifindex, fd,
old_fd,
flags);
}
int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
{
return __bpf_set_link_xdp_fd_replace(ifindex, fd, 0, flags);
}
static int __dump_link_nlmsg(struct nlmsghdr *nlh,
libbpf_dump_nlmsg_t dump_link_nlmsg, void *cookie)
{
@@ -253,7 +288,7 @@ int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info,
{
struct xdp_id_md xdp_id = {};
int sock, ret;
__u32 nl_pid;
__u32 nl_pid = 0;
__u32 mask;
if (flags & ~XDP_FLAGS_MASK || !info_size)
@@ -286,7 +321,9 @@ int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info,
static __u32 get_xdp_id(struct xdp_link_info *info, __u32 flags)
{
if (info->attach_mode != XDP_ATTACHED_MULTI)
flags &= XDP_FLAGS_MODES;
if (info->attach_mode != XDP_ATTACHED_MULTI && !flags)
return info->prog_id;
if (flags & XDP_FLAGS_DRV_MODE)
return info->drv_prog_id;

View File

@@ -13,6 +13,9 @@
#include <string.h>
#include <stdio.h>
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
static uint16_t nla_attr_minlen[LIBBPF_NLA_TYPE_MAX+1] = {
[LIBBPF_NLA_U8] = sizeof(uint8_t),
[LIBBPF_NLA_U16] = sizeof(uint16_t),

288
src/ringbuf.c Normal file
View File

@@ -0,0 +1,288 @@
// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
/*
* Ring buffer operations.
*
* Copyright (C) 2020 Facebook, Inc.
*/
#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include <linux/err.h>
#include <linux/bpf.h>
#include <asm/barrier.h>
#include <sys/mman.h>
#include <sys/epoll.h>
#include <tools/libc_compat.h>
#include "libbpf.h"
#include "libbpf_internal.h"
#include "bpf.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
struct ring {
ring_buffer_sample_fn sample_cb;
void *ctx;
void *data;
unsigned long *consumer_pos;
unsigned long *producer_pos;
unsigned long mask;
int map_fd;
};
struct ring_buffer {
struct epoll_event *events;
struct ring *rings;
size_t page_size;
int epoll_fd;
int ring_cnt;
};
static void ringbuf_unmap_ring(struct ring_buffer *rb, struct ring *r)
{
if (r->consumer_pos) {
munmap(r->consumer_pos, rb->page_size);
r->consumer_pos = NULL;
}
if (r->producer_pos) {
munmap(r->producer_pos, rb->page_size + 2 * (r->mask + 1));
r->producer_pos = NULL;
}
}
/* Add extra RINGBUF maps to this ring buffer manager */
int ring_buffer__add(struct ring_buffer *rb, int map_fd,
ring_buffer_sample_fn sample_cb, void *ctx)
{
struct bpf_map_info info;
__u32 len = sizeof(info);
struct epoll_event *e;
struct ring *r;
void *tmp;
int err;
memset(&info, 0, sizeof(info));
err = bpf_obj_get_info_by_fd(map_fd, &info, &len);
if (err) {
err = -errno;
pr_warn("ringbuf: failed to get map info for fd=%d: %d\n",
map_fd, err);
return err;
}
if (info.type != BPF_MAP_TYPE_RINGBUF) {
pr_warn("ringbuf: map fd=%d is not BPF_MAP_TYPE_RINGBUF\n",
map_fd);
return -EINVAL;
}
tmp = reallocarray(rb->rings, rb->ring_cnt + 1, sizeof(*rb->rings));
if (!tmp)
return -ENOMEM;
rb->rings = tmp;
tmp = reallocarray(rb->events, rb->ring_cnt + 1, sizeof(*rb->events));
if (!tmp)
return -ENOMEM;
rb->events = tmp;
r = &rb->rings[rb->ring_cnt];
memset(r, 0, sizeof(*r));
r->map_fd = map_fd;
r->sample_cb = sample_cb;
r->ctx = ctx;
r->mask = info.max_entries - 1;
/* Map writable consumer page */
tmp = mmap(NULL, rb->page_size, PROT_READ | PROT_WRITE, MAP_SHARED,
map_fd, 0);
if (tmp == MAP_FAILED) {
err = -errno;
pr_warn("ringbuf: failed to mmap consumer page for map fd=%d: %d\n",
map_fd, err);
return err;
}
r->consumer_pos = tmp;
/* Map read-only producer page and data pages. We map twice as big
* data size to allow simple reading of samples that wrap around the
* end of a ring buffer. See kernel implementation for details.
* */
tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ,
MAP_SHARED, map_fd, rb->page_size);
if (tmp == MAP_FAILED) {
err = -errno;
ringbuf_unmap_ring(rb, r);
pr_warn("ringbuf: failed to mmap data pages for map fd=%d: %d\n",
map_fd, err);
return err;
}
r->producer_pos = tmp;
r->data = tmp + rb->page_size;
e = &rb->events[rb->ring_cnt];
memset(e, 0, sizeof(*e));
e->events = EPOLLIN;
e->data.fd = rb->ring_cnt;
if (epoll_ctl(rb->epoll_fd, EPOLL_CTL_ADD, map_fd, e) < 0) {
err = -errno;
ringbuf_unmap_ring(rb, r);
pr_warn("ringbuf: failed to epoll add map fd=%d: %d\n",
map_fd, err);
return err;
}
rb->ring_cnt++;
return 0;
}
void ring_buffer__free(struct ring_buffer *rb)
{
int i;
if (!rb)
return;
for (i = 0; i < rb->ring_cnt; ++i)
ringbuf_unmap_ring(rb, &rb->rings[i]);
if (rb->epoll_fd >= 0)
close(rb->epoll_fd);
free(rb->events);
free(rb->rings);
free(rb);
}
struct ring_buffer *
ring_buffer__new(int map_fd, ring_buffer_sample_fn sample_cb, void *ctx,
const struct ring_buffer_opts *opts)
{
struct ring_buffer *rb;
int err;
if (!OPTS_VALID(opts, ring_buffer_opts))
return NULL;
rb = calloc(1, sizeof(*rb));
if (!rb)
return NULL;
rb->page_size = getpagesize();
rb->epoll_fd = epoll_create1(EPOLL_CLOEXEC);
if (rb->epoll_fd < 0) {
err = -errno;
pr_warn("ringbuf: failed to create epoll instance: %d\n", err);
goto err_out;
}
err = ring_buffer__add(rb, map_fd, sample_cb, ctx);
if (err)
goto err_out;
return rb;
err_out:
ring_buffer__free(rb);
return NULL;
}
static inline int roundup_len(__u32 len)
{
/* clear out top 2 bits (discard and busy, if set) */
len <<= 2;
len >>= 2;
/* add length prefix */
len += BPF_RINGBUF_HDR_SZ;
/* round up to 8 byte alignment */
return (len + 7) / 8 * 8;
}
static int ringbuf_process_ring(struct ring* r)
{
int *len_ptr, len, err, cnt = 0;
unsigned long cons_pos, prod_pos;
bool got_new_data;
void *sample;
cons_pos = smp_load_acquire(r->consumer_pos);
do {
got_new_data = false;
prod_pos = smp_load_acquire(r->producer_pos);
while (cons_pos < prod_pos) {
len_ptr = r->data + (cons_pos & r->mask);
len = smp_load_acquire(len_ptr);
/* sample not committed yet, bail out for now */
if (len & BPF_RINGBUF_BUSY_BIT)
goto done;
got_new_data = true;
cons_pos += roundup_len(len);
if ((len & BPF_RINGBUF_DISCARD_BIT) == 0) {
sample = (void *)len_ptr + BPF_RINGBUF_HDR_SZ;
err = r->sample_cb(r->ctx, sample, len);
if (err) {
/* update consumer pos and bail out */
smp_store_release(r->consumer_pos,
cons_pos);
return err;
}
cnt++;
}
smp_store_release(r->consumer_pos, cons_pos);
}
} while (got_new_data);
done:
return cnt;
}
/* Consume available ring buffer(s) data without event polling.
* Returns number of records consumed across all registered ring buffers, or
* negative number if any of the callbacks return error.
*/
int ring_buffer__consume(struct ring_buffer *rb)
{
int i, err, res = 0;
for (i = 0; i < rb->ring_cnt; i++) {
struct ring *ring = &rb->rings[i];
err = ringbuf_process_ring(ring);
if (err < 0)
return err;
res += err;
}
return res;
}
/* Poll for available data and consume records, if any are available.
* Returns number of records consumed, or negative number, if any of the
* registered callbacks returned error.
*/
int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms)
{
int i, cnt, err, res = 0;
cnt = epoll_wait(rb->epoll_fd, rb->events, rb->ring_cnt, timeout_ms);
for (i = 0; i < cnt; i++) {
__u32 ring_id = rb->events[i].data.fd;
struct ring *ring = &rb->rings[ring_id];
err = ringbuf_process_ring(ring);
if (err < 0)
return err;
res += cnt;
}
return cnt < 0 ? -errno : res;
}

View File

@@ -4,6 +4,9 @@
#include <stdio.h>
#include "str_error.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
/*
* Wrapper to allow for building in non-GNU systems such as Alpine Linux's musl
* libc, while checking strerror_r() return to avoid having to check this in

View File

@@ -32,6 +32,9 @@
#include "libbpf_internal.h"
#include "xsk.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
#ifndef SOL_XDP
#define SOL_XDP 283
#endif
@@ -277,7 +280,11 @@ int xsk_umem__create_v0_0_4(struct xsk_umem **umem_ptr, void *umem_area,
fill->consumer = map + off.fr.consumer;
fill->flags = map + off.fr.flags;
fill->ring = map + off.fr.desc;
fill->cached_cons = umem->config.fill_size;
fill->cached_prod = *fill->producer;
/* cached_cons is "size" bigger than the real consumer pointer
* See xsk_prod_nb_free
*/
fill->cached_cons = *fill->consumer + umem->config.fill_size;
map = mmap(NULL, off.cr.desc + umem->config.comp_size * sizeof(__u64),
PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE, umem->fd,
@@ -294,6 +301,8 @@ int xsk_umem__create_v0_0_4(struct xsk_umem **umem_ptr, void *umem_area,
comp->consumer = map + off.cr.consumer;
comp->flags = map + off.cr.flags;
comp->ring = map + off.cr.desc;
comp->cached_prod = *comp->producer;
comp->cached_cons = *comp->consumer;
*umem_ptr = umem;
return 0;
@@ -669,6 +678,8 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
rx->consumer = rx_map + off.rx.consumer;
rx->flags = rx_map + off.rx.flags;
rx->ring = rx_map + off.rx.desc;
rx->cached_prod = *rx->producer;
rx->cached_cons = *rx->consumer;
}
xsk->rx = rx;
@@ -688,7 +699,11 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
tx->consumer = tx_map + off.tx.consumer;
tx->flags = tx_map + off.tx.flags;
tx->ring = tx_map + off.tx.desc;
tx->cached_cons = xsk->config.tx_size;
tx->cached_prod = *tx->producer;
/* cached_cons is r->size bigger than the real consumer pointer
* See xsk_prod_nb_free
*/
tx->cached_cons = *tx->consumer + xsk->config.tx_size;
}
xsk->tx = tx;

View File

@@ -2,7 +2,7 @@
PHASES=(${@:-SETUP RUN RUN_ASAN CLEANUP})
DEBIAN_RELEASE="${DEBIAN_RELEASE:-testing}"
CONT_NAME="${CONT_NAME:-debian-$DEBIAN_RELEASE-$RANDOM}"
CONT_NAME="${CONT_NAME:-libbpf-debian-$DEBIAN_RELEASE}"
ENV_VARS="${ENV_VARS:-}"
DOCKER_RUN="${DOCKER_RUN:-docker run}"
REPO_ROOT="${REPO_ROOT:-$PWD}"
@@ -30,6 +30,10 @@ for phase in "${PHASES[@]}"; do
SETUP)
info "Setup phase"
info "Using Debian $DEBIAN_RELEASE"
sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce
docker --version
docker pull debian:$DEBIAN_RELEASE
info "Starting container $CONT_NAME"
$DOCKER_RUN -v $REPO_ROOT:/build:rw \
@@ -57,7 +61,7 @@ for phase in "${PHASES[@]}"; do
docker_exec mkdir build install
docker_exec ${CC:-cc} --version
info "build"
docker_exec make CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
docker_exec make -j$((4*$(nproc))) CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
info "ldd build/libbpf.so:"
docker_exec ldd build/libbpf.so
if ! docker_exec ldd build/libbpf.so | grep -q libelf; then
@@ -65,7 +69,7 @@ for phase in "${PHASES[@]}"; do
exit 1
fi
info "install"
docker_exec make -C src OBJDIR=../build DESTDIR=../install install
docker_exec make -j$((4*$(nproc))) -C src OBJDIR=../build DESTDIR=../install install
docker_exec rm -rf build install
;;
CLEANUP)

View File

@@ -17,11 +17,11 @@ cd $REPO_ROOT
CFLAGS="-g -O2 -Werror -Wall -fsanitize=address,undefined"
mkdir build install
cc --version
make CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
make -j$((4*$(nproc))) CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
ldd build/libbpf.so
if ! ldd build/libbpf.so | grep -q libelf; then
echo "FAIL: No reference to libelf.so in libbpf.so!"
exit 1
fi
make -C src OBJDIR=../build DESTDIR=../install install
make -j$((4*$(nproc))) -C src OBJDIR=../build DESTDIR=../install install
rm -rf build install

View File

@@ -0,0 +1,25 @@
#!/bin/bash
set -eux
CWD=$(pwd)
REPO_PATH=$1
PAHOLE_ORIGIN=https://git.kernel.org/pub/scm/devel/pahole/pahole.git
mkdir -p ${REPO_PATH}
cd ${REPO_PATH}
git init
git remote add origin ${PAHOLE_ORIGIN}
git fetch origin
git checkout master
mkdir -p build
cd build
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -D__LIB=lib ..
make -j$((4*$(nproc))) all
sudo make install
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:-}:/usr/local/lib
ldd $(which pahole)
pahole --version

View File

@@ -0,0 +1,35 @@
#!/bin/bash
set -euxo pipefail
LLVM_VER=12
LIBBPF_PATH="${REPO_ROOT}"
REPO_PATH="travis-ci/vmtest/bpf-next"
PREPARE_SELFTESTS_SCRIPT=${VMTEST_ROOT}/prepare_selftests-${KERNEL}.sh
if [ -f "${PREPARE_SELFTESTS_SCRIPT}" ]; then
(cd "${REPO_ROOT}/${REPO_PATH}/tools/testing/selftests/bpf" && ${PREPARE_SELFTESTS_SCRIPT})
fi
if [[ "${KERNEL}" = 'LATEST' ]]; then
VMLINUX_H=
else
VMLINUX_H=${VMTEST_ROOT}/vmlinux.h
fi
make \
CLANG=clang-${LLVM_VER} \
LLC=llc-${LLVM_VER} \
LLVM_STRIP=llvm-strip-${LLVM_VER} \
VMLINUX_BTF="${VMLINUX_BTF}" \
VMLINUX_H=${VMLINUX_H} \
-C "${REPO_ROOT}/${REPO_PATH}/tools/testing/selftests/bpf" \
-j $((2*$(nproc)))
mkdir ${LIBBPF_PATH}/selftests
cp -R "${REPO_ROOT}/${REPO_PATH}/tools/testing/selftests/bpf" \
${LIBBPF_PATH}/selftests
cd ${LIBBPF_PATH}
rm selftests/bpf/.gitignore
git add selftests
git add "${VMTEST_ROOT}/configs/blacklist"

View File

@@ -0,0 +1,38 @@
#!/bin/bash
set -eux
CWD=$(pwd)
LIBBPF_PATH=$(pwd)
REPO_PATH=$1
BPF_NEXT_ORIGIN=https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
LINUX_SHA=$(cat ${LIBBPF_PATH}/CHECKPOINT-COMMIT)
SNAPSHOT_URL=https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/snapshot/bpf-next-${LINUX_SHA}.tar.gz
echo REPO_PATH = ${REPO_PATH}
echo LINUX_SHA = ${LINUX_SHA}
if [ ! -d "${REPO_PATH}" ]; then
mkdir -p $(dirname "${REPO_PATH}")
cd $(dirname "${REPO_PATH}")
# attempt to fetch desired bpf-next repo snapshot
if wget ${SNAPSHOT_URL} ; then
tar xf bpf-next-${LINUX_SHA}.tar.gz
mv bpf-next-${LINUX_SHA} $(basename ${REPO_PATH})
else
# but fallback to git fetch approach if that fails
mkdir -p $(basename ${REPO_PATH})
cd $(basename ${REPO_PATH})
git init
git remote add bpf-next ${BPF_NEXT_ORIGIN}
# try shallow clone first
git fetch --depth 32 bpf-next
# check if desired SHA exists
if ! git cat-file -e ${LINUX_SHA}^{commit} ; then
# if not, fetch all of bpf-next; slow and painful
git fetch bpf-next
fi
git reset --hard ${LINUX_SHA}
fi
fi

View File

@@ -0,0 +1,8 @@
INDEX https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/INDEX
libbpf-vmtest-rootfs-2020.03.11.tar.zst https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/libbpf-vmtest-rootfs-2020.03.11.tar.zst
vmlinux-4.9.0.zst https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinux-4.9.0.zst
vmlinux-5.5.0-rc6.zst https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinux-5.5.0-rc6.zst
vmlinux-5.5.0.zst https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinux-5.5.0.zst
vmlinuz-5.5.0-rc6 https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinuz-5.5.0-rc6
vmlinuz-5.5.0 https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinuz-5.5.0
vmlinuz-4.9.0 https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinuz-4.9.0

View File

@@ -0,0 +1,65 @@
# PERMANENTLY DISABLED
align # verifier output format changed
bpf_iter # bpf_iter support is missing
bpf_obj_id # bpf_link support missing for GET_OBJ_INFO, GET_FD_BY_ID, etc
bpf_tcp_ca # STRUCT_OPS is missing
# latest Clang generates code that fails to verify
bpf_verif_scale
#bpf_verif_scale/strobemeta.o
#bpf_verif_scale/strobemeta_nounroll1.o
#bpf_verif_scale/strobemeta_nounroll2.o
btf_map_in_map # inner map leak fixed in 5.8
cg_storage_multi # v5.9+ functionality
cgroup_attach_multi # BPF_F_REPLACE_PROG missing
cgroup_link # LINK_CREATE is missing
cgroup_skb_sk_lookup # bpf_sk_lookup_tcp() helper is missing
connect_force_port # cgroup/get{peer,sock}name{4,6} support is missing
enable_stats # BPF_ENABLE_STATS support is missing
fentry_fexit # bpf_prog_test_tracing missing
fentry_test # bpf_prog_test_tracing missing
fexit_bpf2bpf # freplace is missing
fexit_test # bpf_prog_test_tracing missing
flow_dissector # bpf_link-based flow dissector is in 5.8+
flow_dissector_reattach
get_stack_raw_tp # exercising BPF verifier bug causing infinite loop
ksyms # __start_BTF has different name
link_pinning # bpf_link is missing
load_bytes_relative # new functionality in 5.8
map_ptr # test uses BPF_MAP_TYPE_RINGBUF, added in 5.8
mmap # 5.5 kernel is too permissive with re-mmaping
modify_return # fmod_ret support is missing
ns_current_pid_tgid # bpf_get_ns_current_pid_tgid() helper is missing
perf_branches # bpf_read_branch_records() helper is missing
ringbuf # BPF_MAP_TYPE_RINGBUF is supported in 5.8+
# bug in verifier w/ tracking references
#reference_tracking/classifier/sk_lookup_success
reference_tracking
select_reuseport # UDP support is missing
sk_assign # bpf_sk_assign helper missing
skb_helpers # helpers added in 5.8+
sockmap_basic # uses new socket fields, 5.8+
sockmap_listen # no listen socket supportin SOCKMAP
sockopt_sk
sk_lookup # v5.9+
skb_ctx # ctx_{size, }_{in, out} in BPF_PROG_TEST_RUN is missing
test_global_funcs # kernel doesn't support BTF linkage=global on FUNCs
test_lsm # no BPF_LSM support
test_overhead # no fmod_ret support
udp_limit # no cgroup/sock_release BPF program type (5.9+)
varlen # verifier bug fixed in later kernels
vmlinux # hrtimer_nanosleep() signature changed incompatibly
xdp_adjust_tail # new XDP functionality added in 5.8
xdp_attach # IFLA_XDP_EXPECTED_FD support is missing
xdp_bpf2bpf # freplace is missing
xdp_cpumap_attach # v5.9+
xdp_devmap_attach # new feature in 5.8
xdp_link # v5.9+
# TEMPORARILY DISABLED
send_signal # flaky
cls_redirect # latest Clang breaks BPF verification

View File

@@ -0,0 +1,7 @@
# TEMPORARILY DISABLED
send_signal # flaky
test_lsm # semi-working
sk_assign # needs better setup in Travis CI
sk_lookup
core_reloc # temporary test breakage
bpf_verif_scale # clang regression

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,7 @@
btf_dump
core_retro
cpu_mask
hashmap
perf_buffer
section_names

148
travis-ci/vmtest/mkrootfs.sh Executable file
View File

@@ -0,0 +1,148 @@
#!/bin/bash
# This script is based on drgn script for generating Arch Linux bootstrap
# images.
# https://github.com/osandov/drgn/blob/master/scripts/vmtest/mkrootfs.sh
set -euo pipefail
usage () {
USAGE_STRING="usage: $0 [NAME]
$0 -h
Build an Arch Linux root filesystem image for testing libbpf in a virtual
machine.
The image is generated as a zstd-compressed tarball.
This must be run as root, as most of the installation is done in a chroot.
Arguments:
NAME name of generated image file (default:
libbpf-vmtest-rootfs-\$DATE.tar.zst)
Options:
-h display this help message and exit"
case "$1" in
out)
echo "$USAGE_STRING"
exit 0
;;
err)
echo "$USAGE_STRING" >&2
exit 1
;;
esac
}
while getopts "h" OPT; do
case "$OPT" in
h)
usage out
;;
*)
usage err
;;
esac
done
if [[ $OPTIND -eq $# ]]; then
NAME="${!OPTIND}"
elif [[ $OPTIND -gt $# ]]; then
NAME="libbpf-vmtest-rootfs-$(date +%Y.%m.%d).tar.zst"
else
usage err
fi
pacman_conf=
root=
trap 'rm -rf "$pacman_conf" "$root"' EXIT
pacman_conf="$(mktemp -p "$PWD")"
cat > "$pacman_conf" << "EOF"
[options]
Architecture = x86_64
CheckSpace
SigLevel = Required DatabaseOptional
[core]
Include = /etc/pacman.d/mirrorlist
[extra]
Include = /etc/pacman.d/mirrorlist
[community]
Include = /etc/pacman.d/mirrorlist
EOF
root="$(mktemp -d -p "$PWD")"
packages=(
busybox
# libbpf dependencies.
libelf
zlib
# selftests test_progs dependencies.
binutils
elfutils
glibc
# selftests test_verifier dependencies.
libcap
)
pacstrap -C "$pacman_conf" -cGM "$root" "${packages[@]}"
# Remove unnecessary files from the chroot.
# We don't need the pacman databases anymore.
rm -rf "$root/var/lib/pacman/sync/"
# We don't need D, Fortran, or Go.
rm -f "$root/usr/lib/libgdruntime."* \
"$root/usr/lib/libgphobos."* \
"$root/usr/lib/libgfortran."* \
"$root/usr/lib/libgo."*
# We don't need any documentation.
rm -rf "$root/usr/share/{doc,help,man,texinfo}"
chroot "${root}" /bin/busybox --install
cat > "$root/etc/fstab" << "EOF"
dev /dev devtmpfs rw,nosuid 0 0
proc /proc proc rw,nosuid,nodev,noexec 0 0
sys /sys sysfs rw,nosuid,nodev,noexec 0 0
debugfs /sys/kernel/debug debugfs mode=755,realtime 0 0
bpffs /sys/fs/bpf bpf realtime 0 0
EOF
chmod 644 "$root/etc/fstab"
cat > "$root/etc/inittab" << "EOF"
::sysinit:/etc/init.d/rcS
::ctrlaltdel:/sbin/reboot
::shutdown:/sbin/swapoff -a
::shutdown:/bin/umount -a -r
::restart:/sbin/init
EOF
chmod 644 "$root/etc/inittab"
mkdir -m 755 "$root/etc/init.d" "$root/etc/rcS.d"
cat > "$root/etc/rcS.d/S10-mount" << "EOF"
#!/bin/sh
/bin/mount -a
EOF
chmod 755 "$root/etc/rcS.d/S10-mount"
cat > "$root/etc/rcS.d/S40-network" << "EOF"
#!/bin/sh
ip link set lo up
EOF
chmod 755 "$root/etc/rcS.d/S40-network"
cat > "$root/etc/init.d/rcS" << "EOF"
#!/bin/sh
for path in /etc/rcS.d/S*; do
[ -x "$path" ] && "$path"
done
EOF
chmod 755 "$root/etc/init.d/rcS"
chmod 755 "$root"
tar -C "$root" -c . | zstd -T0 -19 -o "$NAME"
chmod 644 "$NAME"

View File

@@ -0,0 +1,14 @@
#!/bin/bash
set -eux
REPO_PATH=$1
${VMTEST_ROOT}/checkout_latest_kernel.sh ${REPO_PATH}
cd ${REPO_PATH}
if [[ "${KERNEL}" = 'LATEST' ]]; then
cp ${VMTEST_ROOT}/configs/latest.config .config
make -j $((4*$(nproc))) olddefconfig all
fi

438
travis-ci/vmtest/run.sh Executable file
View File

@@ -0,0 +1,438 @@
#!/bin/bash
set -uo pipefail
trap 'exit 2' ERR
usage () {
USAGE_STRING="usage: $0 [-k KERNELRELEASE|-b DIR] [[-r ROOTFSVERSION] [-fo]|-I] [-Si] [-d DIR] IMG
$0 [-k KERNELRELEASE] -l
$0 -h
Run "${PROJECT_NAME}" tests in a virtual machine.
This exits with status 0 on success, 1 if the virtual machine ran successfully
but tests failed, and 2 if we encountered a fatal error.
This script uses sudo to mount and modify the disk image.
Arguments:
IMG path of virtual machine disk image to create
Versions:
-k, --kernel=KERNELRELEASE
kernel release to test. This is a glob pattern; the
newest (sorted by version number) release that matches
the pattern is used (default: newest available release)
-b, --build DIR use the kernel built in the given directory. This option
cannot be combined with -k
-r, --rootfs=ROOTFSVERSION
version of root filesystem to use (default: newest
available version)
Setup:
-f, --force overwrite IMG if it already exists
-o, --one-shot one-shot mode. By default, this script saves a clean copy
of the downloaded root filesystem image and vmlinux and
makes a copy (reflinked, when possible) for executing the
virtual machine. This allows subsequent runs to skip
downloading these files. If this option is given, the
root filesystem image and vmlinux are always
re-downloaded and are not saved. This option implies -f
-s, --setup-cmd setup commands run on VM boot. Whitespace characters
should be escaped with preceding '\'.
-I, --skip-image skip creating the disk image; use the existing one at
IMG. This option cannot be combined with -r, -f, or -o
-S, --skip-source skip copying the source files and init scripts
Miscellaneous:
-i, --interactive interactive mode. Boot the virtual machine into an
interactive shell instead of automatically running tests
-d, --dir=DIR working directory to use for downloading and caching
files (default: current working directory)
-l, --list list available kernel releases instead of running tests.
The list may be filtered with -k
-h, --help display this help message and exit"
case "$1" in
out)
echo "$USAGE_STRING"
exit 0
;;
err)
echo "$USAGE_STRING" >&2
exit 2
;;
esac
}
TEMP=$(getopt -o 'k:b:r:fos:ISid:lh' --long 'kernel:,build:,rootfs:,force,one-shot,setup-cmd,skip-image,skip-source:,interactive,dir:,list,help' -n "$0" -- "$@")
eval set -- "$TEMP"
unset TEMP
unset KERNELRELEASE
unset BUILDDIR
unset ROOTFSVERSION
unset IMG
unset SETUPCMD
FORCE=0
ONESHOT=0
SKIPIMG=0
SKIPSOURCE=0
APPEND=""
DIR="$PWD"
LIST=0
while true; do
case "$1" in
-k|--kernel)
KERNELRELEASE="$2"
shift 2
;;
-b|--build)
BUILDDIR="$2"
shift 2
;;
-r|--rootfs)
ROOTFSVERSION="$2"
shift 2
;;
-f|--force)
FORCE=1
shift
;;
-o|--one-shot)
ONESHOT=1
FORCE=1
shift
;;
-s|--setup-cmd)
SETUPCMD="$2"
shift 2
;;
-I|--skip-image)
SKIPIMG=1
shift
;;
-S|--skip-source)
SKIPSOURCE=1
shift
;;
-i|--interactive)
APPEND=" single"
shift
;;
-d|--dir)
DIR="$2"
shift 2
;;
-l|--list)
LIST=1
;;
-h|--help)
usage out
;;
--)
shift
break
;;
*)
usage err
;;
esac
done
if [[ -v BUILDDIR ]]; then
if [[ -v KERNELRELEASE ]]; then
usage err
fi
elif [[ ! -v KERNELRELEASE ]]; then
KERNELRELEASE='*'
fi
if [[ $SKIPIMG -ne 0 && ( -v ROOTFSVERSION || $FORCE -ne 0 ) ]]; then
usage err
fi
if (( LIST )); then
if [[ $# -ne 0 || -v BUILDDIR || -v ROOTFSVERSION || $FORCE -ne 0 ||
$SKIPIMG -ne 0 || $SKIPSOURCE -ne 0 || -n $APPEND ]]; then
usage err
fi
else
if [[ $# -ne 1 ]]; then
usage err
fi
IMG="${!OPTIND}"
fi
unset URLS
cache_urls() {
if ! declare -p URLS &> /dev/null; then
# This URL contains a mapping from file names to URLs where
# those files can be downloaded.
declare -gA URLS
while IFS=$'\t' read -r name url; do
URLS["$name"]="$url"
done < <(cat "${VMTEST_ROOT}/configs/INDEX")
fi
}
matching_kernel_releases() {
local pattern="$1"
{
for file in "${!URLS[@]}"; do
if [[ $file =~ ^vmlinux-(.*).zst$ ]]; then
release="${BASH_REMATCH[1]}"
case "$release" in
$pattern)
# sort -V handles rc versions properly
# if we use "~" instead of "-".
echo "${release//-rc/~rc}"
;;
esac
fi
done
} | sort -rV | sed 's/~rc/-rc/g'
}
newest_rootfs_version() {
{
for file in "${!URLS[@]}"; do
if [[ $file =~ ^${PROJECT_NAME}-vmtest-rootfs-(.*)\.tar\.zst$ ]]; then
echo "${BASH_REMATCH[1]}"
fi
done
} | sort -rV | head -1
}
download() {
local file="$1"
cache_urls
if [[ ! -v URLS[$file] ]]; then
echo "$file not found" >&2
return 1
fi
echo "Downloading $file..." >&2
curl -Lf "${URLS[$file]}" "${@:2}"
}
set_nocow() {
touch "$@"
chattr +C "$@" >/dev/null 2>&1 || true
}
cp_img() {
set_nocow "$2"
cp --reflink=auto "$1" "$2"
}
create_rootfs_img() {
local path="$1"
set_nocow "$path"
truncate -s 2G "$path"
mkfs.ext4 -q "$path"
}
download_rootfs() {
local rootfsversion="$1"
local dir="$2"
download "${PROJECT_NAME}-vmtest-rootfs-$rootfsversion.tar.zst" |
zstd -d | sudo tar -C "$dir" -x
}
if (( LIST )); then
cache_urls
matching_kernel_releases "$KERNELRELEASE"
exit 0
fi
if [[ $FORCE -eq 0 && $SKIPIMG -eq 0 && -e $IMG ]]; then
echo "$IMG already exists; use -f to overwrite it or -I to reuse it" >&2
exit 1
fi
# Only go to the network if it's actually a glob pattern.
if [[ -v BUILDDIR ]]; then
KERNELRELEASE="$(make -C "$BUILDDIR" -s kernelrelease)"
elif [[ ! $KERNELRELEASE =~ ^([^\\*?[]|\\[*?[])*\\?$ ]]; then
# We need to cache the list of URLs outside of the command
# substitution, which happens in a subshell.
cache_urls
KERNELRELEASE="$(matching_kernel_releases "$KERNELRELEASE" | head -1)"
if [[ -z $KERNELRELEASE ]]; then
echo "No matching kernel release found" >&2
exit 1
fi
fi
if [[ $SKIPIMG -eq 0 && ! -v ROOTFSVERSION ]]; then
cache_urls
ROOTFSVERSION="$(newest_rootfs_version)"
fi
echo "Kernel release: $KERNELRELEASE" >&2
if (( SKIPIMG )); then
echo "Not extracting root filesystem" >&2
else
echo "Root filesystem version: $ROOTFSVERSION" >&2
fi
echo "Disk image: $IMG" >&2
tmp=
ARCH_DIR="$DIR/x86_64"
mkdir -p "$ARCH_DIR"
mnt="$(mktemp -d -p "$DIR" mnt.XXXXXXXXXX)"
cleanup() {
if [[ -n $tmp ]]; then
rm -f "$tmp" || true
fi
if mountpoint -q "$mnt"; then
sudo umount "$mnt" || true
fi
if [[ -d "$mnt" ]]; then
rmdir "$mnt" || true
fi
}
trap cleanup EXIT
if [[ -v BUILDDIR ]]; then
vmlinuz="$BUILDDIR/$(make -C "$BUILDDIR" -s image_name)"
else
vmlinuz="${ARCH_DIR}/vmlinuz-${KERNELRELEASE}"
if [[ ! -e $vmlinuz ]]; then
tmp="$(mktemp "$vmlinuz.XXX.part")"
download "vmlinuz-${KERNELRELEASE}" -o "$tmp"
mv "$tmp" "$vmlinuz"
tmp=
fi
fi
# Mount and set up the rootfs image.
if (( ONESHOT )); then
rm -f "$IMG"
create_rootfs_img "$IMG"
sudo mount -o loop "$IMG" "$mnt"
download_rootfs "$ROOTFSVERSION" "$mnt"
else
if (( ! SKIPIMG )); then
rootfs_img="${ARCH_DIR}/${PROJECT_NAME}-vmtest-rootfs-${ROOTFSVERSION}.img"
if [[ ! -e $rootfs_img ]]; then
tmp="$(mktemp "$rootfs_img.XXX.part")"
set_nocow "$tmp"
truncate -s 2G "$tmp"
mkfs.ext4 -q "$tmp"
sudo mount -o loop "$tmp" "$mnt"
download_rootfs "$ROOTFSVERSION" "$mnt"
sudo umount "$mnt"
mv "$tmp" "$rootfs_img"
tmp=
fi
rm -f "$IMG"
cp_img "$rootfs_img" "$IMG"
fi
sudo mount -o loop "$IMG" "$mnt"
fi
# Install vmlinux.
vmlinux="$mnt/boot/vmlinux-${KERNELRELEASE}"
if [[ -v BUILDDIR || $ONESHOT -eq 0 ]]; then
if [[ -v BUILDDIR ]]; then
source_vmlinux="${BUILDDIR}/vmlinux"
else
source_vmlinux="${ARCH_DIR}/vmlinux-${KERNELRELEASE}"
if [[ ! -e $source_vmlinux ]]; then
tmp="$(mktemp "$source_vmlinux.XXX.part")"
download "vmlinux-${KERNELRELEASE}.zst" | zstd -dfo "$tmp"
mv "$tmp" "$source_vmlinux"
tmp=
fi
fi
echo "Copying vmlinux..." >&2
sudo rsync -cp --chmod 0644 "$source_vmlinux" "$vmlinux"
else
# We could use "sudo zstd -o", but let's not run zstd as root with
# input from the internet.
download "vmlinux-${KERNELRELEASE}.zst" |
zstd -d | sudo tee "$vmlinux" > /dev/null
sudo chmod 644 "$vmlinux"
fi
LIBBPF_PATH="${REPO_ROOT}" \
REPO_PATH="travis-ci/vmtest/bpf-next" \
VMTEST_ROOT="${VMTEST_ROOT}" \
VMLINUX_BTF=${vmlinux} ${VMTEST_ROOT}/build_selftests.sh
if (( SKIPSOURCE )); then
echo "Not copying source files..." >&2
else
echo "Copying source files..." >&2
# Copy the source files in.
sudo mkdir -p -m 0755 "$mnt/${PROJECT_NAME}"
{
if [[ -e .git ]]; then
git ls-files -z
else
tr '\n' '\0' < "${PROJECT_NAME}.egg-info/SOURCES.txt"
fi
} | sudo rsync --files-from=- -0cpt . "$mnt/${PROJECT_NAME}"
fi
setup_script="#!/bin/sh
echo 'Skipping setup commands'
echo 0 > /exitstatus
chmod 644 /exitstatus"
# Create the init scripts.
if [[ ! -z SETUPCMD ]]; then
# Unescape whitespace characters.
setup_cmd=$(sed 's/\(\\\)\([[:space:]]\)/\2/g' <<< "${SETUPCMD}")
kernel="${KERNELRELEASE}"
if [[ -v BUILDDIR ]]; then kernel='latest'; fi
setup_envvars="export KERNEL=${kernel}"
setup_script=$(printf "#!/bin/sh
set -e
echo 'Running setup commands'
%s
%s
echo $? > /exitstatus
chmod 644 /exitstatus" "${setup_envvars}" "${setup_cmd}")
fi
echo "${setup_script}" | sudo tee "$mnt/etc/rcS.d/S50-run-tests" > /dev/null
sudo chmod 755 "$mnt/etc/rcS.d/S50-run-tests"
poweroff_script="#!/bin/sh
poweroff"
echo "${poweroff_script}" | sudo tee "$mnt/etc/rcS.d/S99-poweroff" > /dev/null
sudo chmod 755 "$mnt/etc/rcS.d/S99-poweroff"
sudo umount "$mnt"
echo "Starting virtual machine..." >&2
qemu-system-x86_64 -nodefaults -display none -serial mon:stdio \
-cpu kvm64 -enable-kvm -smp "$(nproc)" -m 2G \
-drive file="$IMG",format=raw,index=1,media=disk,if=virtio,cache=none \
-kernel "$vmlinuz" -append "root=/dev/vda rw console=ttyS0,115200$APPEND"
sudo mount -o loop "$IMG" "$mnt"
if exitstatus="$(cat "$mnt/exitstatus" 2>/dev/null)"; then
printf '\nTests exit status: %s\n' "$exitstatus" >&2
else
printf '\nCould not read tests exit status\n' >&2
exitstatus=1
fi
sudo umount "$mnt"
exit "$exitstatus"

View File

@@ -0,0 +1,43 @@
#!/bin/bash
set -euxo pipefail
test_progs() {
if [[ "${KERNEL}" != '4.9.0' ]]; then
echo TEST_PROGS
./test_progs ${BLACKLIST:+-b$BLACKLIST} ${WHITELIST:+-t$WHITELIST}
fi
echo TEST_PROGS-NO_ALU32
./test_progs-no_alu32 ${BLACKLIST:+-b$BLACKLIST} ${WHITELIST:+-t$WHITELIST}
}
test_maps() {
echo TEST_MAPS
./test_maps
}
test_verifier() {
echo TEST_VERIFIER
./test_verifier
}
configs_path='libbpf/travis-ci/vmtest/configs'
blacklist_path="$configs_path/blacklist/BLACKLIST-${KERNEL}"
if [[ -s "${blacklist_path}" ]]; then
BLACKLIST=$(cat "${blacklist_path}" | cut -d'#' -f1 | tr -s '[:space:]' ',')
fi
whitelist_path="$configs_path/whitelist/WHITELIST-${KERNEL}"
if [[ -s "${whitelist_path}" ]]; then
WHITELIST=$(cat "${whitelist_path}" | cut -d'#' -f1 | tr -s '[:space:]' ',')
fi
cd libbpf/selftests/bpf
test_progs
if [[ "${KERNEL}" == 'latest' ]]; then
test_maps
test_verifier
fi

30
travis-ci/vmtest/run_vmtest.sh Executable file
View File

@@ -0,0 +1,30 @@
#!/bin/bash
set -eux
VMTEST_SETUPCMD="PROJECT_NAME=${PROJECT_NAME} ./${PROJECT_NAME}/travis-ci/vmtest/run_selftests.sh"
echo "KERNEL: $KERNEL"
# Build latest pahole
${VMTEST_ROOT}/build_pahole.sh travis-ci/vmtest/pahole
# Install required packages
wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
echo "deb http://apt.llvm.org/bionic/ llvm-toolchain-bionic main" | sudo tee -a /etc/apt/sources.list
sudo apt-get update
sudo apt-get -y install clang-12 lld-12 llvm-12
# Build selftests (and latest kernel, if necessary)
KERNEL="${KERNEL}" ${VMTEST_ROOT}/prepare_selftests.sh travis-ci/vmtest/bpf-next
# Escape whitespace characters.
setup_cmd=$(sed 's/\([[:space:]]\)/\\\1/g' <<< "${VMTEST_SETUPCMD}")
sudo adduser "${USER}" kvm
if [[ "${KERNEL}" = 'LATEST' ]]; then
sudo -E sudo -E -u "${USER}" "${VMTEST_ROOT}/run.sh" -b travis-ci/vmtest/bpf-next -o -d ~ -s "${setup_cmd}" ~/root.img;
else
sudo -E sudo -E -u "${USER}" "${VMTEST_ROOT}/run.sh" -k "${KERNEL}*" -o -d ~ -s "${setup_cmd}" ~/root.img;
fi

82043
travis-ci/vmtest/vmlinux.h Normal file

File diff suppressed because it is too large Load Diff