Compare commits

...

323 Commits

Author SHA1 Message Date
Julia Kartseva
583bddce6b vmtest: build and run bpf kernel selftests against various kernels
Run kernel selftests in vmtest with the goal to test libbpf backward
compatibility with older kernels.
The list of kernels should be specified in .travis.yml config in
`jobs` section, e.g. KERNEL=5.5.0.
Enlisted kernel releases
- 5.5.0 # built from main
- 5.5.0-rc6 # built from bpf-next
- LATEST

The kernel specified as 'LATEST' in .travis.yml is built from bpf-next kernel
tree, the rest of the kernels are downloaded from the specified in INDEX file.
The kernel sources from bpf-next are manually patched with [1] from bpf tree to
fix ranqslower build. This workaround should be removed after the patch is merged
from bpf to bpf-next tree.
Due to kernel sources being checked out the duration of the LATEST kernel test is
~30m.

bpf selftests are built from tools/testing/selftests/bpf/ of bpf-next tree with
HEAD revision set to CHECKPOINT-COMMIT specified in libbpf so selftests and
libbpf are in sync.
Currently only programs are tested with test_progs program, test_maps and
test_verifier should follow.
test_progs are run with blacklist required due to:
- some features, e.g. fentry/fexit are not supported in older kernels
- environment limitations, e.g an absence of the recent pahole in Debian
- incomplete disk image

The blacklist is passed to test_progs with -b option as specified in [2]
patch set.

Most of the preceeding tests are disabled due to incomplete disk image currenly
lacking proper networking settings.
For the LATEST kernel fome fentry/fexit tests are disabled due to pahole v1.16
is not abailible in Debian yet.

Next steps are resolving issues with blacklisted tests, enabling maps and
verifier testing, expanding the list of tested kernels.

[1] https://lore.kernel.org/bpf/908498f794661c44dca54da9e09dc0c382df6fcb.1580425879.git.hex@fb.com/t.mbox.gz
[2] https://www.spinics.net/lists/netdev/msg625192.html
2020-02-17 22:12:17 -08:00
Julia Kartseva
a52fb86a96 vmtest: add configs for bpf kernel selftests
vmtest is run as a TravisCI job in order to test libbpf backward compatibility
with the older kernels

Add config files required to build and run bpf kernel selftests in vmtest:
- latest.config: latest kernel config
- INDEX: links to binaries (kernels, disk image) to download
- blacklist/BLACKLIST-${kernel}: blacklisted bpf program tests for ${kernel}
2020-02-17 22:12:17 -08:00
Andrii Nakryiko
e5dbc1a96f sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   a6ed02cac690b635dbb938690e795812ce1e14ca
Checkpoint bpf-next commit: 35b9211c0a2427e8f39e534f442f43804fc8d5ca
Baseline bpf commit:        1712b2fff8c682d145c7889d2290696647d82dab
Checkpoint bpf commit:      08dc225d8868d5094ada62f471ebdfcce9dbc298

Alexei Starovoitov (1):
  libbpf: Add support for program extensions

Andrii Nakryiko (2):
  libbpf: Improve handling of failed CO-RE relocations
  libbpf: Fix realloc usage in bpf_core_find_cands

Antoine Tenart (1):
  net: macsec: introduce the macsec_context structure

Martin KaFai Lau (1):
  bpf: Sync uapi bpf.h to tools/

 include/uapi/linux/bpf.h     |  10 +++-
 include/uapi/linux/if_link.h |   7 +++
 src/bpf.c                    |   3 +-
 src/libbpf.c                 | 112 +++++++++++++++++++++--------------
 src/libbpf.h                 |   8 ++-
 src/libbpf.map               |   2 +
 src/libbpf_probes.c          |   1 +
 7 files changed, 97 insertions(+), 46 deletions(-)

--
2.17.1
2020-01-24 14:08:27 -08:00
Andrii Nakryiko
96333403ca sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-01-24 14:08:27 -08:00
Andrii Nakryiko
928f2fc146 libbpf: Fix realloc usage in bpf_core_find_cands
Fix bug requesting invalid size of reallocated array when constructing CO-RE
relocation candidate list. This can cause problems if there are many potential
candidates and a very fine-grained memory allocator bucket sizes are used.

Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm")
Reported-by: William Smith <williampsmith@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200124201847.212528-1-andriin@fb.com
2020-01-24 14:08:27 -08:00
Andrii Nakryiko
8fd8b5bb46 libbpf: Improve handling of failed CO-RE relocations
Previously, if libbpf failed to resolve CO-RE relocation for some
instructions, it would either return error immediately, or, if
.relaxed_core_relocs option was set, would replace relocatable offset/imm part
of an instruction with a bogus value (-1). Neither approach is good, because
there are many possible scenarios where relocation is expected to fail (e.g.,
when some field knowingly can be missing on specific kernel versions). On the
other hand, replacing offset with invalid one can hide programmer errors, if
this relocation failue wasn't anticipated.

This patch deprecates .relaxed_core_relocs option and changes the approach to
always replacing instruction, for which relocation failed, with invalid BPF
helper call instruction. For cases where this is expected, BPF program should
already ensure that that instruction is unreachable, in which case this
invalid instruction is going to be silently ignored. But if instruction wasn't
guarded, BPF program will be rejected at verification step with verifier log
pointing precisely to the place in assembly where the problem is.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200124053837.2434679-1-andriin@fb.com
2020-01-24 14:08:27 -08:00
Martin KaFai Lau
b999e8f2c1 bpf: Sync uapi bpf.h to tools/
This patch sync uapi bpf.h to tools/.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200122233652.903348-1-kafai@fb.com
2020-01-24 14:08:27 -08:00
Alexei Starovoitov
c6c86a53f2 libbpf: Add support for program extensions
Add minimal support for program extensions. bpf_object_open_opts() needs to be
called with attach_prog_fd = target_prog_fd and BPF program extension needs to
have in .c file section definition like SEC("freplace/func_to_be_replaced").
libbpf will search for "func_to_be_replaced" in the target_prog_fd's BTF and
will pass it in attach_btf_id to the kernel. This approach works for tests, but
more compex use case may need to request function name (and attach_btf_id that
kernel sees) to be more dynamic. Such API will be added in future patches.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200121005348.2769920-3-ast@kernel.org
2020-01-24 14:08:27 -08:00
Antoine Tenart
c69f0d12f3 net: macsec: introduce the macsec_context structure
This patch introduces the macsec_context structure. It will be used
in the kernel to exchange information between the common MACsec
implementation (macsec.c) and the MACsec hardware offloading
implementations. This structure contains pointers to MACsec specific
structures which contain the actual MACsec configuration, and to the
underlying device (phydev for now).

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24 14:08:27 -08:00
Andrii Nakryiko
033ad7ee78 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   20f21d98cf12b8ecd69e8defc93fae9e3b353b13
Checkpoint bpf-next commit: a6ed02cac690b635dbb938690e795812ce1e14ca
Baseline bpf commit:        1712b2fff8c682d145c7889d2290696647d82dab
Checkpoint bpf commit:      1712b2fff8c682d145c7889d2290696647d82dab

Andrii Nakryiko (3):
  libbpf: Fix error handling bug in btf_dump__new
  libbpf: Simplify BTF initialization logic
  libbpf: Fix potential multiplication overflow in mmap() size
    calculation

KP Singh (1):
  libbpf: Load btf_vmlinux only once per object.

 src/btf_dump.c |   1 +
 src/libbpf.c   | 174 ++++++++++++++++++++++++++++++-------------------
 2 files changed, 109 insertions(+), 66 deletions(-)

--
2.17.1
2020-01-17 20:33:22 -08:00
KP Singh
397db2175d libbpf: Load btf_vmlinux only once per object.
As more programs (TRACING, STRUCT_OPS, and upcoming LSM) use vmlinux
BTF information, loading the BTF vmlinux information for every program
in an object is sub-optimal. The fix was originally proposed in:

   https://lore.kernel.org/bpf/CAEf4BzZodr3LKJuM7QwD38BiEH02Cc1UbtnGpVkCJ00Mf+V_Qg@mail.gmail.com/

The btf_vmlinux is populated in the object if any of the programs in
the object requires it just before the programs are loaded and freed
after the programs finish loading.

Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Brendan Jackman <jackmanb@chromium.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200117212825.11755-1-kpsingh@chromium.org
2020-01-17 20:33:22 -08:00
Andrii Nakryiko
6756bdc96e libbpf: Fix potential multiplication overflow in mmap() size calculation
Prevent potential overflow performed in 32-bit integers, before assigning
result to size_t. Reported by LGTM static analysis.

Fixes: eba9c5f498a1 ("libbpf: Refactor global data map initialization")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200117060801.1311525-4-andriin@fb.com
2020-01-17 20:33:22 -08:00
Andrii Nakryiko
9c1ae55dbd libbpf: Simplify BTF initialization logic
Current implementation of bpf_object's BTF initialization is very convoluted
and thus prone to errors. It doesn't have to be like that. This patch
simplifies it significantly.

This code also triggered static analysis issues over logically dead code due
to redundant error checks. This simplification should fix that as well.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200117060801.1311525-3-andriin@fb.com
2020-01-17 20:33:22 -08:00
Andrii Nakryiko
091f073ff0 libbpf: Fix error handling bug in btf_dump__new
Fix missing jump to error handling in btf_dump__new, found by Coverity static
code analysis.

Fixes: 9f81654eebe8 ("libbpf: Expose BTF-to-C type declaration emitting API")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200117060801.1311525-2-andriin@fb.com
2020-01-17 20:33:22 -08:00
Andrii Nakryiko
ad51a528dc travis-ci: make sure before_script override is non-empty
Travis CI seems to be ignoring empty before_script override. Let's make sure
it's a non-empty no-op.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-01-17 20:00:47 -08:00
Andrii Nakryiko
fa29cc01ff sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   858e284f0ec18bff2620d9a6afe764dc683f8ba1
Checkpoint bpf-next commit: 20f21d98cf12b8ecd69e8defc93fae9e3b353b13
Baseline bpf commit:        1712b2fff8c682d145c7889d2290696647d82dab
Checkpoint bpf commit:      1712b2fff8c682d145c7889d2290696647d82dab

Andrii Nakryiko (1):
  libbpf: Revert bpf_helper_defs.h inclusion regression

 src/bpf_helpers.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--
2.17.1
2020-01-16 20:06:41 -08:00
Andrii Nakryiko
a2bec08412 libbpf: Revert bpf_helper_defs.h inclusion regression
Revert bpf_helpers.h's change to include auto-generated bpf_helper_defs.h
through <> instead of "", which causes it to be searched in include path. This
can break existing applications that don't have their include path pointing
directly to where libbpf installs its headers.

There is ongoing work to make all (not just bpf_helper_defs.h) includes more
consistent across libbpf and its consumers, but this unbreaks user code as is
right now without any regressions. Selftests still behave sub-optimally
(taking bpf_helper_defs.h from libbpf's source directory, if it's present
there), which will be fixed in subsequent patches.

Fixes: 6910d7d3867a ("selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir")
Reported-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200117004103.148068-1-andriin@fb.com
2020-01-16 20:06:41 -08:00
Andrii Nakryiko
080fd68e9c sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   1d1a3bcffe360a56fd8cc287ed74d4c3066daf42
Checkpoint bpf-next commit: 858e284f0ec18bff2620d9a6afe764dc683f8ba1
Baseline bpf commit:        e7a5f1f1cd0008e5ad379270a8657e121eedb669
Checkpoint bpf commit:      1712b2fff8c682d145c7889d2290696647d82dab

Andrii Nakryiko (2):
  tools: Sync uapi/linux/if_link.h
  libbpf: Support .text sub-calls relocations

Brian Vazquez (1):
  libbpf: Fix unneeded extra initialization in bpf_map_batch_common

Martin KaFai Lau (1):
  libbpf: Expose bpf_find_kernel_btf as a LIBBPF_API

Yonghong Song (3):
  bpf: Add bpf_send_signal_thread() helper
  tools/bpf: Sync uapi header bpf.h
  libbpf: Add libbpf support to batch ops

 include/uapi/linux/bpf.h     |  40 +++++++++++-
 include/uapi/linux/if_link.h |   1 +
 src/bpf.c                    |  58 +++++++++++++++++
 src/bpf.h                    |  22 +++++++
 src/btf.c                    | 102 +++++++++++++++++++++++++++--
 src/btf.h                    |   2 +
 src/libbpf.c                 | 122 +++++++----------------------------
 src/libbpf.map               |   5 ++
 8 files changed, 247 insertions(+), 105 deletions(-)

--
2.17.1
2020-01-16 10:38:48 -08:00
Andrii Nakryiko
437f57042c sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-01-16 10:38:48 -08:00
Brian Vazquez
f4f271b068 libbpf: Fix unneeded extra initialization in bpf_map_batch_common
bpf_attr doesn't required to be declared with '= {}' as memset is used
in the code.

Fixes: 2ab3d86ea1859 ("libbpf: Add libbpf support to batch ops")
Reported-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200116045918.75597-1-brianvv@google.com
2020-01-16 10:38:48 -08:00
Martin KaFai Lau
bd35a43bb3 libbpf: Expose bpf_find_kernel_btf as a LIBBPF_API
This patch exposes bpf_find_kernel_btf() as a LIBBPF_API.
It will be used in 'bpftool map dump' in a following patch
to dump a map with btf_vmlinux_value_type_id set.

bpf_find_kernel_btf() is renamed to libbpf_find_kernel_btf()
and moved to btf.c.  As <linux/kernel.h> is included,
some of the max/min type casting needs to be fixed.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200115230031.1102305-1-kafai@fb.com
2020-01-16 10:38:48 -08:00
Yonghong Song
d91f681d3b libbpf: Add libbpf support to batch ops
Added four libbpf API functions to support map batch operations:
  . int bpf_map_delete_batch( ... )
  . int bpf_map_lookup_batch( ... )
  . int bpf_map_lookup_and_delete_batch( ... )
  . int bpf_map_update_batch( ... )

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-8-brianvv@google.com
2020-01-16 10:38:48 -08:00
Yonghong Song
1e51491d05 tools/bpf: Sync uapi header bpf.h
sync uapi header include/uapi/linux/bpf.h to
tools/include/uapi/linux/bpf.h

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-7-brianvv@google.com
2020-01-16 10:38:48 -08:00
Andrii Nakryiko
37440e95d1 libbpf: Support .text sub-calls relocations
The LLVM patch https://reviews.llvm.org/D72197 makes LLVM emit function call
relocations within the same section. This includes a default .text section,
which contains any BPF sub-programs. This wasn't the case before and so libbpf
was able to get a way with slightly simpler handling of subprogram call
relocations.

This patch adds support for .text section relocations. It needs to ensure
correct order of relocations, so does two passes:
- first, relocate .text instructions, if there are any relocations in it;
- then process all the other programs and copy over patched .text instructions
for all sub-program calls.

v1->v2:
- break early once .text program is processed.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115190856.2391325-1-andriin@fb.com
2020-01-16 10:38:48 -08:00
Yonghong Song
681f2f9291 bpf: Add bpf_send_signal_thread() helper
Commit 8b401f9ed244 ("bpf: implement bpf_send_signal() helper")
added helper bpf_send_signal() which permits bpf program to
send a signal to the current process. The signal may be
delivered to any threads in the process.

We found a use case where sending the signal to the current
thread is more preferable.
  - A bpf program will collect the stack trace and then
    send signal to the user application.
  - The user application will add some thread specific
    information to the just collected stack trace for
    later analysis.

If bpf_send_signal() is used, user application will need
to check whether the thread receiving the signal matches
the thread collecting the stack by checking thread id.
If not, it will need to send signal to another thread
through pthread_kill().

This patch proposed a new helper bpf_send_signal_thread(),
which sends the signal to the thread corresponding to
the current kernel task. This way, user space is guaranteed that
bpf_program execution context and user space signal handling
context are the same thread.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115035002.602336-1-yhs@fb.com
2020-01-16 10:38:48 -08:00
Andrii Nakryiko
0e4638ec14 tools: Sync uapi/linux/if_link.h
Sync uapi/linux/if_link.h into tools to avoid out of sync warnings during
libbpf build.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200113073143.1779940-2-andriin@fb.com
2020-01-16 10:38:48 -08:00
hex
234a45a128 Update README with Distribution section
- List of current distros having libbpf packaged from GH
- Rationale of having libbpf packaged from GH
- List of package dependencies
2020-01-14 16:12:42 -08:00
Andrii Nakryiko
8687395198 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   2bbc078f812d45b8decb55935dab21199bd21489
Checkpoint bpf-next commit: 1d1a3bcffe360a56fd8cc287ed74d4c3066daf42
Baseline bpf commit:        3c2f450e553ce47fcb0d6141807a8858e3213c9c
Checkpoint bpf commit:      e7a5f1f1cd0008e5ad379270a8657e121eedb669

Alexei Starovoitov (1):
  libbpf: Sanitize global functions

Andrey Ignatov (1):
  bpf: Document BPF_F_QUERY_EFFECTIVE flag

Andrii Nakryiko (3):
  libbpf: Make bpf_map order and indices stable
  selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir
  libbpf: Poison kernel-only integer types

Martin KaFai Lau (2):
  bpf: Synch uapi bpf.h to tools/
  bpf: libbpf: Add STRUCT_OPS support

Michal Rostecki (1):
  libbpf: Add probe for large INSN limit

 include/uapi/linux/bpf.h |  26 +-
 include/uapi/linux/btf.h |   6 +
 src/bpf.c                |  13 +-
 src/bpf.h                |   5 +-
 src/bpf_helpers.h        |   2 +-
 src/bpf_prog_linfo.c     |   3 +
 src/btf.c                |   3 +
 src/btf_dump.c           |   3 +
 src/hashmap.c            |   3 +
 src/libbpf.c             | 701 +++++++++++++++++++++++++++++++++++++--
 src/libbpf.h             |   6 +-
 src/libbpf.map           |   4 +
 src/libbpf_errno.c       |   3 +
 src/libbpf_probes.c      |  26 ++
 src/netlink.c            |   3 +
 src/nlattr.c             |   3 +
 src/str_error.c          |   3 +
 src/xsk.c                |   3 +
 18 files changed, 784 insertions(+), 32 deletions(-)

--
2.17.1
2020-01-10 11:15:12 -08:00
Andrii Nakryiko
0cccc9ff28 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-01-10 11:15:12 -08:00
Andrii Nakryiko
b50eb28758 libbpf: Poison kernel-only integer types
It's been a recurring issue with types like u32 slipping into libbpf source
code accidentally. This is not detected during builds inside kernel source
tree, but becomes a compilation error in libbpf's Github repo. Libbpf is
supposed to use only __{s,u}{8,16,32,64} typedefs, so poison {s,u}{8,16,32,64}
explicitly in every .c file. Doing that in a bit more centralized way, e.g.,
inside libbpf_internal.h breaks selftests, which are both using kernel u32 and
libbpf_internal.h.

This patch also fixes a new u32 occurence in libbpf.c, added recently.

Fixes: 590a00888250 ("bpf: libbpf: Add STRUCT_OPS support")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200110181916.271446-1-andriin@fb.com
2020-01-10 11:15:12 -08:00
Alexei Starovoitov
8d936a1570 libbpf: Sanitize global functions
In case the kernel doesn't support BTF_FUNC_GLOBAL sanitize BTF produced by the
compiler for global functions.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200110064124.1760511-2-ast@kernel.org
2020-01-10 11:15:12 -08:00
Andrii Nakryiko
2c8602eb54 selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir
Reorder includes search path to ensure $(OUTPUT) and $(CURDIR) go before
libbpf's directory. Also fix bpf_helpers.h to include bpf_helper_defs.h in
such a way as to leverage includes search path. This allows selftests to not
use libbpf's local and potentially stale bpf_helper_defs.h. It's important
because selftests/bpf's Makefile only re-generates bpf_helper_defs.h in
seltests' output directory, not the one in libbpf's directory.

Also force regeneration of bpf_helper_defs.h when libbpf.a is updated to
reduce staleness.

Fixes: fa633a0f8919 ("libbpf: Fix build on read-only filesystems")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200110051716.1591485-3-andriin@fb.com
2020-01-10 11:15:12 -08:00
Andrii Nakryiko
1ef23426e7 libbpf: Make bpf_map order and indices stable
Currently, libbpf re-sorts bpf_map structs after all the maps are added and
initialized, which might change their relative order and invalidate any
bpf_map pointer or index taken before that. This is inconvenient and
error-prone. For instance, it can cause .kconfig map index to point to a wrong
map.

Furthermore, libbpf itself doesn't rely on any specific ordering of bpf_maps,
so it's just an unnecessary complication right now. This patch drops sorting
of maps and makes their relative positions fixed. If efficient index is ever
needed, it's better to have a separate array of pointers as a search index,
instead of reordering bpf_map struct in-place. This will be less error-prone
and will allow multiple independent orderings, if necessary (e.g., either by
section index or by name).

Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables")
Reported-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200110034247.1220142-1-andriin@fb.com
2020-01-10 11:15:12 -08:00
Andrey Ignatov
d2100072b9 bpf: Document BPF_F_QUERY_EFFECTIVE flag
Document BPF_F_QUERY_EFFECTIVE flag, mostly to clarify how it affects
attach_flags what may not be obvious and what may lead to confision.

Specifically attach_flags is returned only for target_fd but if programs
are inherited from an ancestor cgroup then returned attach_flags for
current cgroup may be confusing. For example, two effective programs of
same attach_type can be returned but w/o BPF_F_ALLOW_MULTI in
attach_flags.

Simple repro:
  # bpftool c s /sys/fs/cgroup/path/to/task
  ID       AttachType      AttachFlags     Name
  # bpftool c s /sys/fs/cgroup/path/to/task effective
  ID       AttachType      AttachFlags     Name
  95043    ingress                         tw_ipt_ingress
  95048    ingress                         tw_ingress

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200108014006.938363-1-rdna@fb.com
2020-01-10 11:15:12 -08:00
Martin KaFai Lau
f3edca46e5 bpf: libbpf: Add STRUCT_OPS support
This patch adds BPF STRUCT_OPS support to libbpf.

The only sec_name convention is SEC(".struct_ops") to identify the
struct_ops implemented in BPF,
e.g. To implement a tcp_congestion_ops:

SEC(".struct_ops")
struct tcp_congestion_ops dctcp = {
	.init           = (void *)dctcp_init,  /* <-- a bpf_prog */
	/* ... some more func prts ... */
	.name           = "bpf_dctcp",
};

Each struct_ops is defined as a global variable under SEC(".struct_ops")
as above.  libbpf creates a map for each variable and the variable name
is the map's name.  Multiple struct_ops is supported under
SEC(".struct_ops").

In the bpf_object__open phase, libbpf will look for the SEC(".struct_ops")
section and find out what is the btf-type the struct_ops is
implementing.  Note that the btf-type here is referring to
a type in the bpf_prog.o's btf.  A "struct bpf_map" is added
by bpf_object__add_map() as other maps do.  It will then
collect (through SHT_REL) where are the bpf progs that the
func ptrs are referring to.  No btf_vmlinux is needed in
the open phase.

In the bpf_object__load phase, the map-fields, which depend
on the btf_vmlinux, are initialized (in bpf_map__init_kern_struct_ops()).
It will also set the prog->type, prog->attach_btf_id, and
prog->expected_attach_type.  Thus, the prog's properties do
not rely on its section name.
[ Currently, the bpf_prog's btf-type ==> btf_vmlinux's btf-type matching
  process is as simple as: member-name match + btf-kind match + size match.
  If these matching conditions fail, libbpf will reject.
  The current targeting support is "struct tcp_congestion_ops" which
  most of its members are function pointers.
  The member ordering of the bpf_prog's btf-type can be different from
  the btf_vmlinux's btf-type. ]

Then, all obj->maps are created as usual (in bpf_object__create_maps()).

Once the maps are created and prog's properties are all set,
the libbpf will proceed to load all the progs.

bpf_map__attach_struct_ops() is added to register a struct_ops
map to a kernel subsystem.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200109003514.3856730-1-kafai@fb.com
2020-01-10 11:15:12 -08:00
Martin KaFai Lau
cabb077325 bpf: Synch uapi bpf.h to tools/
This patch sync uapi bpf.h to tools/

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200109003512.3856559-1-kafai@fb.com
2020-01-10 11:15:12 -08:00
Michal Rostecki
b95b281039 libbpf: Add probe for large INSN limit
Introduce a new probe which checks whether kernel has large maximum
program size which was increased in the following commit:

c04c0d2b968a ("bpf: increase complexity limit and maximum program size")

Based on the similar check in Cilium[0], authored by Daniel Borkmann.

  [0] 657d0f585a

Co-authored-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Link: https://lore.kernel.org/bpf/20200108162428.25014-2-mrostecki@opensuse.org
2020-01-10 11:15:12 -08:00
Andrii Nakryiko
5033d7177e sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   7745ff9842617323adbe24e71c495d5ebd9aa352
Checkpoint bpf-next commit: 2bbc078f812d45b8decb55935dab21199bd21489
Baseline bpf commit:        1148f9adbe71415836a18a36c1b4ece999ab0973
Checkpoint bpf commit:      3c2f450e553ce47fcb0d6141807a8858e3213c9c

Andrey Ignatov (2):
  bpf: Support replacing cgroup-bpf program in MULTI mode
  libbpf: Introduce bpf_prog_attach_xattr

Andrii Nakryiko (1):
  libbpf: Support CO-RE relocations for LDX/ST/STX instructions

 include/uapi/linux/bpf.h | 10 ++++++++++
 src/bpf.c                | 17 ++++++++++++++++-
 src/bpf.h                | 11 +++++++++++
 src/libbpf.c             | 31 ++++++++++++++++++++++++++++---
 src/libbpf.map           |  1 +
 5 files changed, 66 insertions(+), 4 deletions(-)

--
2.17.1
2019-12-30 12:05:19 -08:00
Andrii Nakryiko
49058f8c6f libbpf: Support CO-RE relocations for LDX/ST/STX instructions
Clang patch [0] enables emitting relocatable generic ALU/ALU64 instructions
(i.e, shifts and arithmetic operations), as well as generic load/store
instructions. The former ones are already supported by libbpf as is. This
patch adds further support for load/store instructions. Relocatable field
offset is encoded in BPF instruction's 16-bit offset section and are adjusted
by libbpf based on target kernel BTF.

These Clang changes and corresponding libbpf changes allow for more succinct
generated BPF code by encoding relocatable field reads as a single
ST/LDX/STX instruction. It also enables relocatable access to BPF context.
Previously, if context struct (e.g., __sk_buff) was accessed with CO-RE
relocations (e.g., due to preserve_access_index attribute), it would be
rejected by BPF verifier due to modified context pointer dereference. With
Clang patch, such context accesses are both relocatable and have a fixed
offset from the point of view of BPF verifier.

  [0] https://reviews.llvm.org/D71790

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191223180305.86417-1-andriin@fb.com
2019-12-30 12:05:19 -08:00
Andrey Ignatov
8b20ffa4b9 libbpf: Introduce bpf_prog_attach_xattr
Introduce a new bpf_prog_attach_xattr function that, in addition to
program fd, target fd and attach type, accepts an extendable struct
bpf_prog_attach_opts.

bpf_prog_attach_opts relies on DECLARE_LIBBPF_OPTS macro to maintain
backward and forward compatibility and has the following "optional"
attach attributes:

* existing attach_flags, since it's not required when attaching in NONE
  mode. Even though it's quite often used in MULTI and OVERRIDE mode it
  seems to be a good idea to reduce number of arguments to
  bpf_prog_attach_xattr;

* newly introduced attribute of BPF_PROG_ATTACH command: replace_prog_fd
  that is fd of previously attached cgroup-bpf program to replace if
  BPF_F_REPLACE flag is used.

The new function is named to be consistent with other xattr-functions
(bpf_prog_test_run_xattr, bpf_create_map_xattr, bpf_load_program_xattr).

The struct bpf_prog_attach_opts is supposed to be used with
DECLARE_LIBBPF_OPTS macro.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/bd6e0732303eb14e4b79cb128268d9e9ad6db208.1576741281.git.rdna@fb.com
2019-12-30 12:05:19 -08:00
Andrey Ignatov
1b1e30679f bpf: Support replacing cgroup-bpf program in MULTI mode
The common use-case in production is to have multiple cgroup-bpf
programs per attach type that cover multiple use-cases. Such programs
are attached with BPF_F_ALLOW_MULTI and can be maintained by different
people.

Order of programs usually matters, for example imagine two egress
programs: the first one drops packets and the second one counts packets.
If they're swapped the result of counting program will be different.

It brings operational challenges with updating cgroup-bpf program(s)
attached with BPF_F_ALLOW_MULTI since there is no way to replace a
program:

* One way to update is to detach all programs first and then attach the
  new version(s) again in the right order. This introduces an
  interruption in the work a program is doing and may not be acceptable
  (e.g. if it's egress firewall);

* Another way is attach the new version of a program first and only then
  detach the old version. This introduces the time interval when two
  versions of same program are working, what may not be acceptable if a
  program is not idempotent. It also imposes additional burden on
  program developers to make sure that two versions of their program can
  co-exist.

Solve the problem by introducing a "replace" mode in BPF_PROG_ATTACH
command for cgroup-bpf programs being attached with BPF_F_ALLOW_MULTI
flag. This mode is enabled by newly introduced BPF_F_REPLACE attach flag
and bpf_attr.replace_bpf_fd attribute to pass fd of the old program to
replace

That way user can replace any program among those attached with
BPF_F_ALLOW_MULTI flag without the problems described above.

Details of the new API:

* If BPF_F_REPLACE is set but replace_bpf_fd doesn't have valid
  descriptor of BPF program, BPF_PROG_ATTACH will return corresponding
  error (EINVAL or EBADF).

* If replace_bpf_fd has valid descriptor of BPF program but such a
  program is not attached to specified cgroup, BPF_PROG_ATTACH will
  return ENOENT.

BPF_F_REPLACE is introduced to make the user intent clear, since
replace_bpf_fd alone can't be used for this (its default value, 0, is a
valid fd). BPF_F_REPLACE also makes it possible to extend the API in the
future (e.g. add BPF_F_BEFORE and BPF_F_AFTER if needed).

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Andrii Narkyiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/30cd850044a0057bdfcaaf154b7d2f39850ba813.1576741281.git.rdna@fb.com
2019-12-30 12:05:19 -08:00
Andrii Nakryiko
e7a82fc033 sync: add zlib dependency and libbpf_common.h to list of installed headers
zlib is now a direct dependency of libbpf (previously zlib was only dependency
of libelf, on which libbpf depends as well). For non-pkg-config case, specify
`-lz` compiler flag explicitly.

Recent sync also added another public header to libbpf. Include it in a list
of headers that are installed on target system.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
7c5583ab2d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   679152d3a32e305c213f83160c328c37566ae8bc
Checkpoint bpf-next commit: 7745ff9842617323adbe24e71c495d5ebd9aa352
Baseline bpf commit:        fe3300897cbfd76c6cb825776e5ac0ca50a91ca4
Checkpoint bpf commit:      1148f9adbe71415836a18a36c1b4ece999ab0973

Andrii Nakryiko (26):
  libbpf: Extract and generalize CPU mask parsing logic
  libbpf: Don't attach perf_buffer to offline/missing CPUs
  libbpf: Don't require root for bpf_object__open()
  libbpf: Add generic bpf_program__attach()
  libbpf: Move non-public APIs from libbpf.h to libbpf_internal.h
  libbpf: Add BPF_EMBED_OBJ macro for embedding BPF .o files
  libbpf: Extract common user-facing helpers
  libbpf: Expose btf__align_of() API
  libbpf: Expose BTF-to-C type declaration emitting API
  libbpf: Expose BPF program's function name
  libbpf: Refactor global data map initialization
  libbpf: Postpone BTF ID finding for TRACING programs to load phase
  libbpf: Reduce log level of supported section names dump
  libbpf: Add BPF object skeleton support
  libbpf: Extract internal map names into constants
  libbpf: Support libbpf-provided extern variables
  bpftool: Generate externs datasec in BPF skeleton
  libbpf: Support flexible arrays in CO-RE
  libbpf: Add zlib as a dependency in pkg-config template
  libbpf: Reduce log level for custom section names
  libbpf: Remove BPF_EMBED_OBJ macro from libbpf.h
  libbpf: Add bpf_link__disconnect() API to preserve underlying BPF
    resource
  libbpf: Put Kconfig externs into .kconfig section
  libbpf: Allow to augment system Kconfig through extra optional config
  libbpf: BTF is required when externs are present
  libbpf: Fix another __u64 printf warning

Jakub Sitnicki (1):
  libbpf: Recognize SK_REUSEPORT programs from section name

Prashant Bhole (1):
  libbpf: Fix build by renaming variables

Toke Høiland-Jørgensen (4):
  libbpf: Print hint about ulimit when getting permission denied error
  libbpf: Fix libbpf_common.h when installing libbpf through 'make
    install'
  libbpf: Add missing newline in opts validation macro
  libbpf: Fix printing of ulimit value

 include/uapi/linux/btf.h |    7 +-
 src/bpf.h                |    6 +-
 src/bpf_helpers.h        |   11 +
 src/btf.c                |   48 +-
 src/btf.h                |   29 +-
 src/btf_dump.c           |  115 ++-
 src/libbpf.c             | 1673 ++++++++++++++++++++++++++++++++------
 src/libbpf.h             |  107 +--
 src/libbpf.map           |   12 +
 src/libbpf.pc.template   |    2 +-
 src/libbpf_common.h      |   40 +
 src/libbpf_internal.h    |   21 +-
 12 files changed, 1678 insertions(+), 393 deletions(-)
 create mode 100644 src/libbpf_common.h

--
2.17.1
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
0d9d85e345 libbpf: Fix another __u64 printf warning
Fix yet another printf warning for %llu specifier on ppc64le. This time size_t
casting won't work, so cast to verbose `unsigned long long`.

Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191219052103.3515-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Toke Høiland-Jørgensen
c8c4edf4c9 libbpf: Fix printing of ulimit value
Naresh pointed out that libbpf builds fail on 32-bit architectures because
rlimit.rlim_cur is defined as 'unsigned long long' on those architectures.
Fix this by using %zu in printf and casting to size_t.

Fixes: dc3a2d254782 ("libbpf: Print hint about ulimit when getting permission denied error")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191219090236.905059-1-toke@redhat.com
2019-12-19 15:34:27 -08:00
Toke Høiland-Jørgensen
23983fd75b libbpf: Add missing newline in opts validation macro
The error log output in the opts validation macro was missing a newline.

Fixes: 2ce8450ef5a3 ("libbpf: add bpf_object__open_{file, mem} w/ extensible opts")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191219120714.928380-1-toke@redhat.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
4bbdefdce1 libbpf: BTF is required when externs are present
BTF is required to get type information about extern variables.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191219002837.3074619-4-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
5bc09e54fa libbpf: Allow to augment system Kconfig through extra optional config
Instead of all or nothing approach of overriding Kconfig file location, allow
to extend it with extra values and override chosen subset of values though
optional user-provided extra config, passed as a string through open options'
.kconfig option. If same config key is present in both user-supplied config
and Kconfig, user-supplied one wins. This allows applications to more easily
test various conditions despite host kernel's real configuration. If all of
BPF object's __kconfig externs are satisfied from user-supplied config, system
Kconfig won't be read at all.

Simplify selftests by not needing to create temporary Kconfig files.

Suggested-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191219002837.3074619-3-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
9f61b5b95c libbpf: Put Kconfig externs into .kconfig section
Move Kconfig-provided externs into custom .kconfig section. Add __kconfig into
bpf_helpers.h for user convenience. Update selftests accordingly.

Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191219002837.3074619-2-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
baa3268b13 libbpf: Add bpf_link__disconnect() API to preserve underlying BPF resource
There are cases in which BPF resource (program, map, etc) has to outlive
userspace program that "installed" it in the system in the first place.
When BPF program is attached, libbpf returns bpf_link object, which
is supposed to be destroyed after no longer necessary through
bpf_link__destroy() API. Currently, bpf_link destruction causes both automatic
detachment and frees up any resources allocated to for bpf_link in-memory
representation. This is inconvenient for the case described above because of
coupling of detachment and resource freeing.

This patch introduces bpf_link__disconnect() API call, which marks bpf_link as
disconnected from its underlying BPF resouces. This means that when bpf_link
is destroyed later, all its memory resources will be freed, but BPF resource
itself won't be detached.

This design allows to follow strict and resource-leak-free design by default,
while giving easy and straightforward way for user code to opt for keeping BPF
resource attached beyond lifetime of a bpf_link. For some BPF programs (i.e.,
FS-based tracepoints, kprobes, raw tracepoint, etc), user has to make sure to
pin BPF program to prevent kernel to automatically detach it on process exit.
This should typically be achived by pinning BPF program (or map in some cases)
in BPF FS.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191218225039.2668205-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
f5599ef856 libbpf: Remove BPF_EMBED_OBJ macro from libbpf.h
Drop BPF_EMBED_OBJ and struct bpf_embed_data now that skeleton automatically
embeds contents of its source object file. While BPF_EMBED_OBJ is useful
independently of skeleton, we are currently don't have any use cases utilizing
it, so let's remove them until/if we need it.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191218052552.2915188-3-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
99c65fed78 libbpf: Reduce log level for custom section names
Libbpf is trying to recognize BPF program type based on its section name
during bpf_object__open() phase. This is not strictly enforced and user code
has ability to specify/override correct BPF program type after open.  But if
BPF program is using custom section name, libbpf will still emit warnings,
which can be quite annoying to users. This patch reduces log level of
information messages emitted by libbpf if section name is not canonical. User
can still get a list of all supported section names as debug-level message.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191217234228.1739308-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Toke Høiland-Jørgensen
e9adfa851f libbpf: Fix libbpf_common.h when installing libbpf through 'make install'
This fixes two issues with the newly introduced libbpf_common.h file:

- The header failed to include <string.h> for the definition of memset()
- The new file was not included in the install_headers rule in the Makefile

Both of these issues cause breakage when installing libbpf with 'make
install' and trying to use it in applications.

Fixes: 544402d4b493 ("libbpf: Extract common user-facing helpers")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191217112810.768078-1-toke@redhat.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
8363b8d4e6 libbpf: Add zlib as a dependency in pkg-config template
List zlib as another dependency of libbpf in pkg-config template.
Verified it is correctly resolved to proper -lz flag:

$ make DESTDIR=/tmp/libbpf-install install
$ pkg-config --libs /tmp/libbpf-install/usr/local/lib64/pkgconfig/libbpf.pc
-L/usr/local/lib64 -lbpf
$ pkg-config --libs --static /tmp/libbpf-install/usr/local/lib64/pkgconfig/libbpf.pc
-L/usr/local/lib64 -lbpf -lelf -lz

Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Cc: Luca Boccassi <bluca@debian.org>
Link: https://lore.kernel.org/bpf/20191216183830.3972964-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Toke Høiland-Jørgensen
f892b464d0 libbpf: Print hint about ulimit when getting permission denied error
Probably the single most common error newcomers to XDP are stumped by is
the 'permission denied' error they get when trying to load their program
and 'ulimit -l' is set too low. For examples, see [0], [1].

Since the error code is UAPI, we can't change that. Instead, this patch
adds a few heuristics in libbpf and outputs an additional hint if they are
met: If an EPERM is returned on map create or program load, and geteuid()
shows we are root, and the current RLIMIT_MEMLOCK is not infinity, we
output a hint about raising 'ulimit -l' as an additional log line.

[0] https://marc.info/?l=xdp-newbies&m=157043612505624&w=2
[1] https://github.com/xdp-project/xdp-tutorial/issues/86

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191216181204.724953-1-toke@redhat.com
2019-12-19 15:34:27 -08:00
Prashant Bhole
a4132d1590 libbpf: Fix build by renaming variables
In btf__align_of() variable name 't' is shadowed by inner block
declaration of another variable with same name. Patch renames
variables in order to fix it.

  CC       sharedobjs/btf.o
btf.c: In function ‘btf__align_of’:
btf.c:303:21: error: declaration of ‘t’ shadows a previous local [-Werror=shadow]
  303 |   int i, align = 1, t;
      |                     ^
btf.c:283:25: note: shadowed declaration is here
  283 |  const struct btf_type *t = btf__type_by_id(btf, id);
      |

Fixes: 3d208f4ca111 ("libbpf: Expose btf__align_of() API")
Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20191216082738.28421-1-prashantbhole.linux@gmail.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
303916a126 libbpf: Support flexible arrays in CO-RE
Some data stuctures in kernel are defined with either zero-sized array or
flexible (dimensionless) array at the end of a struct. Actual data of such
array follows in memory immediately after the end of that struct, forming its
variable-sized "body" of elements. Support such access pattern in CO-RE
relocation handling.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191215070844.1014385-2-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
8eea7ed8e8 bpftool: Generate externs datasec in BPF skeleton
Add support for generation of mmap()-ed read-only view of libbpf-provided
extern variables. As externs are not supposed to be provided by user code
(that's what .data, .bss, and .rodata is for), don't mmap() it initially. Only
after skeleton load is performed, map .extern contents as read-only memory.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-4-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
fa030ffd20 libbpf: Support libbpf-provided extern variables
Add support for extern variables, provided to BPF program by libbpf. Currently
the following extern variables are supported:
  - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is
    executing, follows KERNEL_VERSION() macro convention, can be 4- and 8-byte
    long;
  - CONFIG_xxx values; a set of values of actual kernel config. Tristate,
    boolean, strings, and integer values are supported.

Set of possible values is determined by declared type of extern variable.
Supported types of variables are:
- Tristate values. Are represented as `enum libbpf_tristate`. Accepted values
  are **strictly** 'y', 'n', or 'm', which are represented as TRI_YES, TRI_NO,
  or TRI_MODULE, respectively.
- Boolean values. Are represented as bool (_Bool) types. Accepted values are
  'y' and 'n' only, turning into true/false values, respectively.
- Single-character values. Can be used both as a substritute for
  bool/tristate, or as a small-range integer:
  - 'y'/'n'/'m' are represented as is, as characters 'y', 'n', or 'm';
  - integers in a range [-128, 127] or [0, 255] (depending on signedness of
    char in target architecture) are recognized and represented with
    respective values of char type.
- Strings. String values are declared as fixed-length char arrays. String of
  up to that length will be accepted and put in first N bytes of char array,
  with the rest of bytes zeroed out. If config string value is longer than
  space alloted, it will be truncated and warning message emitted. Char array
  is always zero terminated. String literals in config have to be enclosed in
  double quotes, just like C-style string literals.
- Integers. 8-, 16-, 32-, and 64-bit integers are supported, both signed and
  unsigned variants. Libbpf enforces parsed config value to be in the
  supported range of corresponding integer type. Integers values in config can
  be:
  - decimal integers, with optional + and - signs;
  - hexadecimal integers, prefixed with 0x or 0X;
  - octal integers, starting with 0.

Config file itself is searched in /boot/config-$(uname -r) location with
fallback to /proc/config.gz, unless config path is specified explicitly
through bpf_object_open_opts' kernel_config_path option. Both gzipped and
plain text formats are supported. Libbpf adds explicit dependency on zlib
because of this, but this shouldn't be a problem, given libelf already depends
on zlib.

All detected extern variables, are put into a separate .extern internal map.
It, similarly to .rodata map, is marked as read-only from BPF program side, as
well as is frozen on load. This allows BPF verifier to track extern values as
constants and perform enhanced branch prediction and dead code elimination.
This can be relied upon for doing kernel version/feature detection and using
potentially unsupported field relocations or BPF helpers in a CO-RE-based BPF
program, while still having a single version of BPF program running on old and
new kernels. Selftests are validating this explicitly for unexisting BPF
helper.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-3-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
ea06bc30fa libbpf: Extract internal map names into constants
Instead of duplicating string literals, keep them in one place and consistent.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-2-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
531ac0e65f libbpf: Add BPF object skeleton support
Add new set of APIs, allowing to open/load/attach BPF object through BPF
object skeleton, generated by bpftool for a specific BPF object file. All the
xxx_skeleton() APIs wrap up corresponding bpf_object_xxx() APIs, but
additionally also automate map/program lookups by name, global data
initialization and mmap()-ing, etc.  All this greatly improves and simplifies
userspace usability of working with BPF programs. See follow up patches for
examples.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-13-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
e35cb347ce libbpf: Reduce log level of supported section names dump
It's quite spammy. And now that bpf_object__open() is trying to determine
program type from its section name, we are getting these verbose messages all
the time. Reduce their log level to DEBUG.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-12-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
68fa3f0b57 libbpf: Postpone BTF ID finding for TRACING programs to load phase
Move BTF ID determination for BPF_PROG_TYPE_TRACING programs to a load phase.
Performing it at open step is inconvenient, because it prevents BPF skeleton
generation on older host kernel, which doesn't contain BTF_KIND_FUNCs
information in vmlinux BTF. This is a common set up, though, when, e.g.,
selftests are compiled on older host kernel, but the test program itself is
executed in qemu VM with bleeding edge kernel. Having this BTF searching
performed at load time allows to successfully use bpf_object__open() for
codegen and inspection of BPF object file.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-11-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
1c145f0fda libbpf: Refactor global data map initialization
Refactor global data map initialization to use anonymous mmap()-ed memory
instead of malloc()-ed one. This allows to do a transparent re-mmap()-ing of
already existing memory address to point to BPF map's memory after
bpf_object__load() step (done in follow up patch). This choreographed setup
allows to have a nice and unsurprising way to pre-initialize read-only (and
r/w as well) maps by user and after BPF map creation keep working with
mmap()-ed contents of this map. All in a way that doesn't require user code to
update any pointers: the illusion of working with memory contents is preserved
before and after actual BPF map instantiation.

Selftests and runqslower example demonstrate this feature in follow up patches.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-10-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
418c07226a libbpf: Expose BPF program's function name
Add APIs to get BPF program function name, as opposed to bpf_program__title(),
which returns BPF program function's section name. Function name has a benefit
of being a valid C identifier and uniquely identifies a specific BPF program,
while section name can be duplicated across multiple independent BPF programs.

Add also bpf_object__find_program_by_name(), similar to
bpf_object__find_program_by_title(), to facilitate looking up BPF programs by
their C function names.

Convert one of selftests to new API for look up.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-9-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
5ec0ba6530 libbpf: Expose BTF-to-C type declaration emitting API
Expose API that allows to emit type declaration and field/variable definition
(if optional field name is specified) in valid C syntax for any provided BTF
type. This is going to be used by bpftool when emitting data section layout as
a struct. As part of making this API useful in a stand-alone fashion, move
initialization of some of the internal btf_dump state to earlier phase.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-8-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
600ba1c5e1 libbpf: Expose btf__align_of() API
Expose BTF API that calculates type alignment requirements.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-7-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
aa73e35dc3 libbpf: Extract common user-facing helpers
LIBBPF_API and DECLARE_LIBBPF_OPTS are needed in many public libbpf API
headers. Extract them into libbpf_common.h to avoid unnecessary
interdependency between btf.h, libbpf.h, and bpf.h or code duplication.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-6-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
dca6176410 libbpf: Add BPF_EMBED_OBJ macro for embedding BPF .o files
Add a convenience macro BPF_EMBED_OBJ, which allows to embed other files
(typically used to embed BPF .o files) into a hosting userspace programs. To
C program it is exposed as struct bpf_embed_data, containing a pointer to
raw data and its size in bytes.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-5-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
6f88f26945 libbpf: Move non-public APIs from libbpf.h to libbpf_internal.h
Few libbpf APIs are not public but currently exposed through libbpf.h to be
used by bpftool. Move them to libbpf_internal.h, where intent of being
non-stable and non-public is much more obvious.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-4-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
f7af143516 libbpf: Add generic bpf_program__attach()
Generalize BPF program attaching and allow libbpf to auto-detect type (and
extra parameters, where applicable) and attach supported BPF program types
based on program sections. Currently this is supported for:
- kprobe/kretprobe;
- tracepoint;
- raw tracepoint;
- tracing programs (typed raw TP/fentry/fexit).

More types support can be trivially added within this framework.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-3-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
4f3c7b3e13 libbpf: Don't require root for bpf_object__open()
Reorganize bpf_object__open and bpf_object__load steps such that
bpf_object__open doesn't need root access. This was previously done for
feature probing and BTF sanitization. This doesn't have to happen on open,
though, so move all those steps into the load phase.

This is important, because it makes it possible for tools like bpftool, to
just open BPF object file and inspect their contents: programs, maps, BTF,
etc. For such operations it is prohibitive to require root access. On the
other hand, there is a lot of custom libbpf logic in those steps, so its best
avoided for tools to reimplement all that on their own.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-2-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
b85e83f6cb libbpf: Don't attach perf_buffer to offline/missing CPUs
It's quite common on some systems to have more CPUs enlisted as "possible",
than there are (and could ever be) present/online CPUs. In such cases,
perf_buffer creationg will fail due to inability to create perf event on
missing CPU with error like this:

libbpf: failed to open perf buffer event on cpu #16: No such device

This patch fixes the logic of perf_buffer__new() to ignore CPUs that are
missing or currently offline. In rare cases where user explicitly listed
specific CPUs to connect to, behavior is unchanged: libbpf will try to open
perf event buffer on specified CPU(s) anyways.

Fixes: fb84b8224655 ("libbpf: add perf buffer API")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191212013609.1691168-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Andrii Nakryiko
33d1fbea57 libbpf: Extract and generalize CPU mask parsing logic
This logic is re-used for parsing a set of online CPUs. Having it as an
isolated piece of code working with input string makes it conveninent to test
this logic as well. While refactoring, also improve the robustness of original
implementation.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191212013548.1690564-1-andriin@fb.com
2019-12-19 15:34:27 -08:00
Jakub Sitnicki
b234d12c97 libbpf: Recognize SK_REUSEPORT programs from section name
Allow loading BPF object files that contain SK_REUSEPORT programs without
having to manually set the program type before load if the the section name
is set to "sk_reuseport".

Makes user-space code needed to load SK_REUSEPORT BPF program more concise.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191212102259.418536-2-jakub@cloudflare.com
2019-12-19 15:34:27 -08:00
hex
7a1d185108 libbpf: fix Coverity scan CI
A follow up of [1]
Travis CI stages use default phases when no override provided.
This leads to Coverity scan stage fail due to execuing the default
before_script: phase of VMTEST.
Fix this with an explicit override with empty value.

[1] https://github.com/libbpf/libbpf/pull/108
2019-12-17 16:46:57 -08:00
hex
76d5bb6a13 libbpf: Add VMTEST to CI
Extend continuous integration tests by adding testing against various kernel
versions.
The code is based on vmtest CI scripts implemented by osandov@
for drgn [1] with the following modifications:
- The downloadables are stored in Amazon S3 cloud indexed in [2]
- `--setup-cmd` command line option is added to vmtest/run.sh so
  setup commands run on VM boot can be set in e.g. `.travis.yml`
- Travis build matrix [2] is introduced for VM tests so VM tests are
  followed by the existing CI tests. The matrix has `KERNEL` and
  `VMTEST_SETUPCMD` dimensions.
- Minor style fixes.

The vmtest extention code is located in travis-ci/vmtest and contains
`run.sh` and `setup_example.sh`
- `run.sh` is responsible for the vmtest workflow: downloading vmlinux
  and rootfs image from the cloud, fs mounting, syncing libbpf sources
  to the image, setting up scripts run on VM boot, starting VM using
  QEMU.
  `run.sh` covers more use cases than a script for a job run in TravisCI,
  e.g. int can build a kernel w/ `--build` option.

- `setup_example.sh` is an example of a script run in VM which can be
  modified to e.g. run actual libbpf tests. A setup script should have
  executable permission.

To set up a new kernel version for a test:
1) upload vmlinuz.* and vmlinux.*\.zst to Amazon S3 store
located at [4];
2) modify INDEX [2] file.

[1] https://github.com/osandov/drgn
[2] https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/INDEX
[3] https://docs.travis-ci.com/user/build-matrix
[4] https://libbpf-vmtest.s3-us-west-1.amazonaws.com/
2019-12-16 21:04:03 -08:00
Frantisek Sumsal
c42bfcbf0e travis: build on ppc64le as well 2019-12-13 01:04:46 -08:00
Andrii Nakryiko
c2fc7c15a3 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   e7096c131e5161fa3b8e52a650d7719d2857adfd
Checkpoint bpf-next commit: 679152d3a32e305c213f83160c328c37566ae8bc
Baseline bpf commit:        e42617b825f8073569da76dc4510bfa019b1c35a
Checkpoint bpf commit:      fe3300897cbfd76c6cb825776e5ac0ca50a91ca4

Andrii Nakryiko (2):
  libbpf: Bump libpf current version to v0.0.7
  libbpf: Fix printf compilation warnings on ppc64le arch

 src/libbpf.c   | 37 +++++++++++++++++++------------------
 src/libbpf.map |  3 +++
 2 files changed, 22 insertions(+), 18 deletions(-)

--
2.17.1
2019-12-12 14:40:26 -08:00
Andrii Nakryiko
4060a65222 libbpf: Fix printf compilation warnings on ppc64le arch
On ppc64le __u64 and __s64 are defined as long int and unsigned long int,
respectively. This causes compiler to emit warning when %lld/%llu are used to
printf 64-bit numbers. Fix this by casting to size_t/ssize_t with %zu and %zd
format specifiers, respectively.

v1->v2:
- use size_t/ssize_t instead of custom typedefs (Martin).

Fixes: 1f8e2bcb2cd5 ("libbpf: Refactor relocation handling")
Fixes: abd29c931459 ("libbpf: allow specifying map definitions using BTF")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191212171918.638010-1-andriin@fb.com
2019-12-12 14:40:26 -08:00
Andrii Nakryiko
a26f6b1375 libbpf: Bump libpf current version to v0.0.7
New development cycles starts, bump to v0.0.7 proactively.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191209224022.3544519-1-andriin@fb.com
2019-12-12 14:40:26 -08:00
Toke Høiland-Jørgensen
6e686c26fa Makefile: Add cscope and tags rules
These were added to the kernel repo, but not in Github. However, they are
useful for browsing the source in Github while prototyping new features and
compiling them into userspace utilities.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
2019-12-11 10:38:48 -08:00
Andrii Nakryiko
ab067ed371 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   b615e5a1e067dcb327482d1af7463268b89b1629
Checkpoint bpf-next commit: e7096c131e5161fa3b8e52a650d7719d2857adfd
Baseline bpf commit:        34e59836565e36fade1464e054a3551c1a0364be
Checkpoint bpf commit:      e42617b825f8073569da76dc4510bfa019b1c35a

Alexei Starovoitov (2):
  libbpf: Fix sym->st_value print on 32-bit arches
  selftests/bpf: Add test for BPF trampoline

Andrii Nakryiko (1):
  libbpf: Fix global variable relocation

Martin KaFai Lau (1):
  bpf: Introduce BPF_TRACE_x helper for the tracing tests

 src/libbpf.c | 45 ++++++++++++++++++++-------------------------
 1 file changed, 20 insertions(+), 25 deletions(-)

--
2.17.1
2019-12-09 09:44:20 -08:00
Martin KaFai Lau
9b69fbe4d1 bpf: Introduce BPF_TRACE_x helper for the tracing tests
For BPF_PROG_TYPE_TRACING, the bpf_prog's ctx is an array of u64.
This patch borrows the idea from BPF_CALL_x in filter.h to
convert a u64 to the arg type of the traced function.

The new BPF_TRACE_x has an arg to specify the return type of a bpf_prog.
It will be used in the future TCP-ops bpf_prog that may return "void".

The new macros are defined in the new header file "bpf_trace_helpers.h".
It is under selftests/bpf/ for now.  It could be moved to libbpf later
after seeing more upcoming non-tracing use cases.

The tests are changed to use these new macros also.  Hence,
the k[s]u8/16/32/64 are no longer needed and they are removed
from the bpf_helpers.h.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191123202504.1502696-1-kafai@fb.com
2019-12-09 09:44:20 -08:00
Alexei Starovoitov
04d8fc50ab selftests/bpf: Add test for BPF trampoline
Add sanity test for BPF trampoline that checks kernel functions
with up to 6 arguments of different sizes.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-10-ast@kernel.org
2019-12-09 09:44:20 -08:00
Alexei Starovoitov
ceff1e0363 libbpf: Fix sym->st_value print on 32-bit arches
The st_value field is a 64-bit value and causing this error on 32-bit arches:

In file included from libbpf.c:52:
libbpf.c: In function 'bpf_program__record_reloc':
libbpf_internal.h:59:22: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'Elf64_Addr' {aka 'const long long unsigned int'} [-Werror=format=]

Fix it with (__u64) cast.

Fixes: 1f8e2bcb2cd5 ("libbpf: Refactor relocation handling")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-12-09 09:44:20 -08:00
Andrii Nakryiko
d28acc595f libbpf: Fix global variable relocation
Similarly to a0d7da26ce86 ("libbpf: Fix call relocation offset calculation
bug"), relocations against global variables need to take into account
referenced symbol's st_value, which holds offset into a corresponding data
section (and, subsequently, offset into internal backing map). For static
variables this offset is always zero and data offset is completely described
by respective instruction's imm field.

Convert a bunch of selftests to global variables. Previously they were relying
on `static volatile` trick to ensure Clang doesn't inline static variables,
which with global variables is not necessary anymore.

Fixes: 393cdfbee809 ("libbpf: Support initialized global variables")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191127200651.1381348-1-andriin@fb.com
2019-12-09 09:44:20 -08:00
Andrii Nakryiko
9ef191ea7d license: add LICENSE with dual-license SPDX expression
Add LICENSE specifying dual-license expression.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-11-26 11:06:43 -08:00
Andrii Nakryiko
1add860402 license: add license note to README
Add mention of dual-licensing to README

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-11-26 11:02:19 -08:00
Andrii Nakryiko
c658f21738 libbpf: add BSD-2-Clause and LGPL-2.1 licenses
Libbpf is dual-licensed under BSD-2-Clause and LGPL-2.1 licenses. Include
their texts in the root of the repo.

Suggestes-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-11-26 09:54:43 -08:00
Andrii Nakryiko
9f519af7f4 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   e47a179997ceee6864fbae620eee09ea9c345a4d
Checkpoint bpf-next commit: b615e5a1e067dcb327482d1af7463268b89b1629
Baseline bpf commit:        d0fbb51dfaa612f960519b798387be436e8f83c5
Checkpoint bpf commit:      34e59836565e36fade1464e054a3551c1a0364be

Alexei Starovoitov (4):
  libbpf: Introduce btf__find_by_name_kind()
  libbpf: Add support to attach to fentry/fexit tracing progs
  selftests/bpf: Add test for BPF trampoline
  libbpf: Add support for attaching BPF programs to other BPF programs

Andrii Nakryiko (8):
  bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY
  libbpf: Make global data internal arrays mmap()-able, if possible
  libbpf: Fix call relocation offset calculation bug
  libbpf: Refactor relocation handling
  libbpf: Fix various errors and warning reported by checkpatch.pl
  libbpf: Support initialized global variables
  libbpf: Fix bpf_object name determination for bpf_object__open_file()
  libbpf: Fix usage of u32 in userspace code

Luigi Rizzo (1):
  net-af_xdp: Use correct number of channels from ethtool

Martin KaFai Lau (1):
  bpf: Introduce BPF_TRACE_x helper for the tracing tests

 include/uapi/linux/bpf.h |   6 +
 src/bpf.c                |   8 +-
 src/bpf.h                |   5 +-
 src/btf.c                |  22 ++
 src/btf.h                |   2 +
 src/libbpf.c             | 478 ++++++++++++++++++++++++++-------------
 src/libbpf.h             |   7 +-
 src/libbpf.map           |   3 +
 src/xsk.c                |  11 +-
 9 files changed, 371 insertions(+), 171 deletions(-)

--
2.17.1
2019-11-25 16:55:44 -08:00
Andrii Nakryiko
b7bdc604ef libbpf: Fix usage of u32 in userspace code
u32 is not defined for libbpf when compiled outside of kernel sources (e.g.,
in Github projection). Use __u32 instead.

Fixes: b8c54ea455dc ("libbpf: Add support to attach to fentry/fexit tracing progs")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191125212948.1163343-1-andriin@fb.com
2019-11-25 16:55:44 -08:00
Martin KaFai Lau
354dd9844e bpf: Introduce BPF_TRACE_x helper for the tracing tests
For BPF_PROG_TYPE_TRACING, the bpf_prog's ctx is an array of u64.
This patch borrows the idea from BPF_CALL_x in filter.h to
convert a u64 to the arg type of the traced function.

The new BPF_TRACE_x has an arg to specify the return type of a bpf_prog.
It will be used in the future TCP-ops bpf_prog that may return "void".

The new macros are defined in the new header file "bpf_trace_helpers.h".
It is under selftests/bpf/ for now.  It could be moved to libbpf later
after seeing more upcoming non-tracing use cases.

The tests are changed to use these new macros also.  Hence,
the k[s]u8/16/32/64 are no longer needed and they are removed
from the bpf_helpers.h.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191123202504.1502696-1-kafai@fb.com
2019-11-25 16:55:44 -08:00
Andrii Nakryiko
9b91dce691 libbpf: Fix bpf_object name determination for bpf_object__open_file()
If bpf_object__open_file() gets path like "some/dir/obj.o", it should derive
BPF object's name as "obj" (unless overriden through opts->object_name).
Instead, due to using `path` as a fallback value for opts->obj_name, path is
used as is for object name, so for above example BPF object's name will be
verbatim "some/dir/obj", which leads to all sorts of troubles, especially when
internal maps are concern (they are using up to 8 characters of object name).
Fix that by ensuring object_name stays NULL, unless overriden.

Fixes: 291ee02b5e40 ("libbpf: Refactor bpf_object__open APIs to use common opts")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191122003527.551556-1-andriin@fb.com
2019-11-25 16:55:44 -08:00
Andrii Nakryiko
83535cb2bf libbpf: Support initialized global variables
Initialized global variables are no different in ELF from static variables,
and don't require any extra support from libbpf. But they are matching
semantics of global data (backed by BPF maps) more closely, preventing
LLVM/Clang from aggressively inlining constant values and not requiring
volatile incantations to prevent those. This patch enables global variables.
It still disables uninitialized variables, which will be put into special COM
(common) ELF section, because BPF doesn't allow uninitialized data to be
accessed.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191121070743.1309473-5-andriin@fb.com
2019-11-25 16:55:44 -08:00
Andrii Nakryiko
3f05b513d4 libbpf: Fix various errors and warning reported by checkpatch.pl
Fix a bunch of warnings and errors reported by checkpatch.pl, to make it
easier to spot new problems.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191121070743.1309473-4-andriin@fb.com
2019-11-25 16:55:44 -08:00
Andrii Nakryiko
0d0d05de08 libbpf: Refactor relocation handling
Relocation handling code is convoluted and unnecessarily deeply nested. Split
out per-relocation logic into separate function. Also refactor the logic to be
more a sequence of per-relocation type checks and processing steps, making it
simpler to follow control flow. This makes it easier to further extends it to
new kinds of relocations (e.g., support for extern variables).

This patch also makes relocation's section verification more robust.
Previously relocations against not yet supported externs were silently ignored
because of obj->efile.text_shndx was zero, when all BPF programs had custom
section names and there was no .text section. Also, invalid LDIMM64 relocations
against non-map sections were passed through, if they were pointing to a .text
section (or 0, which is invalid section). All these bugs are fixed within this
refactoring and checks are made more appropriate for each type of relocation.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191121070743.1309473-3-andriin@fb.com
2019-11-25 16:55:44 -08:00
Andrii Nakryiko
44409068f7 libbpf: Fix call relocation offset calculation bug
When relocating subprogram call, libbpf doesn't take into account
relo->text_off, which comes from symbol's value. This generally works fine for
subprograms implemented as static functions, but breaks for global functions.

Taking a simplified test_pkt_access.c as an example:

__attribute__ ((noinline))
static int test_pkt_access_subprog1(volatile struct __sk_buff *skb)
{
        return skb->len * 2;
}

__attribute__ ((noinline))
static int test_pkt_access_subprog2(int val, volatile struct __sk_buff *skb)
{
        return skb->len + val;
}

SEC("classifier/test_pkt_access")
int test_pkt_access(struct __sk_buff *skb)
{
        if (test_pkt_access_subprog1(skb) != skb->len * 2)
                return TC_ACT_SHOT;
        if (test_pkt_access_subprog2(2, skb) != skb->len + 2)
                return TC_ACT_SHOT;
        return TC_ACT_UNSPEC;
}

When compiled, we get two relocations, pointing to '.text' symbol. .text has
st_value set to 0 (it points to the beginning of .text section):

0000000000000008  000000050000000a R_BPF_64_32            0000000000000000 .text
0000000000000040  000000050000000a R_BPF_64_32            0000000000000000 .text

test_pkt_access_subprog1 and test_pkt_access_subprog2 offsets (targets of two
calls) are encoded within call instruction's imm32 part as -1 and 2,
respectively:

0000000000000000 test_pkt_access_subprog1:
       0:       61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0)
       1:       64 00 00 00 01 00 00 00 w0 <<= 1
       2:       95 00 00 00 00 00 00 00 exit

0000000000000018 test_pkt_access_subprog2:
       3:       61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0)
       4:       04 00 00 00 02 00 00 00 w0 += 2
       5:       95 00 00 00 00 00 00 00 exit

0000000000000000 test_pkt_access:
       0:       bf 16 00 00 00 00 00 00 r6 = r1
===>   1:       85 10 00 00 ff ff ff ff call -1
       2:       bc 01 00 00 00 00 00 00 w1 = w0
       3:       b4 00 00 00 02 00 00 00 w0 = 2
       4:       61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0)
       5:       64 02 00 00 01 00 00 00 w2 <<= 1
       6:       5e 21 08 00 00 00 00 00 if w1 != w2 goto +8 <LBB0_3>
       7:       bf 61 00 00 00 00 00 00 r1 = r6
===>   8:       85 10 00 00 02 00 00 00 call 2
       9:       bc 01 00 00 00 00 00 00 w1 = w0
      10:       61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0)
      11:       04 02 00 00 02 00 00 00 w2 += 2
      12:       b4 00 00 00 ff ff ff ff w0 = -1
      13:       1e 21 01 00 00 00 00 00 if w1 == w2 goto +1 <LBB0_3>
      14:       b4 00 00 00 02 00 00 00 w0 = 2
0000000000000078 LBB0_3:
      15:       95 00 00 00 00 00 00 00 exit

Now, if we compile example with global functions, the setup changes.
Relocations are now against specifically test_pkt_access_subprog1 and
test_pkt_access_subprog2 symbols, with test_pkt_access_subprog2 pointing 24
bytes into its respective section (.text), i.e., 3 instructions in:

0000000000000008  000000070000000a R_BPF_64_32            0000000000000000 test_pkt_access_subprog1
0000000000000048  000000080000000a R_BPF_64_32            0000000000000018 test_pkt_access_subprog2

Calls instructions now encode offsets relative to function symbols and are both
set ot -1:

0000000000000000 test_pkt_access_subprog1:
       0:       61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0)
       1:       64 00 00 00 01 00 00 00 w0 <<= 1
       2:       95 00 00 00 00 00 00 00 exit

0000000000000018 test_pkt_access_subprog2:
       3:       61 20 00 00 00 00 00 00 r0 = *(u32 *)(r2 + 0)
       4:       0c 10 00 00 00 00 00 00 w0 += w1
       5:       95 00 00 00 00 00 00 00 exit

0000000000000000 test_pkt_access:
       0:       bf 16 00 00 00 00 00 00 r6 = r1
===>   1:       85 10 00 00 ff ff ff ff call -1
       2:       bc 01 00 00 00 00 00 00 w1 = w0
       3:       b4 00 00 00 02 00 00 00 w0 = 2
       4:       61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0)
       5:       64 02 00 00 01 00 00 00 w2 <<= 1
       6:       5e 21 09 00 00 00 00 00 if w1 != w2 goto +9 <LBB2_3>
       7:       b4 01 00 00 02 00 00 00 w1 = 2
       8:       bf 62 00 00 00 00 00 00 r2 = r6
===>   9:       85 10 00 00 ff ff ff ff call -1
      10:       bc 01 00 00 00 00 00 00 w1 = w0
      11:       61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0)
      12:       04 02 00 00 02 00 00 00 w2 += 2
      13:       b4 00 00 00 ff ff ff ff w0 = -1
      14:       1e 21 01 00 00 00 00 00 if w1 == w2 goto +1 <LBB2_3>
      15:       b4 00 00 00 02 00 00 00 w0 = 2
0000000000000080 LBB2_3:
      16:       95 00 00 00 00 00 00 00 exit

Thus the right formula to calculate target call offset after relocation should
take into account relocation's target symbol value (offset within section),
call instruction's imm32 offset, and (subtracting, to get relative instruction
offset) instruction index of call instruction itself. All that is shifted by
number of instructions in main program, given all sub-programs are copied over
after main program.

Convert few selftests relying on bpf-to-bpf calls to use global functions
instead of static ones.

Fixes: 48cca7e44f9f ("libbpf: add support for bpf_call")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191119224447.3781271-1-andriin@fb.com
2019-11-25 16:55:44 -08:00
Luigi Rizzo
16ecc53e73 net-af_xdp: Use correct number of channels from ethtool
Drivers use different fields to report the number of channels, so take
the maximum of all data channels (rx, tx, combined) when determining the
size of the xsk map. The current code used only 'combined' which was set
to 0 in some drivers e.g. mlx4.

Tested: compiled and run xdpsock -q 3 -r -S on mlx4

Signed-off-by: Luigi Rizzo <lrizzo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20191119001951.92930-1-lrizzo@google.com
2019-11-25 16:55:44 -08:00
Andrii Nakryiko
38f66776db libbpf: Make global data internal arrays mmap()-able, if possible
Add detection of BPF_F_MMAPABLE flag support for arrays and add it as an extra
flag to internal global data maps, if supported by kernel. This allows users
to memory-map global data and use it without BPF map operations, greatly
simplifying user experience.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20191117172806.2195367-5-andriin@fb.com
2019-11-25 16:55:44 -08:00
Andrii Nakryiko
e9d33df74d bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY
Add ability to memory-map contents of BPF array map. This is extremely useful
for working with BPF global data from userspace programs. It allows to avoid
typical bpf_map_{lookup,update}_elem operations, improving both performance
and usability.

There had to be special considerations for map freezing, to avoid having
writable memory view into a frozen map. To solve this issue, map freezing and
mmap-ing is happening under mutex now:
  - if map is already frozen, no writable mapping is allowed;
  - if map has writable memory mappings active (accounted in map->writecnt),
    map freezing will keep failing with -EBUSY;
  - once number of writable memory mappings drops to zero, map freezing can be
    performed again.

Only non-per-CPU plain arrays are supported right now. Maps with spinlocks
can't be memory mapped either.

For BPF_F_MMAPABLE array, memory allocation has to be done through vmalloc()
to be mmap()'able. We also need to make sure that array data memory is
page-sized and page-aligned, so we over-allocate memory in such a way that
struct bpf_array is at the end of a single page of memory with array->value
being aligned with the start of the second page. On deallocation we need to
accomodate this memory arrangement to free vmalloc()'ed memory correctly.

One important consideration regarding how memory-mapping subsystem functions.
Memory-mapping subsystem provides few optional callbacks, among them open()
and close().  close() is called for each memory region that is unmapped, so
that users can decrease their reference counters and free up resources, if
necessary. open() is *almost* symmetrical: it's called for each memory region
that is being mapped, **except** the very first one. So bpf_map_mmap does
initial refcnt bump, while open() will do any extra ones after that. Thus
number of close() calls is equal to number of open() calls plus one more.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lore.kernel.org/bpf/20191117172806.2195367-4-andriin@fb.com
2019-11-25 16:55:44 -08:00
Alexei Starovoitov
05b515de7d libbpf: Add support for attaching BPF programs to other BPF programs
Extend libbpf api to pass attach_prog_fd into bpf_object__open.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-19-ast@kernel.org
2019-11-25 16:55:44 -08:00
Alexei Starovoitov
c2bbeaa900 selftests/bpf: Add test for BPF trampoline
Add sanity test for BPF trampoline that checks kernel functions
with up to 6 arguments of different sizes.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-10-ast@kernel.org
2019-11-25 16:55:44 -08:00
Alexei Starovoitov
799d153f41 libbpf: Add support to attach to fentry/fexit tracing progs
Teach libbpf to recognize tracing programs types and attach them to
fentry/fexit.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-7-ast@kernel.org
2019-11-25 16:55:44 -08:00
Alexei Starovoitov
69ff3960eb libbpf: Introduce btf__find_by_name_kind()
Introduce btf__find_by_name_kind() helper to search BTF by name and kind, since
name alone can be ambiguous.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-6-ast@kernel.org
2019-11-25 16:55:44 -08:00
Frantisek Sumsal
b91f53ec5f travis: use travis_terminate instead of set {+,-}e combo
Apart from that it looks a bit nicer, it also acts as a workaround for
https://travis-ci.community/t/exit-0-cannot-exit-successfully-on-arm/5731/4
2019-11-14 13:49:21 -08:00
Frantisek Sumsal
dd8f1bdd45 travis: bump the Ubuntu release to Bionic
The main reason why this is necessary is that gcc 5.x on Xenial doesn't
support ASan on s390x. Bumping the release to Bionic with gcc 7.x allows
us to build libbpf on s390x with ASan without issues.
2019-11-14 13:49:21 -08:00
Frantisek Sumsal
3720f31852 travis: add an s390x job
Travis now supports IBM Z and IBM Power architectures, so let's enable
them in our CI as well.

As libbpf won't compile on ppc64le right now (with current CFLAGS), let
skip it until the issue is resolved, see discussion in
https://github.com/libbpf/libbpf/pull/98#issuecomment-553873098

See: https://blog.travis-ci.com/2019-11-12-multi-cpu-architecture-ibm-power-ibm-z
2019-11-14 13:49:21 -08:00
Andrii Nakryiko
c51c492a65 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   ed578021210e14f15a654c825fba6a700c9a39a7
Checkpoint bpf-next commit: e47a179997ceee6864fbae620eee09ea9c345a4d
Baseline bpf commit:        7de086909365cd60a5619a45af3f4152516fd75c
Checkpoint bpf commit:      d0fbb51dfaa612f960519b798387be436e8f83c5

Andrii Nakryiko (6):
  libbpf: Fix negative FD close() in xsk_setup_xdp_prog()
  libbpf: Fix memory leak/double free issue
  libbpf: Fix potential overflow issue
  libbpf: Fix another potential overflow issue in bpf_prog_linfo
  libbpf: Make btf__resolve_size logic always check size error condition
  libbpf: Improve handling of corrupted ELF during map initialization

Magnus Karlsson (2):
  libbpf: Support XDP_SHARED_UMEM with external XDP program
  libbpf: Allow for creating Rx or Tx only AF_XDP sockets

Toke Høiland-Jørgensen (5):
  libbpf: Unpin auto-pinned maps if loading fails
  libbpf: Propagate EPERM to caller on program load
  libbpf: Use pr_warn() when printing netlink errors
  libbpf: Add bpf_get_link_xdp_info() function to get more XDP
    information
  libbpf: Add getter for program size

 src/bpf.c            |  2 +-
 src/bpf_prog_linfo.c | 14 +++----
 src/btf.c            |  3 +-
 src/libbpf.c         | 47 ++++++++++++++----------
 src/libbpf.h         | 13 +++++++
 src/libbpf.map       |  2 +
 src/netlink.c        | 87 +++++++++++++++++++++++++++++---------------
 src/nlattr.c         | 10 ++---
 src/xsk.c            | 34 +++++++++++------
 9 files changed, 136 insertions(+), 76 deletions(-)

--
2.17.1
2019-11-13 16:39:58 -08:00
Magnus Karlsson
d3e68e036e libbpf: Allow for creating Rx or Tx only AF_XDP sockets
The libbpf AF_XDP code is extended to allow for the creation of Rx
only or Tx only sockets. Previously it returned an error if the socket
was not initialized for both Rx and Tx.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: William Tu <u9012063@gmail.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/bpf/1573148860-30254-4-git-send-email-magnus.karlsson@intel.com
2019-11-13 16:39:58 -08:00
Magnus Karlsson
6ce8910d4d libbpf: Support XDP_SHARED_UMEM with external XDP program
Add support in libbpf to create multiple sockets that share a single
umem. Note that an external XDP program need to be supplied that
routes the incoming traffic to the desired sockets. So you need to
supply the libbpf_flag XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD and load
your own XDP program.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: William Tu <u9012063@gmail.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/bpf/1573148860-30254-2-git-send-email-magnus.karlsson@intel.com
2019-11-13 16:39:58 -08:00
Toke Høiland-Jørgensen
79b1d813f9 libbpf: Add getter for program size
This adds a new getter for the BPF program size (in bytes). This is useful
for a caller that is trying to predict how much memory will be locked by
loading a BPF object into the kernel.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/157333185272.88376.10996937115395724683.stgit@toke.dk
2019-11-13 16:39:58 -08:00
Toke Høiland-Jørgensen
26954e103d libbpf: Add bpf_get_link_xdp_info() function to get more XDP information
Currently, libbpf only provides a function to get a single ID for the XDP
program attached to the interface. However, it can be useful to get the
full set of program IDs attached, along with the attachment mode, in one
go. Add a new getter function to support this, using an extendible
structure to carry the information. Express the old bpf_get_link_id()
function in terms of the new function.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157333185164.88376.7520653040667637246.stgit@toke.dk
2019-11-13 16:39:58 -08:00
Toke Høiland-Jørgensen
c8c02fca3a libbpf: Use pr_warn() when printing netlink errors
The netlink functions were using fprintf(stderr, ) directly to print out
error messages, instead of going through the usual logging macros. This
makes it impossible for the calling application to silence or redirect
those error messages. Fix this by switching to pr_warn() in nlattr.c and
netlink.c.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/157333185055.88376.15999360127117901443.stgit@toke.dk
2019-11-13 16:39:58 -08:00
Toke Høiland-Jørgensen
0e2f5f9615 libbpf: Propagate EPERM to caller on program load
When loading an eBPF program, libbpf overrides the return code for EPERM
errors instead of returning it to the caller. This makes it hard to figure
out what went wrong on load.

In particular, EPERM is returned when the system rlimit is too low to lock
the memory required for the BPF program. Previously, this was somewhat
obscured because the rlimit error would be hit on map creation (which does
return it correctly). However, since maps can now be reused, object load
can proceed all the way to loading programs without hitting the error;
propagating it even in this case makes it possible for the caller to react
appropriately (and, e.g., attempt to raise the rlimit before retrying).

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/157333184946.88376.11768171652794234561.stgit@toke.dk
2019-11-13 16:39:58 -08:00
Toke Høiland-Jørgensen
b539321838 libbpf: Unpin auto-pinned maps if loading fails
Since the automatic map-pinning happens during load, it will leave pinned
maps around if the load fails at a later stage. Fix this by unpinning any
pinned maps on cleanup. To avoid unpinning pinned maps that were reused
rather than newly pinned, add a new boolean property on struct bpf_map to
keep track of whether that map was reused or not; and only unpin those maps
that were not reused.

Fixes: 57a00f41644f ("libbpf: Add auto-pinning of maps when loading BPF objects")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/157333184731.88376.9992935027056165873.stgit@toke.dk
2019-11-13 16:39:58 -08:00
Andrii Nakryiko
0f15f88443 libbpf: Improve handling of corrupted ELF during map initialization
If we get ELF file with "maps" section, but no symbols pointing to it, we'll
end up with division by zero. Add check against this situation and exit early
with error. Found by Coverity scan against Github libbpf sources.

Fixes: bf82927125dd ("libbpf: refactor map initialization")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-6-andriin@fb.com
2019-11-13 16:39:58 -08:00
Andrii Nakryiko
bada95a5f3 libbpf: Make btf__resolve_size logic always check size error condition
Perform size check always in btf__resolve_size. Makes the logic a bit more
robust against corrupted BTF and silences LGTM/Coverity complaining about
always true (size < 0) check.

Fixes: 69eaab04c675 ("btf: extract BTF type size calculation")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-5-andriin@fb.com
2019-11-13 16:39:58 -08:00
Andrii Nakryiko
fb929625dc libbpf: Fix another potential overflow issue in bpf_prog_linfo
Fix few issues found by Coverity and LGTM.

Fixes: b053b439b72a ("bpf: libbpf: bpftool: Print bpf_line_info during prog dump")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-4-andriin@fb.com
2019-11-13 16:39:58 -08:00
Andrii Nakryiko
1a828b3d58 libbpf: Fix potential overflow issue
Fix a potential overflow issue found by LGTM analysis, based on Github libbpf
source code.

Fixes: 3d65014146c6 ("bpf: libbpf: Add btf_line_info support to libbpf")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-3-andriin@fb.com
2019-11-13 16:39:58 -08:00
Andrii Nakryiko
330f4683e2 libbpf: Fix memory leak/double free issue
Coverity scan against Github libbpf code found the issue of not freeing memory and
leaving already freed memory still referenced from bpf_program. Fix it by
re-assigning successfully reallocated memory sooner.

Fixes: 2993e0515bb4 ("tools/bpf: add support to read .BTF.ext sections")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-2-andriin@fb.com
2019-11-13 16:39:58 -08:00
Andrii Nakryiko
2ef7f5607c libbpf: Fix negative FD close() in xsk_setup_xdp_prog()
Fix issue reported by static analysis (Coverity). If bpf_prog_get_fd_by_id()
fails, xsk_lookup_bpf_maps() will fail as well and clean-up code will attempt
close() with fd=-1. Fix by checking bpf_prog_get_fd_by_id() return result and
exiting early.

Fixes: 10a13bb40e54 ("libbpf: remove qidconf and better support external bpf programs.")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107054059.313884-1-andriin@fb.com
2019-11-13 16:39:58 -08:00
Andrii Nakryiko
4da243c179 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   f23c7ce341c2dfd187d4e3712ba6c110969463a0
Checkpoint bpf-next commit: ed578021210e14f15a654c825fba6a700c9a39a7
Baseline bpf commit:        7de086909365cd60a5619a45af3f4152516fd75c
Checkpoint bpf commit:      7de086909365cd60a5619a45af3f4152516fd75c

Andrii Nakryiko (1):
  libbpf: Simplify BPF_CORE_READ_BITFIELD_PROBED usage

 src/bpf_core_read.h | 27 +++++++++++----------------
 1 file changed, 11 insertions(+), 16 deletions(-)

--
2.17.1
2019-11-06 14:11:45 -08:00
Andrii Nakryiko
4d8fc6d438 libbpf: Simplify BPF_CORE_READ_BITFIELD_PROBED usage
Streamline BPF_CORE_READ_BITFIELD_PROBED interface to follow
BPF_CORE_READ_BITFIELD (direct) and BPF_CORE_READ, in general, i.e., just
return read result or 0, if underlying bpf_probe_read() failed.

In practice, real applications rarely check bpf_probe_read() result, because
it has to always work or otherwise it's a bug. So propagating internal
bpf_probe_read() error from this macro hurts usability without providing real
benefits in practice. This patch fixes the issue and simplifies usage,
noticeable even in selftest itself.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191106201500.2582438-1-andriin@fb.com
2019-11-06 14:11:45 -08:00
Andrii Nakryiko
6d4abdda08 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   a566e35f1e8b4b3be1e96a804d1cca38b578167c
Checkpoint bpf-next commit: f23c7ce341c2dfd187d4e3712ba6c110969463a0
Baseline bpf commit:        fc11078dd3514c65eabce166b8431a56d8a667cb
Checkpoint bpf commit:      7de086909365cd60a5619a45af3f4152516fd75c

Alexei Starovoitov (1):
  libbpf: Add support for prog_tracing

Andrii Nakryiko (2):
  libbpf: Add support for relocatable bitfields
  libbpf: Add support for field size relocations

Daniel Borkmann (1):
  bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str
    helpers

Toke Høiland-Jørgensen (4):
  libbpf: Fix error handling in bpf_map__reuse_fd()
  libbpf: Store map pin path and status in struct bpf_map
  libbpf: Move directory creation into _pin() functions
  libbpf: Add auto-pinning of maps when loading BPF objects

 include/uapi/linux/bpf.h | 124 ++++---
 src/bpf.c                |   8 +-
 src/bpf.h                |   5 +-
 src/bpf_core_read.h      |  79 +++++
 src/bpf_helpers.h        |   6 +
 src/libbpf.c             | 707 ++++++++++++++++++++++++++++++---------
 src/libbpf.h             |  23 +-
 src/libbpf.map           |   5 +
 src/libbpf_internal.h    |   4 +
 src/libbpf_probes.c      |   1 +
 10 files changed, 749 insertions(+), 213 deletions(-)

--
2.17.1
2019-11-05 16:00:11 -08:00
Andrii Nakryiko
67ab4c0f82 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2019-11-05 16:00:11 -08:00
Andrii Nakryiko
df45cf7a3e libbpf: Add support for field size relocations
Add bpf_core_field_size() macro, capturing a relocation against field size.
Adjust bits of internal libbpf relocation logic to allow capturing size
relocations of various field types: arrays, structs/unions, enums, etc.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191101222810.1246166-4-andriin@fb.com
2019-11-05 16:00:11 -08:00
Andrii Nakryiko
4438972ccc libbpf: Add support for relocatable bitfields
Add support for the new field relocation kinds, necessary to support
relocatable bitfield reads. Provide macro for abstracting necessary code doing
full relocatable bitfield extraction into u64 value. Two separate macros are
provided:
- BPF_CORE_READ_BITFIELD macro for direct memory read-enabled BPF programs
(e.g., typed raw tracepoints). It uses direct memory dereference to extract
bitfield backing integer value.
- BPF_CORE_READ_BITFIELD_PROBED macro for cases where bpf_probe_read() needs
to be used to extract same backing integer value.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191101222810.1246166-3-andriin@fb.com
2019-11-05 16:00:11 -08:00
Daniel Borkmann
09cd9ff2db bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers
The current bpf_probe_read() and bpf_probe_read_str() helpers are broken
in that they assume they can be used for probing memory access for kernel
space addresses /as well as/ user space addresses.

However, plain use of probe_kernel_read() for both cases will attempt to
always access kernel space address space given access is performed under
KERNEL_DS and some archs in-fact have overlapping address spaces where a
kernel pointer and user pointer would have the /same/ address value and
therefore accessing application memory via bpf_probe_read{,_str}() would
read garbage values.

Lets fix BPF side by making use of recently added 3d7081822f7f ("uaccess:
Add non-pagefault user-space read functions"). Unfortunately, the only way
to fix this status quo is to add dedicated bpf_probe_read_{user,kernel}()
and bpf_probe_read_{user,kernel}_str() helpers. The bpf_probe_read{,_str}()
helpers are kept as-is to retain their current behavior.

The two *_user() variants attempt the access always under USER_DS set, the
two *_kernel() variants will -EFAULT when accessing user memory if the
underlying architecture has non-overlapping address ranges, also avoiding
throwing the kernel warning via 00c42373d397 ("x86-64: add warning for
non-canonical user access address dereferences").

Fixes: a5e8c07059d0 ("bpf: add bpf_probe_read_str helper")
Fixes: 2541517c32be ("tracing, perf: Implement BPF programs attached to kprobes")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/796ee46e948bc808d54891a1108435f8652c6ca4.1572649915.git.daniel@iogearbox.net
2019-11-05 16:00:11 -08:00
Toke Høiland-Jørgensen
e7d860d2fc libbpf: Add auto-pinning of maps when loading BPF objects
This adds support to libbpf for setting map pinning information as part of
the BTF map declaration, to get automatic map pinning (and reuse) on load.
The pinning type currently only supports a single PIN_BY_NAME mode, where
each map will be pinned by its name in a path that can be overridden, but
defaults to /sys/fs/bpf.

Since auto-pinning only does something if any maps actually have a
'pinning' BTF attribute set, we default the new option to enabled, on the
assumption that seamless pinning is what most callers want.

When a map has a pin_path set at load time, libbpf will compare the map
pinned at that location (if any), and if the attributes match, will re-use
that map instead of creating a new one. If no existing map is found, the
newly created map will instead be pinned at the location.

Programs wanting to customise the pinning can override the pinning paths
using bpf_map__set_pin_path() before calling bpf_object__load() (including
setting it to NULL to disable pinning of a particular map).

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269298092.394725.3966306029218559681.stgit@toke.dk
2019-11-05 16:00:11 -08:00
Toke Høiland-Jørgensen
ff3d2702d8 libbpf: Move directory creation into _pin() functions
The existing pin_*() functions all try to create the parent directory
before pinning. Move this check into the per-object _pin() functions
instead. This ensures consistent behaviour when auto-pinning is
added (which doesn't go through the top-level pin_maps() function), at the
cost of a few more calls to mkdir().

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269297985.394725.5882630952992598610.stgit@toke.dk
2019-11-05 16:00:11 -08:00
Toke Høiland-Jørgensen
44f9712f79 libbpf: Store map pin path and status in struct bpf_map
Support storing and setting a pin path in struct bpf_map, which can be used
for automatic pinning. Also store the pin status so we can avoid attempts
to re-pin a map that has already been pinned (or reused from a previous
pinning).

The behaviour of bpf_object__{un,}pin_maps() is changed so that if it is
called with a NULL path argument (which was previously illegal), it will
(un)pin only those maps that have a pin_path set.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269297876.394725.14782206533681896279.stgit@toke.dk
2019-11-05 16:00:11 -08:00
Toke Høiland-Jørgensen
fe4cb796df libbpf: Fix error handling in bpf_map__reuse_fd()
bpf_map__reuse_fd() was calling close() in the error path before returning
an error value based on errno. However, close can change errno, so that can
lead to potentially misleading error messages. Instead, explicitly store
errno in the err variable before each goto.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269297769.394725.12634985106772698611.stgit@toke.dk
2019-11-05 16:00:11 -08:00
Alexei Starovoitov
15de8ad80d libbpf: Add support for prog_tracing
Cleanup libbpf from expected_attach_type == attach_btf_id hack
and introduce BPF_PROG_TYPE_TRACING.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191030223212.953010-3-ast@kernel.org
2019-11-05 16:00:11 -08:00
Frantisek Sumsal
d7a137510a coverity: explicitly use bash instead of sh
On Ubuntu `/bin/sh` is a symlink to `/bin/dash`, which doesn't support
certain builtins used by the Coverity script (namely pushd/popd)
2019-11-05 13:28:13 -08:00
Frantisek Sumsal
91e4f27dd7 travis: use sudo during the 'install' phase 2019-11-04 15:08:38 -08:00
Frantisek Sumsal
1339ef70a3 README: add Coverity badge 2019-11-01 23:22:57 -07:00
Frantisek Sumsal
c204e3d610 travis: automate Coverity builds 2019-11-01 23:22:57 -07:00
Frantisek Sumsal
32d0a03332 README: add a LGTM badge 2019-10-29 15:45:36 -07:00
Andrii Nakryiko
05346cfd90 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   3820729160440158a014add69cc0d371061a96b2
Checkpoint bpf-next commit: a566e35f1e8b4b3be1e96a804d1cca38b578167c
Baseline bpf commit:        2afd23f78f39da84937006ecd24aa664a4ab052b
Checkpoint bpf commit:      fc11078dd3514c65eabce166b8431a56d8a667cb

Andrii Nakryiko (2):
  libbpf: Fix off-by-one error in ELF sanity check
  libbpf: Don't use kernel-side u32 type in xsk.c

Magnus Karlsson (1):
  libbpf: Fix compatibility for kernels without need_wakeup

 src/libbpf.c |  2 +-
 src/xsk.c    | 83 ++++++++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 72 insertions(+), 13 deletions(-)

--
2.17.1
2019-10-29 09:25:36 -07:00
Andrii Nakryiko
a7a32b899c libbpf: Don't use kernel-side u32 type in xsk.c
u32 is a kernel-side typedef. User-space library is supposed to use __u32.
This breaks Github's projection of libbpf. Do u32 -> __u32 fix.

Fixes: 94ff9ebb49a5 ("libbpf: Fix compatibility for kernels without need_wakeup")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20191029055953.2461336-1-andriin@fb.com
2019-10-29 09:25:36 -07:00
Andrii Nakryiko
68a051f2d2 libbpf: Fix off-by-one error in ELF sanity check
libbpf's bpf_object__elf_collect() does simple sanity check after iterating
over all ELF sections, if checks that .strtab index is correct. Unfortunately,
due to section indices being 1-based, the check breaks for cases when .strtab
ends up being the very last section in ELF.

Fixes: 77ba9a5b48a7 ("tools lib bpf: Fetch map names from correct strtab")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191028233727.1286699-1-andriin@fb.com
2019-10-29 09:25:36 -07:00
Magnus Karlsson
8e80367637 libbpf: Fix compatibility for kernels without need_wakeup
When the need_wakeup flag was added to AF_XDP, the format of the
XDP_MMAP_OFFSETS getsockopt was extended. Code was added to the
kernel to take care of compatibility issues arrising from running
applications using any of the two formats. However, libbpf was
not extended to take care of the case when the application/libbpf
uses the new format but the kernel only supports the old
format. This patch adds support in libbpf for parsing the old
format, before the need_wakeup flag was added, and emulating a
set of static need_wakeup flags that will always work for the
application.

v2 -> v3:
* Incorporated code improvements suggested by Jonathan Lemon

v1 -> v2:
* Rebased to bpf-next
* Rewrote the code as the previous version made you blind

Fixes: a4500432c2587cb2a ("libbpf: add support for need_wakeup flag in AF_XDP part")
Reported-by: Eloy Degen <degeneloy@gmail.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/bpf/1571995035-21889-1-git-send-email-magnus.karlsson@intel.com
2019-10-29 09:25:36 -07:00
Andrii Nakryiko
9a5adecc62 sync: ignore test_libbpf.c
Adjust sync script to ignore test_libbpf.c, not test_libbpf.cpp.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-29 09:25:36 -07:00
Frantisek Sumsal
b923d0e3c6 lgtm: fix the extraction process
As this project uses only Makefile, without any configuration step, and due to
a "non-standard" location of the source files, LGTM kept failing to find the
respective Makefile and build the sources. By tricking LGTM's build system
auto detection, that we use automake/configure, it correctly sets the source
dir, thus the compilation, extraction & analysis steps now work in the src/
subdirectory, as expected.
2019-10-28 15:15:47 -07:00
Andrii Nakryiko
f02e248ae1 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   5e5b03d163e15a40b0fa57c70b4e8edd549b0b98
Checkpoint bpf-next commit: 3820729160440158a014add69cc0d371061a96b2
Baseline bpf commit:        cd7455f1013ef96d5cbf5c05d2b7c06f273810a6
Checkpoint bpf commit:      2afd23f78f39da84937006ecd24aa664a4ab052b

Björn Töpel (1):
  libbpf: Use implicit XSKMAP lookup from AF_XDP XDP program

KP Singh (1):
  libbpf: Fix strncat bounds error in libbpf_prog_type_by_name

 src/libbpf.c |  2 +-
 src/xsk.c    | 42 ++++++++++++++++++++++++++++++++----------
 2 files changed, 33 insertions(+), 11 deletions(-)

--
2.17.1
2019-10-24 22:59:06 -07:00
KP Singh
e152510d72 libbpf: Fix strncat bounds error in libbpf_prog_type_by_name
On compiling samples with this change, one gets an error:

 error: ‘strncat’ specified bound 118 equals destination size
  [-Werror=stringop-truncation]

    strncat(dst, name + section_names[i].len,
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     sizeof(raw_tp_btf_name) - (dst - raw_tp_btf_name));
     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

strncat requires the destination to have enough space for the
terminating null byte.

Fixes: f75a697e09137 ("libbpf: Auto-detect btf_id of BTF-based raw_tracepoint")
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191023154038.24075-1-kpsingh@chromium.org
2019-10-24 22:59:06 -07:00
Björn Töpel
59ac1946b0 libbpf: Use implicit XSKMAP lookup from AF_XDP XDP program
In commit 43e74c0267a3 ("bpf_xdp_redirect_map: Perform map lookup in
eBPF helper") the bpf_redirect_map() helper learned to do map lookup,
which means that the explicit lookup in the XDP program for AF_XDP is
not needed for post-5.3 kernels.

This commit adds the implicit map lookup with default action, which
improves the performance for the "rx_drop" [1] scenario with ~4%.

For pre-5.3 kernels, the bpf_redirect_map() returns XDP_ABORTED, and a
fallback path for backward compatibility is entered, where explicit
lookup is still performed. This means a slight regression for older
kernels (an additional bpf_redirect_map() call), but I consider that a
fair punishment for users not upgrading their kernels. ;-)

v1->v2: Backward compatibility (Toke) [2]
v2->v3: Avoid masking/zero-extension by using JMP32 [3]

[1] # xdpsock -i eth0 -z -r
[2] https://lore.kernel.org/bpf/87pnirb3dc.fsf@toke.dk/
[3] https://lore.kernel.org/bpf/87v9sip0i8.fsf@toke.dk/

Suggested-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20191022072206.6318-1-bjorn.topel@gmail.com
2019-10-24 22:59:06 -07:00
Andrii Nakryiko
5150a4a0fb includes: add BPF_JMP32_IMM macro to fix build
Recent xsk change started using new BPF_JMP32_IMM macro. Add it to our
local copy of include/linux/filter.h to fix the build.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-24 22:59:06 -07:00
Frantisek Sumsal
2a25957df6 travis: add an aarch64 Xenial job 2019-10-23 10:13:54 -07:00
Andrii Nakryiko
e441f55089 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   da927466a152a9497c05926a95c6aebba6d3ad5b
Checkpoint bpf-next commit: 5e5b03d163e15a40b0fa57c70b4e8edd549b0b98
Baseline bpf commit:        9e8acd9c44a0dd52b2922eeb82398c04e356c058
Checkpoint bpf commit:      cd7455f1013ef96d5cbf5c05d2b7c06f273810a6

Alexei Starovoitov (3):
  bpf: Add attach_btf_id attribute to program load
  libbpf: Auto-detect btf_id of BTF-based raw_tracepoints
  bpf: Check types of arguments passed into helpers

Andrii Nakryiko (5):
  tools: Sync if_link.h
  libbpf: Add bpf_program__get_{type, expected_attach_type) APIs
  libbpf: Add uprobe/uretprobe and tp/raw_tp section suffixes
  libbpf: Teach bpf_object__open to guess program types
  libbpf: Make DECLARE_LIBBPF_OPTS macro strictly a variable declaration

John Fastabend (1):
  bpf, libbpf: Add kernel version section parsing back

Kefeng Wang (1):
  tools, bpf: Rename pr_warning to pr_warn to align with kernel logging

 include/uapi/linux/bpf.h     |  28 +-
 include/uapi/linux/if_link.h |   2 +
 src/bpf.c                    |   3 +
 src/btf.c                    |  56 +--
 src/btf_dump.c               |  18 +-
 src/libbpf.c                 | 830 +++++++++++++++++++----------------
 src/libbpf.h                 |  24 +-
 src/libbpf.map               |   2 +
 src/libbpf_internal.h        |   8 +-
 src/xsk.c                    |   4 +-
 10 files changed, 539 insertions(+), 436 deletions(-)

--
2.17.1
2019-10-22 16:15:55 -07:00
Andrii Nakryiko
beb9f88080 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2019-10-22 16:15:55 -07:00
Andrii Nakryiko
c7b5116f71 libbpf: Make DECLARE_LIBBPF_OPTS macro strictly a variable declaration
LIBBPF_OPTS is implemented as a mix of field declaration and memset
+ assignment. This makes it neither variable declaration nor purely
statements, which is a problem, because you can't mix it with either
other variable declarations nor other function statements, because C90
compiler mode emits warning on mixing all that together.

This patch changes LIBBPF_OPTS into a strictly declaration of variable
and solves this problem, as can be seen in case of bpftool, which
previously would emit compiler warning, if done this way (LIBBPF_OPTS as
part of function variables declaration block).

This patch also renames LIBBPF_OPTS into DECLARE_LIBBPF_OPTS to follow
kernel convention for similar macros more closely.

v1->v2:
- rename LIBBPF_OPTS into DECLARE_LIBBPF_OPTS (Jakub Sitnicki).

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20191022172100.3281465-1-andriin@fb.com
2019-10-22 16:15:55 -07:00
Andrii Nakryiko
2b0cd55bf5 libbpf: Teach bpf_object__open to guess program types
Teach bpf_object__open how to guess program type and expected attach
type from section names, similar to what bpf_prog_load() does. This
seems like a really useful features and an oversight to not have this
done during bpf_object_open(). To preserver backwards compatible
behavior of bpf_prog_load(), its attr->prog_type is treated as an
override of bpf_object__open() decisions, if attr->prog_type is not
UNSPECIFIED.

There is a slight difference in behavior for bpf_prog_load().
Previously, if bpf_prog_load() was loading BPF object with more than one
program, first program's guessed program type and expected attach type
would determine corresponding attributes of all the subsequent program
types, even if their sections names suggest otherwise. That seems like
a rather dubious behavior and with this change it will behave more
sanely: each program's type is determined individually, unless they are
forced to uniformity through attr->prog_type.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191021033902.3856966-5-andriin@fb.com
2019-10-22 16:15:55 -07:00
Andrii Nakryiko
188276ca5f libbpf: Add uprobe/uretprobe and tp/raw_tp section suffixes
Map uprobe/uretprobe into KPROBE program type. tp/raw_tp are just an
alias for more verbose tracepoint/raw_tracepoint, respectively.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191021033902.3856966-4-andriin@fb.com
2019-10-22 16:15:55 -07:00
Andrii Nakryiko
87c4984da8 libbpf: Add bpf_program__get_{type, expected_attach_type) APIs
There are bpf_program__set_type() and
bpf_program__set_expected_attach_type(), but no corresponding getters,
which seems rather incomplete. Fix this.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191021033902.3856966-3-andriin@fb.com
2019-10-22 16:15:55 -07:00
Andrii Nakryiko
a5611ba6e8 tools: Sync if_link.h
Sync if_link.h into tools/ and get rid of annoying libbpf Makefile warning.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191021033902.3856966-2-andriin@fb.com
2019-10-22 16:15:55 -07:00
Kefeng Wang
c6e01425b6 tools, bpf: Rename pr_warning to pr_warn to align with kernel logging
For kernel logging macros, pr_warning() is completely removed and
replaced by pr_warn(). By using pr_warn() in tools/lib/bpf/ for
symmetry to kernel logging macros, we could eventually drop the
use of pr_warning() in the whole kernel tree.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191021055532.185245-1-wangkefeng.wang@huawei.com
2019-10-22 16:15:55 -07:00
John Fastabend
58e3a8fac1 bpf, libbpf: Add kernel version section parsing back
With commit "libbpf: stop enforcing kern_version,..." we removed the
kernel version section parsing in favor of querying for the kernel
using uname() and populating the version using the result of the
query. After this any version sections were simply ignored.

Unfortunately, the world of kernels is not so friendly. I've found some
customized kernels where uname() does not match the in kernel version.
To fix this so programs can load in this environment this patch adds
back parsing the section and if it exists uses the user specified
kernel version to override the uname() result. However, keep most the
kernel uname() discovery bits so users are not required to insert the
version except in these odd cases.

Fixes: 5e61f27070292 ("libbpf: stop enforcing kern_version, populate it for users")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157140968634.9073.6407090804163937103.stgit@john-XPS-13-9370
2019-10-22 16:15:55 -07:00
Alexei Starovoitov
1b27702c14 bpf: Check types of arguments passed into helpers
Introduce new helper that reuses existing skb perf_event output
implementation, but can be called from raw_tracepoint programs
that receive 'struct sk_buff *' as tracepoint argument or
can walk other kernel data structures to skb pointer.

In order to do that teach verifier to resolve true C types
of bpf helpers into in-kernel BTF ids.
The type of kernel pointer passed by raw tracepoint into bpf
program will be tracked by the verifier all the way until
it's passed into helper function.
For example:
kfree_skb() kernel function calls trace_kfree_skb(skb, loc);
bpf programs receives that skb pointer and may eventually
pass it into bpf_skb_output() bpf helper which in-kernel is
implemented via bpf_skb_event_output() kernel function.
Its first argument in the kernel is 'struct sk_buff *'.
The verifier makes sure that types match all the way.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191016032505.2089704-11-ast@kernel.org
2019-10-22 16:15:55 -07:00
Alexei Starovoitov
39cf9fc90f libbpf: Auto-detect btf_id of BTF-based raw_tracepoints
It's a responsiblity of bpf program author to annotate the program
with SEC("tp_btf/name") where "name" is a valid raw tracepoint.
The libbpf will try to find "name" in vmlinux BTF and error out
in case vmlinux BTF is not available or "name" is not found.
If "name" is indeed a valid raw tracepoint then in-kernel BTF
will have "btf_trace_##name" typedef that points to function
prototype of that raw tracepoint. BTF description captures
exact argument the kernel C code is passing into raw tracepoint.
The kernel verifier will check the types while loading bpf program.

libbpf keeps BTF type id in expected_attach_type, but since
kernel ignores this attribute for tracing programs copy it
into attach_btf_id attribute before loading.

Later the kernel will use prog->attach_btf_id to select raw tracepoint
during bpf_raw_tracepoint_open syscall command.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191016032505.2089704-6-ast@kernel.org
2019-10-22 16:15:55 -07:00
Alexei Starovoitov
bc4a6e9709 bpf: Add attach_btf_id attribute to program load
Add attach_btf_id attribute to prog_load command.
It's similar to existing expected_attach_type attribute which is
used in several cgroup based program types.
Unfortunately expected_attach_type is ignored for
tracing programs and cannot be reused for new purpose.
Hence introduce attach_btf_id to verify bpf programs against
given in-kernel BTF type id at load time.
It is strictly checked to be valid for raw_tp programs only.
In a later patches it will become:
btf_id == 0 semantics of existing raw_tp progs.
btd_id > 0 raw_tp with BTF and additional type safety.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191016032505.2089704-5-ast@kernel.org
2019-10-22 16:15:55 -07:00
Andrii Nakryiko
4a50ceb043 Makefile: back-port _FILE_OFFSET_BITS=64 and _LARGEFILE64_SOURCE to Makefile
Upstream commit 71dd77fd4bf7 ("libbpf: use LFS (_FILE_OFFSET_BITS) instead
of direct mmap2 syscall") added _FILE_OFFSET_BITS=64 and
_LARGEFILE64_SOURCE CFLAGS. Back-port them to Github's mirror to avoid
compilation problems on ARM.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-22 14:50:23 -07:00
Andrii Nakryiko
4d86cae4f0 ci: disable GCC's -Wstringop-truncation noisy error
This error is usually a false positive for us. Disable it.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
33b374395f sync: adjust sync script for test_libbpf.c rename and bpf_helper_defs.h
Accomodate changes:
- test_libbpf.cpp was renamed to test_libbpf.c;
- bpf_helper_defs.h should be ignored for consistency check at the end,
  as it's not checked in on linux side;

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
ade4409352 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   f05c2001ecc98629cecd47728e4db11e5a17e58d
Checkpoint bpf-next commit: da927466a152a9497c05926a95c6aebba6d3ad5b
Baseline bpf commit:        106c35dda32f8b63f88cad7433f1b8bb0056958a
Checkpoint bpf commit:      9e8acd9c44a0dd52b2922eeb82398c04e356c058

Andrii Nakryiko (7):
  libbpf: Fix struct end padding in btf_dump
  libbpf: Generate more efficient BPF_CORE_READ code
  libbpf: Handle invalid typedef emitted by old GCC
  libbpf: Update BTF reloc support to latest Clang format
  libbpf: Refactor bpf_object__open APIs to use common opts
  libbpf: Add support for field existance CO-RE relocation
  libbpf: Add BPF-side definitions of supported field relocation kinds

Ilya Maximets (1):
  libbpf: Fix passing uninitialized bytes to setsockopt

 src/bpf_core_read.h   |  28 ++++++-
 src/btf.c             |  16 ++--
 src/btf.h             |   4 +-
 src/btf_dump.c        |  19 ++++-
 src/libbpf.c          | 169 ++++++++++++++++++++++++++----------------
 src/libbpf.h          |   4 +-
 src/libbpf_internal.h |  25 +++++--
 src/xsk.c             |   1 +
 8 files changed, 180 insertions(+), 86 deletions(-)

--
2.17.1
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
2f9abb2a26 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
fca60960ea libbpf: Add BPF-side definitions of supported field relocation kinds
Add enum definition for Clang's __builtin_preserve_field_info()
second argument (info_kind). Currently only byte offset and existence
are supported. Corresponding Clang changes introducing this built-in can
be found at [0]

  [0] https://reviews.llvm.org/D67980

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191015182849.3922287-5-andriin@fb.com
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
0db22b01a1 libbpf: Add support for field existance CO-RE relocation
Add support for BPF_FRK_EXISTS relocation kind to detect existence of
captured field in a destination BTF, allowing conditional logic to
handle incompatible differences between kernels.

Also introduce opt-in relaxed CO-RE relocation handling option, which
makes libbpf emit warning for failed relocations, but proceed with other
relocations. Instruction, for which relocation failed, is patched with
(u32)-1 value.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191015182849.3922287-4-andriin@fb.com
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
807b9d7be1 libbpf: Refactor bpf_object__open APIs to use common opts
Refactor all the various bpf_object__open variations to ultimately
specify common bpf_object_open_opts struct. This makes it easy to keep
extending this common struct w/ extra parameters without having to
update all the legacy APIs.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191015182849.3922287-3-andriin@fb.com
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
a3d02f9ab4 libbpf: Update BTF reloc support to latest Clang format
BTF offset reloc was generalized in recent Clang into field relocation,
capturing extra u32 field, specifying what aspect of captured field
needs to be relocated. This changes .BTF.ext's record size for this
relocation from 12 bytes to 16 bytes. Given these format changes
happened in Clang before official released version, it's ok to not
support outdated 12-byte record size w/o breaking ABI.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191015182849.3922287-2-andriin@fb.com
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
54aac21f7e libbpf: Handle invalid typedef emitted by old GCC
Old GCC versions are producing invalid typedef for __gnuc_va_list
pointing to void. Special-case this and emit valid:

typedef __builtin_va_list __gnuc_va_list;

Reported-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20191011032901.452042-1-andriin@fb.com
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
d8dd0beb98 libbpf: Generate more efficient BPF_CORE_READ code
Existing BPF_CORE_READ() macro generates slightly suboptimal code. If
there are intermediate pointers to be read, initial source pointer is
going to be assigned into a temporary variable and then temporary
variable is going to be uniformly used as a "source" pointer for all
intermediate pointer reads. Schematically (ignoring all the type casts),
BPF_CORE_READ(s, a, b, c) is expanded into:
({
	const void *__t = src;
	bpf_probe_read(&__t, sizeof(*__t), &__t->a);
	bpf_probe_read(&__t, sizeof(*__t), &__t->b);

	typeof(s->a->b->c) __r;
	bpf_probe_read(&__r, sizeof(*__r), &__t->c);
})

This initial `__t = src` makes calls more uniform, but causes slightly
less optimal register usage sometimes when compiled with Clang. This can
cascase into, e.g., more register spills.

This patch fixes this issue by generating more optimal sequence:
({
	const void *__t;
	bpf_probe_read(&__t, sizeof(*__t), &src->a); /* <-- src here */
	bpf_probe_read(&__t, sizeof(*__t), &__t->b);

	typeof(s->a->b->c) __r;
	bpf_probe_read(&__r, sizeof(*__r), &__t->c);
})

Fixes: 7db3822ab991 ("libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191011023847.275936-1-andriin@fb.com
2019-10-15 19:43:48 -07:00
Ilya Maximets
e94f57a9ab libbpf: Fix passing uninitialized bytes to setsockopt
'struct xdp_umem_reg' has 4 bytes of padding at the end that makes
valgrind complain about passing uninitialized stack memory to the
syscall:

  Syscall param socketcall.setsockopt() points to uninitialised byte(s)
    at 0x4E7AB7E: setsockopt (in /usr/lib64/libc-2.29.so)
    by 0x4BDE035: xsk_umem__create@@LIBBPF_0.0.4 (xsk.c:172)
  Uninitialised value was created by a stack allocation
    at 0x4BDDEBA: xsk_umem__create@@LIBBPF_0.0.4 (xsk.c:140)

Padding bytes appeared after introducing of a new 'flags' field.
memset() is required to clear them.

Fixes: 10d30e301732 ("libbpf: add flags to umem config")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191009164929.17242-1-i.maximets@ovn.org
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
bda436be4a libbpf: Fix struct end padding in btf_dump
Fix a case where explicit padding at the end of a struct is necessary
due to non-standart alignment requirements of fields (which BTF doesn't
capture explicitly).

Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion")
Reported-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20191008231009.2991130-2-andriin@fb.com
2019-10-15 19:43:48 -07:00
Andrii Nakryiko
a30df5c09f makefile: install new BPF-side headers along libbpf user-land ones
Install BPF-side helper headers:
- bpf_helpers.h
- bpf_helper_defs.h
- bpf_tracing.h
- bpf_endian.h
- bpf_core_read.h

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
e776bf7ec7 sync: teach sync script to generate bpf_helper_defs.h
Linux repo doesn't commit bpf_helper_defs.h, as it's re-generated on
build every time. For Github projection, though, it's much nicer to have
this header be pre-generated during sync and commited. This makes
integration story easier for all the users that use libbpf as
a submodule.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
46688687d5 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   02dc96ef6c25f990452c114c59d75c368a1f4c8f
Checkpoint bpf-next commit: f05c2001ecc98629cecd47728e4db11e5a17e58d
Baseline bpf commit:        1bd63524593b964934a33afd442df16b8f90e2b5
Checkpoint bpf commit:      106c35dda32f8b63f88cad7433f1b8bb0056958a

Andrii Nakryiko (7):
  libbpf: Bump current version to v0.0.6
  libbpf: stop enforcing kern_version, populate it for users
  libbpf: add bpf_object__open_{file, mem} w/ extensible opts
  libbpf: fix bpf_object__name() to actually return object name
  uapi/bpf: fix helper docs
  libbpf: Move bpf_{helpers, helper_defs, endian, tracing}.h into libbpf
  libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers

 include/uapi/linux/bpf.h |  32 +++----
 src/bpf_core_read.h      | 167 +++++++++++++++++++++++++++++++++
 src/bpf_endian.h         |  72 +++++++++++++++
 src/bpf_helpers.h        |  41 ++++++++
 src/bpf_tracing.h        | 195 +++++++++++++++++++++++++++++++++++++++
 src/libbpf.c             | 183 ++++++++++++++++++------------------
 src/libbpf.h             |  48 +++++++++-
 src/libbpf.map           |   6 ++
 src/libbpf_internal.h    |  32 +++++++
 9 files changed, 661 insertions(+), 115 deletions(-)
 create mode 100644 src/bpf_core_read.h
 create mode 100644 src/bpf_endian.h
 create mode 100644 src/bpf_helpers.h
 create mode 100644 src/bpf_tracing.h

--
2.17.1
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
19cbbd8f52 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
c87b3a6065 libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers
Add few macros simplifying BCC-like multi-level probe reads, while also
emitting CO-RE relocations for each read.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191008175942.1769476-7-andriin@fb.com
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
4c55ba2b19 libbpf: Move bpf_{helpers, helper_defs, endian, tracing}.h into libbpf
Move bpf_helpers.h, bpf_tracing.h, and bpf_endian.h into libbpf. Move
bpf_helper_defs.h generation into libbpf's Makefile. Ensure all those
headers are installed along the other libbpf headers. Also, adjust
selftests and samples include path to include libbpf now.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191008175942.1769476-6-andriin@fb.com
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
104006a054 uapi/bpf: fix helper docs
Various small fixes to BPF helper documentation comments, enabling
automatic header generation with a list of BPF helpers.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
bf83a95dee libbpf: fix bpf_object__name() to actually return object name
bpf_object__name() was returning file path, not name. Fix this.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
732f598282 libbpf: add bpf_object__open_{file, mem} w/ extensible opts
Add new set of bpf_object__open APIs using new approach to optional
parameters extensibility allowing simpler ABI compatibility approach.

This patch demonstrates an approach to implementing libbpf APIs that
makes it easy to extend existing APIs with extra optional parameters in
such a way, that ABI compatibility is preserved without having to do
symbol versioning and generating lots of boilerplate code to handle it.
To facilitate succinct code for working with options, add OPTS_VALID,
OPTS_HAS, and OPTS_GET macros that hide all the NULL, size, and zero
checks.

Additionally, newly added libbpf APIs are encouraged to follow similar
pattern of having all mandatory parameters as formal function parameters
and always have optional (NULL-able) xxx_opts struct, which should
always have real struct size as a first field and the rest would be
optional parameters added over time, which tune the behavior of existing
API, if specified by user.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
de3c5a17cb libbpf: stop enforcing kern_version, populate it for users
Kernel version enforcement for kprobes/kretprobes was removed from
5.0 kernel in 6c4fc209fcf9 ("bpf: remove useless version check for prog load").
Since then, BPF programs were specifying SEC("version") just to please
libbpf. We should stop enforcing this in libbpf, if even kernel doesn't
care. Furthermore, libbpf now will pre-populate current kernel version
of the host system, in case we are still running on old kernel.

This patch also removes __bpf_object__open_xattr from libbpf.h, as
nothing in libbpf is relying on having it in that header. That function
was never exported as LIBBPF_API and even name suggests its internal
version. So this should be safe to remove, as it doesn't break ABI.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
1a8a75037b libbpf: Bump current version to v0.0.6
New release cycle started, let's bump to v0.0.6 proactively.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20190930222503.519782-1-andriin@fb.com
2019-10-09 14:42:45 -07:00
Andrii Nakryiko
1a26b51b1c meson: kill meson.build as it's not used anymore
Meson.build was added to facilitate systemd integration, but systemd
integration went different direction and we don't need meson.build
anymore. So remove it and not maintain it anymore.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-01 16:51:32 -07:00
Andrii Nakryiko
2cc0829775 ci: execute install step in CI
Add simple execution of `make install` in Debian and Xenial build to
catch most obvious breakages.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-01 12:56:22 -07:00
Andrii Nakryiko
92cb475558 makefile: fix install target
After latest shared vs static libraries fixes, `make install` target
broke as it relied on now removed $(LIBS) variable. This patch fixes
issue by listing $(SHARED_LIBS) and $(STATIC_LIBS) explicitly.

Tested with and without BUILD_STATIC_ONLY.

Fixes: 8b2782a1f2 ("makefile: support libbpf symbol versioning in shared library mode")
Reported-by: Michal Rostecki <mrostecki@opensuse.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-10-01 12:56:22 -07:00
Andrii Nakryiko
8b2782a1f2 makefile: support libbpf symbol versioning in shared library mode
Similarly to Linux's 1bd63524593b ("libbpf: handle symbol versioning properly
for libbpf.a"), add necessary changes to build static and shared object
files separately with extra shared library flags. This allows to
properly handle symbol versioning in shared library mode, while still
having statically linkable library.

Cc: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-09-30 21:42:42 -07:00
Andrii Nakryiko
886e8149a0 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   b41dae061bbd722b9d7fa828f35d22035b218e18
Checkpoint bpf-next commit: 02dc96ef6c25f990452c114c59d75c368a1f4c8f
Baseline bpf commit:        e3439af4a339acd7fddbd6d59b8ecefaac07a611
Checkpoint bpf commit:      1bd63524593b964934a33afd442df16b8f90e2b5

Yonghong Song (1):
  libbpf: handle symbol versioning properly for libbpf.a

 src/libbpf_internal.h | 16 ++++++++++++++++
 src/xsk.c             |  4 ++--
 2 files changed, 18 insertions(+), 2 deletions(-)

--
2.17.1
2019-09-30 16:10:45 -07:00
Yonghong Song
d275397111 libbpf: handle symbol versioning properly for libbpf.a
bcc uses libbpf repo as a submodule. It brings in libbpf source
code and builds everything together to produce shared libraries.
With latest libbpf, I got the following errors:
  /bin/ld: libbcc_bpf.so.0.10.0: version node not found for symbol xsk_umem__create@LIBBPF_0.0.2
  /bin/ld: failed to set dynamic section sizes: Bad value
  collect2: error: ld returned 1 exit status
  make[2]: *** [src/cc/libbcc_bpf.so.0.10.0] Error 1

In xsk.c, we have
  asm(".symver xsk_umem__create_v0_0_2, xsk_umem__create@LIBBPF_0.0.2");
  asm(".symver xsk_umem__create_v0_0_4, xsk_umem__create@@LIBBPF_0.0.4");
The linker thinks the built is for LIBBPF but cannot find proper version
LIBBPF_0.0.2/4, so emit errors.

I also confirmed that using libbpf.a to produce a shared library also
has issues:
  -bash-4.4$ cat t.c
  extern void *xsk_umem__create;
  void * test() { return xsk_umem__create; }
  -bash-4.4$ gcc -c -fPIC t.c
  -bash-4.4$ gcc -shared t.o libbpf.a -o t.so
  /bin/ld: t.so: version node not found for symbol xsk_umem__create@LIBBPF_0.0.2
  /bin/ld: failed to set dynamic section sizes: Bad value
  collect2: error: ld returned 1 exit status
  -bash-4.4$

Symbol versioning does happens in commonly used libraries, e.g., elfutils
and glibc. For static libraries, for a versioned symbol, the old definitions
will be ignored, and the symbol will be an alias to the latest definition.
For example, glibc sched_setaffinity is versioned.
  -bash-4.4$ readelf -s /usr/lib64/libc.so.6 | grep sched_setaffinity
     756: 000000000013d3d0    13 FUNC    GLOBAL DEFAULT   13 sched_setaffinity@GLIBC_2.3.3
     757: 00000000000e2e70   455 FUNC    GLOBAL DEFAULT   13 sched_setaffinity@@GLIBC_2.3.4
    1800: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS sched_setaffinity.c
    4228: 00000000000e2e70   455 FUNC    LOCAL  DEFAULT   13 __sched_setaffinity_new
    4648: 000000000013d3d0    13 FUNC    LOCAL  DEFAULT   13 __sched_setaffinity_old
    7338: 000000000013d3d0    13 FUNC    GLOBAL DEFAULT   13 sched_setaffinity@GLIBC_2
    7380: 00000000000e2e70   455 FUNC    GLOBAL DEFAULT   13 sched_setaffinity@@GLIBC_
  -bash-4.4$
For static library, the definition of sched_setaffinity aliases to the new definition.
  -bash-4.4$ readelf -s /usr/lib64/libc.a | grep sched_setaffinity
  File: /usr/lib64/libc.a(sched_setaffinity.o)
     8: 0000000000000000   455 FUNC    GLOBAL DEFAULT    1 __sched_setaffinity_new
    12: 0000000000000000   455 FUNC    WEAK   DEFAULT    1 sched_setaffinity

For both elfutils and glibc, additional macros are used to control different handling
of symbol versioning w.r.t static and shared libraries.
For elfutils, the macro is SYMBOL_VERSIONING
(https://sourceware.org/git/?p=elfutils.git;a=blob;f=lib/eu-config.h).
For glibc, the macro is SHARED
(https://sourceware.org/git/?p=glibc.git;a=blob;f=include/shlib-compat.h;hb=refs/heads/master)

This patch used SHARED as the macro name. After this patch, the libbpf.a has
  -bash-4.4$ readelf -s libbpf.a | grep xsk_umem__create
     372: 0000000000017145  1190 FUNC    GLOBAL DEFAULT    1 xsk_umem__create_v0_0_4
     405: 0000000000017145  1190 FUNC    GLOBAL DEFAULT    1 xsk_umem__create
     499: 00000000000175eb   103 FUNC    GLOBAL DEFAULT    1 xsk_umem__create_v0_0_2
  -bash-4.4$
No versioned symbols for xsk_umem__create.
The libbpf.a can be used to build a shared library succesfully.
  -bash-4.4$ cat t.c
  extern void *xsk_umem__create;
  void * test() { return xsk_umem__create; }
  -bash-4.4$ gcc -c -fPIC t.c
  -bash-4.4$ gcc -shared t.o libbpf.a -o t.so
  -bash-4.4$

Fixes: 10d30e301732 ("libbpf: add flags to umem config")
Cc: Kevin Laatz <kevin.laatz@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-09-30 16:10:45 -07:00
Andrii Nakryiko
ede18f80d8 scripts: fix empty cherry-pick handling, fix IGNORE_CONSISTENCY check
Fix two issues I've encountered during latest bpf/bpf-next sync.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-09-30 15:02:52 -07:00
Andrii Nakryiko
07cd489681 libbpf: fix Github-only indentation issue
When appying urgent fix to Github mirror, before it was pushed to linux
repo, there were some indentation issues, which eventually got fixed
upstream, but are still in Github mirror. Fix it to prevent future merge
conflicts.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-09-30 15:02:43 -07:00
Andrii Nakryiko
d2f307c7f6 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   0bb52b0dfc88a155688f492aba8e686147600278
Checkpoint bpf-next commit: b41dae061bbd722b9d7fa828f35d22035b218e18
Baseline bpf commit:        2c238177bd7f4b14bdf7447cc1cd9bb791f147e6
Checkpoint bpf commit:      e3439af4a339acd7fddbd6d59b8ecefaac07a611

Alexei Starovoitov (1):
  tools/bpf: sync bpf.h

Andrii Nakryiko (2):
  libbpf: fix false uninitialized variable warning
  libbpf: Teach btf_dumper to emit stand-alone anonymous enum
    definitions

Daniel Borkmann (1):
  bpf: sync bpf.h to tools infrastructure

Kevin Laatz (1):
  libbpf: add flags to umem config

Toke Høiland-Jørgensen (1):
  libbpf: Remove getsockopt() check for XDP_OPTIONS

 include/uapi/linux/bpf.h    |  7 ++-
 include/uapi/linux/if_xdp.h |  9 ++++
 src/btf_dump.c              | 94 ++++++++++++++++++++++++++++++++++---
 src/libbpf.map              |  1 +
 src/xsk.c                   | 44 +++++++++++------
 src/xsk.h                   | 27 +++++++++++
 6 files changed, 160 insertions(+), 22 deletions(-)

--
2.17.1
2019-09-26 13:29:16 -07:00
Andrii Nakryiko
990cef2a0c libbpf: Teach btf_dumper to emit stand-alone anonymous enum definitions
BTF-to-C converter previously skipped anonymous enums in an assumption
that those are embedded in struct's field definitions. This is not
always the case and a lot of kernel constants are defined as part of
anonymous enums. This change fixes the logic by eagerly marking all
types as either referenced by any other type or not. This is enough to
distinguish two classes of anonymous enums and emit previously omitted
enum definitions.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20190925203745.3173184-1-andriin@fb.com
2019-09-26 13:29:16 -07:00
Andrii Nakryiko
4c2c521513 libbpf: fix false uninitialized variable warning
Some compilers emit warning for potential uninitialized next_id usage.
The code is correct, but control flow is too complicated for some
compilers to figure this out. Re-initialize next_id to satisfy
compiler.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-26 13:29:16 -07:00
Toke Høiland-Jørgensen
b1e911e9ba libbpf: Remove getsockopt() check for XDP_OPTIONS
The xsk_socket__create() function fails and returns an error if it cannot
get the XDP_OPTIONS through getsockopt(). However, support for XDP_OPTIONS
was not added until kernel 5.3, so this means that creating XSK sockets
always fails on older kernels.

Since the option is just used to set the zero-copy flag in the xsk struct,
and that flag is not really used for anything yet, just remove the
getsockopt() call until a proper use for it is introduced.

Suggested-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-26 13:29:16 -07:00
Kevin Laatz
ae673dc91f libbpf: add flags to umem config
This patch adds a 'flags' field to the umem_config and umem_reg structs.
This will allow for more options to be added for configuring umems.

The first use for the flags field is to add a flag for unaligned chunks
mode. These flags can either be user-provided or filled with a default.

Since we change the size of the xsk_umem_config struct, we need to version
the ABI. This patch includes the ABI versioning for xsk_umem__create. The
Makefile was also updated to handle multiple function versions in
check-abi.

Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-26 13:29:16 -07:00
Alexei Starovoitov
5a256d12bf tools/bpf: sync bpf.h
sync bpf.h from kernel/ to tools/

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-26 13:29:16 -07:00
Ondrej Mosnacek
ae8edc7624 libbpf: fix linker flags for shared library
The -lelf flag needs to be specified *after* the object files, otherwise
the output library produced by some compilers doesn't contain a link to
libelf.so:

(Example from Debian testing run on Travis.)
$ ldd libelf.so
	linux-vdso.so.1 (0x00007ffcbfda9000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f75f8d24000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f75f8f0f000)

Linking against such library then produces 'undefined reference to ...'
errors unless the target links against libelf as well. After this commit
the built library references the libelf library correctly:

$ ldd libbpf.so
	linux-vdso.so.1 (0x00007ffc821f1000)
	libelf.so.1 => /usr/lib/x86_64-linux-gnu/libelf.so.1 (0x00007f70ea3ec000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f70ea22c000)
	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f70ea20f000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f70ea433000)

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
2019-09-26 00:41:27 -07:00
Ondrej Mosnacek
8f8b4a14fa Travis CI: add sanity check for libelf dependency
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
2019-09-26 00:41:27 -07:00
Yonghong Song
476e158b07 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   b753c5a7f99f390fc100de18647ce0dcacdceafc
Checkpoint bpf-next commit: 0bb52b0dfc88a155688f492aba8e686147600278
Baseline bpf commit:        91b4db5313a2c793aabc2143efb8ed0cf0fdd097
Checkpoint bpf commit:      2c238177bd7f4b14bdf7447cc1cd9bb791f147e6

Andrii Nakryiko (1):
  libbpf: make libbpf.map source of truth for libbpf version

Ivan Khoronzhuk (1):
  libbpf: use LFS (_FILE_OFFSET_BITS) instead of direct mmap2 syscall

Magnus Karlsson (1):
  libbpf: add support for need_wakeup flag in AF_XDP part

Peter Wu (1):
  bpf: sync bpf.h to tools/

Quentin Monnet (3):
  tools: bpf: synchronise BPF UAPI header with tools
  libbpf: refactor bpf_*_get_next_id() functions
  libbpf: add bpf_btf_get_next_id() to cycle through BTF objects

Stanislav Fomichev (1):
  bpf: sync bpf.h to tools/

 include/uapi/linux/bpf.h    | 12 ++++++---
 include/uapi/linux/if_xdp.h | 13 +++++++++
 src/bpf.c                   | 24 ++++++++---------
 src/bpf.h                   |  1 +
 src/libbpf.map              |  5 ++++
 src/xsk.c                   | 53 +++++++++++++------------------------
 src/xsk.h                   |  6 +++++
 7 files changed, 64 insertions(+), 50 deletions(-)

--
2.17.1
2019-08-27 13:58:15 -07:00
Peter Wu
13e1ee420e bpf: sync bpf.h to tools/
Fix a 'struct pt_reg' typo and clarify when bpf_trace_printk discards
lines. Affects documentation only.

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-27 13:58:15 -07:00
Ivan Khoronzhuk
3e2bab6d2c libbpf: use LFS (_FILE_OFFSET_BITS) instead of direct mmap2 syscall
Drop __NR_mmap2 fork in flavor of LFS, that is _FILE_OFFSET_BITS=64
(glibc & bionic) / LARGEFILE64_SOURCE (for musl) decision. It allows
mmap() to use 64bit offset that is passed to mmap2 syscall. As result
pgoff is not truncated and no need to use direct access to mmap2 for
32 bits systems.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-27 13:58:15 -07:00
Quentin Monnet
9084f4cd4d libbpf: add bpf_btf_get_next_id() to cycle through BTF objects
Add an API function taking a BTF object id and providing the id of the
next BTF object in the kernel. This can be used to list all BTF objects
loaded on the system.

v2:
- Rebase on top of Andrii's changes regarding libbpf versioning.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-27 13:58:15 -07:00
Quentin Monnet
66d20edaf0 libbpf: refactor bpf_*_get_next_id() functions
In preparation for the introduction of a similar function for retrieving
the id of the next BTF object, consolidate the code from
bpf_prog_get_next_id() and bpf_map_get_next_id() in libbpf.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-27 13:58:15 -07:00
Quentin Monnet
d8d6772ab8 tools: bpf: synchronise BPF UAPI header with tools
Synchronise the bpf.h header under tools, to report the addition of the
new BPF_BTF_GET_NEXT_ID syscall command for bpf().

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-27 13:58:15 -07:00
Stanislav Fomichev
4397d09cd8 bpf: sync bpf.h to tools/
Sync new sk storage clone flag.

Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-27 13:58:15 -07:00
Magnus Karlsson
5771dacd3d libbpf: add support for need_wakeup flag in AF_XDP part
This commit adds support for the new need_wakeup flag in AF_XDP. The
xsk_socket__create function is updated to handle this and a new
function is introduced called xsk_ring_prod__needs_wakeup(). This
function can be used by the application to check if Rx and/or Tx
processing needs to be explicitly woken up.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-27 13:58:15 -07:00
Andrii Nakryiko
d34efeeef1 libbpf: make libbpf.map source of truth for libbpf version
Currently libbpf version is specified in 2 places: libbpf.map and
Makefile. They easily get out of sync and it's very easy to update one,
but forget to update another one. In addition, Github projection of
libbpf has to maintain its own version which has to be remembered to be
kept in sync manually, which is very error-prone approach.

This patch makes libbpf.map a source of truth for libbpf version and
uses shell invocation to parse out correct full and major libbpf version
to use during build. Now we need to make sure that once new release
cycle starts, we need to add (initially) empty section to libbpf.map
with correct latest version.

This also will make it possible to keep Github projection consistent
with kernel sources version of libbpf by adopting similar parsing of
version from libbpf.map.

v2->v3:
- grep -o + sort -rV (Andrey);

v1->v2:
- eager version vars evaluation (Jakub);
- simplified version regex (Andrey);

Cc: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-27 13:58:15 -07:00
Petar Penkov
db63a5aa5d filter.h: fix BPF_LD_MAP_VALUE definition
The current definition calls BPF_LD_IMM64_RAW_FULL with
BPF_PSEUDO_MAP_FD but the original patch[0] invokes it with
BPF_PSEUDO_MAP_VALUE.

[0] https://patchwork.ozlabs.org/patch/1082785/
2019-08-22 16:33:59 -07:00
Andrii Nakryiko
d60f568961 Makefile: get libbpf version from libbpf.map
Similarly to kernel-side Makefile ([0]), get libbpf version by looking
at latest version in libbpf.map.

  [0] https://patchwork.ozlabs.org/patch/1147232/

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-08-14 22:41:40 -07:00
Andrii Nakryiko
e78a36f4b0 sync: fix non-empty merge detection/handling
Fix how non-empty merge detection is done. Allow to proceed despite
non-empty merges (they will typically will cause conflicts during
applying patches, but if conflicts were handled already, it should be ok
to ignore this problem).

Also ensure that diff's output is in unified diff format.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-08-14 09:53:04 -07:00
Andrii Nakryiko
c8a7eb06bd sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   b707659213d3c70f2c704ec950df6263b4bffe84
Checkpoint bpf-next commit: b753c5a7f99f390fc100de18647ce0dcacdceafc
Baseline bpf commit:        f1fc7249dddc0e52d9e805e2e661caa118649509
Checkpoint bpf commit:      91b4db5313a2c793aabc2143efb8ed0cf0fdd097

Andrii Nakryiko (2):
  libbpf: fix missing __WORDSIZE definition
  libbpf: attempt to load kernel BTF from sysfs first

Arnaldo Carvalho de Melo (1):
  tools headers UAPI: Sync if_link.h with the kernel

Daniel Borkmann (1):
  bpf: sync bpf.h to tools infrastructure

 include/uapi/linux/bpf.h     |  4 +--
 include/uapi/linux/if_link.h |  5 +++
 src/hashmap.h                |  5 +++
 src/libbpf.c                 | 64 ++++++++++++++++++++++++++++++++----
 4 files changed, 69 insertions(+), 9 deletions(-)

--
2.17.1
2019-08-14 09:52:47 -07:00
Daniel Borkmann
b48c14807b bpf: sync bpf.h to tools infrastructure
Pull in updates in BPF helper function description.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-14 09:52:47 -07:00
Andrii Nakryiko
a3b4055ec7 libbpf: attempt to load kernel BTF from sysfs first
Add support for loading kernel BTF from sysfs (/sys/kernel/btf/vmlinux)
as a target BTF. Also extend the list of on disk search paths for
vmlinux ELF image with entries that perf is searching for.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-14 09:52:47 -07:00
Andrii Nakryiko
30603852f4 libbpf: fix missing __WORDSIZE definition
hashmap.h depends on __WORDSIZE being defined. It is defined by
glibc/musl in different headers. It's an explicit goal for musl to be
"non-detectable" at compilation time, so instead include glibc header if
glibc is explicitly detected and fall back to musl header otherwise.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Fixes: e3b924224028 ("libbpf: add resizable non-thread safe internal hashmap")
Link: https://lkml.kernel.org/r/20190718173021.2418606-1-andriin@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-14 09:52:47 -07:00
Arnaldo Carvalho de Melo
1a28fa5dac tools headers UAPI: Sync if_link.h with the kernel
To pick the changes in:

  07a4ddec3ce9 ("bonding: add an option to specify a delay between peer notifications")

And silence this build warning:

  Kernel ABI header at 'tools/include/uapi/linux/if_link.h' differs from latest version at 'include/uapi/linux/if_link.h'

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vincent Bernat <vincent@bernat.ch>
Link: https://lkml.kernel.org/n/tip-3liw4exxh8goc0rq9xryl2kv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-14 09:52:47 -07:00
Andrii Nakryiko
def5576b37 sync: pull patches from bpf tree as well
Add patches sync from both bpf and bpf-next trees at the same time.
Baseline checkpoint commits are tracked independently for both.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-08-09 08:41:26 -07:00
Andrii Nakryiko
3e45a16621 sync: make patch applying interactive, allow to ignore consistency
Give more control over patching process to allow manual intervention and
fix up, after which process will continue. Also allow an option to
ignore consistency check results (when having both bpf and bpf-next
changes).

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-08-09 08:41:26 -07:00
Andrii Nakryiko
2c0e53cb08 sync: attempt to auto-resolve non-libbpf conflicts
If cherry-picked commit contains non-libbpf files, chances are high that
this will result in conflict, because we are generally skipping commits
that didn't touch libbpf files, which means that our working copy will
not be up-to-date for non-libbpf files. This change checks if conflicts
are only in non-libbpf files and marks them as resolved. This will work
fine as long as we don't cherry-pick some more non-libbpf changes to
same set of files that happen to conflict with not-so-resolved version
of non-libbpf files. But anyways, this should help in a lot of cases.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-08-09 08:41:26 -07:00
Andrii Nakryiko
6227c6f8dd sync: add manual cherry-picking mode
This is sometimes necessary when we did ad-hoc urgent bug fixes, which
are not identical to the ones in kernel.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-08-09 08:41:26 -07:00
Andrii Nakryiko
00ad180d07 sync: extract cherry-picking logic for reuse
Extract non-empty merge validation and cherry-picking logic so that it
can be re-used for bpf and bpf-next commits.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-08-09 08:41:26 -07:00
Andrii Nakryiko
715a58d593 sync: improve and automate already synced patches detection
Make detection more precise and automate skip/sync decision, if
everything looks sane.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-08-09 08:41:26 -07:00
Andrii Nakryiko
11052fc1be sync: add commit_desc() function and move things around a bit
Just cleaning up a bunch of stuff before the next refactoring.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-08-09 08:41:26 -07:00
Andrii Nakryiko
97ecda3b25 sync: extract directory changing function
Extract the logic of handling relative paths within the script into
go_to() function.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-08-09 08:41:26 -07:00
Andrii Nakryiko
342bcfa319 sync: centralize kernel-to-github paths mapping
Use associative array (requires at least bash 4) to centralize mapping
of paths between kernel's libbpf layout and the one on Github. This
minimizes the chance of all those mappings getting out of sync (which
happened twice before).

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-08-09 08:41:26 -07:00
Andrii Nakryiko
c020432531 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   66b5f1c439843bcbab01cc7f3854ae2742f3d1e3
Checkpoint bpf-next commit: b707659213d3c70f2c704ec950df6263b4bffe84
Baseline bpf commit:        53db1cced401e4c65d49edf198e00daa9fc837e6
Checkpoint bpf commit:      f1fc7249dddc0e52d9e805e2e661caa118649509

Andrii Nakryiko (8):
  libbpf: provide more helpful message on uninitialized global var
  libbpf: return previous print callback from libbpf_set_print
  libbpf: add helpers for working with BTF types
  libbpf: convert libbpf code to use new btf helpers
  libbpf: add .BTF.ext offset relocation section loading
  libbpf: implement BPF CO-RE offset relocation algorithm
  libbpf: fix SIGSEGV when BTF loading fails, but .BTF.ext exists
  libbpf: sanitize VAR to conservative 1-byte INT

Petar Penkov (1):
  bpf: sync bpf.h to tools/

Stanislav Fomichev (2):
  tools/bpf: sync bpf_flow_keys flags
  bpf/flow_dissector: support ipv6 flow_label and
    BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL

Toke Høiland-Jørgensen (2):
  tools/include/uapi: Add devmap_hash BPF map type
  tools/libbpf_probes: Add new devmap_hash type

 include/uapi/linux/bpf.h |  44 +-
 src/btf.c                | 250 +++++-----
 src/btf.h                | 182 ++++++++
 src/btf_dump.c           | 138 ++----
 src/libbpf.c             | 972 ++++++++++++++++++++++++++++++++++++---
 src/libbpf.h             |   3 +-
 src/libbpf_internal.h    | 105 +++++
 src/libbpf_probes.c      |   1 +
 8 files changed, 1405 insertions(+), 290 deletions(-)

--
2.17.1
2019-08-09 08:40:44 -07:00
Andrii Nakryiko
99ce275b52 libbpf: implement BPF CO-RE offset relocation algorithm
This patch implements the core logic for BPF CO-RE offsets relocations.
Every instruction that needs to be relocated has corresponding
bpf_offset_reloc as part of BTF.ext. Relocations are performed by trying
to match recorded "local" relocation spec against potentially many
compatible "target" types, creating corresponding spec. Details of the
algorithm are noted in corresponding comments in the code.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-09 08:40:44 -07:00
Andrii Nakryiko
c0a5f7ee11 libbpf: add .BTF.ext offset relocation section loading
Add support for BPF CO-RE offset relocations. Add section/record
iteration macros for .BTF.ext. These macro are useful for iterating over
each .BTF.ext record, either for dumping out contents or later for BPF
CO-RE relocation handling.

To enable other parts of libbpf to work with .BTF.ext contents, moved
a bunch of type definitions into libbpf_internal.h.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-09 08:40:44 -07:00
Andrii Nakryiko
0da9ba439f libbpf: convert libbpf code to use new btf helpers
Simplify code by relying on newly added BTF helper functions.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-09 08:40:44 -07:00
Andrii Nakryiko
c4735d9e05 libbpf: add helpers for working with BTF types
Add lots of frequently used helpers that simplify working with BTF
types.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-09 08:40:44 -07:00
Petar Penkov
563f1d3fff bpf: sync bpf.h to tools/
Sync updated documentation for bpf_redirect_map.

Sync the bpf_tcp_gen_syncookie helper function definition with the one
in tools/uapi.

Signed-off-by: Petar Penkov <ppenkov@google.com>
Reviewed-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-09 08:40:44 -07:00
Toke Høiland-Jørgensen
c5d4295fc5 tools/libbpf_probes: Add new devmap_hash type
This adds the definition for BPF_MAP_TYPE_DEVMAP_HASH to libbpf_probes.c in
tools/lib/bpf.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-09 08:40:44 -07:00
Toke Høiland-Jørgensen
b606dc725e tools/include/uapi: Add devmap_hash BPF map type
This adds the devmap_hash BPF map type to the uapi headers in tools/.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-09 08:40:44 -07:00
Andrii Nakryiko
f615047aa0 libbpf: return previous print callback from libbpf_set_print
By returning previously set print callback from libbpf_set_print, it's
possible to restore it, eventually. This is useful when running many
independent test with one default print function, but overriding log
verbosity for particular subset of tests.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-09 08:40:44 -07:00
Stanislav Fomichev
84a508a51f bpf/flow_dissector: support ipv6 flow_label and BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL
Add support for exporting ipv6 flow label via bpf_flow_keys.
Export flow label from bpf_flow.c and also return early when
BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL is passed.

Acked-by: Petar Penkov <ppenkov@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Petar Penkov <ppenkov@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-09 08:40:44 -07:00
Stanislav Fomichev
c59016e100 tools/bpf: sync bpf_flow_keys flags
Export bpf_flow_keys flags to tools/libbpf/selftests.

Acked-by: Petar Penkov <ppenkov@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Petar Penkov <ppenkov@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-09 08:40:44 -07:00
Andrii Nakryiko
509ef92905 libbpf: provide more helpful message on uninitialized global var
When BPF program defines uninitialized global variable, it's put into
a special COMMON section. Libbpf will reject such programs, but will
provide very unhelpful message with garbage-looking section index.

This patch detects special section cases and gives more explicit error
message.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-09 08:40:44 -07:00
Andrii Nakryiko
2c9394f2a3 libbpf: set BTF FD for prog only when there is supported .BTF.ext data
5d01ab7bac46 ("libbpf: fix erroneous multi-closing of BTF FD")
introduced backwards-compatibility issue, manifesting itself as -E2BIG
error returned on program load due to unknown non-zero btf_fd attribute
value for BPF_PROG_LOAD sys_bpf() sub-command.

This patch fixes bug by ensuring that we only ever associate BTF FD with
program if there is a BTF.ext data that was successfully loaded into
kernel, which automatically means kernel supports func_info/line_info
and associated BTF FD for progs (checked and ensured also by BTF
sanitization code).

Fixes: 5d01ab7bac46 ("libbpf: fix erroneous multi-closing of BTF FD")
Reported-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-01 11:01:38 -07:00
Takshak Chahande
0f4d83f3ab libbpf : make libbpf_num_possible_cpus function thread safe
Having static variable `cpus` in libbpf_num_possible_cpus function
without guarding it with mutex makes this function thread-unsafe.

If multiple threads accessing this function, in the current form; it
leads to incrementing the static variable value `cpus` in the multiple
of total available CPUs.

Used local stack variable to calculate the number of possible CPUs and
then updated the static variable using WRITE_ONCE().

Changes since v1:
 * added stack variable to calculate cpus
 * serialized static variable update using WRITE_ONCE()
 * fixed Fixes tag

Fixes: 6446b3155521 ("bpf: add a new API libbpf_num_possible_cpus()")
Signed-off-by: Takshak Chahande <ctakshak@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-01 11:01:38 -07:00
hex
6a7b28b6a1 libbpf: fix extraversion in Makefile
The current LIBBPF_VERSION (0.0.3) doesn't match with ABI version LIBBPF_0.0.4.
Fix the EXTRAVERSION portion.
2019-07-31 21:56:11 -07:00
Andrii Nakryiko
d76d264ac0 libbpf: fix erroneous multi-closing of BTF FD
Libbpf stores associated BTF FD per each instance of bpf_program. When
program is unloaded, that FD is closed. This is wrong, because leads to
a race and possibly closing of unrelated files, if application
simultaneously opens new files while bpf_programs are unloaded.

It's also unnecessary, because struct btf "owns" that FD, and
btf__free(), called from bpf_object__close() will close it. Thus the fix
is to never have per-program BTF FD and fetch it from obj->btf, when
necessary.

Fixes: 2993e0515bb4 ("tools/bpf: add support to read .BTF.ext sections")
Reported-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-29 16:32:58 -07:00
Andrii Nakryiko
63a3bdf23a libbpf: silence GCC8 warning about string truncation
Despite a proper NULL-termination after strncpy(..., ..., IFNAMSIZ - 1),
GCC8 still complains about *expected* string truncation:

  xsk.c:330:2: error: 'strncpy' output may be truncated copying 15 bytes
  from a string of length 15 [-Werror=stringop-truncation]
    strncpy(ifr.ifr_name, xsk->ifname, IFNAMSIZ - 1);

This patch gets rid of the issue altogether by using memcpy instead.
There is no performance regression, as strncpy will still copy and fill
all of the bytes anyway.

v1->v2:
- rebase against bpf tree.

Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-29 16:32:58 -07:00
Ilya Maximets
12fa15e89a libbpf: fix using uninitialized ioctl results
'channels.max_combined' initialized only on ioctl success and
errno is only valid on ioctl failure.

The code doesn't produce any runtime issues, but makes memory
sanitizers angry:

 Conditional jump or move depends on uninitialised value(s)
    at 0x55C056F: xsk_get_max_queues (xsk.c:336)
    by 0x55C05B2: xsk_create_bpf_maps (xsk.c:354)
    by 0x55C089F: xsk_setup_xdp_prog (xsk.c:447)
    by 0x55C0E57: xsk_socket__create (xsk.c:601)
  Uninitialised value was created by a stack allocation
    at 0x55C04CD: xsk_get_max_queues (xsk.c:318)

Additionally fixed warning on uninitialized bytes in ioctl arguments:

 Syscall param ioctl(SIOCETHTOOL) points to uninitialised byte(s)
    at 0x648D45B: ioctl (in /usr/lib64/libc-2.28.so)
    by 0x55C0546: xsk_get_max_queues (xsk.c:330)
    by 0x55C05B2: xsk_create_bpf_maps (xsk.c:354)
    by 0x55C089F: xsk_setup_xdp_prog (xsk.c:447)
    by 0x55C0E57: xsk_socket__create (xsk.c:601)
  Address 0x1ffefff378 is on thread 1's stack
  in frame #1, created by xsk_get_max_queues (xsk.c:318)
  Uninitialised value was created by a stack allocation
    at 0x55C04CD: xsk_get_max_queues (xsk.c:318)

CC: Magnus Karlsson <magnus.karlsson@intel.com>
Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-29 16:32:58 -07:00
Arnaldo Carvalho de Melo
b987dcfecb libbpf: Avoid designated initializers for unnamed union members
As it fails to build in some systems with:

  libbpf.c: In function 'perf_buffer__new':
  libbpf.c:4515: error: unknown field 'sample_period' specified in initializer
  libbpf.c:4516: error: unknown field 'wakeup_events' specified in initializer

Doing as:

    attr.sample_period = 1;

I.e. not as a designated initializer makes it build everywhere.

Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: fb84b8224655 ("libbpf: add perf buffer API")
Link: https://lkml.kernel.org/n/tip-hnlmch8qit1ieksfppmr32si@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-29 16:32:58 -07:00
Arnaldo Carvalho de Melo
9c1ab4d070 libbpf: Fix endianness macro usage for some compilers
Using endian.h and its endianness macros makes this code build in a
wider range of compilers, as some don't have those macros
(__BYTE_ORDER__, __ORDER_LITTLE_ENDIAN__, __ORDER_BIG_ENDIAN__),
so use instead endian.h's macros (__BYTE_ORDER, __LITTLE_ENDIAN,
__BIG_ENDIAN) which makes this code even shorter :-)

Acked-by: Andrii Nakryiko <andriin@fb.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 12ef5634a855 ("libbpf: simplify endianness check")
Fixes: e6c64855fd7a ("libbpf: add btf__parse_elf API to load .BTF and .BTF.ext")
Link: https://lkml.kernel.org/n/tip-eep5n8vgwcdphw3uc058k03u@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-29 16:32:58 -07:00
Andrii Nakryiko
550aa56dd4 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline commit:   8fc9f8bedf1bdaea48382ae2e3dd558e2b939cee
Checkpoint commit: 66b5f1c439843bcbab01cc7f3854ae2742f3d1e3

Andrii Nakryiko (2):
  libbpf: fix ptr to u64 conversion warning on 32-bit platforms
  libbpf: fix another GCC8 warning for strncpy

Baruch Siach (1):
  bpf: fix uapi bpf_prog_info fields alignment

Mauro Carvalho Chehab (2):
  docs: cgroup-v1: convert docs to ReST and rename to *.rst
  docs: cgroup-v1: add it to the admin-guide book

Stanislav Fomichev (1):
  bpf: sync bpf.h to tools/

 include/uapi/linux/bpf.h | 7 ++++---
 src/libbpf.c             | 4 ++--
 src/xsk.c                | 3 ++-
 3 files changed, 8 insertions(+), 6 deletions(-)

--
2.17.1
2019-07-23 14:26:50 -07:00
Andrii Nakryiko
54facd3fce libbpf: fix another GCC8 warning for strncpy
Similar issue was fixed in cdfc7f888c2a ("libbpf: fix GCC8 warning for
strncpy") already. This one was missed. Fixing now.

Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-23 14:26:50 -07:00
Stanislav Fomichev
1346b5b538 bpf: sync bpf.h to tools/
Update bpf_sock_addr comments to indicate support for 8-byte reads
from user_ip6 and msg_src_ip6.

Cc: Yonghong Song <yhs@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-23 14:26:50 -07:00
Andrii Nakryiko
78d3666065 libbpf: fix ptr to u64 conversion warning on 32-bit platforms
On 32-bit platforms compiler complains about conversion:

libbpf.c: In function ‘perf_event_open_probe’:
libbpf.c:4112:17: error: cast from pointer to integer of different
size [-Werror=pointer-to-int-cast]
  attr.config1 = (uint64_t)(void *)name; /* kprobe_func or uprobe_path */
                 ^

Reported-by: Matt Hart <matthew.hart@linaro.org>
Fixes: b26500274767 ("libbpf: add kprobe/uprobe attach API")
Tested-by: Matt Hart <matthew.hart@linaro.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-23 14:26:50 -07:00
Mauro Carvalho Chehab
ce2eb85588 docs: cgroup-v1: add it to the admin-guide book
Those files belong to the admin guide, so add them.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-23 14:26:50 -07:00
Baruch Siach
6d4104b077 bpf: fix uapi bpf_prog_info fields alignment
Merge commit 1c8c5a9d38f60 ("Merge
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next") undid the
fix from commit 36f9814a494 ("bpf: fix uapi hole for 32 bit compat
applications") by taking the gpl_compatible 1-bit field definition from
commit b85fab0e67b162 ("bpf: Add gpl_compatible flag to struct
bpf_prog_info") as is. That breaks architectures with 16-bit alignment
like m68k. Add 31-bit pad after gpl_compatible to restore alignment of
following fields.

Thanks to Dmitry V. Levin his analysis of this bug history.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-23 14:26:50 -07:00
Mauro Carvalho Chehab
43c14e871c docs: cgroup-v1: convert docs to ReST and rename to *.rst
Convert the cgroup-v1 files to ReST format, in order to
allow a later addition to the admin-guide.

The conversion is actually:
  - add blank lines and identation in order to identify paragraphs;
  - fix tables markups;
  - add some lists markups;
  - mark literal blocks;
  - adjust title markups.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2019-07-23 14:26:50 -07:00
Andrii Nakryiko
e61f4b8269 scripts/sync-kernel.sh: add missing if_xdp.h in one of file lists
if_xdp.h wasn't added to one of few file lists, causing sync content
verification failure.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-07-19 15:12:56 -07:00
Andrii Nakryiko
80a0eca14b libbpf: sanitize VAR to conservative 1-byte INT
If VAR in non-sanitized BTF was size less than 4, converting such VAR
into an INT with size=4 will cause BTF validation failure due to
violationg of STRUCT (into which DATASEC was converted) member size.
Fix by conservatively using size=1.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-07-19 13:01:08 -07:00
Andrii Nakryiko
5a8c675d0a libbpf: fix SIGSEGV when BTF loading fails, but .BTF.ext exists
In case when BTF loading fails despite sanitization, but BPF object has
.BTF.ext loaded as well, we free and null obj->btf, but not
obj->btf_ext. This leads to an attempt to relocate .BTF.ext later on
during bpf_object__load(), which assumes obj->btf is present. This leads
to SIGSEGV on null pointer access. Fix bug by freeing and nulling
obj->btf_ext as well.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-07-19 13:01:08 -07:00
Andrii Nakryiko
45ad862601 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline commit:   2c6a577d719e96c89b1aaa631089cce69b4d0045
Checkpoint commit: 8fc9f8bedf1bdaea48382ae2e3dd558e2b939cee

Andrii Nakryiko (19):
  libbpf: add common min/max macro to libbpf_internal.h
  libbpf: extract BTF loading logic
  libbpf: streamline ELF parsing error-handling
  libbpf: refactor map initialization
  libbpf: identify maps by section index in addition to offset
  libbpf: split initialization and loading of BTF
  libbpf: allow specifying map definitions using BTF
  libbpf: constify getter APIs
  libbpf: fix GCC8 warning for strncpy
  libbpf: make libbpf_strerror_r agnostic to sign of error
  libbpf: introduce concept of bpf_link
  libbpf: add ability to attach/detach BPF program to perf event
  libbpf: add kprobe/uprobe attach API
  libbpf: add tracepoint attach API
  libbpf: add raw tracepoint attach API
  libbpf: capture value in BTF type info for BTF-defined map defs
  libbpf: add perf buffer API
  libbpf: auto-set PERF_EVENT_ARRAY size to number of CPUs
  libbpf: add perf_buffer_ prefix to README

Colin Ian King (1):
  libbpf: fix spelling mistake "conflictling" -> "conflicting"

Daniel Borkmann (2):
  bpf: sync tooling uapi header
  bpf, libbpf: enable recvmsg attach types

Ivan Khoronzhuk (1):
  libbpf: fix max() type mismatch for 32bit

Leo Yan (1):
  bpf, libbpf, smatch: Fix potential NULL pointer dereference

Martynas Pumputis (1):
  bpf: sync BPF_FIB_LOOKUP flag changes with BPF uapi

Maxim Mikityanskiy (3):
  xsk: Add getsockopt XDP_OPTIONS
  libbpf: Support getsockopt XDP_OPTIONS
  xsk: Change the default frame size to 4096 and allow controlling it

Michal Rostecki (1):
  libbpf: Return btf_fd for load_sk_storage_btf

Stanislav Fomichev (4):
  bpf: sync bpf.h to tools/
  libbpf: support sockopt hooks
  bpf/tools: sync bpf.h
  bpf: sync bpf.h to tools/

Vincent Bernat (1):
  bonding: add an option to specify a delay between peer notifications

 include/uapi/linux/bpf.h     |   38 +-
 include/uapi/linux/if_link.h |    1 +
 include/uapi/linux/if_xdp.h  |    8 +
 src/README.rst               |    3 +-
 src/bpf.c                    |    7 +-
 src/bpf_prog_linfo.c         |    5 +-
 src/btf.c                    |    3 -
 src/btf.h                    |    1 +
 src/btf_dump.c               |    3 -
 src/libbpf.c                 | 1670 ++++++++++++++++++++++++++++------
 src/libbpf.h                 |  132 ++-
 src/libbpf.map               |   12 +-
 src/libbpf_internal.h        |   11 +-
 src/libbpf_probes.c          |   14 +-
 src/str_error.c              |    2 +-
 src/xsk.c                    |   15 +-
 src/xsk.h                    |    2 +-
 17 files changed, 1601 insertions(+), 326 deletions(-)

--
2.17.1
2019-07-08 12:52:46 -07:00
Stanislav Fomichev
5e4da17d43 bpf: sync bpf.h to tools/
Sync user_ip6 & msg_src_ip6 comments.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
2d8ab5cf2c libbpf: add perf_buffer_ prefix to README
perf_buffer "object" is part of libbpf API now, add it to the list of
libbpf function prefixes.

Suggested-by: Daniel Borkman <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
fe1ce6bd74 libbpf: auto-set PERF_EVENT_ARRAY size to number of CPUs
For BPF_MAP_TYPE_PERF_EVENT_ARRAY typically correct size is number of
possible CPUs. This is impossible to specify at compilation time. This
change adds automatic setting of PERF_EVENT_ARRAY size to number of
system CPUs, unless non-zero size is specified explicitly. This allows
to adjust size for advanced specific cases, while providing convenient
and logical defaults.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
9007494e6c libbpf: add perf buffer API
BPF_MAP_TYPE_PERF_EVENT_ARRAY map is often used to send data from BPF program
to user space for additional processing. libbpf already has very low-level API
to read single CPU perf buffer, bpf_perf_event_read_simple(), but it's hard to
use and requires a lot of code to set everything up. This patch adds
perf_buffer abstraction on top of it, abstracting setting up and polling
per-CPU logic into simple and convenient API, similar to what BCC provides.

perf_buffer__new() sets up per-CPU ring buffers and updates corresponding BPF
map entries. It accepts two user-provided callbacks: one for handling raw
samples and one for get notifications of lost samples due to buffer overflow.

perf_buffer__new_raw() is similar, but provides more control over how
perf events are set up (by accepting user-provided perf_event_attr), how
they are handled (perf_event_header pointer is passed directly to
user-provided callback), and on which CPUs ring buffers are created
(it's possible to provide a list of CPUs and corresponding map keys to
update). This API allows advanced users fuller control.

perf_buffer__poll() is used to fetch ring buffer data across all CPUs,
utilizing epoll instance.

perf_buffer__free() does corresponding clean up and unsets FDs from BPF map.

All APIs are not thread-safe. User should ensure proper locking/coordination if
used in multi-threaded set up.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
8b82b9c82b libbpf: capture value in BTF type info for BTF-defined map defs
Change BTF-defined map definitions to capture compile-time integer
values as part of BTF type definition, to avoid split of key/value type
information and actual type/size/flags initialization for maps.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
9a361d2fdd libbpf: add raw tracepoint attach API
Add a wrapper utilizing bpf_link "infrastructure" to allow attaching BPF
programs to raw tracepoints.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
01272d3040 libbpf: add tracepoint attach API
Allow attaching BPF programs to kernel tracepoint BPF hooks specified by
category and name.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
0a216f37f8 libbpf: add kprobe/uprobe attach API
Add ability to attach to kernel and user probes and retprobes.
Implementation depends on perf event support for kprobes/uprobes.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
04f987a89a libbpf: add ability to attach/detach BPF program to perf event
bpf_program__attach_perf_event allows to attach BPF program to existing
perf event hook, providing most generic and most low-level way to attach BPF
programs. It returns struct bpf_link, which should be passed to
bpf_link__destroy to detach and free resources, associated with a link.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
86171433b7 libbpf: introduce concept of bpf_link
bpf_link is an abstraction of an association of a BPF program and one of
many possible BPF attachment points (hooks). This allows to have uniform
interface for detaching BPF programs regardless of the nature of link
and how it was created. Details of creation and setting up of a specific
bpf_link is handled by corresponding attachment methods
(bpf_program__attach_xxx) added in subsequent commits. Once successfully
created, bpf_link has to be eventually destroyed with
bpf_link__destroy(), at which point BPF program is disassociated from
a hook and all the relevant resources are freed.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
3e8f8914cb libbpf: make libbpf_strerror_r agnostic to sign of error
It's often inconvenient to switch sign of error when passing it into
libbpf_strerror_r. It's better for it to handle that automatically.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Stanislav Fomichev
960ec9ace6 bpf/tools: sync bpf.h
Sync new bpf_tcp_sock fields and new BPF_PROG_TYPE_SOCK_OPS RTT callback.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Priyaranjan Jha <priyarjha@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Leo Yan
0cc3d9d332 bpf, libbpf, smatch: Fix potential NULL pointer dereference
Based on the following report from Smatch, fix the potential NULL
pointer dereference check:

  tools/lib/bpf/libbpf.c:3493
  bpf_prog_load_xattr() warn: variable dereferenced before check 'attr'
  (see line 3483)

  3479 int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,
  3480                         struct bpf_object **pobj, int *prog_fd)
  3481 {
  3482         struct bpf_object_open_attr open_attr = {
  3483                 .file           = attr->file,
  3484                 .prog_type      = attr->prog_type,
                                         ^^^^^^
  3485         };

At the head of function, it directly access 'attr' without checking
if it's NULL pointer. This patch moves the values assignment after
validating 'attr' and 'attr->file'.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
50a63f31b6 libbpf: fix GCC8 warning for strncpy
GCC8 started emitting warning about using strncpy with number of bytes
exactly equal destination size, which is generally unsafe, as can lead
to non-zero terminated string being copied. Use IFNAMSIZ - 1 as number
of bytes to ensure name is always zero-terminated.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Stanislav Fomichev
32a605a9a6 libbpf: support sockopt hooks
Make libbpf aware of new sockopt hooks so it can derive prog type
and hook point from the section names.

Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-08 12:52:46 -07:00
Stanislav Fomichev
5efb454851 bpf: sync bpf.h to tools/
Export new prog type and hook points to the libbpf.

Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-08 12:52:46 -07:00
Maxim Mikityanskiy
e0ee1593fd xsk: Change the default frame size to 4096 and allow controlling it
The typical XDP memory scheme is one packet per page. Change the AF_XDP
frame size in libbpf to 4096, which is the page size on x86, to allow
libbpf to be used with the drivers with the packet-per-page scheme.

Add a command line option -f to xdpsock to allow to specify a custom
frame size.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Maxim Mikityanskiy
421ecf02c8 libbpf: Support getsockopt XDP_OPTIONS
Query XDP_OPTIONS in libbpf to determine if the zero-copy mode is active
or not.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Maxim Mikityanskiy
38b91c640f xsk: Add getsockopt XDP_OPTIONS
Make it possible for the application to determine whether the AF_XDP
socket is running in zero-copy mode. To achieve this, add a new
getsockopt option XDP_OPTIONS that returns flags. The only flag
supported for now is the zero-copy mode indicator.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Ivan Khoronzhuk
f925686015 libbpf: fix max() type mismatch for 32bit
It fixes build error for 32bit caused by type mismatch
size_t/unsigned long.

Fixes: bf82927125dd ("libbpf: refactor map initialization")
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Colin Ian King
be0f832d40 libbpf: fix spelling mistake "conflictling" -> "conflicting"
There are several spelling mistakes in pr_warning messages. Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-08 12:52:46 -07:00
Vincent Bernat
f29b6fd1da bonding: add an option to specify a delay between peer notifications
Currently, gratuitous ARP/ND packets are sent every `miimon'
milliseconds. This commit allows a user to specify a custom delay
through a new option, `peer_notif_delay'.

Like for `updelay' and `downdelay', this delay should be a multiple of
`miimon' to avoid managing an additional work queue. The configuration
logic is copied from `updelay' and `downdelay'. However, the default
value cannot be set using a module parameter: Netlink or sysfs should
be used to configure this feature.

When setting `miimon' to 100 and `peer_notif_delay' to 500, we can
observe the 500 ms delay is respected:

    20:30:19.354693 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
    20:30:19.874892 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
    20:30:20.394919 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
    20:30:20.914963 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28

In bond_mii_monitor(), I have tried to keep the lock logic readable.
The change is due to the fact we cannot rely on a notification to
lower the value of `bond->send_peer_notif' as `NETDEV_NOTIFY_PEERS' is
only triggered once every N times, while we need to decrement the
counter each time.

iproute2 also needs to be updated to be able to specify this new
attribute through `ip link'.

Signed-off-by: Vincent Bernat <vincent@bernat.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
1ed7b6ade1 libbpf: constify getter APIs
Add const qualifiers to bpf_object/bpf_program/bpf_map arguments for
getter APIs. There is no need for them to not be const pointers.

Verified that

make -C tools/lib/bpf
make -C tools/testing/selftests/bpf
make -C tools/perf

all build without warnings.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
ec13b30349 libbpf: allow specifying map definitions using BTF
This patch adds support for a new way to define BPF maps. It relies on
BTF to describe mandatory and optional attributes of a map, as well as
captures type information of key and value naturally. This eliminates
the need for BPF_ANNOTATE_KV_PAIR hack and ensures key/value sizes are
always in sync with the key/value type.

Relying on BTF, this approach allows for both forward and backward
compatibility w.r.t. extending supported map definition features. By
default, any unrecognized attributes are treated as an error, but it's
possible relax this using MAPS_RELAX_COMPAT flag. New attributes, added
in the future will need to be optional.

The outline of the new map definition (short, BTF-defined maps) is as follows:
1. All the maps should be defined in .maps ELF section. It's possible to
   have both "legacy" map definitions in `maps` sections and BTF-defined
   maps in .maps sections. Everything will still work transparently.
2. The map declaration and initialization is done through
   a global/static variable of a struct type with few mandatory and
   extra optional fields:
   - type field is mandatory and specified type of BPF map;
   - key/value fields are mandatory and capture key/value type/size information;
   - max_entries attribute is optional; if max_entries is not specified or
     initialized, it has to be provided in runtime through libbpf API
     before loading bpf_object;
   - map_flags is optional and if not defined, will be assumed to be 0.
3. Key/value fields should be **a pointer** to a type describing
   key/value. The pointee type is assumed (and will be recorded as such
   and used for size determination) to be a type describing key/value of
   the map. This is done to save excessive amounts of space allocated in
   corresponding ELF sections for key/value of big size.
4. As some maps disallow having BTF type ID associated with key/value,
   it's possible to specify key/value size explicitly without
   associating BTF type ID with it. Use key_size and value_size fields
   to do that (see example below).

Here's an example of simple ARRAY map defintion:

struct my_value { int x, y, z; };

struct {
	int type;
	int max_entries;
	int *key;
	struct my_value *value;
} btf_map SEC(".maps") = {
	.type = BPF_MAP_TYPE_ARRAY,
	.max_entries = 16,
};

This will define BPF ARRAY map 'btf_map' with 16 elements. The key will
be of type int and thus key size will be 4 bytes. The value is struct
my_value of size 12 bytes. This map can be used from C code exactly the
same as with existing maps defined through struct bpf_map_def.

Here's an example of STACKMAP definition (which currently disallows BTF type
IDs for key/value):

struct {
	__u32 type;
	__u32 max_entries;
	__u32 map_flags;
	__u32 key_size;
	__u32 value_size;
} stackmap SEC(".maps") = {
	.type = BPF_MAP_TYPE_STACK_TRACE,
	.max_entries = 128,
	.map_flags = BPF_F_STACK_BUILD_ID,
	.key_size = sizeof(__u32),
	.value_size = PERF_MAX_STACK_DEPTH * sizeof(struct bpf_stack_build_id),
};

This approach is naturally extended to support map-in-map, by making a value
field to be another struct that describes inner map. This feature is not
implemented yet. It's also possible to incrementally add features like pinning
with full backwards and forward compatibility. Support for static
initialization of BPF_MAP_TYPE_PROG_ARRAY using pointers to BPF programs
is also on the roadmap.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
e64e62d19f libbpf: split initialization and loading of BTF
Libbpf does sanitization of BTF before loading it into kernel, if kernel
doesn't support some of newer BTF features. This removes some of the
important information from BTF (e.g., DATASEC and VAR description),
which will be used for map construction. This patch splits BTF
processing into initialization step, in which BTF is initialized from
ELF and all the original data is still preserved; and
sanitization/loading step, which ensures that BTF is safe to load into
kernel. This allows to use full BTF information to construct maps, while
still loading valid BTF into older kernels.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
115c0e02cb libbpf: identify maps by section index in addition to offset
To support maps to be defined in multiple sections, it's important to
identify map not just by offset within its section, but section index as
well. This patch adds tracking of section index.

For global data, we record section index of corresponding
.data/.bss/.rodata ELF section for uniformity, and thus don't need
a special value of offset for those maps.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
67057c6b7d libbpf: refactor map initialization
User and global data maps initialization has gotten pretty complicated
and unnecessarily convoluted. This patch splits out the logic for global
data map and user-defined map initialization. It also removes the
restriction of pre-calculating how many maps will be initialized,
instead allowing to keep adding new maps as they are discovered, which
will be used later for BTF-defined map definitions.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
bdf65f9fea libbpf: streamline ELF parsing error-handling
Simplify ELF parsing logic by exiting early, as there is no common clean
up path to execute. That makes it unnecessary to track when err was set
and when it was cleared. It also reduces nesting in some places.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
0559e41969 libbpf: extract BTF loading logic
As a preparation for adding BTF-based BPF map loading, extract .BTF and
.BTF.ext loading logic.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
abc096b71d libbpf: add common min/max macro to libbpf_internal.h
Multiple files in libbpf redefine their own definitions for min/max.
Let's define them in libbpf_internal.h and use those everywhere.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Martynas Pumputis
7586f784e6 bpf: sync BPF_FIB_LOOKUP flag changes with BPF uapi
Sync the changes to the flags made in "bpf: simplify definition of
BPF_FIB_LOOKUP related flags" with the BPF UAPI headers.

Doing in a separate commit to ease syncing of github/libbpf.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Daniel Borkmann
76ff616fcd bpf, libbpf: enable recvmsg attach types
Another trivial patch to libbpf in order to enable identifying and
attaching programs to BPF_CGROUP_UDP{4,6}_RECVMSG by section name.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-08 12:52:46 -07:00
Daniel Borkmann
04a05786c3 bpf: sync tooling uapi header
Sync BPF uapi header in order to pull in BPF_CGROUP_UDP{4,6}_RECVMSG
attach types. This is done and preferred as an extra patch in order
to ease sync of libbpf.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-08 12:52:46 -07:00
Michal Rostecki
ddba4024c0 libbpf: Return btf_fd for load_sk_storage_btf
Before this change, function load_sk_storage_btf expected that
libbpf__probe_raw_btf was returning a BTF descriptor, but in fact it was
returning an information about whether the probe was successful (0 or
1). load_sk_storage_btf was using that value as an argument of the close
function, which was resulting in closing stdout and thus terminating the
process which called that function.

That bug was visible in bpftool. `bpftool feature` subcommand was always
exiting too early (because of closed stdout) and it didn't display all
requested probes. `bpftool -j feature` or `bpftool -p feature` were not
returning a valid json object.

This change renames the libbpf__probe_raw_btf function to
libbpf__load_raw_btf, which now returns a BTF descriptor, as expected in
load_sk_storage_btf.

v2:
- Fix typo in the commit message.

v3:
- Simplify BTF descriptor handling in bpf_object__probe_btf_* functions.
- Rename libbpf__probe_raw_btf function to libbpf__load_raw_btf and
return a BTF descriptor.

v4:
- Fix typo in the commit message.

Fixes: d7c4b3980c18 ("libbpf: detect supported kernel BTF features and sanitize BTF")
Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
52ec16bce8 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline commit:   5e2ac390fbd08b2a462db66cef2663e4db0d5191
Checkpoint commit: 2c6a577d719e96c89b1aaa631089cce69b4d0045

Andrii Nakryiko (1):
  libbpf: fix check for presence of associated BTF for map creation

Stanislav Fomichev (1):
  bpf/tools: sync bpf.h

 include/uapi/linux/bpf.h | 2 ++
 src/libbpf.c             | 9 +++++----
 2 files changed, 7 insertions(+), 4 deletions(-)

--
2.17.1
2019-06-14 16:53:36 -07:00
Stanislav Fomichev
a4e4dbc35a bpf/tools: sync bpf.h
Add sk to struct bpf_sock_addr and struct bpf_sock_ops.

Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-14 16:53:36 -07:00
Andrii Nakryiko
21742bc952 libbpf: fix check for presence of associated BTF for map creation
Kernel internally checks that either key or value type ID is specified,
before using btf_fd. Do the same in libbpf's map creation code for
determining when to retry map creation w/o BTF.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: fba01a0689a9 ("libbpf: use negative fd to specify missing BTF")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-14 16:53:36 -07:00
Andrii Nakryiko
39de671179 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline commit:   e672db03ab0e43e41ab6f8b2156a10d6e40f243d
Checkpoint commit: 5e2ac390fbd08b2a462db66cef2663e4db0d5191

Andrii Nakryiko (9):
  libbpf: fix detection of corrupted BPF instructions section
  libbpf: preserve errno before calling into user callback
  libbpf: simplify endianness check
  libbpf: check map name retrieved from ELF
  libbpf: fix error code returned on corrupted ELF
  libbpf: use negative fd to specify missing BTF
  libbpf: simplify two pieces of logic
  libbpf: typo and formatting fixes
  libbpf: reduce unnecessary line wrapping

Hechao Li (1):
  bpf: add a new API libbpf_num_possible_cpus()

Jonathan Lemon (2):
  bpf/tools: sync bpf.h
  libbpf: remove qidconf and better support external bpf programs.

Quentin Monnet (1):
  libbpf: prevent overwriting of log_level in bpf_object__load_progs()

 include/uapi/linux/bpf.h |   4 +
 src/libbpf.c             | 207 ++++++++++++++++++++++-----------------
 src/libbpf.h             |  16 +++
 src/libbpf.map           |   1 +
 src/xsk.c                | 103 ++++++-------------
 5 files changed, 167 insertions(+), 164 deletions(-)

--
2.17.1
2019-06-11 10:18:48 -07:00
Hechao Li
49967c1f5a bpf: add a new API libbpf_num_possible_cpus()
Adding a new API libbpf_num_possible_cpus() that helps user with
per-CPU map operations.

Signed-off-by: Hechao Li <hechaol@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Jonathan Lemon
4f260bccf5 libbpf: remove qidconf and better support external bpf programs.
Use the recent change to XSKMAP bpf_map_lookup_elem() to test if
there is a xsk present in the map instead of duplicating the work
with qidconf.

Fix things so callers using XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD
bypass any internal bpf maps, so xsk_socket__{create|delete} works
properly.

Clean up error handling path.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-06-11 10:18:48 -07:00
Jonathan Lemon
80aeaa33e7 bpf/tools: sync bpf.h
Sync uapi/linux/bpf.h

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
fcbc2604c8 libbpf: reduce unnecessary line wrapping
There are a bunch of lines of code or comments that are unnecessary
wrapped into multi-lines. Fix that without violating any code
guidelines.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
896c231d8c libbpf: typo and formatting fixes
A bunch of typo and formatting fixes.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
e13049a667 libbpf: simplify two pieces of logic
Extra check for type is unnecessary in first case.

Extra zeroing is unnecessary, as snprintf guarantees that it will
zero-terminate string.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
8e01ffc179 libbpf: use negative fd to specify missing BTF
0 is a valid FD, so it's better to initialize it to -1, as is done in
other places. Also, technically, BTF type ID 0 is valid (it's a VOID
type), so it's more reliable to check btf_fd, instead of
btf_key_type_id, to determine if there is any BTF associated with a map.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
c7b8b2e3a5 libbpf: fix error code returned on corrupted ELF
All of libbpf errors are negative, except this one. Fix it.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
e977af8516 libbpf: check map name retrieved from ELF
Validate there was no error retrieving symbol name corresponding to
a BPF map.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
dd55058989 libbpf: simplify endianness check
Rewrite endianness check to use "more canonical" way, using
compiler-defined macros, similar to few other places in libbpf. It also
is more obvious and shorter.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
64852b2fb9 libbpf: preserve errno before calling into user callback
pr_warning ultimately may call into user-provided callback function,
which can clobber errno value, so we need to save it before that.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
1aedc35d5d libbpf: fix detection of corrupted BPF instructions section
Ensure that size of a section w/ BPF instruction is exactly a multiple
of BPF instruction size.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Quentin Monnet
d5013de6a5 libbpf: prevent overwriting of log_level in bpf_object__load_progs()
There are two functions in libbpf that support passing a log_level
parameter for the verifier for loading programs:
bpf_object__load_xattr() and bpf_prog_load_xattr(). Both accept an
attribute object containing the log_level, and apply it to the programs
to load.

It turns out that to effectively load the programs, the latter function
eventually relies on the former. This was not taken into account when
adding support for log_level in bpf_object__load_xattr(), and the
log_level passed to bpf_prog_load_xattr() later gets overwritten with a
zero value, thus disabling verifier logs for the program in all cases:

bpf_prog_load_xattr()             // prog->log_level = attr1->log_level;
-> bpf_object__load()             // attr2->log_level = 0;
   -> bpf_object__load_xattr()    // <pass prog and attr2>
      -> bpf_object__load_progs() // prog->log_level = attr2->log_level;

Fix this by OR-ing the log_level in bpf_object__load_progs(), instead of
overwriting it.

v2: Fix commit log description (confusion on function names in v1).

Fixes: 60276f984998 ("libbpf: add bpf_object__load_xattr() API function to pass log_level")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
0e37e0d03a build: add hashmap and btf_dump to OBJS list
Fix Makefile to include expected object files.
Fixes https://github.com/libbpf/libbpf/issues/52.

Reported-by: Robert McCabe <robert.mccabe@rockwellcollins.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-05-30 15:07:51 -07:00
Andrii Nakryiko
75db50f4a0 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline commit:   f49aa1de98363b6c5fba4637678d6b0ba3d18065
Checkpoint commit: e672db03ab0e43e41ab6f8b2156a10d6e40f243d

Andrii Nakryiko (5):
  libbpf: ensure libbpf.h is included along libbpf_internal.h
  libbpf: add btf__parse_elf API to load .BTF and .BTF.ext
  libbpf: add resizable non-thread safe internal hashmap
  libbpf: switch btf_dedup() to hashmap for dedup table
  libbpf: add btf_dump API for BTF-to-C conversion

Hariprasad Kelam (1):
  libbpf: fix warning that PTR_ERR_OR_ZERO can be used

Jiong Wang (2):
  tools: bpf: sync uapi header bpf.h
  libbpf: add "prog_flags" to
    bpf_program/bpf_prog_load_attr/bpf_load_program_attr

Quentin Monnet (1):
  libbpf: add bpf_object__load_xattr() API function to pass log_level

Yonghong Song (1):
  tools/bpf: sync bpf uapi header bpf.h to tools directory

 include/uapi/linux/bpf.h |   35 +-
 src/bpf.c                |    1 +
 src/bpf.h                |    1 +
 src/btf.c                |  329 ++++++----
 src/btf.h                |   19 +
 src/btf_dump.c           | 1336 ++++++++++++++++++++++++++++++++++++++
 src/hashmap.c            |  229 +++++++
 src/hashmap.h            |  173 +++++
 src/libbpf.c             |   27 +-
 src/libbpf.h             |    7 +
 src/libbpf.map           |    9 +
 src/libbpf_internal.h    |    2 +
 12 files changed, 2045 insertions(+), 123 deletions(-)
 create mode 100644 src/btf_dump.c
 create mode 100644 src/hashmap.c
 create mode 100644 src/hashmap.h

--
2.17.1
2019-05-29 10:27:27 -07:00
Quentin Monnet
d714245dd9 libbpf: add bpf_object__load_xattr() API function to pass log_level
libbpf was recently made aware of the log_level attribute for programs,
used to specify the level of information expected to be dumped by the
verifier. Function bpf_prog_load_xattr() got support for this log_level
parameter.

But some applications using libbpf rely on another function to load
programs, bpf_object__load(), which does accept any parameter for log
level. Create an API function based on bpf_object__load(), but accepting
an "attr" object as a parameter. Then add a log_level field to that
object, so that applications calling the new bpf_object__load_xattr()
can pick the desired log level.

v3:
- Rewrite commit log.

v2:
- We are in a new cycle, bump libbpf extraversion number.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-29 10:27:27 -07:00
Hariprasad Kelam
1d0ddcdbda libbpf: fix warning that PTR_ERR_OR_ZERO can be used
Fix below warning reported by coccicheck:

/tools/lib/bpf/libbpf.c:3461:1-3: WARNING: PTR_ERR_OR_ZERO can be used

Signed-off-by: Hariprasad Kelam <hariprasad.kelam@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-29 10:27:27 -07:00
Jiong Wang
edf3123942 libbpf: add "prog_flags" to bpf_program/bpf_prog_load_attr/bpf_load_program_attr
libbpf doesn't allow passing "prog_flags" during bpf program load in a
couple of load related APIs, "bpf_load_program_xattr", "load_program" and
"bpf_prog_load_xattr".

It makes sense to allow passing "prog_flags" which is useful for
customizing program loading.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-29 10:27:27 -07:00
Jiong Wang
1f8e2f7208 tools: bpf: sync uapi header bpf.h
Sync new bpf prog load flag "BPF_F_TEST_RND_HI32" to tools/.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-29 10:27:27 -07:00
Yonghong Song
ca76b123f1 tools/bpf: sync bpf uapi header bpf.h to tools directory
The bpf uapi header include/uapi/linux/bpf.h is sync'ed
to tools/include/uapi/linux/bpf.h.

Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-29 10:27:27 -07:00
Andrii Nakryiko
4fe8a54efe libbpf: add btf_dump API for BTF-to-C conversion
BTF contains enough type information to allow generating valid
compilable C header w/ correct layout of structs/unions and all the
typedef/enum definitions. This patch adds a new "object" - btf_dump to
facilitate dumping BTF as valid C. btf_dump__dump_type() is the main API
which takes care of dumping out (through user-provided printf-like
callback function) C definitions for given type ID and it's required
dependencies. This allows for not just dumping out entirety of BTF types,
but also selective filtering based on user-provided criterias w/ minimal
set of dependent types.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-29 10:27:27 -07:00
Andrii Nakryiko
5f3f74892a libbpf: switch btf_dedup() to hashmap for dedup table
Utilize libbpf's hashmap as a multimap fof dedup_table implementation.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-29 10:27:27 -07:00
Andrii Nakryiko
553db8ba73 libbpf: add resizable non-thread safe internal hashmap
There is a need for fast point lookups inside libbpf for multiple use
cases (e.g., name resolution for BTF-to-C conversion, by-name lookups in
BTF for upcoming BPF CO-RE relocation support, etc). This patch
implements simple resizable non-thread safe hashmap using single linked
list chains.

Four different insert strategies are supported:
 - HASHMAP_ADD - only add key/value if key doesn't exist yet;
 - HASHMAP_SET - add key/value pair if key doesn't exist yet; otherwise,
   update value;
 - HASHMAP_UPDATE - update value, if key already exists; otherwise, do
   nothing and return -ENOENT;
 - HASHMAP_APPEND - always add key/value pair, even if key already exists.
   This turns hashmap into a multimap by allowing multiple values to be
   associated with the same key. Most useful read API for such hashmap is
   hashmap__for_each_key_entry() iteration. If hashmap__find() is still
   used, it will return last inserted key/value entry (first in a bucket
   chain).

For HASHMAP_SET and HASHMAP_UPDATE, old key/value pair is returned, so
that calling code can handle proper memory management, if necessary.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-29 10:27:27 -07:00
Andrii Nakryiko
939da1eb3f libbpf: add btf__parse_elf API to load .BTF and .BTF.ext
Loading BTF and BTF.ext from ELF file is a common need. Instead of
requiring every user to re-implement it, let's provide this API from
libbpf itself. It's mostly copy/paste from `bpftool btf dump`
implementation, which will be switched to libbpf's version in next
patch. btf__parse_elf allows to load BTF and optionally BTF.ext.
This is also useful for tests that need to load/work with BTF, loaded
from test ELF files.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-29 10:27:27 -07:00
Andrii Nakryiko
d557b32f71 libbpf: ensure libbpf.h is included along libbpf_internal.h
libbpf_internal.h expects a bunch of stuff defined in libbpf.h to be
defined. This patch makes sure that libbpf.h is always included.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-29 10:27:27 -07:00
Andrii Nakryiko
e60460f4e5 linux/err.h: add PTR_ERR_OR_ZERO
Add missing PTR_ERR_OR_ZERO implementation. Also add a bunch of
generated files to .gitignore.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-05-29 10:19:20 -07:00
61 changed files with 18918 additions and 1905 deletions

14
.lgtm.yml Normal file
View File

@@ -0,0 +1,14 @@
# vi: set ts=2 sw=2:
extraction:
cpp:
prepare:
packages:
- libelf-dev
- pkg-config
after_prepare:
# As the buildsystem detection by LGTM is performed _only_ during the
# 'configure' phase, we need to trick LGTM we use a supported build
# system (configure, meson, cmake, etc.). This way LGTM correctly detects
# that our sources are in the src/ subfolder.
- touch src/configure
- chmod +x src/configure

View File

@@ -1,13 +1,64 @@
sudo: required
dist: xenial
language: bash
dist: bionic
services:
- docker
env:
global:
- AUTHOR_EMAIL="$(git log -1 $TRAVIS_COMMIT --pretty=\"%aE\")"
- PROJECT_NAME='libbpf'
- AUTHOR_EMAIL="$(git log -1 --pretty=\"%aE\")"
- CI_MANAGERS="$TRAVIS_BUILD_DIR/travis-ci/managers"
- VMTEST_ROOT="$TRAVIS_BUILD_DIR/travis-ci/vmtest"
- REPO_ROOT="$TRAVIS_BUILD_DIR"
- GIT_FETCH_DEPTH=64
- VMTEST_SETUPCMD="PROJECT_NAME=${PROJECT_NAME} ./${PROJECT_NAME}/travis-ci/vmtest/run_selftests.sh"
jobs:
# Setup command override.
# 5.5.0-rc6 is built from bpf-next; TODO(hex@): remove when pahole v1.16 is available
- KERNEL=5.5.0-rc6
- KERNEL=5.5.0
- KERNEL=LATEST
addons:
apt:
packages:
- qemu-kvm
- zstd
- binutils-dev
- elfutils
- libcap-dev
- libelf-dev
install: sudo adduser "${USER}" kvm
before_script:
- wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
- echo "deb http://apt.llvm.org/bionic/ llvm-toolchain-bionic-10 main" | sudo tee -a /etc/apt/sources.list
- echo "deb http://archive.ubuntu.com/ubuntu eoan main restricted universe multiverse" | sudo tee -a /etc/apt/sources.list
- sudo apt-get -qq update
- sudo apt-get -y install dwarves=1.15-1
- sudo apt-get -qq -y install clang-10 lld-10 llvm-10
- if [[ "${KERNEL}" = 'LATEST' ]]; then ${VMTEST_ROOT}/build_latest_kernel.sh travis-ci/vmtest/bpf-next; fi
- ${VMTEST_ROOT}/prepare_selftests.sh travis-ci/vmtest/bpf-next
# Escape whitespace characters.
- setup_cmd=$(sed 's/\([[:space:]]\)/\\\1/g' <<< "${VMTEST_SETUPCMD}")
- if [[ "${KERNEL}" = 'LATEST' ]]; then
sudo -E sudo -E -u "${USER}" "${VMTEST_ROOT}/run.sh" -b travis-ci/vmtest/bpf-next -o -d ~ -s "${setup_cmd}" ~/root.img;
else
sudo -E sudo -E -u "${USER}" "${VMTEST_ROOT}/run.sh" -k "${KERNEL}*" -o -d ~ -s "${setup_cmd}" ~/root.img;
fi; exitstatus=$?
- test $exitstatus -le 1
script:
- test $exitstatus -eq 0
stages:
# Run Coverity periodically instead of for each PR for following reasons:
# 1) Coverity jobs are heavily rate-limited
# 2) Due to security restrictions of encrypted environment variables
# in Travis CI, pull requests made from forks can't access encrypted
# env variables, making Coverity unusable
# See: https://docs.travis-ci.com/user/pull-requests#pull-requests-and-security-restrictions
- name: Coverity
if: type = cron
jobs:
include:
@@ -22,10 +73,10 @@ jobs:
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
# Override before_script: so VMTEST before_install commands are not executed.
before_script: true
script:
- set -e
- $CI_MANAGERS/debian.sh RUN
- set +e
- $CI_MANAGERS/debian.sh RUN || travis_terminate
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
@@ -39,10 +90,9 @@ jobs:
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
before_script: true
script:
- set -e
- $CI_MANAGERS/debian.sh RUN_ASAN
- set +e
- $CI_MANAGERS/debian.sh RUN_ASAN || travis_terminate
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
@@ -56,10 +106,9 @@ jobs:
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
before_script: true
script:
- set -e
- $CI_MANAGERS/debian.sh RUN_CLANG
- set +e
- $CI_MANAGERS/debian.sh RUN_CLANG || travis_terminate
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
@@ -73,10 +122,9 @@ jobs:
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
before_script: true
script:
- set -e
- $CI_MANAGERS/debian.sh RUN_CLANG_ASAN
- set +e
- $CI_MANAGERS/debian.sh RUN_CLANG_ASAN || travis_terminate
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
@@ -90,10 +138,9 @@ jobs:
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
before_script: true
script:
- set -e
- $CI_MANAGERS/debian.sh RUN_GCC8
- set +e
- $CI_MANAGERS/debian.sh RUN_GCC8 || travis_terminate
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
@@ -107,16 +154,58 @@ jobs:
- docker --version
install:
- $CI_MANAGERS/debian.sh SETUP
before_script: true
script:
- set -e
- $CI_MANAGERS/debian.sh RUN_GCC8_ASAN
- set +e
- $CI_MANAGERS/debian.sh RUN_GCC8_ASAN || travis_terminate
after_script:
- $CI_MANAGERS/debian.sh CLEANUP
- name: Ubuntu Xenial
- name: Ubuntu Bionic
language: bash
before_script: true
script:
- set -e
- sudo $CI_MANAGERS/xenial.sh
- set +e
- sudo $CI_MANAGERS/ubuntu.sh || travis_terminate
- name: Ubuntu Bionic (arm)
arch: arm64
language: bash
before_script: true
script:
- sudo $CI_MANAGERS/ubuntu.sh || travis_terminate
- name: Ubuntu Bionic (s390x)
arch: s390x
language: bash
before_script: true
script:
- sudo $CI_MANAGERS/ubuntu.sh || travis_terminate
- name: Ubuntu Bionic (ppc64le)
arch: ppc64le
language: bash
before_script: true
script:
- sudo $CI_MANAGERS/ubuntu.sh || travis_terminate
- stage: Coverity
language: bash
env:
# Coverity configuration
# COVERITY_SCAN_TOKEN=xxx
# Encrypted using `travis encrypt --repo libbpf/libbpf COVERITY_SCAN_TOKEN=xxx`
- secure: "I9OsMRHbb82IUivDp+I+w/jEQFOJgBDAqYqf1ollqCM1QhocxMcS9bwIAgfPhdXi2hohV7sRrVMZstahY67FAvJLGxNopi4tAPDIAaIFxgO0yDxMhaTMx5xDfMwlIm2FOP/9gB9BQsd6M7CmoQZgXYwBIv7xd1ooxoQrh2rOK1YrRl7UQu3+c3zPTjDfIYZzR3bFttMqZ9/c4U0v8Ry5IFXrel3hCshndHA1TtttJrUSrILlZcmVc1ch7JIy6zCbCU/2lGv0B/7rWXfF8MT7O9jPtFOhJ1DEcd2zhw2n4j9YT3a8OhtnM61LA6ask632mwCOsxpFLTun7AzuR1Cb5mdPHsxhxnCHcXXARa2mJjem0QG1NhwxwJE8sbRDapojexxCvweYlEN40ofwMDSnj/qNt95XIcrk0tiIhGFx0gVNWvAdmZwx+N4mwGPMTAN0AEOFjpgI+ZdB89m+tL/CbEgE1flc8QxUxJhcp5OhH6yR0z9qYOp0nXIbHsIaCiRvt/7LqFRQfheifztWVz4mdQlCdKS9gcOQ09oKicPevKO1L0Ue3cb7Ug7jOpMs+cdh3XokJtUeYEr1NijMHT9+CTAhhO5RToWXIZRon719z3fwoUBNDREATwVFMlVxqSO/pbYgaKminigYbl785S89YYaZ6E5UvaKRHM6KHKMDszs="
- COVERITY_SCAN_PROJECT_NAME="libbpf"
- COVERITY_SCAN_NOTIFICATION_EMAIL="${AUTHOR_EMAIL}"
- COVERITY_SCAN_BRANCH_PATTERN="$TRAVIS_BRANCH"
# Note: `make -C src/` as a BUILD_COMMAND will not work here
- COVERITY_SCAN_BUILD_COMMAND_PREPEND="cd src/"
- COVERITY_SCAN_BUILD_COMMAND="make"
install:
- sudo echo 'deb-src http://archive.ubuntu.com/ubuntu/ bionic main restricted universe multiverse' >>/etc/apt/sources.list
- sudo apt-get update
- sudo apt-get -y build-dep libelf-dev
- sudo apt-get install -y libelf-dev pkg-config
# Override before_script: so VMTEST before_script commands are not executed.
before_script: true
script:
- scripts/coverity.sh || travis_terminate

1
BPF-CHECKPOINT-COMMIT Normal file
View File

@@ -0,0 +1 @@
08dc225d8868d5094ada62f471ebdfcce9dbc298

View File

@@ -1 +1 @@
f49aa1de98363b6c5fba4637678d6b0ba3d18065
35b9211c0a2427e8f39e534f442f43804fc8d5ca

1
LICENSE Normal file
View File

@@ -0,0 +1 @@
LGPL-2.1 OR BSD-2-Clause

32
LICENSE.BSD-2-Clause Normal file
View File

@@ -0,0 +1,32 @@
Valid-License-Identifier: BSD-2-Clause
SPDX-URL: https://spdx.org/licenses/BSD-2-Clause.html
Usage-Guide:
To use the BSD 2-clause "Simplified" License put the following SPDX
tag/value pair into a comment according to the placement guidelines in
the licensing rules documentation:
SPDX-License-Identifier: BSD-2-Clause
License-Text:
Copyright (c) <year> <owner> . All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.

503
LICENSE.LPGL-2.1 Normal file
View File

@@ -0,0 +1,503 @@
Valid-License-Identifier: LGPL-2.1
Valid-License-Identifier: LGPL-2.1+
SPDX-URL: https://spdx.org/licenses/LGPL-2.1.html
Usage-Guide:
To use this license in source code, put one of the following SPDX
tag/value pairs into a comment according to the placement
guidelines in the licensing rules documentation.
For 'GNU Lesser General Public License (LGPL) version 2.1 only' use:
SPDX-License-Identifier: LGPL-2.1
For 'GNU Lesser General Public License (LGPL) version 2.1 or any later
version' use:
SPDX-License-Identifier: LGPL-2.1+
License-Text:
GNU LESSER GENERAL PUBLIC LICENSE
Version 2.1, February 1999
Copyright (C) 1991, 1999 Free Software Foundation, Inc.
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
[This is the first released version of the Lesser GPL. It also counts as
the successor of the GNU Library Public License, version 2, hence the
version number 2.1.]
Preamble
The licenses for most software are designed to take away your freedom to
share and change it. By contrast, the GNU General Public Licenses are
intended to guarantee your freedom to share and change free software--to
make sure the software is free for all its users.
This license, the Lesser General Public License, applies to some specially
designated software packages--typically libraries--of the Free Software
Foundation and other authors who decide to use it. You can use it too, but
we suggest you first think carefully about whether this license or the
ordinary General Public License is the better strategy to use in any
particular case, based on the explanations below.
When we speak of free software, we are referring to freedom of use, not
price. Our General Public Licenses are designed to make sure that you have
the freedom to distribute copies of free software (and charge for this
service if you wish); that you receive source code or can get it if you
want it; that you can change the software and use pieces of it in new free
programs; and that you are informed that you can do these things.
To protect your rights, we need to make restrictions that forbid
distributors to deny you these rights or to ask you to surrender these
rights. These restrictions translate to certain responsibilities for you if
you distribute copies of the library or if you modify it.
For example, if you distribute copies of the library, whether gratis or for
a fee, you must give the recipients all the rights that we gave you. You
must make sure that they, too, receive or can get the source code. If you
link other code with the library, you must provide complete object files to
the recipients, so that they can relink them with the library after making
changes to the library and recompiling it. And you must show them these
terms so they know their rights.
We protect your rights with a two-step method: (1) we copyright the
library, and (2) we offer you this license, which gives you legal
permission to copy, distribute and/or modify the library.
To protect each distributor, we want to make it very clear that there is no
warranty for the free library. Also, if the library is modified by someone
else and passed on, the recipients should know that what they have is not
the original version, so that the original author's reputation will not be
affected by problems that might be introduced by others.
Finally, software patents pose a constant threat to the existence of any
free program. We wish to make sure that a company cannot effectively
restrict the users of a free program by obtaining a restrictive license
from a patent holder. Therefore, we insist that any patent license obtained
for a version of the library must be consistent with the full freedom of
use specified in this license.
Most GNU software, including some libraries, is covered by the ordinary GNU
General Public License. This license, the GNU Lesser General Public
License, applies to certain designated libraries, and is quite different
from the ordinary General Public License. We use this license for certain
libraries in order to permit linking those libraries into non-free
programs.
When a program is linked with a library, whether statically or using a
shared library, the combination of the two is legally speaking a combined
work, a derivative of the original library. The ordinary General Public
License therefore permits such linking only if the entire combination fits
its criteria of freedom. The Lesser General Public License permits more lax
criteria for linking other code with the library.
We call this license the "Lesser" General Public License because it does
Less to protect the user's freedom than the ordinary General Public
License. It also provides other free software developers Less of an
advantage over competing non-free programs. These disadvantages are the
reason we use the ordinary General Public License for many
libraries. However, the Lesser license provides advantages in certain
special circumstances.
For example, on rare occasions, there may be a special need to encourage
the widest possible use of a certain library, so that it becomes a de-facto
standard. To achieve this, non-free programs must be allowed to use the
library. A more frequent case is that a free library does the same job as
widely used non-free libraries. In this case, there is little to gain by
limiting the free library to free software only, so we use the Lesser
General Public License.
In other cases, permission to use a particular library in non-free programs
enables a greater number of people to use a large body of free
software. For example, permission to use the GNU C Library in non-free
programs enables many more people to use the whole GNU operating system, as
well as its variant, the GNU/Linux operating system.
Although the Lesser General Public License is Less protective of the users'
freedom, it does ensure that the user of a program that is linked with the
Library has the freedom and the wherewithal to run that program using a
modified version of the Library.
The precise terms and conditions for copying, distribution and modification
follow. Pay close attention to the difference between a "work based on the
library" and a "work that uses the library". The former contains code
derived from the library, whereas the latter must be combined with the
library in order to run.
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License Agreement applies to any software library or other program
which contains a notice placed by the copyright holder or other
authorized party saying it may be distributed under the terms of this
Lesser General Public License (also called "this License"). Each
licensee is addressed as "you".
A "library" means a collection of software functions and/or data
prepared so as to be conveniently linked with application programs
(which use some of those functions and data) to form executables.
The "Library", below, refers to any such software library or work which
has been distributed under these terms. A "work based on the Library"
means either the Library or any derivative work under copyright law:
that is to say, a work containing the Library or a portion of it, either
verbatim or with modifications and/or translated straightforwardly into
another language. (Hereinafter, translation is included without
limitation in the term "modification".)
"Source code" for a work means the preferred form of the work for making
modifications to it. For a library, complete source code means all the
source code for all modules it contains, plus any associated interface
definition files, plus the scripts used to control compilation and
installation of the library.
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of running
a program using the Library is not restricted, and output from such a
program is covered only if its contents constitute a work based on the
Library (independent of the use of the Library in a tool for writing
it). Whether that is true depends on what the Library does and what the
program that uses the Library does.
1. You may copy and distribute verbatim copies of the Library's complete
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the notices
that refer to this License and to the absence of any warranty; and
distribute a copy of this License along with the Library.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Library or any portion of it,
thus forming a work based on the Library, and copy and distribute such
modifications or work under the terms of Section 1 above, provided that
you also meet all of these conditions:
a) The modified work must itself be a software library.
b) You must cause the files modified to carry prominent notices stating
that you changed the files and the date of any change.
c) You must cause the whole of the work to be licensed at no charge to
all third parties under the terms of this License.
d) If a facility in the modified Library refers to a function or a table
of data to be supplied by an application program that uses the
facility, other than as an argument passed when the facility is
invoked, then you must make a good faith effort to ensure that, in
the event an application does not supply such function or table, the
facility still operates, and performs whatever part of its purpose
remains meaningful.
(For example, a function in a library to compute square roots has a
purpose that is entirely well-defined independent of the
application. Therefore, Subsection 2d requires that any
application-supplied function or table used by this function must be
optional: if the application does not supply it, the square root
function must still compute square roots.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Library, and
can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based on
the Library, the distribution of the whole must be on the terms of this
License, whose permissions for other licensees extend to the entire
whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Library.
In addition, mere aggregation of another work not based on the Library
with the Library (or with a work based on the Library) on a volume of a
storage or distribution medium does not bring the other work under the
scope of this License.
3. You may opt to apply the terms of the ordinary GNU General Public
License instead of this License to a given copy of the Library. To do
this, you must alter all the notices that refer to this License, so that
they refer to the ordinary GNU General Public License, version 2,
instead of to this License. (If a newer version than version 2 of the
ordinary GNU General Public License has appeared, then you can specify
that version instead if you wish.) Do not make any other change in these
notices.
Once this change is made in a given copy, it is irreversible for that
copy, so the ordinary GNU General Public License applies to all
subsequent copies and derivative works made from that copy.
This option is useful when you wish to copy part of the code of the
Library into a program that is not a library.
4. You may copy and distribute the Library (or a portion or derivative of
it, under Section 2) in object code or executable form under the terms
of Sections 1 and 2 above provided that you accompany it with the
complete corresponding machine-readable source code, which must be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange.
If distribution of object code is made by offering access to copy from a
designated place, then offering equivalent access to copy the source
code from the same place satisfies the requirement to distribute the
source code, even though third parties are not compelled to copy the
source along with the object code.
5. A program that contains no derivative of any portion of the Library, but
is designed to work with the Library by being compiled or linked with
it, is called a "work that uses the Library". Such a work, in isolation,
is not a derivative work of the Library, and therefore falls outside the
scope of this License.
However, linking a "work that uses the Library" with the Library creates
an executable that is a derivative of the Library (because it contains
portions of the Library), rather than a "work that uses the
library". The executable is therefore covered by this License. Section 6
states terms for distribution of such executables.
When a "work that uses the Library" uses material from a header file
that is part of the Library, the object code for the work may be a
derivative work of the Library even though the source code is
not. Whether this is true is especially significant if the work can be
linked without the Library, or if the work is itself a library. The
threshold for this to be true is not precisely defined by law.
If such an object file uses only numerical parameters, data structure
layouts and accessors, and small macros and small inline functions (ten
lines or less in length), then the use of the object file is
unrestricted, regardless of whether it is legally a derivative
work. (Executables containing this object code plus portions of the
Library will still fall under Section 6.)
Otherwise, if the work is a derivative of the Library, you may
distribute the object code for the work under the terms of Section
6. Any executables containing that work also fall under Section 6,
whether or not they are linked directly with the Library itself.
6. As an exception to the Sections above, you may also combine or link a
"work that uses the Library" with the Library to produce a work
containing portions of the Library, and distribute that work under terms
of your choice, provided that the terms permit modification of the work
for the customer's own use and reverse engineering for debugging such
modifications.
You must give prominent notice with each copy of the work that the
Library is used in it and that the Library and its use are covered by
this License. You must supply a copy of this License. If the work during
execution displays copyright notices, you must include the copyright
notice for the Library among them, as well as a reference directing the
user to the copy of this License. Also, you must do one of these things:
a) Accompany the work with the complete corresponding machine-readable
source code for the Library including whatever changes were used in
the work (which must be distributed under Sections 1 and 2 above);
and, if the work is an executable linked with the Library, with the
complete machine-readable "work that uses the Library", as object
code and/or source code, so that the user can modify the Library and
then relink to produce a modified executable containing the modified
Library. (It is understood that the user who changes the contents of
definitions files in the Library will not necessarily be able to
recompile the application to use the modified definitions.)
b) Use a suitable shared library mechanism for linking with the
Library. A suitable mechanism is one that (1) uses at run time a copy
of the library already present on the user's computer system, rather
than copying library functions into the executable, and (2) will
operate properly with a modified version of the library, if the user
installs one, as long as the modified version is interface-compatible
with the version that the work was made with.
c) Accompany the work with a written offer, valid for at least three
years, to give the same user the materials specified in Subsection
6a, above, for a charge no more than the cost of performing this
distribution.
d) If distribution of the work is made by offering access to copy from a
designated place, offer equivalent access to copy the above specified
materials from the same place.
e) Verify that the user has already received a copy of these materials
or that you have already sent this user a copy.
For an executable, the required form of the "work that uses the Library"
must include any data and utility programs needed for reproducing the
executable from it. However, as a special exception, the materials to be
distributed need not include anything that is normally distributed (in
either source or binary form) with the major components (compiler,
kernel, and so on) of the operating system on which the executable runs,
unless that component itself accompanies the executable.
It may happen that this requirement contradicts the license restrictions
of other proprietary libraries that do not normally accompany the
operating system. Such a contradiction means you cannot use both them
and the Library together in an executable that you distribute.
7. You may place library facilities that are a work based on the Library
side-by-side in a single library together with other library facilities
not covered by this License, and distribute such a combined library,
provided that the separate distribution of the work based on the Library
and of the other library facilities is otherwise permitted, and provided
that you do these two things:
a) Accompany the combined library with a copy of the same work based on
the Library, uncombined with any other library facilities. This must
be distributed under the terms of the Sections above.
b) Give prominent notice with the combined library of the fact that part
of it is a work based on the Library, and explaining where to find
the accompanying uncombined form of the same work.
8. You may not copy, modify, sublicense, link with, or distribute the
Library except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense, link with, or distribute the
Library is void, and will automatically terminate your rights under this
License. However, parties who have received copies, or rights, from you
under this License will not have their licenses terminated so long as
such parties remain in full compliance.
9. You are not required to accept this License, since you have not signed
it. However, nothing else grants you permission to modify or distribute
the Library or its derivative works. These actions are prohibited by law
if you do not accept this License. Therefore, by modifying or
distributing the Library (or any work based on the Library), you
indicate your acceptance of this License to do so, and all its terms and
conditions for copying, distributing or modifying the Library or works
based on it.
10. Each time you redistribute the Library (or any work based on the
Library), the recipient automatically receives a license from the
original licensor to copy, distribute, link with or modify the Library
subject to these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted
herein. You are not responsible for enforcing compliance by third
parties with this License.
11. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Library at all. For example, if a patent license
would not permit royalty-free redistribution of the Library by all
those who receive copies directly or indirectly through you, then the
only way you could satisfy both it and this License would be to refrain
entirely from distribution of the Library.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply, and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system which is implemented
by public license practices. Many people have made generous
contributions to the wide range of software distributed through that
system in reliance on consistent application of that system; it is up
to the author/donor to decide if he or she is willing to distribute
software through any other system and a licensee cannot impose that
choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
12. If the distribution and/or use of the Library is restricted in certain
countries either by patents or by copyrighted interfaces, the original
copyright holder who places the Library under this License may add an
explicit geographical distribution limitation excluding those
countries, so that distribution is permitted only in or among countries
not thus excluded. In such case, this License incorporates the
limitation as if written in the body of this License.
13. The Free Software Foundation may publish revised and/or new versions of
the Lesser General Public License from time to time. Such new versions
will be similar in spirit to the present version, but may differ in
detail to address new problems or concerns.
Each version is given a distinguishing version number. If the Library
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and
conditions either of that version or of any later version published by
the Free Software Foundation. If the Library does not specify a license
version number, you may choose any version ever published by the Free
Software Foundation.
14. If you wish to incorporate parts of the Library into other free
programs whose distribution conditions are incompatible with these,
write to the author to ask for permission. For software which is
copyrighted by the Free Software Foundation, write to the Free Software
Foundation; we sometimes make exceptions for this. Our decision will be
guided by the two goals of preserving the free status of all
derivatives of our free software and of promoting the sharing and reuse
of software generally.
NO WARRANTY
15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE
ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE LIBRARY IS WITH
YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL
NECESSARY SERVICING, REPAIR OR CORRECTION.
16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU FOR
DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL
DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE LIBRARY
(INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED
INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF
THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR
OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Libraries
If you develop a new library, and you want it to be of the greatest
possible use to the public, we recommend making it free software that
everyone can redistribute and change. You can do so by permitting
redistribution under these terms (or, alternatively, under the terms of the
ordinary General Public License).
To apply these terms, attach the following notices to the library. It is
safest to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least the
"copyright" line and a pointer to where the full notice is found.
one line to give the library's name and an idea of what it does.
Copyright (C) year name of author
This library is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation; either version 2.1 of the License, or (at
your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License
for more details.
You should have received a copy of the GNU Lesser General Public License
along with this library; if not, write to the Free Software Foundation,
Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Also add
information on how to contact you by electronic and paper mail.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the library, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in
the library `Frob' (a library for tweaking knobs) written
by James Random Hacker.
signature of Ty Coon, 1 April 1990
Ty Coon, President of Vice
That's all there is to it!

View File

@@ -16,7 +16,10 @@ Other header files at this repo (`include/linux/*.h`) are reduced versions of
their counterpart files at bpf-next's `tools/include/linux/*.h` to make compilation
successful.
Build [![Build Status](https://travis-ci.org/libbpf/libbpf.svg?branch=master)](https://travis-ci.org/libbpf/libbpf)
Build
[![Build Status](https://travis-ci.org/libbpf/libbpf.svg?branch=master)](https://travis-ci.org/libbpf/libbpf)
[![Total alerts](https://img.shields.io/lgtm/alerts/g/libbpf/libbpf.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/libbpf/libbpf/alerts/)
[![Coverity](https://img.shields.io/coverity/scan/18195.svg)](https://scan.coverity.com/projects/libbpf)
=====
libelf is an internal dependency of libbpf and thus it is required to link
against and must be installed on the system for applications to work.
@@ -48,23 +51,32 @@ $ cd src
$ PKG_CONFIG_PATH=/build/root/lib64/pkgconfig DESTDIR=/build/root make install
```
To integrate libbpf into a project which uses Meson building system define
`[wrap-git]` file in `subprojects` folder.
To add libbpf dependency to the parent parent project, e.g. for
libbpf_static_dep:
```
libbpf_obj = subproject('libbpf', required : true)
libbpf_static_dep = libbpf_proj.get_variable('libbpf_static_dep')
```
Distributions
=====
To validate changes to meson.build
```bash
$ python3 meson.py build
$ ninja -C build/
```
Distributions packaging libbpf from this mirror:
- [Fedora](https://src.fedoraproject.org/rpms/libbpf)
- [Gentoo](https://packages.gentoo.org/packages/dev-libs/libbpf)
To install headers, libs and pkgconfig
```bash
$ cd build
$ ninja install
```
Benefits of packaging from the mirror over packaging from kernel sources:
- Consistent versioning across distributions.
- No ties to any specific kernel, transparent handling of older kernels.
Libbpf is designed to be kernel-agnostic and work across multitude of kernel
versions. It has built-in mechanisms to gracefully handle older kernels,
that are missing some of the features, by working around or gracefully
degrading functionality. Thus libbpf is not tied to a specific kernel
version and can/should be packaged and versioned independently.
- Continuous integration testing via [TravisCI](https://travis-ci.org/libbpf/libbpf).
- Static code analysis via [LGTM](https://lgtm.com/projects/g/libbpf/libbpf) and [Coverity](https://scan.coverity.com/projects/libbpf).
Package dependencies of libbpf, package names may vary across distros:
- zlib
- libelf
License
=====
This work is dual-licensed under BSD 2-clause license and GNU LGPL v2.1 license.
You can choose between one of them if you use this work.
`SPDX-License-Identifier: BSD-2-Clause OR LGPL-2.1`

View File

@@ -30,4 +30,9 @@ static inline bool IS_ERR_OR_NULL(const void *ptr)
return (!ptr) || IS_ERR_VALUE((unsigned long)ptr);
}
static inline long PTR_ERR_OR_ZERO(const void *ptr)
{
return IS_ERR(ptr) ? PTR_ERR(ptr) : 0;
}
#endif

View File

@@ -96,7 +96,7 @@
MAP_FD, 0)
#define BPF_LD_MAP_VALUE(DST, MAP_FD, VALUE_OFF) \
BPF_LD_IMM64_RAW_FULL(DST, BPF_PSEUDO_MAP_FD, 0, 0, \
BPF_LD_IMM64_RAW_FULL(DST, BPF_PSEUDO_MAP_VALUE, 0, 0, \
MAP_FD, VALUE_OFF)
#define BPF_JMP_IMM(OP, DST, IMM, OFF) \
@@ -107,5 +107,12 @@
.off = OFF, \
.imm = IMM })
#define BPF_JMP32_IMM(OP, DST, IMM, OFF) \
((struct bpf_insn) { \
.code = BPF_JMP32 | BPF_OP(OP) | BPF_K, \
.dst_reg = DST, \
.src_reg = 0, \
.off = OFF, \
.imm = IMM })
#endif

View File

@@ -106,6 +106,11 @@ enum bpf_cmd {
BPF_TASK_FD_QUERY,
BPF_MAP_LOOKUP_AND_DELETE_ELEM,
BPF_MAP_FREEZE,
BPF_BTF_GET_NEXT_ID,
BPF_MAP_LOOKUP_BATCH,
BPF_MAP_LOOKUP_AND_DELETE_BATCH,
BPF_MAP_UPDATE_BATCH,
BPF_MAP_DELETE_BATCH,
};
enum bpf_map_type {
@@ -134,6 +139,8 @@ enum bpf_map_type {
BPF_MAP_TYPE_QUEUE,
BPF_MAP_TYPE_STACK,
BPF_MAP_TYPE_SK_STORAGE,
BPF_MAP_TYPE_DEVMAP_HASH,
BPF_MAP_TYPE_STRUCT_OPS,
};
/* Note that tracing related programs such as
@@ -170,6 +177,10 @@ enum bpf_prog_type {
BPF_PROG_TYPE_FLOW_DISSECTOR,
BPF_PROG_TYPE_CGROUP_SYSCTL,
BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE,
BPF_PROG_TYPE_CGROUP_SOCKOPT,
BPF_PROG_TYPE_TRACING,
BPF_PROG_TYPE_STRUCT_OPS,
BPF_PROG_TYPE_EXT,
};
enum bpf_attach_type {
@@ -192,6 +203,13 @@ enum bpf_attach_type {
BPF_LIRC_MODE2,
BPF_FLOW_DISSECTOR,
BPF_CGROUP_SYSCTL,
BPF_CGROUP_UDP4_RECVMSG,
BPF_CGROUP_UDP6_RECVMSG,
BPF_CGROUP_GETSOCKOPT,
BPF_CGROUP_SETSOCKOPT,
BPF_TRACE_RAW_TP,
BPF_TRACE_FENTRY,
BPF_TRACE_FEXIT,
__MAX_BPF_ATTACH_TYPE
};
@@ -220,6 +238,11 @@ enum bpf_attach_type {
* When children program makes decision (like picking TCP CA or sock bind)
* parent program has a chance to override it.
*
* With BPF_F_ALLOW_MULTI a new program is added to the end of the list of
* programs for a cgroup. Though it's possible to replace an old program at
* any position by also specifying BPF_F_REPLACE flag and position itself in
* replace_bpf_fd attribute. Old program at this position will be released.
*
* A cgroup with MULTI or OVERRIDE flag allows any attach flags in sub-cgroups.
* A cgroup with NONE doesn't allow any programs in sub-cgroups.
* Ex1:
@@ -238,6 +261,7 @@ enum bpf_attach_type {
*/
#define BPF_F_ALLOW_OVERRIDE (1U << 0)
#define BPF_F_ALLOW_MULTI (1U << 1)
#define BPF_F_REPLACE (1U << 2)
/* If BPF_F_STRICT_ALIGNMENT is used in BPF_PROG_LOAD command, the
* verifier will perform strict alignment checking as if the kernel
@@ -260,6 +284,27 @@ enum bpf_attach_type {
*/
#define BPF_F_ANY_ALIGNMENT (1U << 1)
/* BPF_F_TEST_RND_HI32 is used in BPF_PROG_LOAD command for testing purpose.
* Verifier does sub-register def/use analysis and identifies instructions whose
* def only matters for low 32-bit, high 32-bit is never referenced later
* through implicit zero extension. Therefore verifier notifies JIT back-ends
* that it is safe to ignore clearing high 32-bit for these instructions. This
* saves some back-ends a lot of code-gen. However such optimization is not
* necessary on some arches, for example x86_64, arm64 etc, whose JIT back-ends
* hence hasn't used verifier's analysis result. But, we really want to have a
* way to be able to verify the correctness of the described optimization on
* x86_64 on which testsuites are frequently exercised.
*
* So, this flag is introduced. Once it is set, verifier will randomize high
* 32-bit for those instructions who has been identified as safe to ignore them.
* Then, if verifier is not doing correct analysis, such randomization will
* regress tests to expose bugs.
*/
#define BPF_F_TEST_RND_HI32 (1U << 2)
/* The verifier internal test flag. Behavior is undefined */
#define BPF_F_TEST_STATE_FREQ (1U << 3)
/* When BPF ldimm64's insn[0].src_reg != 0 then this can have
* two extensions:
*
@@ -313,7 +358,18 @@ enum bpf_attach_type {
#define BPF_F_RDONLY_PROG (1U << 7)
#define BPF_F_WRONLY_PROG (1U << 8)
/* flags for BPF_PROG_QUERY */
/* Clone map from listener for newly accepted socket */
#define BPF_F_CLONE (1U << 9)
/* Enable memory-mapping BPF map */
#define BPF_F_MMAPABLE (1U << 10)
/* Flags for BPF_PROG_QUERY. */
/* Query effective (directly attached + inherited from ancestor cgroups)
* programs that will be executed for events within a cgroup.
* attach_flags with this flag are returned only for directly attached programs.
*/
#define BPF_F_QUERY_EFFECTIVE (1U << 0)
enum bpf_stack_build_id_status {
@@ -353,6 +409,10 @@ union bpf_attr {
__u32 btf_fd; /* fd pointing to a BTF type data */
__u32 btf_key_type_id; /* BTF type_id of the key */
__u32 btf_value_type_id; /* BTF type_id of the value */
__u32 btf_vmlinux_value_type_id;/* BTF type_id of a kernel-
* struct stored as the
* map value
*/
};
struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */
@@ -365,6 +425,23 @@ union bpf_attr {
__u64 flags;
};
struct { /* struct used by BPF_MAP_*_BATCH commands */
__aligned_u64 in_batch; /* start batch,
* NULL to start from beginning
*/
__aligned_u64 out_batch; /* output: next start batch */
__aligned_u64 keys;
__aligned_u64 values;
__u32 count; /* input/output:
* input: # of key/value
* elements
* output: # of filled elements
*/
__u32 map_fd;
__u64 elem_flags;
__u64 flags;
} batch;
struct { /* anonymous struct used by BPF_PROG_LOAD command */
__u32 prog_type; /* one of enum bpf_prog_type */
__u32 insn_cnt;
@@ -389,6 +466,8 @@ union bpf_attr {
__u32 line_info_rec_size; /* userspace bpf_line_info size */
__aligned_u64 line_info; /* line info */
__u32 line_info_cnt; /* number of bpf_line_info records */
__u32 attach_btf_id; /* in-kernel BTF type id to attach to */
__u32 attach_prog_fd; /* 0 to attach to vmlinux */
};
struct { /* anonymous struct used by BPF_OBJ_* commands */
@@ -402,6 +481,10 @@ union bpf_attr {
__u32 attach_bpf_fd; /* eBPF program to attach */
__u32 attach_type;
__u32 attach_flags;
__u32 replace_bpf_fd; /* previously attached eBPF
* program to replace if
* BPF_F_REPLACE is used
*/
};
struct { /* anonymous struct used by BPF_PROG_TEST_RUN command */
@@ -529,10 +612,13 @@ union bpf_attr {
* Return
* 0 on success, or a negative error in case of failure.
*
* int bpf_probe_read(void *dst, u32 size, const void *src)
* int bpf_probe_read(void *dst, u32 size, const void *unsafe_ptr)
* Description
* For tracing programs, safely attempt to read *size* bytes from
* address *src* and store the data in *dst*.
* kernel space address *unsafe_ptr* and store the data in *dst*.
*
* Generally, use bpf_probe_read_user() or bpf_probe_read_kernel()
* instead.
* Return
* 0 on success, or a negative error in case of failure.
*
@@ -552,6 +638,8 @@ union bpf_attr {
* limited to five).
*
* Each time the helper is called, it appends a line to the trace.
* Lines are discarded while *\/sys/kernel/debug/tracing/trace* is
* open, use *\/sys/kernel/debug/tracing/trace_pipe* to avoid this.
* The format of the trace is customizable, and the exact output
* one will get depends on the options set in
* *\/sys/kernel/debug/tracing/trace_options* (see also the
@@ -761,7 +849,7 @@ union bpf_attr {
* A 64-bit integer containing the current GID and UID, and
* created as such: *current_gid* **<< 32 \|** *current_uid*.
*
* int bpf_get_current_comm(char *buf, u32 size_of_buf)
* int bpf_get_current_comm(void *buf, u32 size_of_buf)
* Description
* Copy the **comm** attribute of the current task into *buf* of
* *size_of_buf*. The **comm** attribute contains the name of
@@ -783,7 +871,7 @@ union bpf_attr {
* based on a user-provided identifier for all traffic coming from
* the tasks belonging to the related cgroup. See also the related
* kernel documentation, available from the Linux sources in file
* *Documentation/cgroup-v1/net_cls.txt*.
* *Documentation/admin-guide/cgroup-v1/net_cls.rst*.
*
* The Linux kernel has two versions for cgroups: there are
* cgroups v1 and cgroups v2. Both are available to users, who can
@@ -990,7 +1078,7 @@ union bpf_attr {
* The realm of the route for the packet associated to *skb*, or 0
* if none was found.
*
* int bpf_perf_event_output(struct pt_reg *ctx, struct bpf_map *map, u64 flags, void *data, u64 size)
* int bpf_perf_event_output(void *ctx, struct bpf_map *map, u64 flags, void *data, u64 size)
* Description
* Write raw *data* blob into a special BPF perf event held by
* *map* of type **BPF_MAP_TYPE_PERF_EVENT_ARRAY**. This perf
@@ -1035,7 +1123,7 @@ union bpf_attr {
* Return
* 0 on success, or a negative error in case of failure.
*
* int bpf_skb_load_bytes(const struct sk_buff *skb, u32 offset, void *to, u32 len)
* int bpf_skb_load_bytes(const void *skb, u32 offset, void *to, u32 len)
* Description
* This helper was provided as an easy way to load data from a
* packet. It can be used to load *len* bytes from *offset* from
@@ -1052,7 +1140,7 @@ union bpf_attr {
* Return
* 0 on success, or a negative error in case of failure.
*
* int bpf_get_stackid(struct pt_reg *ctx, struct bpf_map *map, u64 flags)
* int bpf_get_stackid(void *ctx, struct bpf_map *map, u64 flags)
* Description
* Walk a user or a kernel stack and return its id. To achieve
* this, the helper needs *ctx*, which is a pointer to the context
@@ -1121,7 +1209,7 @@ union bpf_attr {
* The checksum result, or a negative error code in case of
* failure.
*
* int bpf_skb_get_tunnel_opt(struct sk_buff *skb, u8 *opt, u32 size)
* int bpf_skb_get_tunnel_opt(struct sk_buff *skb, void *opt, u32 size)
* Description
* Retrieve tunnel options metadata for the packet associated to
* *skb*, and store the raw tunnel option data to the buffer *opt*
@@ -1139,7 +1227,7 @@ union bpf_attr {
* Return
* The size of the option data retrieved.
*
* int bpf_skb_set_tunnel_opt(struct sk_buff *skb, u8 *opt, u32 size)
* int bpf_skb_set_tunnel_opt(struct sk_buff *skb, void *opt, u32 size)
* Description
* Set tunnel options metadata for the packet associated to *skb*
* to the option data contained in the raw buffer *opt* of *size*.
@@ -1392,45 +1480,14 @@ union bpf_attr {
* Return
* 0 on success, or a negative error in case of failure.
*
* int bpf_probe_read_str(void *dst, int size, const void *unsafe_ptr)
* int bpf_probe_read_str(void *dst, u32 size, const void *unsafe_ptr)
* Description
* Copy a NUL terminated string from an unsafe address
* *unsafe_ptr* to *dst*. The *size* should include the
* terminating NUL byte. In case the string length is smaller than
* *size*, the target is not padded with further NUL bytes. If the
* string length is larger than *size*, just *size*-1 bytes are
* copied and the last byte is set to NUL.
* Copy a NUL terminated string from an unsafe kernel address
* *unsafe_ptr* to *dst*. See bpf_probe_read_kernel_str() for
* more details.
*
* On success, the length of the copied string is returned. This
* makes this helper useful in tracing programs for reading
* strings, and more importantly to get its length at runtime. See
* the following snippet:
*
* ::
*
* SEC("kprobe/sys_open")
* void bpf_sys_open(struct pt_regs *ctx)
* {
* char buf[PATHLEN]; // PATHLEN is defined to 256
* int res = bpf_probe_read_str(buf, sizeof(buf),
* ctx->di);
*
* // Consume buf, for example push it to
* // userspace via bpf_perf_event_output(); we
* // can use res (the string length) as event
* // size, after checking its boundaries.
* }
*
* In comparison, using **bpf_probe_read()** helper here instead
* to read the string would require to estimate the length at
* compile time, and would often result in copying more memory
* than necessary.
*
* Another useful use case is when parsing individual process
* arguments or individual environment variables navigating
* *current*\ **->mm->arg_start** and *current*\
* **->mm->env_start**: using this helper and the return value,
* one can quickly iterate at the right offset of the memory area.
* Generally, use bpf_probe_read_user_str() or bpf_probe_read_kernel_str()
* instead.
* Return
* On success, the strictly positive length of the string,
* including the trailing NUL character. On error, a negative
@@ -1443,8 +1500,8 @@ union bpf_attr {
* If no cookie has been set yet, generate a new cookie. Once
* generated, the socket cookie remains stable for the life of the
* socket. This helper can be useful for monitoring per socket
* networking traffic statistics as it provides a unique socket
* identifier per namespace.
* networking traffic statistics as it provides a global socket
* identifier that can be assumed unique.
* Return
* A 8-byte long non-decreasing number on success, or 0 if the
* socket field is missing inside *skb*.
@@ -1478,7 +1535,7 @@ union bpf_attr {
* Return
* 0
*
* int bpf_setsockopt(struct bpf_sock_ops *bpf_socket, int level, int optname, char *optval, int optlen)
* int bpf_setsockopt(struct bpf_sock_ops *bpf_socket, int level, int optname, void *optval, int optlen)
* Description
* Emulate a call to **setsockopt()** on the socket associated to
* *bpf_socket*, which must be a full socket. The *level* at
@@ -1548,8 +1605,11 @@ union bpf_attr {
* but this is only implemented for native XDP (with driver
* support) as of this writing).
*
* All values for *flags* are reserved for future usage, and must
* be left at zero.
* The lower two bits of *flags* are used as the return code if
* the map lookup fails. This is so that the return value can be
* one of the XDP program return codes up to XDP_TX, as chosen by
* the caller. Any higher bits in the *flags* argument must be
* unset.
*
* When used to redirect packets to net devices, this helper
* provides a high performance increase over **bpf_redirect**\ ().
@@ -1559,7 +1619,7 @@ union bpf_attr {
* Return
* **XDP_REDIRECT** on success, or **XDP_ABORTED** on error.
*
* int bpf_sk_redirect_map(struct bpf_map *map, u32 key, u64 flags)
* int bpf_sk_redirect_map(struct sk_buff *skb, struct bpf_map *map, u32 key, u64 flags)
* Description
* Redirect the packet to the socket referenced by *map* (of type
* **BPF_MAP_TYPE_SOCKMAP**) at index *key*. Both ingress and
@@ -1679,7 +1739,7 @@ union bpf_attr {
* Return
* 0 on success, or a negative error in case of failure.
*
* int bpf_getsockopt(struct bpf_sock_ops *bpf_socket, int level, int optname, char *optval, int optlen)
* int bpf_getsockopt(struct bpf_sock_ops *bpf_socket, int level, int optname, void *optval, int optlen)
* Description
* Emulate a call to **getsockopt()** on the socket associated to
* *bpf_socket*, which must be a full socket. The *level* at
@@ -1698,7 +1758,7 @@ union bpf_attr {
* Return
* 0 on success, or a negative error in case of failure.
*
* int bpf_override_return(struct pt_reg *regs, u64 rc)
* int bpf_override_return(struct pt_regs *regs, u64 rc)
* Description
* Used for error injection, this helper uses kprobes to override
* the return value of the probed function, and to set it to *rc*.
@@ -1744,6 +1804,7 @@ union bpf_attr {
* * **BPF_SOCK_OPS_RTO_CB_FLAG** (retransmission time out)
* * **BPF_SOCK_OPS_RETRANS_CB_FLAG** (retransmission)
* * **BPF_SOCK_OPS_STATE_CB_FLAG** (TCP state change)
* * **BPF_SOCK_OPS_RTT_CB_FLAG** (every RTT)
*
* Therefore, this function can be used to clear a callback flag by
* setting the appropriate bit to zero. e.g. to disable the RTO
@@ -1910,7 +1971,7 @@ union bpf_attr {
* Return
* 0 on success, or a negative error in case of failure.
*
* int bpf_get_stack(struct pt_regs *regs, void *buf, u32 size, u64 flags)
* int bpf_get_stack(void *ctx, void *buf, u32 size, u64 flags)
* Description
* Return a user or a kernel stack in bpf program provided buffer.
* To achieve this, the helper needs *ctx*, which is a pointer
@@ -1943,7 +2004,7 @@ union bpf_attr {
* A non-negative value equal to or less than *size* on success,
* or a negative error in case of failure.
*
* int bpf_skb_load_bytes_relative(const struct sk_buff *skb, u32 offset, void *to, u32 len, u32 start_header)
* int bpf_skb_load_bytes_relative(const void *skb, u32 offset, void *to, u32 len, u32 start_header)
* Description
* This helper is similar to **bpf_skb_load_bytes**\ () in that
* it provides an easy way to load *len* bytes from *offset*
@@ -1996,7 +2057,7 @@ union bpf_attr {
* * > 0 one of **BPF_FIB_LKUP_RET_** codes explaining why the
* packet is not forwarded or needs assist from full stack
*
* int bpf_sock_hash_update(struct bpf_sock_ops_kern *skops, struct bpf_map *map, void *key, u64 flags)
* int bpf_sock_hash_update(struct bpf_sock_ops *skops, struct bpf_map *map, void *key, u64 flags)
* Description
* Add an entry to, or update a sockhash *map* referencing sockets.
* The *skops* is used as a new value for the entry associated to
@@ -2355,7 +2416,7 @@ union bpf_attr {
* Return
* 0 on success, or a negative error in case of failure.
*
* int bpf_msg_push_data(struct sk_buff *skb, u32 start, u32 len, u64 flags)
* int bpf_msg_push_data(struct sk_msg_buff *msg, u32 start, u32 len, u64 flags)
* Description
* For socket policies, insert *len* bytes into *msg* at offset
* *start*.
@@ -2371,9 +2432,9 @@ union bpf_attr {
* Return
* 0 on success, or a negative error in case of failure.
*
* int bpf_msg_pop_data(struct sk_msg_buff *msg, u32 start, u32 pop, u64 flags)
* int bpf_msg_pop_data(struct sk_msg_buff *msg, u32 start, u32 len, u64 flags)
* Description
* Will remove *pop* bytes from a *msg* starting at byte *start*.
* Will remove *len* bytes from a *msg* starting at byte *start*.
* This may result in **ENOMEM** errors under certain situations if
* an allocation and copy are required due to a full ring buffer.
* However, the helper will try to avoid doing the allocation
@@ -2468,7 +2529,7 @@ union bpf_attr {
* A **struct bpf_tcp_sock** pointer on success, or **NULL** in
* case of failure.
*
* int bpf_skb_ecn_set_ce(struct sk_buf *skb)
* int bpf_skb_ecn_set_ce(struct sk_buff *skb)
* Description
* Set ECN (Explicit Congestion Notification) field of IP header
* to **CE** (Congestion Encountered) if current value is **ECT**
@@ -2672,6 +2733,165 @@ union bpf_attr {
* 0 on success.
*
* **-ENOENT** if the bpf-local-storage cannot be found.
*
* int bpf_send_signal(u32 sig)
* Description
* Send signal *sig* to the process of the current task.
* The signal may be delivered to any of this process's threads.
* Return
* 0 on success or successfully queued.
*
* **-EBUSY** if work queue under nmi is full.
*
* **-EINVAL** if *sig* is invalid.
*
* **-EPERM** if no permission to send the *sig*.
*
* **-EAGAIN** if bpf program can try again.
*
* s64 bpf_tcp_gen_syncookie(struct bpf_sock *sk, void *iph, u32 iph_len, struct tcphdr *th, u32 th_len)
* Description
* Try to issue a SYN cookie for the packet with corresponding
* IP/TCP headers, *iph* and *th*, on the listening socket in *sk*.
*
* *iph* points to the start of the IPv4 or IPv6 header, while
* *iph_len* contains **sizeof**\ (**struct iphdr**) or
* **sizeof**\ (**struct ip6hdr**).
*
* *th* points to the start of the TCP header, while *th_len*
* contains the length of the TCP header.
*
* Return
* On success, lower 32 bits hold the generated SYN cookie in
* followed by 16 bits which hold the MSS value for that cookie,
* and the top 16 bits are unused.
*
* On failure, the returned value is one of the following:
*
* **-EINVAL** SYN cookie cannot be issued due to error
*
* **-ENOENT** SYN cookie should not be issued (no SYN flood)
*
* **-EOPNOTSUPP** kernel configuration does not enable SYN cookies
*
* **-EPROTONOSUPPORT** IP packet version is not 4 or 6
*
* int bpf_skb_output(void *ctx, struct bpf_map *map, u64 flags, void *data, u64 size)
* Description
* Write raw *data* blob into a special BPF perf event held by
* *map* of type **BPF_MAP_TYPE_PERF_EVENT_ARRAY**. This perf
* event must have the following attributes: **PERF_SAMPLE_RAW**
* as **sample_type**, **PERF_TYPE_SOFTWARE** as **type**, and
* **PERF_COUNT_SW_BPF_OUTPUT** as **config**.
*
* The *flags* are used to indicate the index in *map* for which
* the value must be put, masked with **BPF_F_INDEX_MASK**.
* Alternatively, *flags* can be set to **BPF_F_CURRENT_CPU**
* to indicate that the index of the current CPU core should be
* used.
*
* The value to write, of *size*, is passed through eBPF stack and
* pointed by *data*.
*
* *ctx* is a pointer to in-kernel struct sk_buff.
*
* This helper is similar to **bpf_perf_event_output**\ () but
* restricted to raw_tracepoint bpf programs.
* Return
* 0 on success, or a negative error in case of failure.
*
* int bpf_probe_read_user(void *dst, u32 size, const void *unsafe_ptr)
* Description
* Safely attempt to read *size* bytes from user space address
* *unsafe_ptr* and store the data in *dst*.
* Return
* 0 on success, or a negative error in case of failure.
*
* int bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr)
* Description
* Safely attempt to read *size* bytes from kernel space address
* *unsafe_ptr* and store the data in *dst*.
* Return
* 0 on success, or a negative error in case of failure.
*
* int bpf_probe_read_user_str(void *dst, u32 size, const void *unsafe_ptr)
* Description
* Copy a NUL terminated string from an unsafe user address
* *unsafe_ptr* to *dst*. The *size* should include the
* terminating NUL byte. In case the string length is smaller than
* *size*, the target is not padded with further NUL bytes. If the
* string length is larger than *size*, just *size*-1 bytes are
* copied and the last byte is set to NUL.
*
* On success, the length of the copied string is returned. This
* makes this helper useful in tracing programs for reading
* strings, and more importantly to get its length at runtime. See
* the following snippet:
*
* ::
*
* SEC("kprobe/sys_open")
* void bpf_sys_open(struct pt_regs *ctx)
* {
* char buf[PATHLEN]; // PATHLEN is defined to 256
* int res = bpf_probe_read_user_str(buf, sizeof(buf),
* ctx->di);
*
* // Consume buf, for example push it to
* // userspace via bpf_perf_event_output(); we
* // can use res (the string length) as event
* // size, after checking its boundaries.
* }
*
* In comparison, using **bpf_probe_read_user()** helper here
* instead to read the string would require to estimate the length
* at compile time, and would often result in copying more memory
* than necessary.
*
* Another useful use case is when parsing individual process
* arguments or individual environment variables navigating
* *current*\ **->mm->arg_start** and *current*\
* **->mm->env_start**: using this helper and the return value,
* one can quickly iterate at the right offset of the memory area.
* Return
* On success, the strictly positive length of the string,
* including the trailing NUL character. On error, a negative
* value.
*
* int bpf_probe_read_kernel_str(void *dst, u32 size, const void *unsafe_ptr)
* Description
* Copy a NUL terminated string from an unsafe kernel address *unsafe_ptr*
* to *dst*. Same semantics as with bpf_probe_read_user_str() apply.
* Return
* On success, the strictly positive length of the string, including
* the trailing NUL character. On error, a negative value.
*
* int bpf_tcp_send_ack(void *tp, u32 rcv_nxt)
* Description
* Send out a tcp-ack. *tp* is the in-kernel struct tcp_sock.
* *rcv_nxt* is the ack_seq to be sent out.
* Return
* 0 on success, or a negative error in case of failure.
*
* int bpf_send_signal_thread(u32 sig)
* Description
* Send signal *sig* to the thread corresponding to the current task.
* Return
* 0 on success or successfully queued.
*
* **-EBUSY** if work queue under nmi is full.
*
* **-EINVAL** if *sig* is invalid.
*
* **-EPERM** if no permission to send the *sig*.
*
* **-EAGAIN** if bpf program can try again.
*
* u64 bpf_jiffies64(void)
* Description
* Obtain the 64bit jiffies
* Return
* The 64 bit jiffies
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
@@ -2782,7 +3002,17 @@ union bpf_attr {
FN(strtol), \
FN(strtoul), \
FN(sk_storage_get), \
FN(sk_storage_delete),
FN(sk_storage_delete), \
FN(send_signal), \
FN(tcp_gen_syncookie), \
FN(skb_output), \
FN(probe_read_user), \
FN(probe_read_kernel), \
FN(probe_read_user_str), \
FN(probe_read_kernel_str), \
FN(tcp_send_ack), \
FN(send_signal_thread), \
FN(jiffies64),
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
* function eBPF program intends to call
@@ -3031,6 +3261,12 @@ struct bpf_tcp_sock {
* sum(delta(snd_una)), or how many bytes
* were acked.
*/
__u32 dsack_dups; /* RFC4898 tcpEStatsStackDSACKDups
* total number of DSACK blocks received
*/
__u32 delivered; /* Total data packets delivered incl. rexmits */
__u32 delivered_ce; /* Like the above but only ECE marked packets */
__u32 icsk_retransmits; /* Number of unrecovered [RTO] timeouts */
};
struct bpf_sock_tuple {
@@ -3050,6 +3286,10 @@ struct bpf_sock_tuple {
};
};
struct bpf_xdp_sock {
__u32 queue_id;
};
#define XDP_PACKET_HEADROOM 256
/* User return codes for XDP prog type.
@@ -3141,6 +3381,7 @@ struct bpf_prog_info {
char name[BPF_OBJ_NAME_LEN];
__u32 ifindex;
__u32 gpl_compatible:1;
__u32 :31; /* alignment pad */
__u64 netns_dev;
__u64 netns_ino;
__u32 nr_jited_ksyms;
@@ -3172,7 +3413,7 @@ struct bpf_map_info {
__u32 map_flags;
char name[BPF_OBJ_NAME_LEN];
__u32 ifindex;
__u32 :32;
__u32 btf_vmlinux_value_type_id;
__u64 netns_dev;
__u64 netns_ino;
__u32 btf_id;
@@ -3195,7 +3436,7 @@ struct bpf_sock_addr {
__u32 user_ip4; /* Allows 1,2,4-byte read and 4-byte write.
* Stored in network byte order.
*/
__u32 user_ip6[4]; /* Allows 1,2,4-byte read an 4-byte write.
__u32 user_ip6[4]; /* Allows 1,2,4,8-byte read and 4,8-byte write.
* Stored in network byte order.
*/
__u32 user_port; /* Allows 4-byte read and write.
@@ -3204,12 +3445,13 @@ struct bpf_sock_addr {
__u32 family; /* Allows 4-byte read, but no write */
__u32 type; /* Allows 4-byte read, but no write */
__u32 protocol; /* Allows 4-byte read, but no write */
__u32 msg_src_ip4; /* Allows 1,2,4-byte read an 4-byte write.
__u32 msg_src_ip4; /* Allows 1,2,4-byte read and 4-byte write.
* Stored in network byte order.
*/
__u32 msg_src_ip6[4]; /* Allows 1,2,4-byte read an 4-byte write.
__u32 msg_src_ip6[4]; /* Allows 1,2,4,8-byte read and 4,8-byte write.
* Stored in network byte order.
*/
__bpf_md_ptr(struct bpf_sock *, sk);
};
/* User bpf_sock_ops struct to access socket values and specify request ops
@@ -3261,13 +3503,15 @@ struct bpf_sock_ops {
__u32 sk_txhash;
__u64 bytes_received;
__u64 bytes_acked;
__bpf_md_ptr(struct bpf_sock *, sk);
};
/* Definitions for bpf_sock_ops_cb_flags */
#define BPF_SOCK_OPS_RTO_CB_FLAG (1<<0)
#define BPF_SOCK_OPS_RETRANS_CB_FLAG (1<<1)
#define BPF_SOCK_OPS_STATE_CB_FLAG (1<<2)
#define BPF_SOCK_OPS_ALL_CB_FLAGS 0x7 /* Mask of all currently
#define BPF_SOCK_OPS_RTT_CB_FLAG (1<<3)
#define BPF_SOCK_OPS_ALL_CB_FLAGS 0xF /* Mask of all currently
* supported cb flags
*/
@@ -3322,6 +3566,8 @@ enum {
BPF_SOCK_OPS_TCP_LISTEN_CB, /* Called on listen(2), right after
* socket transition to LISTEN state.
*/
BPF_SOCK_OPS_RTT_CB, /* Called on every RTT.
*/
};
/* List of TCP states. There is a build check in net/ipv4/tcp.c to detect
@@ -3376,8 +3622,8 @@ struct bpf_raw_tracepoint_args {
/* DIRECT: Skip the FIB rules and go to FIB table associated with device
* OUTPUT: Do lookup from egress perspective; default is ingress
*/
#define BPF_FIB_LOOKUP_DIRECT BIT(0)
#define BPF_FIB_LOOKUP_OUTPUT BIT(1)
#define BPF_FIB_LOOKUP_DIRECT (1U << 0)
#define BPF_FIB_LOOKUP_OUTPUT (1U << 1)
enum {
BPF_FIB_LKUP_RET_SUCCESS, /* lookup successful */
@@ -3449,6 +3695,10 @@ enum bpf_task_fd_type {
BPF_FD_TYPE_URETPROBE, /* filename + offset */
};
#define BPF_FLOW_DISSECTOR_F_PARSE_1ST_FRAG (1U << 0)
#define BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL (1U << 1)
#define BPF_FLOW_DISSECTOR_F_STOP_AT_ENCAP (1U << 2)
struct bpf_flow_keys {
__u16 nhoff;
__u16 thoff;
@@ -3470,6 +3720,8 @@ struct bpf_flow_keys {
__u32 ipv6_dst[4]; /* in6_addr; network order */
};
};
__u32 flags;
__be32 flow_label;
};
struct bpf_func_info {
@@ -3500,4 +3752,15 @@ struct bpf_sysctl {
*/
};
struct bpf_sockopt {
__bpf_md_ptr(struct bpf_sock *, sk);
__bpf_md_ptr(void *, optval);
__bpf_md_ptr(void *, optval_end);
__s32 level;
__s32 optname;
__s32 optlen;
__s32 retval;
};
#endif /* _UAPI__LINUX_BPF_H__ */

View File

@@ -22,9 +22,9 @@ struct btf_header {
};
/* Max # of type identifier */
#define BTF_MAX_TYPE 0x0000ffff
#define BTF_MAX_TYPE 0x000fffff
/* Max offset into the string section */
#define BTF_MAX_NAME_OFFSET 0x0000ffff
#define BTF_MAX_NAME_OFFSET 0x00ffffff
/* Max # of struct/union/enum members or func args */
#define BTF_MAX_VLEN 0xffff
@@ -142,7 +142,14 @@ struct btf_param {
enum {
BTF_VAR_STATIC = 0,
BTF_VAR_GLOBAL_ALLOCATED,
BTF_VAR_GLOBAL_ALLOCATED = 1,
BTF_VAR_GLOBAL_EXTERN = 2,
};
enum btf_func_linkage {
BTF_FUNC_STATIC = 0,
BTF_FUNC_GLOBAL = 1,
BTF_FUNC_EXTERN = 2,
};
/* BTF_KIND_VAR is followed by a single "struct btf_var" to describe

View File

@@ -167,6 +167,9 @@ enum {
IFLA_NEW_IFINDEX,
IFLA_MIN_MTU,
IFLA_MAX_MTU,
IFLA_PROP_LIST,
IFLA_ALT_IFNAME, /* Alternative ifname */
IFLA_PERM_ADDRESS,
__IFLA_MAX
};
@@ -483,6 +486,13 @@ enum macsec_validation_type {
MACSEC_VALIDATE_MAX = __MACSEC_VALIDATE_END - 1,
};
enum macsec_offload {
MACSEC_OFFLOAD_OFF = 0,
MACSEC_OFFLOAD_PHY = 1,
__MACSEC_OFFLOAD_END,
MACSEC_OFFLOAD_MAX = __MACSEC_OFFLOAD_END - 1,
};
/* IPVLAN section */
enum {
IFLA_IPVLAN_UNSPEC,
@@ -636,6 +646,7 @@ enum {
IFLA_BOND_AD_USER_PORT_KEY,
IFLA_BOND_AD_ACTOR_SYSTEM,
IFLA_BOND_TLB_DYNAMIC_LB,
IFLA_BOND_PEER_NOTIF_DELAY,
__IFLA_BOND_MAX,
};
@@ -694,6 +705,7 @@ enum {
IFLA_VF_IB_NODE_GUID, /* VF Infiniband node GUID */
IFLA_VF_IB_PORT_GUID, /* VF Infiniband port GUID */
IFLA_VF_VLAN_LIST, /* nested list of vlans, option for QinQ */
IFLA_VF_BROADCAST, /* VF broadcast */
__IFLA_VF_MAX,
};
@@ -704,6 +716,10 @@ struct ifla_vf_mac {
__u8 mac[32]; /* MAX_ADDR_LEN */
};
struct ifla_vf_broadcast {
__u8 broadcast[32];
};
struct ifla_vf_vlan {
__u32 vf;
__u32 vlan; /* 0 - 4095, 0 disables VLAN filter */

View File

@@ -16,6 +16,18 @@
#define XDP_SHARED_UMEM (1 << 0)
#define XDP_COPY (1 << 1) /* Force copy-mode */
#define XDP_ZEROCOPY (1 << 2) /* Force zero-copy mode */
/* If this option is set, the driver might go sleep and in that case
* the XDP_RING_NEED_WAKEUP flag in the fill and/or Tx rings will be
* set. If it is set, the application need to explicitly wake up the
* driver with a poll() (Rx and Tx) or sendto() (Tx only). If you are
* running the driver and the application on the same core, you should
* use this option so that the kernel will yield to the user space
* application.
*/
#define XDP_USE_NEED_WAKEUP (1 << 3)
/* Flags for xsk_umem_config flags */
#define XDP_UMEM_UNALIGNED_CHUNK_FLAG (1 << 0)
struct sockaddr_xdp {
__u16 sxdp_family;
@@ -25,10 +37,14 @@ struct sockaddr_xdp {
__u32 sxdp_shared_umem_fd;
};
/* XDP_RING flags */
#define XDP_RING_NEED_WAKEUP (1 << 0)
struct xdp_ring_offset {
__u64 producer;
__u64 consumer;
__u64 desc;
__u64 flags;
};
struct xdp_mmap_offsets {
@@ -46,12 +62,14 @@ struct xdp_mmap_offsets {
#define XDP_UMEM_FILL_RING 5
#define XDP_UMEM_COMPLETION_RING 6
#define XDP_STATISTICS 7
#define XDP_OPTIONS 8
struct xdp_umem_reg {
__u64 addr; /* Start of packet data area */
__u64 len; /* Length of packet data area */
__u32 chunk_size;
__u32 headroom;
__u32 flags;
};
struct xdp_statistics {
@@ -60,12 +78,24 @@ struct xdp_statistics {
__u64 tx_invalid_descs; /* Dropped due to invalid descriptor */
};
struct xdp_options {
__u32 flags;
};
/* Flags for the flags field of struct xdp_options */
#define XDP_OPTIONS_ZEROCOPY (1 << 0)
/* Pgoff for mmaping the rings */
#define XDP_PGOFF_RX_RING 0
#define XDP_PGOFF_TX_RING 0x80000000
#define XDP_UMEM_PGOFF_FILL_RING 0x100000000ULL
#define XDP_UMEM_PGOFF_COMPLETION_RING 0x180000000ULL
/* Masks for unaligned chunks mode */
#define XSK_UNALIGNED_BUF_OFFSET_SHIFT 48
#define XSK_UNALIGNED_BUF_ADDR_MASK \
((1ULL << XSK_UNALIGNED_BUF_OFFSET_SHIFT) - 1)
/* Rx/Tx descriptor */
struct xdp_desc {
__u64 addr;

View File

@@ -1,95 +0,0 @@
# SPDX-License-Identifier: LGPL-2.1 OR BSD-2-Clause
project('libbpf', 'c',
version : '0.0.3',
license : 'LGPL-2.1 OR BSD-2-Clause',
default_options : [
'prefix=/usr',
],
meson_version : '>= 0.46',
)
patchlevel = meson.project_version().split('.')[1]
libbpf_source_dir = './'
libbpf_sources = files(run_command('find',
[
'@0@/src'.format(libbpf_source_dir),
'-type',
'f',
'-name',
'*.[h|c]']).stdout().split())
libbpf_headers = files(
join_paths(libbpf_source_dir, 'src/bpf.h'),
join_paths(libbpf_source_dir, 'src/btf.h'),
join_paths(libbpf_source_dir, 'src/libbpf.h'))
feature_rellocarray = run_command(join_paths(libbpf_source_dir, 'scripts/check-reallocarray.sh'))
libbpf_c_args = ['-g',
'-O2',
'-Werror',
'-Wall',
]
if feature_rellocarray.stdout().strip() != ''
libbpf_c_args += '-DCOMPAT_NEED_REALLOCARRAY'
endif
# bpf_includes are required to include bpf.h, btf.h, libbpf.h
bpf_includes = include_directories(
join_paths(libbpf_source_dir, 'src'))
libbpf_includes = include_directories(
join_paths(libbpf_source_dir, 'include'),
join_paths(libbpf_source_dir, 'include/uapi'))
libelf = dependency('libelf')
libelf = dependency('libelf', required: false)
if not libelf.found()
libelf = cc.find_library('elf', required: true)
endif
deps = [libelf]
libbpf_static = static_library(
'bpf',
libbpf_sources,
c_args : libbpf_c_args,
dependencies : deps,
include_directories : libbpf_includes,
install : true)
libbpf_static_dep = declare_dependency(link_with : libbpf_static)
libbpf_map_source_path = join_paths(libbpf_source_dir, 'src/libbpf.map')
libbpf_map_abs_path = join_paths(meson.current_source_dir(), libbpf_map_source_path)
libbpf_c_args += ['-fPIC', '-fvisibility=hidden']
libbpf_link_args = ['-Wl,--version-script=@0@'.format(libbpf_map_abs_path)]
libbpf_shared = shared_library(
'bpf',
libbpf_sources,
c_args : libbpf_c_args,
dependencies : deps,
include_directories : libbpf_includes,
install : true,
link_args : libbpf_link_args,
link_depends : libbpf_map_source_path,
soversion : patchlevel,
version : meson.project_version())
libbpf_shared_dep = declare_dependency(link_with : libbpf_shared)
install_headers(libbpf_headers, subdir : 'bpf')
pkg = import('pkgconfig')
pkg.generate(
name: meson.project_name(),
version: meson.project_version(),
libraries: libbpf_shared,
requires_private: ['libelf'],
description: '''BPF library''')

105
scripts/coverity.sh Executable file
View File

@@ -0,0 +1,105 @@
#!/bin/bash
# Taken from: https://scan.coverity.com/scripts/travisci_build_coverity_scan.sh
# Local changes are annotated with "#[local]"
set -e
# Environment check
echo -e "\033[33;1mNote: COVERITY_SCAN_PROJECT_NAME and COVERITY_SCAN_TOKEN are available on Project Settings page on scan.coverity.com\033[0m"
[ -z "$COVERITY_SCAN_PROJECT_NAME" ] && echo "ERROR: COVERITY_SCAN_PROJECT_NAME must be set" && exit 1
[ -z "$COVERITY_SCAN_NOTIFICATION_EMAIL" ] && echo "ERROR: COVERITY_SCAN_NOTIFICATION_EMAIL must be set" && exit 1
[ -z "$COVERITY_SCAN_BRANCH_PATTERN" ] && echo "ERROR: COVERITY_SCAN_BRANCH_PATTERN must be set" && exit 1
[ -z "$COVERITY_SCAN_BUILD_COMMAND" ] && echo "ERROR: COVERITY_SCAN_BUILD_COMMAND must be set" && exit 1
[ -z "$COVERITY_SCAN_TOKEN" ] && echo "ERROR: COVERITY_SCAN_TOKEN must be set" && exit 1
PLATFORM=`uname`
#[local] Use /var/tmp for TOOL_ARCHIVE and TOOL_BASE, as on certain systems
# /tmp is tmpfs and is sometimes too small to handle all necessary tooling
TOOL_ARCHIVE=/var//tmp/cov-analysis-${PLATFORM}.tgz
TOOL_URL=https://scan.coverity.com/download/${PLATFORM}
TOOL_BASE=/var/tmp/coverity-scan-analysis
UPLOAD_URL="https://scan.coverity.com/builds"
SCAN_URL="https://scan.coverity.com"
# Do not run on pull requests
if [ "${TRAVIS_PULL_REQUEST}" = "true" ]; then
echo -e "\033[33;1mINFO: Skipping Coverity Analysis: branch is a pull request.\033[0m"
exit 0
fi
# Verify this branch should run
IS_COVERITY_SCAN_BRANCH=`ruby -e "puts '${TRAVIS_BRANCH}' =~ /\\A$COVERITY_SCAN_BRANCH_PATTERN\\z/ ? 1 : 0"`
if [ "$IS_COVERITY_SCAN_BRANCH" = "1" ]; then
echo -e "\033[33;1mCoverity Scan configured to run on branch ${TRAVIS_BRANCH}\033[0m"
else
echo -e "\033[33;1mCoverity Scan NOT configured to run on branch ${TRAVIS_BRANCH}\033[0m"
exit 1
fi
# Verify upload is permitted
AUTH_RES=`curl -s --form project="$COVERITY_SCAN_PROJECT_NAME" --form token="$COVERITY_SCAN_TOKEN" $SCAN_URL/api/upload_permitted`
if [ "$AUTH_RES" = "Access denied" ]; then
echo -e "\033[33;1mCoverity Scan API access denied. Check COVERITY_SCAN_PROJECT_NAME and COVERITY_SCAN_TOKEN.\033[0m"
exit 1
else
AUTH=`echo $AUTH_RES | ruby -e "require 'rubygems'; require 'json'; puts JSON[STDIN.read]['upload_permitted']"`
if [ "$AUTH" = "true" ]; then
echo -e "\033[33;1mCoverity Scan analysis authorized per quota.\033[0m"
else
WHEN=`echo $AUTH_RES | ruby -e "require 'rubygems'; require 'json'; puts JSON[STDIN.read]['next_upload_permitted_at']"`
echo -e "\033[33;1mCoverity Scan analysis NOT authorized until $WHEN.\033[0m"
exit 0
fi
fi
if [ ! -d $TOOL_BASE ]; then
# Download Coverity Scan Analysis Tool
if [ ! -e $TOOL_ARCHIVE ]; then
echo -e "\033[33;1mDownloading Coverity Scan Analysis Tool...\033[0m"
wget -nv -O $TOOL_ARCHIVE $TOOL_URL --post-data "project=$COVERITY_SCAN_PROJECT_NAME&token=$COVERITY_SCAN_TOKEN"
fi
# Extract Coverity Scan Analysis Tool
echo -e "\033[33;1mExtracting Coverity Scan Analysis Tool...\033[0m"
mkdir -p $TOOL_BASE
pushd $TOOL_BASE
tar xzf $TOOL_ARCHIVE
popd
fi
TOOL_DIR=`find $TOOL_BASE -type d -name 'cov-analysis*'`
export PATH=$TOOL_DIR/bin:$PATH
# Build
echo -e "\033[33;1mRunning Coverity Scan Analysis Tool...\033[0m"
COV_BUILD_OPTIONS=""
#COV_BUILD_OPTIONS="--return-emit-failures 8 --parse-error-threshold 85"
RESULTS_DIR="cov-int"
eval "${COVERITY_SCAN_BUILD_COMMAND_PREPEND}"
COVERITY_UNSUPPORTED=1 cov-build --dir $RESULTS_DIR $COV_BUILD_OPTIONS $COVERITY_SCAN_BUILD_COMMAND
cov-import-scm --dir $RESULTS_DIR --scm git --log $RESULTS_DIR/scm_log.txt 2>&1
# Upload results
echo -e "\033[33;1mTarring Coverity Scan Analysis results...\033[0m"
RESULTS_ARCHIVE=analysis-results.tgz
tar czf $RESULTS_ARCHIVE $RESULTS_DIR
SHA=`git rev-parse --short HEAD`
echo -e "\033[33;1mUploading Coverity Scan Analysis results...\033[0m"
response=$(curl \
--silent --write-out "\n%{http_code}\n" \
--form project=$COVERITY_SCAN_PROJECT_NAME \
--form token=$COVERITY_SCAN_TOKEN \
--form email=$COVERITY_SCAN_NOTIFICATION_EMAIL \
--form file=@$RESULTS_ARCHIVE \
--form version=$SHA \
--form description="Travis CI build" \
$UPLOAD_URL)
status_code=$(echo "$response" | sed -n '$p')
#[local] Coverity used to return 201 on success, but it's 200 now
# See https://github.com/systemd/systemd/blob/master/tools/coverity.sh#L145
if [ "$status_code" != "200" ]; then
TEXT=$(echo "$response" | sed '$d')
echo -e "\033[33;1mCoverity Scan upload failed: $TEXT.\033[0m"
exit 1
fi

View File

@@ -1,40 +1,212 @@
#!/bin/bash
usage () {
echo "USAGE: ./sync-kernel.sh <kernel-repo> <libbpf-repo> [<baseline-commit>]"
echo ""
echo "If <baseline-commit> is not specified, it's read from <libbpf-repo>/CHECKPOINT-COMMIT"
exit 1
echo "USAGE: ./sync-kernel.sh <libbpf-repo> <kernel-repo> <bpf-branch>"
echo ""
echo "Set BPF_NEXT_BASELINE to override bpf-next tree commit, otherwise read from <libbpf-repo>/CHECKPOINT-COMMIT."
echo "Set BPF_BASELINE to override bpf tree commit, otherwise read from <libbpf-repo>/BPF-CHECKPOINT-COMMIT."
echo "Set MANUAL_MODE to 1 to manually control every cherry-picked commits."
echo "Set IGNORE_CONSISTENCY to 1 to ignore failed contents consistency check."
exit 1
}
LINUX_REPO=${1-""}
LIBBPF_REPO=${2-""}
if [ -z "${LINUX_REPO}" ]; then
usage
fi
if [ -z "${LIBBPF_REPO}" ]; then
usage
fi
set -eu
WORKDIR=$(pwd)
trap "cd ${WORKDIR}; exit" INT TERM EXIT
LIBBPF_REPO=${1-""}
LINUX_REPO=${2-""}
BPF_BRANCH=${3-""}
BASELINE_COMMIT=${BPF_NEXT_BASELINE:-$(cat ${LIBBPF_REPO}/CHECKPOINT-COMMIT)}
BPF_BASELINE_COMMIT=${BPF_BASELINE:-$(cat ${LIBBPF_REPO}/BPF-CHECKPOINT-COMMIT)}
echo "WORKDIR: ${WORKDIR}"
echo "LINUX REPO: ${LINUX_REPO}"
echo "LIBBPF REPO: ${LIBBPF_REPO}"
if [ -z "${LIBBPF_REPO}" ] || [ -z "${LINUX_REPO}" ] || [ -z "${BPF_BRANCH}" ]; then
echo "Error: libbpf or linux repos are not specified"
usage
fi
if [ -z "${BPF_BRANCH}" ]; then
echo "Error: linux's bpf tree branch is not specified"
usage
fi
if [ -z "${BASELINE_COMMIT}" ] || [ -z "${BPF_BASELINE_COMMIT}" ]; then
echo "Error: bpf or bpf-next baseline commits are not provided"
usage
fi
SUFFIX=$(date --utc +%Y-%m-%dT%H-%M-%S.%3NZ)
BASELINE_COMMIT=${3-$(cat ${LIBBPF_REPO}/CHECKPOINT-COMMIT)}
WORKDIR=$(pwd)
TMP_DIR=$(mktemp -d)
trap "cd ${WORKDIR}; exit" INT TERM EXIT
declare -A PATH_MAP
PATH_MAP=( \
[tools/lib/bpf]=src \
[tools/include/uapi/linux/bpf_common.h]=include/uapi/linux/bpf_common.h \
[tools/include/uapi/linux/bpf.h]=include/uapi/linux/bpf.h \
[tools/include/uapi/linux/btf.h]=include/uapi/linux/btf.h \
[tools/include/uapi/linux/if_link.h]=include/uapi/linux/if_link.h \
[tools/include/uapi/linux/if_xdp.h]=include/uapi/linux/if_xdp.h \
[tools/include/uapi/linux/netlink.h]=include/uapi/linux/netlink.h \
[tools/include/tools/libc_compat.h]=include/tools/libc_compat.h \
)
LIBBPF_PATHS="${!PATH_MAP[@]}"
LIBBPF_VIEW_PATHS="${PATH_MAP[@]}"
LIBBPF_VIEW_EXCLUDE_REGEX='^src/(Makefile|Build|test_libbpf\.c|bpf_helper_defs\.h|\.gitignore)$'
LIBBPF_TREE_FILTER="mkdir -p __libbpf/include/uapi/linux __libbpf/include/tools && "$'\\\n'
for p in "${!PATH_MAP[@]}"; do
LIBBPF_TREE_FILTER+="git mv -kf ${p} __libbpf/${PATH_MAP[${p}]} && "$'\\\n'
done
LIBBPF_TREE_FILTER+="git rm --ignore-unmatch -f __libbpf/src/{Makefile,Build,test_libbpf.c,.gitignore} >/dev/null"
cd_to()
{
cd ${WORKDIR} && cd "$1"
}
# Output brief single-line commit description
# $1 - commit ref
commit_desc()
{
git log -n1 --pretty='%h ("%s")' $1
}
# Create commit single-line signature, which consists of:
# - full commit hash
# - author date in ISO8601 format
# - full commit body with newlines replaced with vertical bars (|)
# - shortstat appended at the end
# The idea is that this single-line signature is good enough to make final
# decision about whether two commits are the same, across different repos.
# $1 - commit ref
commit_signature()
{
git log -n1 --pretty='("%s")|%aI|%b' --shortstat $1 | tr '\n' '|'
}
# Validate there are no non-empty merges (we can't handle them)
# $1 - baseline tag
# $2 - tip tag
validate_merges()
{
local baseline_tag=$1
local tip_tag=$2
local new_merges
local merge_change_cnt
local ignore_merge_resolutions
local desc
new_merges=$(git rev-list --merges --topo-order --reverse ${baseline_tag}..${tip_tag} ${LIBBPF_PATHS[@]})
for new_merge in ${new_merges}; do
desc=$(commit_desc ${new_merge})
echo "MERGE: ${desc}"
merge_change_cnt=$(git show --format='' ${new_merge} | wc -l)
if ((${merge_change_cnt} > 0)); then
read -p "Merge '${desc}' is non-empty, which will cause conflicts! Do you want to proceed? [y/N]: " ignore_merge_resolutions
case "${ignore_merge_resolutions}" in
"y" | "Y")
echo "Skipping '${desc}'..."
continue
;;
esac
exit 3
fi
done
}
# Cherry-pick commits touching libbpf-related files
# $1 - baseline_tag
# $2 - tip_tag
cherry_pick_commits()
{
local manual_mode=${MANUAL_MODE:-0}
local baseline_tag=$1
local tip_tag=$2
local new_commits
local signature
local should_skip
local synced_cnt
local manual_check
local libbpf_conflict_cnt
local desc
new_commits=$(git rev-list --no-merges --topo-order --reverse ${baseline_tag}..${tip_tag} ${LIBBPF_PATHS[@]})
for new_commit in ${new_commits}; do
desc="$(commit_desc ${new_commit})"
signature="$(commit_signature ${new_commit})"
synced_cnt=$(grep -F "${signature}" ${TMP_DIR}/libbpf_commits.txt | wc -l)
manual_check=0
if ((${synced_cnt} > 0)); then
# commit with the same subject is already in libbpf, but it's
# not 100% the same commit, so check with user
echo "Commit '${desc}' is synced into libbpf as:"
grep -F "${signature}" ${TMP_DIR}/libbpf_commits.txt | \
cut -d'|' -f1 | sed -e 's/^/- /'
if ((${manual_mode} != 1 && ${synced_cnt} == 1)); then
echo "Skipping '${desc}' due to unique match..."
continue
fi
if ((${synced_cnt} > 1)); then
echo "'${desc} matches multiple commits, please, double-check!"
manual_check=1
fi
fi
if ((${manual_mode} == 1 || ${manual_check} == 1)); then
read -p "Do you want to skip '${desc}'? [y/N]: " should_skip
case "${should_skip}" in
"y" | "Y")
echo "Skipping '${desc}'..."
continue
;;
esac
fi
# commit hasn't been synced into libbpf yet
echo "Picking '${desc}'..."
if ! git cherry-pick ${new_commit} &>/dev/null; then
echo "Warning! Cherry-picking '${desc} failed, checking if it's non-libbpf files causing problems..."
libbpf_conflict_cnt=$(git diff --name-only --diff-filter=U -- ${LIBBPF_PATHS[@]} | wc -l)
conflict_cnt=$(git diff --name-only | wc -l)
if ((${libbpf_conflict_cnt} == 0)); then
echo "Looks like only non-libbpf files have conflicts, ignoring..."
if ((${conflict_cnt} == 0)); then
echo "Empty cherry-pick, skipping it..."
git cherry-pick --abort
continue
fi
git add .
# GIT_EDITOR=true to avoid editor popping up to edit commit message
if ! GIT_EDITOR=true git cherry-pick --continue &>/dev/null; then
echo "Error! That still failed! Please resolve manually."
else
echo "Success! All cherry-pick conflicts were resolved for '${desc}'!"
continue
fi
fi
read -p "Error! Cherry-picking '${desc}' failed, please fix manually and press <return> to proceed..."
fi
done
}
cd_to ${LIBBPF_REPO}
GITHUB_ABS_DIR=$(pwd)
echo "Dumping existing libbpf commit signatures..."
for h in $(git log --pretty='%h' -n500); do
echo $h "$(commit_signature $h)" >> ${TMP_DIR}/libbpf_commits.txt
done
# Use current kernel repo HEAD as a source of patches
cd ${LINUX_REPO}
cd_to ${LINUX_REPO}
LINUX_ABS_DIR=$(pwd)
TIP_SYM_REF=$(git symbolic-ref -q --short HEAD || git rev-parse HEAD)
TIP_COMMIT=$(git rev-parse HEAD)
BPF_TIP_COMMIT=$(git rev-parse ${BPF_BRANCH})
BASELINE_TAG=libbpf-baseline-${SUFFIX}
TIP_TAG=libbpf-tip-${SUFFIX}
BPF_BASELINE_TAG=libbpf-bpf-baseline-${SUFFIX}
BPF_TIP_TAG=libbpf-bpf-tip-${SUFFIX}
VIEW_TAG=libbpf-view-${SUFFIX}
LIBBPF_SYNC_TAG=libbpf-sync-${SUFFIX}
@@ -43,75 +215,41 @@ SQUASH_BASE_TAG=libbpf-squash-base-${SUFFIX}
SQUASH_TIP_TAG=libbpf-squash-tip-${SUFFIX}
SQUASH_COMMIT=$(git commit-tree ${BASELINE_COMMIT}^{tree} -m "BASELINE SQUASH ${BASELINE_COMMIT}")
echo "SUFFIX: ${SUFFIX}"
echo "BASELINE COMMIT: $(git log --pretty=oneline --no-walk ${BASELINE_COMMIT})"
echo "TIP COMMIT: $(git log --pretty=oneline --no-walk ${TIP_COMMIT})"
echo "SQUASH COMMIT: ${SQUASH_COMMIT}"
echo "BASELINE TAG: ${BASELINE_TAG}"
echo "TIP TAG: ${TIP_TAG}"
echo "SQUASH BASE TAG: ${SQUASH_BASE_TAG}"
echo "SQUASH TIP TAG: ${SQUASH_TIP_TAG}"
echo "VIEW TAG: ${VIEW_TAG}"
echo "LIBBPF SYNC TAG: ${LIBBPF_SYNC_TAG}"
TMP_DIR=$(mktemp -d)
echo "TEMP DIR: ${TMP_DIR}"
echo "PATCHES+COVER: ${TMP_DIR}/patches"
echo "PATCHSET: ${TMP_DIR}/patchset.patch"
echo "WORKDIR: ${WORKDIR}"
echo "LINUX REPO: ${LINUX_REPO}"
echo "LIBBPF REPO: ${LIBBPF_REPO}"
echo "TEMP DIR: ${TMP_DIR}"
echo "SUFFIX: ${SUFFIX}"
echo "BASE COMMIT: '$(commit_desc ${BASELINE_COMMIT})'"
echo "TIP COMMIT: '$(commit_desc ${TIP_COMMIT})'"
echo "BPF BASE COMMIT: '$(commit_desc ${BPF_BASELINE_COMMIT})'"
echo "BPF TIP COMMIT: '$(commit_desc ${BPF_TIP_COMMIT})'"
echo "SQUASH COMMIT: ${SQUASH_COMMIT}"
echo "BASELINE TAG: ${BASELINE_TAG}"
echo "TIP TAG: ${TIP_TAG}"
echo "BPF BASELINE TAG: ${BPF_BASELINE_TAG}"
echo "BPF TIP TAG: ${BPF_TIP_TAG}"
echo "SQUASH BASE TAG: ${SQUASH_BASE_TAG}"
echo "SQUASH TIP TAG: ${SQUASH_TIP_TAG}"
echo "VIEW TAG: ${VIEW_TAG}"
echo "LIBBPF SYNC TAG: ${LIBBPF_SYNC_TAG}"
echo "PATCHES: ${TMP_DIR}/patches"
git branch ${BASELINE_TAG} ${BASELINE_COMMIT}
git branch ${TIP_TAG} ${TIP_COMMIT}
git branch ${BPF_BASELINE_TAG} ${BPF_BASELINE_COMMIT}
git branch ${BPF_TIP_TAG} ${BPF_TIP_COMMIT}
git branch ${SQUASH_BASE_TAG} ${SQUASH_COMMIT}
git checkout -b ${SQUASH_TIP_TAG} ${SQUASH_COMMIT}
# Validate there are no non-empty merges in bpf-next and bpf trees
validate_merges ${BASELINE_TAG} ${TIP_TAG}
validate_merges ${BPF_BASELINE_TAG} ${BPF_TIP_TAG}
# Cherry-pick new commits onto squashed baseline commit
LIBBPF_PATHS=(tools/lib/bpf tools/include/uapi/linux/{bpf_common.h,bpf.h,btf.h,if_link.h,netlink.h} tools/include/tools/libc_compat.h)
cherry_pick_commits ${BASELINE_TAG} ${TIP_TAG}
cherry_pick_commits ${BPF_BASELINE_TAG} ${BPF_TIP_TAG}
LIBBPF_NEW_MERGES=$(git rev-list --merges --topo-order --reverse ${BASELINE_TAG}..${TIP_TAG} ${LIBBPF_PATHS[@]})
for LIBBPF_NEW_MERGE in ${LIBBPF_NEW_MERGES}; do
printf "MERGE:\t" && git log --oneline -n1 ${LIBBPF_NEW_MERGE}
MERGE_CHANGES=$(git log --format='' -n1 ${LIBBPF_NEW_MERGE} | wc -l)
if ((${MERGE_CHANGES} > 0)); then
echo "Merge is non empty, aborting!.."
exit 3
fi
done
cd ${WORKDIR} && cd ${LIBBPF_REPO}
git log --oneline -n500 > ${TMP_DIR}/libbpf_commits.txt
cd ${WORKDIR} && cd ${LINUX_REPO}
LIBBPF_NEW_COMMITS=$(git rev-list --no-merges --topo-order --reverse ${BASELINE_TAG}..${TIP_TAG} ${LIBBPF_PATHS[@]})
for LIBBPF_NEW_COMMIT in ${LIBBPF_NEW_COMMITS}; do
echo "Checking commit '${LIBBPF_NEW_COMMIT}'"
SYNCED_COMMITS=$(grep -F "$(git log -n1 --pretty=format:%s ${LIBBPF_NEW_COMMIT})" ${TMP_DIR}/libbpf_commits.txt || echo "")
if [ -n "${SYNCED_COMMITS}" ]; then
# commit with the same subject is already in libbpf, but it's not 100% the same commit, so check with user
echo "Commit '$(git log -n1 --oneline ${LIBBPF_NEW_COMMIT})' appears to be already synced into libbpf..."
echo "Corresponding libbpf commit(s):"
echo "${SYNCED_COMMITS}"
read -p "Do you want to skip it? [y/N]: " SHOULD_SKIP
case "${SHOULD_SKIP}" in
"y" | "Y")
echo "Skipping '$(git log -n1 --oneline ${LIBBPF_NEW_COMMIT})'..."
continue
;;
esac
fi
# commit hasn't been synced into libbpf yet
if ! git cherry-pick ${LIBBPF_NEW_COMMIT}; then
read -p "Cherry-picking '$(git log --oneline -n1 ${LIBBPF_NEW_COMMIT})' failed, please fix manually and press <return> to proceed..."
fi
done
LIBBPF_TREE_FILTER=' \
mkdir -p __libbpf/include/uapi/linux __libbpf/include/tools && \
git mv -kf tools/lib/bpf __libbpf/src && \
git mv -kf tools/include/uapi/linux/{bpf_common.h,bpf.h,btf.h,if_link.h,if_xdp.h,netlink.h} \
__libbpf/include/uapi/linux && \
git mv -kf tools/include/tools/libc_compat.h __libbpf/include/tools && \
git rm --ignore-unmatch -f __libbpf/src/{Makefile,Build,test_libbpf.cpp,.gitignore} \
'
# Move all libbpf files into __libbpf directory.
git filter-branch --prune-empty -f --tree-filter "${LIBBPF_TREE_FILTER}" ${SQUASH_TIP_TAG} ${SQUASH_BASE_TAG}
# Make __libbpf a new root directory
@@ -126,59 +264,92 @@ fi
# Exclude baseline commit and generate nice cover letter with summary
git format-patch ${SQUASH_BASE_TAG}..${SQUASH_TIP_TAG} --cover-letter -o ${TMP_DIR}/patches
# Now generate single-file patchset w/o cover to apply on top of libbpf repo
git format-patch ${SQUASH_BASE_TAG}..${SQUASH_TIP_TAG} --stdout > ${TMP_DIR}/patchset.patch
# Now is time to re-apply libbpf-related linux patches to libbpf repo
cd ${WORKDIR} && cd ${LIBBPF_REPO}
# Now is time to re-apply libbpf-related linux patches to libbpf repo
cd_to ${LIBBPF_REPO}
git checkout -b ${LIBBPF_SYNC_TAG}
git am --committer-date-is-author-date ${TMP_DIR}/patchset.patch
for patch in $(ls -1 ${TMP_DIR}/patches | tail -n +2); do
if ! git am --committer-date-is-author-date "${TMP_DIR}/patches/${patch}"; then
read -p "Applying ${TMP_DIR}/patches/${patch} failed, please resolve manually and press <return> to proceed..."
fi
done
# Generate bpf_helper_defs.h and commit, if anything changed
# restore Linux tip to use bpf_helpers_doc.py
cd_to ${LINUX_REPO}
git checkout ${TIP_TAG}
# re-generate bpf_helper_defs.h
cd_to ${LIBBPF_REPO}
"${LINUX_ABS_DIR}/scripts/bpf_helpers_doc.py" --header \
--file include/uapi/linux/bpf.h > src/bpf_helper_defs.h
# if anything changed, commit it
helpers_changes=$(git status --porcelain src/bpf_helper_defs.h | wc -l)
if ((${helpers_changes} == 1)); then
git add src/bpf_helper_defs.h
git commit -m "sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
" -- src/bpf_helper_defs.h
fi
# Use generated cover-letter as a template for "sync commit" with
# baseline and checkpoint commits from kernel repo (and leave summary
# from cover letter intact, of course)
echo ${TIP_COMMIT} > CHECKPOINT-COMMIT && \
echo ${BPF_TIP_COMMIT} > BPF-CHECKPOINT-COMMIT && \
git add CHECKPOINT-COMMIT && \
git add BPF-CHECKPOINT-COMMIT && \
awk '/\*\*\* BLURB HERE \*\*\*/ {p=1} p' ${TMP_DIR}/patches/0000-cover-letter.patch | \
sed "s/\*\*\* BLURB HERE \*\*\*/\
sync: latest libbpf changes from kernel\n\
\n\
Syncing latest libbpf commits from kernel repository.\n\
Baseline commit: ${BASELINE_COMMIT}\n\
Checkpoint commit: ${TIP_COMMIT}/" | \
Baseline bpf-next commit: ${BASELINE_COMMIT}\n\
Checkpoint bpf-next commit: ${TIP_COMMIT}\n\
Baseline bpf commit: ${BPF_BASELINE_COMMIT}\n\
Checkpoint bpf commit: ${BPF_TIP_COMMIT}/" | \
git commit --file=-
echo "SUCCESS! ${COMMIT_CNT} commits synced."
echo "Verifying Linux's and Github's libbpf state"
LIBBPF_VIEW_PATHS=(src include/uapi/linux/{bpf_common.h,bpf.h,btf.h,if_link.h,if_xdp.h,netlink.h} include/tools/libc_compat.h)
LIBBPF_VIEW_EXCLUDE_REGEX='^src/(Makefile|Build|test_libbpf.cpp|\.gitignore)$'
cd ${WORKDIR} && cd ${LINUX_REPO}
LINUX_ABS_DIR=$(pwd)
cd_to ${LINUX_REPO}
git checkout -b ${VIEW_TAG} ${TIP_COMMIT}
git filter-branch -f --tree-filter "${LIBBPF_TREE_FILTER}" ${VIEW_TAG}^..${VIEW_TAG}
git filter-branch -f --subdirectory-filter __libbpf ${VIEW_TAG}^..${VIEW_TAG}
git ls-files -- ${LIBBPF_VIEW_PATHS[@]} > ${TMP_DIR}/linux-view.ls
cd ${WORKDIR} && cd ${LIBBPF_REPO}
GITHUB_ABS_DIR=$(pwd)
cd_to ${LIBBPF_REPO}
git ls-files -- ${LIBBPF_VIEW_PATHS[@]} | grep -v -E "${LIBBPF_VIEW_EXCLUDE_REGEX}" > ${TMP_DIR}/github-view.ls
echo "Comparing list of files..."
diff ${TMP_DIR}/linux-view.ls ${TMP_DIR}/github-view.ls
diff -u ${TMP_DIR}/linux-view.ls ${TMP_DIR}/github-view.ls
echo "Comparing file contents..."
CONSISTENT=1
for F in $(cat ${TMP_DIR}/linux-view.ls); do
diff "${LINUX_ABS_DIR}/${F}" "${GITHUB_ABS_DIR}/${F}"
if ! diff -u "${LINUX_ABS_DIR}/${F}" "${GITHUB_ABS_DIR}/${F}"; then
echo "${LINUX_ABS_DIR}/${F} and ${GITHUB_ABS_DIR}/${F} are different!"
CONSISTENT=0
fi
done
echo "Contents appear identical!"
if ((${CONSISTENT} == 1)); then
echo "Great! Content is identical!"
else
echo "Unfortunately, there are consistency problems!"
if ((${IGNORE_CONSISTENCY-0} != 1)); then
exit 4
fi
fi
echo "Cleaning up..."
rm -r ${TMP_DIR}
cd ${WORKDIR} && cd ${LINUX_REPO}
cd_to ${LINUX_REPO}
git checkout ${TIP_SYM_REF}
git branch -D ${BASELINE_TAG} ${TIP_TAG} ${SQUASH_BASE_TAG} ${SQUASH_TIP_TAG} ${VIEW_TAG}
git branch -D ${BASELINE_TAG} ${TIP_TAG} ${BPF_BASELINE_TAG} ${BPF_TIP_TAG} \
${SQUASH_BASE_TAG} ${SQUASH_TIP_TAG} ${VIEW_TAG}
cd ${WORKDIR}
cd_to .
echo "DONE."

4
src/.gitignore vendored
View File

@@ -1,2 +1,6 @@
*.o
*.a
/libbpf.pc
/libbpf.so*
/staticobjs
/sharedobjs

View File

@@ -1,10 +1,9 @@
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
VERSION = 0
PATCHLEVEL = 0
EXTRAVERSION = 3
LIBBPF_VERSION = $(VERSION).$(PATCHLEVEL).$(EXTRAVERSION)
LIBBPF_VERSION := $(shell \
grep -oE '^LIBBPF_([0-9.]+)' libbpf.map | \
sort -rV | head -n1 | cut -d'_' -f2)
LIBBPF_MAJOR_VERSION := $(firstword $(subst ., ,$(LIBBPF_VERSION)))
TOPDIR = ..
@@ -16,15 +15,13 @@ ifneq ($(FEATURE_REALLOCARRAY),)
ALL_CFLAGS += -DCOMPAT_NEED_REALLOCARRAY
endif
ifndef BUILD_STATIC_ONLY
ALL_CFLAGS += -fPIC -fvisibility=hidden
endif
SHARED_CFLAGS += -fPIC -fvisibility=hidden -DSHARED
CFLAGS ?= -g -O2 -Werror -Wall
ALL_CFLAGS += $(CFLAGS)
ALL_CFLAGS += $(CFLAGS) -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64
ALL_LDFLAGS += $(LDFLAGS)
ifdef NO_PKG_CONFIG
ALL_LDFLAGS += -lelf
ALL_LDFLAGS += -lelf -lz
else
PKG_CONFIG ?= pkg-config
ALL_CFLAGS += $(shell $(PKG_CONFIG) --cflags libelf)
@@ -32,21 +29,27 @@ else
endif
OBJDIR ?= .
SHARED_OBJDIR := $(OBJDIR)/sharedobjs
STATIC_OBJDIR := $(OBJDIR)/staticobjs
OBJS := bpf.o btf.o libbpf.o libbpf_errno.o netlink.o \
nlattr.o str_error.o libbpf_probes.o bpf_prog_linfo.o xsk.o \
btf_dump.o hashmap.o
SHARED_OBJS := $(addprefix $(SHARED_OBJDIR)/,$(OBJS))
STATIC_OBJS := $(addprefix $(STATIC_OBJDIR)/,$(OBJS))
OBJS := $(addprefix $(OBJDIR)/,bpf.o btf.o libbpf.o libbpf_errno.o netlink.o \
nlattr.o str_error.o libbpf_probes.o bpf_prog_linfo.o xsk.o)
LIBS := $(OBJDIR)/libbpf.a
STATIC_LIBS := $(OBJDIR)/libbpf.a
ifndef BUILD_STATIC_ONLY
LIBS += $(OBJDIR)/libbpf.so \
$(OBJDIR)/libbpf.so.$(VERSION) \
$(OBJDIR)/libbpf.so.$(LIBBPF_VERSION)
SHARED_LIBS := $(OBJDIR)/libbpf.so \
$(OBJDIR)/libbpf.so.$(LIBBPF_MAJOR_VERSION) \
$(OBJDIR)/libbpf.so.$(LIBBPF_VERSION)
VERSION_SCRIPT := libbpf.map
endif
HEADERS := bpf.h libbpf.h btf.h xsk.h libbpf_util.h
UAPI_HEADERS := $(addprefix $(TOPDIR)/include/uapi/linux/,bpf.h bpf_common.h \
btf.h)
HEADERS := bpf.h libbpf.h btf.h xsk.h libbpf_util.h \
bpf_helpers.h bpf_helper_defs.h bpf_tracing.h \
bpf_endian.h bpf_core_read.h libbpf_common.h
UAPI_HEADERS := $(addprefix $(TOPDIR)/include/uapi/linux/,\
bpf.h bpf_common.h btf.h)
PC_FILE := $(OBJDIR)/libbpf.pc
@@ -65,21 +68,23 @@ LIBDIR ?= $(PREFIX)/$(LIBSUBDIR)
INCLUDEDIR ?= $(PREFIX)/include
UAPIDIR ?= $(PREFIX)/include
all: $(LIBS) $(PC_FILE)
TAGS_PROG := $(if $(shell which etags 2>/dev/null),etags,ctags)
$(OBJDIR)/libbpf.a: $(OBJS)
all: $(STATIC_LIBS) $(SHARED_LIBS) $(PC_FILE)
$(OBJDIR)/libbpf.a: $(STATIC_OBJS)
$(AR) rcs $@ $^
$(OBJDIR)/libbpf.so: $(OBJDIR)/libbpf.so.$(VERSION)
$(OBJDIR)/libbpf.so: $(OBJDIR)/libbpf.so.$(LIBBPF_MAJOR_VERSION)
ln -sf $(^F) $@
$(OBJDIR)/libbpf.so.$(VERSION): $(OBJDIR)/libbpf.so.$(LIBBPF_VERSION)
$(OBJDIR)/libbpf.so.$(LIBBPF_MAJOR_VERSION): $(OBJDIR)/libbpf.so.$(LIBBPF_VERSION)
ln -sf $(^F) $@
$(OBJDIR)/libbpf.so.$(LIBBPF_VERSION): $(OBJS)
$(CC) -shared $(ALL_LDFLAGS) -Wl,--version-script=$(VERSION_SCRIPT) \
-Wl,-soname,libbpf.so.$(VERSION) \
$^ -o $@
$(OBJDIR)/libbpf.so.$(LIBBPF_VERSION): $(SHARED_OBJS)
$(CC) -shared -Wl,--version-script=$(VERSION_SCRIPT) \
-Wl,-soname,libbpf.so.$(LIBBPF_MAJOR_VERSION) \
$^ $(ALL_LDFLAGS) -o $@
$(OBJDIR)/libbpf.pc:
sed -e "s|@PREFIX@|$(PREFIX)|" \
@@ -87,9 +92,18 @@ $(OBJDIR)/libbpf.pc:
-e "s|@VERSION@|$(LIBBPF_VERSION)|" \
< libbpf.pc.template > $@
$(OBJDIR)/%.o: %.c
$(STATIC_OBJDIR):
mkdir -p $(STATIC_OBJDIR)
$(SHARED_OBJDIR):
mkdir -p $(SHARED_OBJDIR)
$(STATIC_OBJDIR)/%.o: %.c | $(STATIC_OBJDIR)
$(CC) $(ALL_CFLAGS) $(CPPFLAGS) -c $< -o $@
$(SHARED_OBJDIR)/%.o: %.c | $(SHARED_OBJDIR)
$(CC) $(ALL_CFLAGS) $(SHARED_CFLAGS) $(CPPFLAGS) -c $< -o $@
define do_install
if [ ! -d '$(DESTDIR)$2' ]; then \
$(INSTALL) -d -m 755 '$(DESTDIR)$2'; \
@@ -106,7 +120,7 @@ define do_s_install
endef
install: all install_headers install_pkgconfig
$(call do_s_install,$(LIBS),$(LIBDIR))
$(call do_s_install,$(STATIC_LIBS) $(SHARED_LIBS),$(LIBDIR))
install_headers:
$(call do_install,$(HEADERS),$(INCLUDEDIR)/bpf,644)
@@ -120,4 +134,13 @@ install_pkgconfig: $(PC_FILE)
$(call do_install,$(PC_FILE),$(LIBDIR)/pkgconfig,644)
clean:
rm -f *.o *.a *.so *.so.* *.pc
rm -rf *.o *.a *.so *.so.* *.pc $(SHARED_OBJDIR) $(STATIC_OBJDIR)
.PHONY: cscope tags
cscope:
ls *.c *.h > cscope.files
cscope -b -q -f cscope.out
tags:
rm -f TAGS tags
ls *.c *.h | xargs $(TAGS_PROG) -a

View File

@@ -9,7 +9,8 @@ described here. It's recommended to follow these conventions whenever a
new function or type is added to keep libbpf API clean and consistent.
All types and functions provided by libbpf API should have one of the
following prefixes: ``bpf_``, ``btf_``, ``libbpf_``, ``xsk_``.
following prefixes: ``bpf_``, ``btf_``, ``libbpf_``, ``xsk_``,
``perf_buffer_``.
System call wrappers
--------------------

130
src/bpf.c
View File

@@ -26,10 +26,14 @@
#include <memory.h>
#include <unistd.h>
#include <asm/unistd.h>
#include <errno.h>
#include <linux/bpf.h>
#include "bpf.h"
#include "libbpf.h"
#include <errno.h>
#include "libbpf_internal.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
/*
* When building perf, unistd.h is overridden. __NR_bpf is
@@ -53,10 +57,6 @@
# endif
#endif
#ifndef min
#define min(x, y) ((x) < (y) ? (x) : (y))
#endif
static inline __u64 ptr_to_u64(const void *ptr)
{
return (__u64) (unsigned long) ptr;
@@ -98,7 +98,11 @@ int bpf_create_map_xattr(const struct bpf_create_map_attr *create_attr)
attr.btf_key_type_id = create_attr->btf_key_type_id;
attr.btf_value_type_id = create_attr->btf_value_type_id;
attr.map_ifindex = create_attr->map_ifindex;
attr.inner_map_fd = create_attr->inner_map_fd;
if (attr.map_type == BPF_MAP_TYPE_STRUCT_OPS)
attr.btf_vmlinux_value_type_id =
create_attr->btf_vmlinux_value_type_id;
else
attr.inner_map_fd = create_attr->inner_map_fd;
return sys_bpf(BPF_MAP_CREATE, &attr, sizeof(attr));
}
@@ -192,7 +196,7 @@ static void *
alloc_zero_tailing_info(const void *orecord, __u32 cnt,
__u32 actual_rec_size, __u32 expected_rec_size)
{
__u64 info_len = actual_rec_size * cnt;
__u64 info_len = (__u64)actual_rec_size * cnt;
void *info, *nrecord;
int i;
@@ -231,6 +235,16 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
memset(&attr, 0, sizeof(attr));
attr.prog_type = load_attr->prog_type;
attr.expected_attach_type = load_attr->expected_attach_type;
if (attr.prog_type == BPF_PROG_TYPE_STRUCT_OPS) {
attr.attach_btf_id = load_attr->attach_btf_id;
} else if (attr.prog_type == BPF_PROG_TYPE_TRACING ||
attr.prog_type == BPF_PROG_TYPE_EXT) {
attr.attach_btf_id = load_attr->attach_btf_id;
attr.attach_prog_fd = load_attr->attach_prog_fd;
} else {
attr.prog_ifindex = load_attr->prog_ifindex;
attr.kern_version = load_attr->kern_version;
}
attr.insn_cnt = (__u32)load_attr->insns_cnt;
attr.insns = ptr_to_u64(load_attr->insns);
attr.license = ptr_to_u64(load_attr->license);
@@ -244,8 +258,6 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
attr.log_size = 0;
}
attr.kern_version = load_attr->kern_version;
attr.prog_ifindex = load_attr->prog_ifindex;
attr.prog_btf_fd = load_attr->prog_btf_fd;
attr.func_info_rec_size = load_attr->func_info_rec_size;
attr.func_info_cnt = load_attr->func_info_cnt;
@@ -256,6 +268,7 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
if (load_attr->name)
memcpy(attr.prog_name, load_attr->name,
min(strlen(load_attr->name), BPF_OBJ_NAME_LEN - 1));
attr.prog_flags = load_attr->prog_flags;
fd = sys_bpf_prog_load(&attr, sizeof(attr));
if (fd >= 0)
@@ -440,6 +453,64 @@ int bpf_map_freeze(int fd)
return sys_bpf(BPF_MAP_FREEZE, &attr, sizeof(attr));
}
static int bpf_map_batch_common(int cmd, int fd, void *in_batch,
void *out_batch, void *keys, void *values,
__u32 *count,
const struct bpf_map_batch_opts *opts)
{
union bpf_attr attr;
int ret;
if (!OPTS_VALID(opts, bpf_map_batch_opts))
return -EINVAL;
memset(&attr, 0, sizeof(attr));
attr.batch.map_fd = fd;
attr.batch.in_batch = ptr_to_u64(in_batch);
attr.batch.out_batch = ptr_to_u64(out_batch);
attr.batch.keys = ptr_to_u64(keys);
attr.batch.values = ptr_to_u64(values);
attr.batch.count = *count;
attr.batch.elem_flags = OPTS_GET(opts, elem_flags, 0);
attr.batch.flags = OPTS_GET(opts, flags, 0);
ret = sys_bpf(cmd, &attr, sizeof(attr));
*count = attr.batch.count;
return ret;
}
int bpf_map_delete_batch(int fd, void *keys, __u32 *count,
const struct bpf_map_batch_opts *opts)
{
return bpf_map_batch_common(BPF_MAP_DELETE_BATCH, fd, NULL,
NULL, keys, NULL, count, opts);
}
int bpf_map_lookup_batch(int fd, void *in_batch, void *out_batch, void *keys,
void *values, __u32 *count,
const struct bpf_map_batch_opts *opts)
{
return bpf_map_batch_common(BPF_MAP_LOOKUP_BATCH, fd, in_batch,
out_batch, keys, values, count, opts);
}
int bpf_map_lookup_and_delete_batch(int fd, void *in_batch, void *out_batch,
void *keys, void *values, __u32 *count,
const struct bpf_map_batch_opts *opts)
{
return bpf_map_batch_common(BPF_MAP_LOOKUP_AND_DELETE_BATCH,
fd, in_batch, out_batch, keys, values,
count, opts);
}
int bpf_map_update_batch(int fd, void *keys, void *values, __u32 *count,
const struct bpf_map_batch_opts *opts)
{
return bpf_map_batch_common(BPF_MAP_UPDATE_BATCH, fd, NULL, NULL,
keys, values, count, opts);
}
int bpf_obj_pin(int fd, const char *pathname)
{
union bpf_attr attr;
@@ -463,14 +534,29 @@ int bpf_obj_get(const char *pathname)
int bpf_prog_attach(int prog_fd, int target_fd, enum bpf_attach_type type,
unsigned int flags)
{
DECLARE_LIBBPF_OPTS(bpf_prog_attach_opts, opts,
.flags = flags,
);
return bpf_prog_attach_xattr(prog_fd, target_fd, type, &opts);
}
int bpf_prog_attach_xattr(int prog_fd, int target_fd,
enum bpf_attach_type type,
const struct bpf_prog_attach_opts *opts)
{
union bpf_attr attr;
if (!OPTS_VALID(opts, bpf_prog_attach_opts))
return -EINVAL;
memset(&attr, 0, sizeof(attr));
attr.target_fd = target_fd;
attr.attach_bpf_fd = prog_fd;
attr.attach_type = type;
attr.attach_flags = flags;
attr.attach_flags = OPTS_GET(opts, flags, 0);
attr.replace_bpf_fd = OPTS_GET(opts, replace_prog_fd, 0);
return sys_bpf(BPF_PROG_ATTACH, &attr, sizeof(attr));
}
@@ -570,7 +656,7 @@ int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr *test_attr)
return ret;
}
int bpf_prog_get_next_id(__u32 start_id, __u32 *next_id)
static int bpf_obj_get_next_id(__u32 start_id, __u32 *next_id, int cmd)
{
union bpf_attr attr;
int err;
@@ -578,26 +664,26 @@ int bpf_prog_get_next_id(__u32 start_id, __u32 *next_id)
memset(&attr, 0, sizeof(attr));
attr.start_id = start_id;
err = sys_bpf(BPF_PROG_GET_NEXT_ID, &attr, sizeof(attr));
err = sys_bpf(cmd, &attr, sizeof(attr));
if (!err)
*next_id = attr.next_id;
return err;
}
int bpf_prog_get_next_id(__u32 start_id, __u32 *next_id)
{
return bpf_obj_get_next_id(start_id, next_id, BPF_PROG_GET_NEXT_ID);
}
int bpf_map_get_next_id(__u32 start_id, __u32 *next_id)
{
union bpf_attr attr;
int err;
return bpf_obj_get_next_id(start_id, next_id, BPF_MAP_GET_NEXT_ID);
}
memset(&attr, 0, sizeof(attr));
attr.start_id = start_id;
err = sys_bpf(BPF_MAP_GET_NEXT_ID, &attr, sizeof(attr));
if (!err)
*next_id = attr.next_id;
return err;
int bpf_btf_get_next_id(__u32 start_id, __u32 *next_id)
{
return bpf_obj_get_next_id(start_id, next_id, BPF_BTF_GET_NEXT_ID);
}
int bpf_prog_get_fd_by_id(__u32 id)

View File

@@ -28,14 +28,12 @@
#include <stddef.h>
#include <stdint.h>
#include "libbpf_common.h"
#ifdef __cplusplus
extern "C" {
#endif
#ifndef LIBBPF_API
#define LIBBPF_API __attribute__((visibility("default")))
#endif
struct bpf_create_map_attr {
const char *name;
enum bpf_map_type map_type;
@@ -48,7 +46,10 @@ struct bpf_create_map_attr {
__u32 btf_key_type_id;
__u32 btf_value_type_id;
__u32 map_ifindex;
__u32 inner_map_fd;
union {
__u32 inner_map_fd;
__u32 btf_vmlinux_value_type_id;
};
};
LIBBPF_API int
@@ -77,8 +78,14 @@ struct bpf_load_program_attr {
const struct bpf_insn *insns;
size_t insns_cnt;
const char *license;
__u32 kern_version;
__u32 prog_ifindex;
union {
__u32 kern_version;
__u32 attach_prog_fd;
};
union {
__u32 prog_ifindex;
__u32 attach_btf_id;
};
__u32 prog_btf_fd;
__u32 func_info_rec_size;
const void *func_info;
@@ -87,6 +94,7 @@ struct bpf_load_program_attr {
const void *line_info;
__u32 line_info_cnt;
__u32 log_level;
__u32 prog_flags;
};
/* Flags to direct loading requirements */
@@ -119,10 +127,43 @@ LIBBPF_API int bpf_map_lookup_and_delete_elem(int fd, const void *key,
LIBBPF_API int bpf_map_delete_elem(int fd, const void *key);
LIBBPF_API int bpf_map_get_next_key(int fd, const void *key, void *next_key);
LIBBPF_API int bpf_map_freeze(int fd);
struct bpf_map_batch_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
__u64 elem_flags;
__u64 flags;
};
#define bpf_map_batch_opts__last_field flags
LIBBPF_API int bpf_map_delete_batch(int fd, void *keys,
__u32 *count,
const struct bpf_map_batch_opts *opts);
LIBBPF_API int bpf_map_lookup_batch(int fd, void *in_batch, void *out_batch,
void *keys, void *values, __u32 *count,
const struct bpf_map_batch_opts *opts);
LIBBPF_API int bpf_map_lookup_and_delete_batch(int fd, void *in_batch,
void *out_batch, void *keys,
void *values, __u32 *count,
const struct bpf_map_batch_opts *opts);
LIBBPF_API int bpf_map_update_batch(int fd, void *keys, void *values,
__u32 *count,
const struct bpf_map_batch_opts *opts);
LIBBPF_API int bpf_obj_pin(int fd, const char *pathname);
LIBBPF_API int bpf_obj_get(const char *pathname);
struct bpf_prog_attach_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
unsigned int flags;
int replace_prog_fd;
};
#define bpf_prog_attach_opts__last_field replace_prog_fd
LIBBPF_API int bpf_prog_attach(int prog_fd, int attachable_fd,
enum bpf_attach_type type, unsigned int flags);
LIBBPF_API int bpf_prog_attach_xattr(int prog_fd, int attachable_fd,
enum bpf_attach_type type,
const struct bpf_prog_attach_opts *opts);
LIBBPF_API int bpf_prog_detach(int attachable_fd, enum bpf_attach_type type);
LIBBPF_API int bpf_prog_detach2(int prog_fd, int attachable_fd,
enum bpf_attach_type type);
@@ -155,6 +196,7 @@ LIBBPF_API int bpf_prog_test_run(int prog_fd, int repeat, void *data,
__u32 *retval, __u32 *duration);
LIBBPF_API int bpf_prog_get_next_id(__u32 start_id, __u32 *next_id);
LIBBPF_API int bpf_map_get_next_id(__u32 start_id, __u32 *next_id);
LIBBPF_API int bpf_btf_get_next_id(__u32 start_id, __u32 *next_id);
LIBBPF_API int bpf_prog_get_fd_by_id(__u32 id);
LIBBPF_API int bpf_map_get_fd_by_id(__u32 id);
LIBBPF_API int bpf_btf_get_fd_by_id(__u32 id);

263
src/bpf_core_read.h Normal file
View File

@@ -0,0 +1,263 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
#ifndef __BPF_CORE_READ_H__
#define __BPF_CORE_READ_H__
/*
* enum bpf_field_info_kind is passed as a second argument into
* __builtin_preserve_field_info() built-in to get a specific aspect of
* a field, captured as a first argument. __builtin_preserve_field_info(field,
* info_kind) returns __u32 integer and produces BTF field relocation, which
* is understood and processed by libbpf during BPF object loading. See
* selftests/bpf for examples.
*/
enum bpf_field_info_kind {
BPF_FIELD_BYTE_OFFSET = 0, /* field byte offset */
BPF_FIELD_BYTE_SIZE = 1,
BPF_FIELD_EXISTS = 2, /* field existence in target kernel */
BPF_FIELD_SIGNED = 3,
BPF_FIELD_LSHIFT_U64 = 4,
BPF_FIELD_RSHIFT_U64 = 5,
};
#define __CORE_RELO(src, field, info) \
__builtin_preserve_field_info((src)->field, BPF_FIELD_##info)
#if __BYTE_ORDER == __LITTLE_ENDIAN
#define __CORE_BITFIELD_PROBE_READ(dst, src, fld) \
bpf_probe_read((void *)dst, \
__CORE_RELO(src, fld, BYTE_SIZE), \
(const void *)src + __CORE_RELO(src, fld, BYTE_OFFSET))
#else
/* semantics of LSHIFT_64 assumes loading values into low-ordered bytes, so
* for big-endian we need to adjust destination pointer accordingly, based on
* field byte size
*/
#define __CORE_BITFIELD_PROBE_READ(dst, src, fld) \
bpf_probe_read((void *)dst + (8 - __CORE_RELO(src, fld, BYTE_SIZE)), \
__CORE_RELO(src, fld, BYTE_SIZE), \
(const void *)src + __CORE_RELO(src, fld, BYTE_OFFSET))
#endif
/*
* Extract bitfield, identified by s->field, and return its value as u64.
* All this is done in relocatable manner, so bitfield changes such as
* signedness, bit size, offset changes, this will be handled automatically.
* This version of macro is using bpf_probe_read() to read underlying integer
* storage. Macro functions as an expression and its return type is
* bpf_probe_read()'s return value: 0, on success, <0 on error.
*/
#define BPF_CORE_READ_BITFIELD_PROBED(s, field) ({ \
unsigned long long val = 0; \
\
__CORE_BITFIELD_PROBE_READ(&val, s, field); \
val <<= __CORE_RELO(s, field, LSHIFT_U64); \
if (__CORE_RELO(s, field, SIGNED)) \
val = ((long long)val) >> __CORE_RELO(s, field, RSHIFT_U64); \
else \
val = val >> __CORE_RELO(s, field, RSHIFT_U64); \
val; \
})
/*
* Extract bitfield, identified by s->field, and return its value as u64.
* This version of macro is using direct memory reads and should be used from
* BPF program types that support such functionality (e.g., typed raw
* tracepoints).
*/
#define BPF_CORE_READ_BITFIELD(s, field) ({ \
const void *p = (const void *)s + __CORE_RELO(s, field, BYTE_OFFSET); \
unsigned long long val; \
\
switch (__CORE_RELO(s, field, BYTE_SIZE)) { \
case 1: val = *(const unsigned char *)p; \
case 2: val = *(const unsigned short *)p; \
case 4: val = *(const unsigned int *)p; \
case 8: val = *(const unsigned long long *)p; \
} \
val <<= __CORE_RELO(s, field, LSHIFT_U64); \
if (__CORE_RELO(s, field, SIGNED)) \
val = ((long long)val) >> __CORE_RELO(s, field, RSHIFT_U64); \
else \
val = val >> __CORE_RELO(s, field, RSHIFT_U64); \
val; \
})
/*
* Convenience macro to check that field actually exists in target kernel's.
* Returns:
* 1, if matching field is present in target kernel;
* 0, if no matching field found.
*/
#define bpf_core_field_exists(field) \
__builtin_preserve_field_info(field, BPF_FIELD_EXISTS)
/*
* Convenience macro to get byte size of a field. Works for integers,
* struct/unions, pointers, arrays, and enums.
*/
#define bpf_core_field_size(field) \
__builtin_preserve_field_info(field, BPF_FIELD_BYTE_SIZE)
/*
* bpf_core_read() abstracts away bpf_probe_read() call and captures offset
* relocation for source address using __builtin_preserve_access_index()
* built-in, provided by Clang.
*
* __builtin_preserve_access_index() takes as an argument an expression of
* taking an address of a field within struct/union. It makes compiler emit
* a relocation, which records BTF type ID describing root struct/union and an
* accessor string which describes exact embedded field that was used to take
* an address. See detailed description of this relocation format and
* semantics in comments to struct bpf_field_reloc in libbpf_internal.h.
*
* This relocation allows libbpf to adjust BPF instruction to use correct
* actual field offset, based on target kernel BTF type that matches original
* (local) BTF, used to record relocation.
*/
#define bpf_core_read(dst, sz, src) \
bpf_probe_read(dst, sz, \
(const void *)__builtin_preserve_access_index(src))
/*
* bpf_core_read_str() is a thin wrapper around bpf_probe_read_str()
* additionally emitting BPF CO-RE field relocation for specified source
* argument.
*/
#define bpf_core_read_str(dst, sz, src) \
bpf_probe_read_str(dst, sz, \
(const void *)__builtin_preserve_access_index(src))
#define ___concat(a, b) a ## b
#define ___apply(fn, n) ___concat(fn, n)
#define ___nth(_1, _2, _3, _4, _5, _6, _7, _8, _9, _10, __11, N, ...) N
/*
* return number of provided arguments; used for switch-based variadic macro
* definitions (see ___last, ___arrow, etc below)
*/
#define ___narg(...) ___nth(_, ##__VA_ARGS__, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)
/*
* return 0 if no arguments are passed, N - otherwise; used for
* recursively-defined macros to specify termination (0) case, and generic
* (N) case (e.g., ___read_ptrs, ___core_read)
*/
#define ___empty(...) ___nth(_, ##__VA_ARGS__, N, N, N, N, N, N, N, N, N, N, 0)
#define ___last1(x) x
#define ___last2(a, x) x
#define ___last3(a, b, x) x
#define ___last4(a, b, c, x) x
#define ___last5(a, b, c, d, x) x
#define ___last6(a, b, c, d, e, x) x
#define ___last7(a, b, c, d, e, f, x) x
#define ___last8(a, b, c, d, e, f, g, x) x
#define ___last9(a, b, c, d, e, f, g, h, x) x
#define ___last10(a, b, c, d, e, f, g, h, i, x) x
#define ___last(...) ___apply(___last, ___narg(__VA_ARGS__))(__VA_ARGS__)
#define ___nolast2(a, _) a
#define ___nolast3(a, b, _) a, b
#define ___nolast4(a, b, c, _) a, b, c
#define ___nolast5(a, b, c, d, _) a, b, c, d
#define ___nolast6(a, b, c, d, e, _) a, b, c, d, e
#define ___nolast7(a, b, c, d, e, f, _) a, b, c, d, e, f
#define ___nolast8(a, b, c, d, e, f, g, _) a, b, c, d, e, f, g
#define ___nolast9(a, b, c, d, e, f, g, h, _) a, b, c, d, e, f, g, h
#define ___nolast10(a, b, c, d, e, f, g, h, i, _) a, b, c, d, e, f, g, h, i
#define ___nolast(...) ___apply(___nolast, ___narg(__VA_ARGS__))(__VA_ARGS__)
#define ___arrow1(a) a
#define ___arrow2(a, b) a->b
#define ___arrow3(a, b, c) a->b->c
#define ___arrow4(a, b, c, d) a->b->c->d
#define ___arrow5(a, b, c, d, e) a->b->c->d->e
#define ___arrow6(a, b, c, d, e, f) a->b->c->d->e->f
#define ___arrow7(a, b, c, d, e, f, g) a->b->c->d->e->f->g
#define ___arrow8(a, b, c, d, e, f, g, h) a->b->c->d->e->f->g->h
#define ___arrow9(a, b, c, d, e, f, g, h, i) a->b->c->d->e->f->g->h->i
#define ___arrow10(a, b, c, d, e, f, g, h, i, j) a->b->c->d->e->f->g->h->i->j
#define ___arrow(...) ___apply(___arrow, ___narg(__VA_ARGS__))(__VA_ARGS__)
#define ___type(...) typeof(___arrow(__VA_ARGS__))
#define ___read(read_fn, dst, src_type, src, accessor) \
read_fn((void *)(dst), sizeof(*(dst)), &((src_type)(src))->accessor)
/* "recursively" read a sequence of inner pointers using local __t var */
#define ___rd_first(src, a) ___read(bpf_core_read, &__t, ___type(src), src, a);
#define ___rd_last(...) \
___read(bpf_core_read, &__t, \
___type(___nolast(__VA_ARGS__)), __t, ___last(__VA_ARGS__));
#define ___rd_p1(...) const void *__t; ___rd_first(__VA_ARGS__)
#define ___rd_p2(...) ___rd_p1(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p3(...) ___rd_p2(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p4(...) ___rd_p3(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p5(...) ___rd_p4(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p6(...) ___rd_p5(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p7(...) ___rd_p6(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p8(...) ___rd_p7(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p9(...) ___rd_p8(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___read_ptrs(src, ...) \
___apply(___rd_p, ___narg(__VA_ARGS__))(src, __VA_ARGS__)
#define ___core_read0(fn, dst, src, a) \
___read(fn, dst, ___type(src), src, a);
#define ___core_readN(fn, dst, src, ...) \
___read_ptrs(src, ___nolast(__VA_ARGS__)) \
___read(fn, dst, ___type(src, ___nolast(__VA_ARGS__)), __t, \
___last(__VA_ARGS__));
#define ___core_read(fn, dst, src, a, ...) \
___apply(___core_read, ___empty(__VA_ARGS__))(fn, dst, \
src, a, ##__VA_ARGS__)
/*
* BPF_CORE_READ_INTO() is a more performance-conscious variant of
* BPF_CORE_READ(), in which final field is read into user-provided storage.
* See BPF_CORE_READ() below for more details on general usage.
*/
#define BPF_CORE_READ_INTO(dst, src, a, ...) \
({ \
___core_read(bpf_core_read, dst, src, a, ##__VA_ARGS__) \
})
/*
* BPF_CORE_READ_STR_INTO() does same "pointer chasing" as
* BPF_CORE_READ() for intermediate pointers, but then executes (and returns
* corresponding error code) bpf_core_read_str() for final string read.
*/
#define BPF_CORE_READ_STR_INTO(dst, src, a, ...) \
({ \
___core_read(bpf_core_read_str, dst, src, a, ##__VA_ARGS__) \
})
/*
* BPF_CORE_READ() is used to simplify BPF CO-RE relocatable read, especially
* when there are few pointer chasing steps.
* E.g., what in non-BPF world (or in BPF w/ BCC) would be something like:
* int x = s->a.b.c->d.e->f->g;
* can be succinctly achieved using BPF_CORE_READ as:
* int x = BPF_CORE_READ(s, a.b.c, d.e, f, g);
*
* BPF_CORE_READ will decompose above statement into 4 bpf_core_read (BPF
* CO-RE relocatable bpf_probe_read() wrapper) calls, logically equivalent to:
* 1. const void *__t = s->a.b.c;
* 2. __t = __t->d.e;
* 3. __t = __t->f;
* 4. return __t->g;
*
* Equivalence is logical, because there is a heavy type casting/preservation
* involved, as well as all the reads are happening through bpf_probe_read()
* calls using __builtin_preserve_access_index() to emit CO-RE relocations.
*
* N.B. Only up to 9 "field accessors" are supported, which should be more
* than enough for any practical purpose.
*/
#define BPF_CORE_READ(src, a, ...) \
({ \
___type(src, a, ##__VA_ARGS__) __r; \
BPF_CORE_READ_INTO(&__r, src, a, ##__VA_ARGS__); \
__r; \
})
#endif

72
src/bpf_endian.h Normal file
View File

@@ -0,0 +1,72 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
#ifndef __BPF_ENDIAN__
#define __BPF_ENDIAN__
#include <linux/stddef.h>
#include <linux/swab.h>
/* LLVM's BPF target selects the endianness of the CPU
* it compiles on, or the user specifies (bpfel/bpfeb),
* respectively. The used __BYTE_ORDER__ is defined by
* the compiler, we cannot rely on __BYTE_ORDER from
* libc headers, since it doesn't reflect the actual
* requested byte order.
*
* Note, LLVM's BPF target has different __builtin_bswapX()
* semantics. It does map to BPF_ALU | BPF_END | BPF_TO_BE
* in bpfel and bpfeb case, which means below, that we map
* to cpu_to_be16(). We could use it unconditionally in BPF
* case, but better not rely on it, so that this header here
* can be used from application and BPF program side, which
* use different targets.
*/
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
# define __bpf_ntohs(x) __builtin_bswap16(x)
# define __bpf_htons(x) __builtin_bswap16(x)
# define __bpf_constant_ntohs(x) ___constant_swab16(x)
# define __bpf_constant_htons(x) ___constant_swab16(x)
# define __bpf_ntohl(x) __builtin_bswap32(x)
# define __bpf_htonl(x) __builtin_bswap32(x)
# define __bpf_constant_ntohl(x) ___constant_swab32(x)
# define __bpf_constant_htonl(x) ___constant_swab32(x)
# define __bpf_be64_to_cpu(x) __builtin_bswap64(x)
# define __bpf_cpu_to_be64(x) __builtin_bswap64(x)
# define __bpf_constant_be64_to_cpu(x) ___constant_swab64(x)
# define __bpf_constant_cpu_to_be64(x) ___constant_swab64(x)
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
# define __bpf_ntohs(x) (x)
# define __bpf_htons(x) (x)
# define __bpf_constant_ntohs(x) (x)
# define __bpf_constant_htons(x) (x)
# define __bpf_ntohl(x) (x)
# define __bpf_htonl(x) (x)
# define __bpf_constant_ntohl(x) (x)
# define __bpf_constant_htonl(x) (x)
# define __bpf_be64_to_cpu(x) (x)
# define __bpf_cpu_to_be64(x) (x)
# define __bpf_constant_be64_to_cpu(x) (x)
# define __bpf_constant_cpu_to_be64(x) (x)
#else
# error "Fix your compiler's __BYTE_ORDER__?!"
#endif
#define bpf_htons(x) \
(__builtin_constant_p(x) ? \
__bpf_constant_htons(x) : __bpf_htons(x))
#define bpf_ntohs(x) \
(__builtin_constant_p(x) ? \
__bpf_constant_ntohs(x) : __bpf_ntohs(x))
#define bpf_htonl(x) \
(__builtin_constant_p(x) ? \
__bpf_constant_htonl(x) : __bpf_htonl(x))
#define bpf_ntohl(x) \
(__builtin_constant_p(x) ? \
__bpf_constant_ntohl(x) : __bpf_ntohl(x))
#define bpf_cpu_to_be64(x) \
(__builtin_constant_p(x) ? \
__bpf_constant_cpu_to_be64(x) : __bpf_cpu_to_be64(x))
#define bpf_be64_to_cpu(x) \
(__builtin_constant_p(x) ? \
__bpf_constant_be64_to_cpu(x) : __bpf_be64_to_cpu(x))
#endif /* __BPF_ENDIAN__ */

2799
src/bpf_helper_defs.h Normal file

File diff suppressed because it is too large Load Diff

58
src/bpf_helpers.h Normal file
View File

@@ -0,0 +1,58 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
#ifndef __BPF_HELPERS__
#define __BPF_HELPERS__
#include "bpf_helper_defs.h"
#define __uint(name, val) int (*name)[val]
#define __type(name, val) typeof(val) *name
/* Helper macro to print out debug messages */
#define bpf_printk(fmt, ...) \
({ \
char ____fmt[] = fmt; \
bpf_trace_printk(____fmt, sizeof(____fmt), \
##__VA_ARGS__); \
})
/*
* Helper macro to place programs, maps, license in
* different sections in elf_bpf file. Section names
* are interpreted by elf_bpf loader
*/
#define SEC(NAME) __attribute__((section(NAME), used))
#ifndef __always_inline
#define __always_inline __attribute__((always_inline))
#endif
#ifndef __weak
#define __weak __attribute__((weak))
#endif
/*
* Helper structure used by eBPF C program
* to describe BPF map attributes to libbpf loader
*/
struct bpf_map_def {
unsigned int type;
unsigned int key_size;
unsigned int value_size;
unsigned int max_entries;
unsigned int map_flags;
};
enum libbpf_pin_type {
LIBBPF_PIN_NONE,
/* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */
LIBBPF_PIN_BY_NAME,
};
enum libbpf_tristate {
TRI_NO = 0,
TRI_YES = 1,
TRI_MODULE = 2,
};
#define __kconfig __attribute__((section(".kconfig")))
#endif

View File

@@ -6,10 +6,10 @@
#include <linux/err.h>
#include <linux/bpf.h>
#include "libbpf.h"
#include "libbpf_internal.h"
#ifndef min
#define min(x, y) ((x) < (y) ? (x) : (y))
#endif
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
struct bpf_prog_linfo {
void *raw_linfo;
@@ -104,6 +104,7 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info)
{
struct bpf_prog_linfo *prog_linfo;
__u32 nr_linfo, nr_jited_func;
__u64 data_sz;
nr_linfo = info->nr_line_info;
@@ -125,11 +126,11 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info)
/* Copy xlated line_info */
prog_linfo->nr_linfo = nr_linfo;
prog_linfo->rec_size = info->line_info_rec_size;
prog_linfo->raw_linfo = malloc(nr_linfo * prog_linfo->rec_size);
data_sz = (__u64)nr_linfo * prog_linfo->rec_size;
prog_linfo->raw_linfo = malloc(data_sz);
if (!prog_linfo->raw_linfo)
goto err_free;
memcpy(prog_linfo->raw_linfo, (void *)(long)info->line_info,
nr_linfo * prog_linfo->rec_size);
memcpy(prog_linfo->raw_linfo, (void *)(long)info->line_info, data_sz);
nr_jited_func = info->nr_jited_ksyms;
if (!nr_jited_func ||
@@ -145,13 +146,12 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info)
/* Copy jited_line_info */
prog_linfo->nr_jited_func = nr_jited_func;
prog_linfo->jited_rec_size = info->jited_line_info_rec_size;
prog_linfo->raw_jited_linfo = malloc(nr_linfo *
prog_linfo->jited_rec_size);
data_sz = (__u64)nr_linfo * prog_linfo->jited_rec_size;
prog_linfo->raw_jited_linfo = malloc(data_sz);
if (!prog_linfo->raw_jited_linfo)
goto err_free;
memcpy(prog_linfo->raw_jited_linfo,
(void *)(long)info->jited_line_info,
nr_linfo * prog_linfo->jited_rec_size);
(void *)(long)info->jited_line_info, data_sz);
/* Number of jited_line_info per jited func */
prog_linfo->nr_jited_linfo_per_func = malloc(nr_jited_func *

195
src/bpf_tracing.h Normal file
View File

@@ -0,0 +1,195 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
#ifndef __BPF_TRACING_H__
#define __BPF_TRACING_H__
/* Scan the ARCH passed in from ARCH env variable (see Makefile) */
#if defined(__TARGET_ARCH_x86)
#define bpf_target_x86
#define bpf_target_defined
#elif defined(__TARGET_ARCH_s390)
#define bpf_target_s390
#define bpf_target_defined
#elif defined(__TARGET_ARCH_arm)
#define bpf_target_arm
#define bpf_target_defined
#elif defined(__TARGET_ARCH_arm64)
#define bpf_target_arm64
#define bpf_target_defined
#elif defined(__TARGET_ARCH_mips)
#define bpf_target_mips
#define bpf_target_defined
#elif defined(__TARGET_ARCH_powerpc)
#define bpf_target_powerpc
#define bpf_target_defined
#elif defined(__TARGET_ARCH_sparc)
#define bpf_target_sparc
#define bpf_target_defined
#else
#undef bpf_target_defined
#endif
/* Fall back to what the compiler says */
#ifndef bpf_target_defined
#if defined(__x86_64__)
#define bpf_target_x86
#elif defined(__s390__)
#define bpf_target_s390
#elif defined(__arm__)
#define bpf_target_arm
#elif defined(__aarch64__)
#define bpf_target_arm64
#elif defined(__mips__)
#define bpf_target_mips
#elif defined(__powerpc__)
#define bpf_target_powerpc
#elif defined(__sparc__)
#define bpf_target_sparc
#endif
#endif
#if defined(bpf_target_x86)
#ifdef __KERNEL__
#define PT_REGS_PARM1(x) ((x)->di)
#define PT_REGS_PARM2(x) ((x)->si)
#define PT_REGS_PARM3(x) ((x)->dx)
#define PT_REGS_PARM4(x) ((x)->cx)
#define PT_REGS_PARM5(x) ((x)->r8)
#define PT_REGS_RET(x) ((x)->sp)
#define PT_REGS_FP(x) ((x)->bp)
#define PT_REGS_RC(x) ((x)->ax)
#define PT_REGS_SP(x) ((x)->sp)
#define PT_REGS_IP(x) ((x)->ip)
#else
#ifdef __i386__
/* i386 kernel is built with -mregparm=3 */
#define PT_REGS_PARM1(x) ((x)->eax)
#define PT_REGS_PARM2(x) ((x)->edx)
#define PT_REGS_PARM3(x) ((x)->ecx)
#define PT_REGS_PARM4(x) 0
#define PT_REGS_PARM5(x) 0
#define PT_REGS_RET(x) ((x)->esp)
#define PT_REGS_FP(x) ((x)->ebp)
#define PT_REGS_RC(x) ((x)->eax)
#define PT_REGS_SP(x) ((x)->esp)
#define PT_REGS_IP(x) ((x)->eip)
#else
#define PT_REGS_PARM1(x) ((x)->rdi)
#define PT_REGS_PARM2(x) ((x)->rsi)
#define PT_REGS_PARM3(x) ((x)->rdx)
#define PT_REGS_PARM4(x) ((x)->rcx)
#define PT_REGS_PARM5(x) ((x)->r8)
#define PT_REGS_RET(x) ((x)->rsp)
#define PT_REGS_FP(x) ((x)->rbp)
#define PT_REGS_RC(x) ((x)->rax)
#define PT_REGS_SP(x) ((x)->rsp)
#define PT_REGS_IP(x) ((x)->rip)
#endif
#endif
#elif defined(bpf_target_s390)
/* s390 provides user_pt_regs instead of struct pt_regs to userspace */
struct pt_regs;
#define PT_REGS_S390 const volatile user_pt_regs
#define PT_REGS_PARM1(x) (((PT_REGS_S390 *)(x))->gprs[2])
#define PT_REGS_PARM2(x) (((PT_REGS_S390 *)(x))->gprs[3])
#define PT_REGS_PARM3(x) (((PT_REGS_S390 *)(x))->gprs[4])
#define PT_REGS_PARM4(x) (((PT_REGS_S390 *)(x))->gprs[5])
#define PT_REGS_PARM5(x) (((PT_REGS_S390 *)(x))->gprs[6])
#define PT_REGS_RET(x) (((PT_REGS_S390 *)(x))->gprs[14])
/* Works only with CONFIG_FRAME_POINTER */
#define PT_REGS_FP(x) (((PT_REGS_S390 *)(x))->gprs[11])
#define PT_REGS_RC(x) (((PT_REGS_S390 *)(x))->gprs[2])
#define PT_REGS_SP(x) (((PT_REGS_S390 *)(x))->gprs[15])
#define PT_REGS_IP(x) (((PT_REGS_S390 *)(x))->psw.addr)
#elif defined(bpf_target_arm)
#define PT_REGS_PARM1(x) ((x)->uregs[0])
#define PT_REGS_PARM2(x) ((x)->uregs[1])
#define PT_REGS_PARM3(x) ((x)->uregs[2])
#define PT_REGS_PARM4(x) ((x)->uregs[3])
#define PT_REGS_PARM5(x) ((x)->uregs[4])
#define PT_REGS_RET(x) ((x)->uregs[14])
#define PT_REGS_FP(x) ((x)->uregs[11]) /* Works only with CONFIG_FRAME_POINTER */
#define PT_REGS_RC(x) ((x)->uregs[0])
#define PT_REGS_SP(x) ((x)->uregs[13])
#define PT_REGS_IP(x) ((x)->uregs[12])
#elif defined(bpf_target_arm64)
/* arm64 provides struct user_pt_regs instead of struct pt_regs to userspace */
struct pt_regs;
#define PT_REGS_ARM64 const volatile struct user_pt_regs
#define PT_REGS_PARM1(x) (((PT_REGS_ARM64 *)(x))->regs[0])
#define PT_REGS_PARM2(x) (((PT_REGS_ARM64 *)(x))->regs[1])
#define PT_REGS_PARM3(x) (((PT_REGS_ARM64 *)(x))->regs[2])
#define PT_REGS_PARM4(x) (((PT_REGS_ARM64 *)(x))->regs[3])
#define PT_REGS_PARM5(x) (((PT_REGS_ARM64 *)(x))->regs[4])
#define PT_REGS_RET(x) (((PT_REGS_ARM64 *)(x))->regs[30])
/* Works only with CONFIG_FRAME_POINTER */
#define PT_REGS_FP(x) (((PT_REGS_ARM64 *)(x))->regs[29])
#define PT_REGS_RC(x) (((PT_REGS_ARM64 *)(x))->regs[0])
#define PT_REGS_SP(x) (((PT_REGS_ARM64 *)(x))->sp)
#define PT_REGS_IP(x) (((PT_REGS_ARM64 *)(x))->pc)
#elif defined(bpf_target_mips)
#define PT_REGS_PARM1(x) ((x)->regs[4])
#define PT_REGS_PARM2(x) ((x)->regs[5])
#define PT_REGS_PARM3(x) ((x)->regs[6])
#define PT_REGS_PARM4(x) ((x)->regs[7])
#define PT_REGS_PARM5(x) ((x)->regs[8])
#define PT_REGS_RET(x) ((x)->regs[31])
#define PT_REGS_FP(x) ((x)->regs[30]) /* Works only with CONFIG_FRAME_POINTER */
#define PT_REGS_RC(x) ((x)->regs[1])
#define PT_REGS_SP(x) ((x)->regs[29])
#define PT_REGS_IP(x) ((x)->cp0_epc)
#elif defined(bpf_target_powerpc)
#define PT_REGS_PARM1(x) ((x)->gpr[3])
#define PT_REGS_PARM2(x) ((x)->gpr[4])
#define PT_REGS_PARM3(x) ((x)->gpr[5])
#define PT_REGS_PARM4(x) ((x)->gpr[6])
#define PT_REGS_PARM5(x) ((x)->gpr[7])
#define PT_REGS_RC(x) ((x)->gpr[3])
#define PT_REGS_SP(x) ((x)->sp)
#define PT_REGS_IP(x) ((x)->nip)
#elif defined(bpf_target_sparc)
#define PT_REGS_PARM1(x) ((x)->u_regs[UREG_I0])
#define PT_REGS_PARM2(x) ((x)->u_regs[UREG_I1])
#define PT_REGS_PARM3(x) ((x)->u_regs[UREG_I2])
#define PT_REGS_PARM4(x) ((x)->u_regs[UREG_I3])
#define PT_REGS_PARM5(x) ((x)->u_regs[UREG_I4])
#define PT_REGS_RET(x) ((x)->u_regs[UREG_I7])
#define PT_REGS_RC(x) ((x)->u_regs[UREG_I0])
#define PT_REGS_SP(x) ((x)->u_regs[UREG_FP])
/* Should this also be a bpf_target check for the sparc case? */
#if defined(__arch64__)
#define PT_REGS_IP(x) ((x)->tpc)
#else
#define PT_REGS_IP(x) ((x)->pc)
#endif
#endif
#if defined(bpf_target_powerpc)
#define BPF_KPROBE_READ_RET_IP(ip, ctx) ({ (ip) = (ctx)->link; })
#define BPF_KRETPROBE_READ_RET_IP BPF_KPROBE_READ_RET_IP
#elif defined(bpf_target_sparc)
#define BPF_KPROBE_READ_RET_IP(ip, ctx) ({ (ip) = PT_REGS_RET(ctx); })
#define BPF_KRETPROBE_READ_RET_IP BPF_KPROBE_READ_RET_IP
#else
#define BPF_KPROBE_READ_RET_IP(ip, ctx) \
({ bpf_probe_read(&(ip), sizeof(ip), (void *)PT_REGS_RET(ctx)); })
#define BPF_KRETPROBE_READ_RET_IP(ip, ctx) \
({ bpf_probe_read(&(ip), sizeof(ip), \
(void *)(PT_REGS_FP(ctx) + sizeof(ip))); })
#endif
#endif

787
src/btf.c

File diff suppressed because it is too large Load Diff

235
src/btf.h
View File

@@ -4,18 +4,19 @@
#ifndef __LIBBPF_BTF_H
#define __LIBBPF_BTF_H
#include <stdarg.h>
#include <linux/btf.h>
#include <linux/types.h>
#include "libbpf_common.h"
#ifdef __cplusplus
extern "C" {
#endif
#ifndef LIBBPF_API
#define LIBBPF_API __attribute__((visibility("default")))
#endif
#define BTF_ELF_SEC ".BTF"
#define BTF_EXT_ELF_SEC ".BTF.ext"
#define MAPS_ELF_SEC ".maps"
struct btf;
struct btf_ext;
@@ -55,19 +56,28 @@ struct btf_ext_header {
__u32 func_info_len;
__u32 line_info_off;
__u32 line_info_len;
/* optional part of .BTF.ext header */
__u32 field_reloc_off;
__u32 field_reloc_len;
};
LIBBPF_API void btf__free(struct btf *btf);
LIBBPF_API struct btf *btf__new(__u8 *data, __u32 size);
LIBBPF_API struct btf *btf__parse_elf(const char *path,
struct btf_ext **btf_ext);
LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf);
LIBBPF_API int btf__load(struct btf *btf);
LIBBPF_API __s32 btf__find_by_name(const struct btf *btf,
const char *type_name);
LIBBPF_API __s32 btf__find_by_name_kind(const struct btf *btf,
const char *type_name, __u32 kind);
LIBBPF_API __u32 btf__get_nr_types(const struct btf *btf);
LIBBPF_API const struct btf_type *btf__type_by_id(const struct btf *btf,
__u32 id);
LIBBPF_API __s64 btf__resolve_size(const struct btf *btf, __u32 type_id);
LIBBPF_API int btf__resolve_type(const struct btf *btf, __u32 type_id);
LIBBPF_API int btf__align_of(const struct btf *btf, __u32 id);
LIBBPF_API int btf__fd(const struct btf *btf);
LIBBPF_API const void *btf__get_raw_data(const struct btf *btf, __u32 *size);
LIBBPF_API const char *btf__name_by_offset(const struct btf *btf, __u32 offset);
@@ -92,6 +102,8 @@ LIBBPF_API int btf_ext__reloc_line_info(const struct btf *btf,
LIBBPF_API __u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext);
LIBBPF_API __u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext);
LIBBPF_API struct btf *libbpf_find_kernel_btf(void);
struct btf_dedup_opts {
unsigned int dedup_table_size;
bool dont_resolve_fwds;
@@ -100,6 +112,221 @@ struct btf_dedup_opts {
LIBBPF_API int btf__dedup(struct btf *btf, struct btf_ext *btf_ext,
const struct btf_dedup_opts *opts);
struct btf_dump;
struct btf_dump_opts {
void *ctx;
};
typedef void (*btf_dump_printf_fn_t)(void *ctx, const char *fmt, va_list args);
LIBBPF_API struct btf_dump *btf_dump__new(const struct btf *btf,
const struct btf_ext *btf_ext,
const struct btf_dump_opts *opts,
btf_dump_printf_fn_t printf_fn);
LIBBPF_API void btf_dump__free(struct btf_dump *d);
LIBBPF_API int btf_dump__dump_type(struct btf_dump *d, __u32 id);
struct btf_dump_emit_type_decl_opts {
/* size of this struct, for forward/backward compatiblity */
size_t sz;
/* optional field name for type declaration, e.g.:
* - struct my_struct <FNAME>
* - void (*<FNAME>)(int)
* - char (*<FNAME>)[123]
*/
const char *field_name;
/* extra indentation level (in number of tabs) to emit for multi-line
* type declarations (e.g., anonymous struct); applies for lines
* starting from the second one (first line is assumed to have
* necessary indentation already
*/
int indent_level;
};
#define btf_dump_emit_type_decl_opts__last_field indent_level
LIBBPF_API int
btf_dump__emit_type_decl(struct btf_dump *d, __u32 id,
const struct btf_dump_emit_type_decl_opts *opts);
/*
* A set of helpers for easier BTF types handling
*/
static inline __u16 btf_kind(const struct btf_type *t)
{
return BTF_INFO_KIND(t->info);
}
static inline __u16 btf_vlen(const struct btf_type *t)
{
return BTF_INFO_VLEN(t->info);
}
static inline bool btf_kflag(const struct btf_type *t)
{
return BTF_INFO_KFLAG(t->info);
}
static inline bool btf_is_int(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_INT;
}
static inline bool btf_is_ptr(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_PTR;
}
static inline bool btf_is_array(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_ARRAY;
}
static inline bool btf_is_struct(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_STRUCT;
}
static inline bool btf_is_union(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_UNION;
}
static inline bool btf_is_composite(const struct btf_type *t)
{
__u16 kind = btf_kind(t);
return kind == BTF_KIND_STRUCT || kind == BTF_KIND_UNION;
}
static inline bool btf_is_enum(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_ENUM;
}
static inline bool btf_is_fwd(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_FWD;
}
static inline bool btf_is_typedef(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_TYPEDEF;
}
static inline bool btf_is_volatile(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_VOLATILE;
}
static inline bool btf_is_const(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_CONST;
}
static inline bool btf_is_restrict(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_RESTRICT;
}
static inline bool btf_is_mod(const struct btf_type *t)
{
__u16 kind = btf_kind(t);
return kind == BTF_KIND_VOLATILE ||
kind == BTF_KIND_CONST ||
kind == BTF_KIND_RESTRICT;
}
static inline bool btf_is_func(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_FUNC;
}
static inline bool btf_is_func_proto(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_FUNC_PROTO;
}
static inline bool btf_is_var(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_VAR;
}
static inline bool btf_is_datasec(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_DATASEC;
}
static inline __u8 btf_int_encoding(const struct btf_type *t)
{
return BTF_INT_ENCODING(*(__u32 *)(t + 1));
}
static inline __u8 btf_int_offset(const struct btf_type *t)
{
return BTF_INT_OFFSET(*(__u32 *)(t + 1));
}
static inline __u8 btf_int_bits(const struct btf_type *t)
{
return BTF_INT_BITS(*(__u32 *)(t + 1));
}
static inline struct btf_array *btf_array(const struct btf_type *t)
{
return (struct btf_array *)(t + 1);
}
static inline struct btf_enum *btf_enum(const struct btf_type *t)
{
return (struct btf_enum *)(t + 1);
}
static inline struct btf_member *btf_members(const struct btf_type *t)
{
return (struct btf_member *)(t + 1);
}
/* Get bit offset of a member with specified index. */
static inline __u32 btf_member_bit_offset(const struct btf_type *t,
__u32 member_idx)
{
const struct btf_member *m = btf_members(t) + member_idx;
bool kflag = btf_kflag(t);
return kflag ? BTF_MEMBER_BIT_OFFSET(m->offset) : m->offset;
}
/*
* Get bitfield size of a member, assuming t is BTF_KIND_STRUCT or
* BTF_KIND_UNION. If member is not a bitfield, zero is returned.
*/
static inline __u32 btf_member_bitfield_size(const struct btf_type *t,
__u32 member_idx)
{
const struct btf_member *m = btf_members(t) + member_idx;
bool kflag = btf_kflag(t);
return kflag ? BTF_MEMBER_BITFIELD_SIZE(m->offset) : 0;
}
static inline struct btf_param *btf_params(const struct btf_type *t)
{
return (struct btf_param *)(t + 1);
}
static inline struct btf_var *btf_var(const struct btf_type *t)
{
return (struct btf_var *)(t + 1);
}
static inline struct btf_var_secinfo *
btf_var_secinfos(const struct btf_type *t)
{
return (struct btf_var_secinfo *)(t + 1);
}
#ifdef __cplusplus
} /* extern "C" */
#endif

1371
src/btf_dump.c Normal file

File diff suppressed because it is too large Load Diff

232
src/hashmap.c Normal file
View File

@@ -0,0 +1,232 @@
// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
/*
* Generic non-thread safe hash map implementation.
*
* Copyright (c) 2019 Facebook
*/
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <linux/err.h>
#include "hashmap.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
/* start with 4 buckets */
#define HASHMAP_MIN_CAP_BITS 2
static void hashmap_add_entry(struct hashmap_entry **pprev,
struct hashmap_entry *entry)
{
entry->next = *pprev;
*pprev = entry;
}
static void hashmap_del_entry(struct hashmap_entry **pprev,
struct hashmap_entry *entry)
{
*pprev = entry->next;
entry->next = NULL;
}
void hashmap__init(struct hashmap *map, hashmap_hash_fn hash_fn,
hashmap_equal_fn equal_fn, void *ctx)
{
map->hash_fn = hash_fn;
map->equal_fn = equal_fn;
map->ctx = ctx;
map->buckets = NULL;
map->cap = 0;
map->cap_bits = 0;
map->sz = 0;
}
struct hashmap *hashmap__new(hashmap_hash_fn hash_fn,
hashmap_equal_fn equal_fn,
void *ctx)
{
struct hashmap *map = malloc(sizeof(struct hashmap));
if (!map)
return ERR_PTR(-ENOMEM);
hashmap__init(map, hash_fn, equal_fn, ctx);
return map;
}
void hashmap__clear(struct hashmap *map)
{
free(map->buckets);
map->cap = map->cap_bits = map->sz = 0;
}
void hashmap__free(struct hashmap *map)
{
if (!map)
return;
hashmap__clear(map);
free(map);
}
size_t hashmap__size(const struct hashmap *map)
{
return map->sz;
}
size_t hashmap__capacity(const struct hashmap *map)
{
return map->cap;
}
static bool hashmap_needs_to_grow(struct hashmap *map)
{
/* grow if empty or more than 75% filled */
return (map->cap == 0) || ((map->sz + 1) * 4 / 3 > map->cap);
}
static int hashmap_grow(struct hashmap *map)
{
struct hashmap_entry **new_buckets;
struct hashmap_entry *cur, *tmp;
size_t new_cap_bits, new_cap;
size_t h;
int bkt;
new_cap_bits = map->cap_bits + 1;
if (new_cap_bits < HASHMAP_MIN_CAP_BITS)
new_cap_bits = HASHMAP_MIN_CAP_BITS;
new_cap = 1UL << new_cap_bits;
new_buckets = calloc(new_cap, sizeof(new_buckets[0]));
if (!new_buckets)
return -ENOMEM;
hashmap__for_each_entry_safe(map, cur, tmp, bkt) {
h = hash_bits(map->hash_fn(cur->key, map->ctx), new_cap_bits);
hashmap_add_entry(&new_buckets[h], cur);
}
map->cap = new_cap;
map->cap_bits = new_cap_bits;
free(map->buckets);
map->buckets = new_buckets;
return 0;
}
static bool hashmap_find_entry(const struct hashmap *map,
const void *key, size_t hash,
struct hashmap_entry ***pprev,
struct hashmap_entry **entry)
{
struct hashmap_entry *cur, **prev_ptr;
if (!map->buckets)
return false;
for (prev_ptr = &map->buckets[hash], cur = *prev_ptr;
cur;
prev_ptr = &cur->next, cur = cur->next) {
if (map->equal_fn(cur->key, key, map->ctx)) {
if (pprev)
*pprev = prev_ptr;
*entry = cur;
return true;
}
}
return false;
}
int hashmap__insert(struct hashmap *map, const void *key, void *value,
enum hashmap_insert_strategy strategy,
const void **old_key, void **old_value)
{
struct hashmap_entry *entry;
size_t h;
int err;
if (old_key)
*old_key = NULL;
if (old_value)
*old_value = NULL;
h = hash_bits(map->hash_fn(key, map->ctx), map->cap_bits);
if (strategy != HASHMAP_APPEND &&
hashmap_find_entry(map, key, h, NULL, &entry)) {
if (old_key)
*old_key = entry->key;
if (old_value)
*old_value = entry->value;
if (strategy == HASHMAP_SET || strategy == HASHMAP_UPDATE) {
entry->key = key;
entry->value = value;
return 0;
} else if (strategy == HASHMAP_ADD) {
return -EEXIST;
}
}
if (strategy == HASHMAP_UPDATE)
return -ENOENT;
if (hashmap_needs_to_grow(map)) {
err = hashmap_grow(map);
if (err)
return err;
h = hash_bits(map->hash_fn(key, map->ctx), map->cap_bits);
}
entry = malloc(sizeof(struct hashmap_entry));
if (!entry)
return -ENOMEM;
entry->key = key;
entry->value = value;
hashmap_add_entry(&map->buckets[h], entry);
map->sz++;
return 0;
}
bool hashmap__find(const struct hashmap *map, const void *key, void **value)
{
struct hashmap_entry *entry;
size_t h;
h = hash_bits(map->hash_fn(key, map->ctx), map->cap_bits);
if (!hashmap_find_entry(map, key, h, NULL, &entry))
return false;
if (value)
*value = entry->value;
return true;
}
bool hashmap__delete(struct hashmap *map, const void *key,
const void **old_key, void **old_value)
{
struct hashmap_entry **pprev, *entry;
size_t h;
h = hash_bits(map->hash_fn(key, map->ctx), map->cap_bits);
if (!hashmap_find_entry(map, key, h, &pprev, &entry))
return false;
if (old_key)
*old_key = entry->key;
if (old_value)
*old_value = entry->value;
hashmap_del_entry(pprev, entry);
free(entry);
map->sz--;
return true;
}

178
src/hashmap.h Normal file
View File

@@ -0,0 +1,178 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/*
* Generic non-thread safe hash map implementation.
*
* Copyright (c) 2019 Facebook
*/
#ifndef __LIBBPF_HASHMAP_H
#define __LIBBPF_HASHMAP_H
#include <stdbool.h>
#include <stddef.h>
#ifdef __GLIBC__
#include <bits/wordsize.h>
#else
#include <bits/reg.h>
#endif
#include "libbpf_internal.h"
static inline size_t hash_bits(size_t h, int bits)
{
/* shuffle bits and return requested number of upper bits */
return (h * 11400714819323198485llu) >> (__WORDSIZE - bits);
}
typedef size_t (*hashmap_hash_fn)(const void *key, void *ctx);
typedef bool (*hashmap_equal_fn)(const void *key1, const void *key2, void *ctx);
struct hashmap_entry {
const void *key;
void *value;
struct hashmap_entry *next;
};
struct hashmap {
hashmap_hash_fn hash_fn;
hashmap_equal_fn equal_fn;
void *ctx;
struct hashmap_entry **buckets;
size_t cap;
size_t cap_bits;
size_t sz;
};
#define HASHMAP_INIT(hash_fn, equal_fn, ctx) { \
.hash_fn = (hash_fn), \
.equal_fn = (equal_fn), \
.ctx = (ctx), \
.buckets = NULL, \
.cap = 0, \
.cap_bits = 0, \
.sz = 0, \
}
void hashmap__init(struct hashmap *map, hashmap_hash_fn hash_fn,
hashmap_equal_fn equal_fn, void *ctx);
struct hashmap *hashmap__new(hashmap_hash_fn hash_fn,
hashmap_equal_fn equal_fn,
void *ctx);
void hashmap__clear(struct hashmap *map);
void hashmap__free(struct hashmap *map);
size_t hashmap__size(const struct hashmap *map);
size_t hashmap__capacity(const struct hashmap *map);
/*
* Hashmap insertion strategy:
* - HASHMAP_ADD - only add key/value if key doesn't exist yet;
* - HASHMAP_SET - add key/value pair if key doesn't exist yet; otherwise,
* update value;
* - HASHMAP_UPDATE - update value, if key already exists; otherwise, do
* nothing and return -ENOENT;
* - HASHMAP_APPEND - always add key/value pair, even if key already exists.
* This turns hashmap into a multimap by allowing multiple values to be
* associated with the same key. Most useful read API for such hashmap is
* hashmap__for_each_key_entry() iteration. If hashmap__find() is still
* used, it will return last inserted key/value entry (first in a bucket
* chain).
*/
enum hashmap_insert_strategy {
HASHMAP_ADD,
HASHMAP_SET,
HASHMAP_UPDATE,
HASHMAP_APPEND,
};
/*
* hashmap__insert() adds key/value entry w/ various semantics, depending on
* provided strategy value. If a given key/value pair replaced already
* existing key/value pair, both old key and old value will be returned
* through old_key and old_value to allow calling code do proper memory
* management.
*/
int hashmap__insert(struct hashmap *map, const void *key, void *value,
enum hashmap_insert_strategy strategy,
const void **old_key, void **old_value);
static inline int hashmap__add(struct hashmap *map,
const void *key, void *value)
{
return hashmap__insert(map, key, value, HASHMAP_ADD, NULL, NULL);
}
static inline int hashmap__set(struct hashmap *map,
const void *key, void *value,
const void **old_key, void **old_value)
{
return hashmap__insert(map, key, value, HASHMAP_SET,
old_key, old_value);
}
static inline int hashmap__update(struct hashmap *map,
const void *key, void *value,
const void **old_key, void **old_value)
{
return hashmap__insert(map, key, value, HASHMAP_UPDATE,
old_key, old_value);
}
static inline int hashmap__append(struct hashmap *map,
const void *key, void *value)
{
return hashmap__insert(map, key, value, HASHMAP_APPEND, NULL, NULL);
}
bool hashmap__delete(struct hashmap *map, const void *key,
const void **old_key, void **old_value);
bool hashmap__find(const struct hashmap *map, const void *key, void **value);
/*
* hashmap__for_each_entry - iterate over all entries in hashmap
* @map: hashmap to iterate
* @cur: struct hashmap_entry * used as a loop cursor
* @bkt: integer used as a bucket loop cursor
*/
#define hashmap__for_each_entry(map, cur, bkt) \
for (bkt = 0; bkt < map->cap; bkt++) \
for (cur = map->buckets[bkt]; cur; cur = cur->next)
/*
* hashmap__for_each_entry_safe - iterate over all entries in hashmap, safe
* against removals
* @map: hashmap to iterate
* @cur: struct hashmap_entry * used as a loop cursor
* @tmp: struct hashmap_entry * used as a temporary next cursor storage
* @bkt: integer used as a bucket loop cursor
*/
#define hashmap__for_each_entry_safe(map, cur, tmp, bkt) \
for (bkt = 0; bkt < map->cap; bkt++) \
for (cur = map->buckets[bkt]; \
cur && ({tmp = cur->next; true; }); \
cur = tmp)
/*
* hashmap__for_each_key_entry - iterate over entries associated with given key
* @map: hashmap to iterate
* @cur: struct hashmap_entry * used as a loop cursor
* @key: key to iterate entries for
*/
#define hashmap__for_each_key_entry(map, cur, _key) \
for (cur = ({ size_t bkt = hash_bits(map->hash_fn((_key), map->ctx),\
map->cap_bits); \
map->buckets ? map->buckets[bkt] : NULL; }); \
cur; \
cur = cur->next) \
if (map->equal_fn(cur->key, (_key), map->ctx))
#define hashmap__for_each_key_entry_safe(map, cur, tmp, _key) \
for (cur = ({ size_t bkt = hash_bits(map->hash_fn((_key), map->ctx),\
map->cap_bits); \
cur = map->buckets ? map->buckets[bkt] : NULL; }); \
cur && ({ tmp = cur->next; true; }); \
cur = tmp) \
if (map->equal_fn(cur->key, (_key), map->ctx))
#endif /* __LIBBPF_HASHMAP_H */

File diff suppressed because it is too large Load Diff

View File

@@ -17,14 +17,12 @@
#include <sys/types.h> // for size_t
#include <linux/bpf.h>
#include "libbpf_common.h"
#ifdef __cplusplus
extern "C" {
#endif
#ifndef LIBBPF_API
#define LIBBPF_API __attribute__((visibility("default")))
#endif
enum libbpf_errno {
__LIBBPF_ERRNO__START = 4000,
@@ -57,7 +55,7 @@ enum libbpf_print_level {
typedef int (*libbpf_print_fn_t)(enum libbpf_print_level level,
const char *, va_list ap);
LIBBPF_API void libbpf_set_print(libbpf_print_fn_t fn);
LIBBPF_API libbpf_print_fn_t libbpf_set_print(libbpf_print_fn_t fn);
/* Hide internal to user */
struct bpf_object;
@@ -67,18 +65,61 @@ struct bpf_object_open_attr {
enum bpf_prog_type prog_type;
};
struct bpf_object_open_opts {
/* size of this struct, for forward/backward compatiblity */
size_t sz;
/* object name override, if provided:
* - for object open from file, this will override setting object
* name from file path's base name;
* - for object open from memory buffer, this will specify an object
* name and will override default "<addr>-<buf-size>" name;
*/
const char *object_name;
/* parse map definitions non-strictly, allowing extra attributes/data */
bool relaxed_maps;
/* DEPRECATED: handle CO-RE relocations non-strictly, allowing failures.
* Value is ignored. Relocations always are processed non-strictly.
* Non-relocatable instructions are replaced with invalid ones to
* prevent accidental errors.
* */
bool relaxed_core_relocs;
/* maps that set the 'pinning' attribute in their definition will have
* their pin_path attribute set to a file in this directory, and be
* auto-pinned to that path on load; defaults to "/sys/fs/bpf".
*/
const char *pin_root_path;
__u32 attach_prog_fd;
/* Additional kernel config content that augments and overrides
* system Kconfig for CONFIG_xxx externs.
*/
const char *kconfig;
};
#define bpf_object_open_opts__last_field kconfig
LIBBPF_API struct bpf_object *bpf_object__open(const char *path);
LIBBPF_API struct bpf_object *
bpf_object__open_file(const char *path, const struct bpf_object_open_opts *opts);
LIBBPF_API struct bpf_object *
bpf_object__open_mem(const void *obj_buf, size_t obj_buf_sz,
const struct bpf_object_open_opts *opts);
/* deprecated bpf_object__open variants */
LIBBPF_API struct bpf_object *
bpf_object__open_buffer(const void *obj_buf, size_t obj_buf_sz,
const char *name);
LIBBPF_API struct bpf_object *
bpf_object__open_xattr(struct bpf_object_open_attr *attr);
struct bpf_object *__bpf_object__open_xattr(struct bpf_object_open_attr *attr,
int flags);
LIBBPF_API struct bpf_object *bpf_object__open_buffer(void *obj_buf,
size_t obj_buf_sz,
const char *name);
int bpf_object__section_size(const struct bpf_object *obj, const char *name,
__u32 *size);
int bpf_object__variable_offset(const struct bpf_object *obj, const char *name,
__u32 *off);
enum libbpf_pin_type {
LIBBPF_PIN_NONE,
/* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */
LIBBPF_PIN_BY_NAME,
};
/* pin_maps and unpin_maps can both be called with a NULL path, in which case
* they will use the pin_path attribute of each map (and ignore all maps that
* don't have a pin_path set).
*/
LIBBPF_API int bpf_object__pin_maps(struct bpf_object *obj, const char *path);
LIBBPF_API int bpf_object__unpin_maps(struct bpf_object *obj,
const char *path);
@@ -89,18 +130,30 @@ LIBBPF_API int bpf_object__unpin_programs(struct bpf_object *obj,
LIBBPF_API int bpf_object__pin(struct bpf_object *object, const char *path);
LIBBPF_API void bpf_object__close(struct bpf_object *object);
struct bpf_object_load_attr {
struct bpf_object *obj;
int log_level;
const char *target_btf_path;
};
/* Load/unload object into/from kernel */
LIBBPF_API int bpf_object__load(struct bpf_object *obj);
LIBBPF_API int bpf_object__load_xattr(struct bpf_object_load_attr *attr);
LIBBPF_API int bpf_object__unload(struct bpf_object *obj);
LIBBPF_API const char *bpf_object__name(struct bpf_object *obj);
LIBBPF_API unsigned int bpf_object__kversion(struct bpf_object *obj);
LIBBPF_API const char *bpf_object__name(const struct bpf_object *obj);
LIBBPF_API unsigned int bpf_object__kversion(const struct bpf_object *obj);
struct btf;
LIBBPF_API struct btf *bpf_object__btf(struct bpf_object *obj);
LIBBPF_API struct btf *bpf_object__btf(const struct bpf_object *obj);
LIBBPF_API int bpf_object__btf_fd(const struct bpf_object *obj);
LIBBPF_API struct bpf_program *
bpf_object__find_program_by_title(struct bpf_object *obj, const char *title);
bpf_object__find_program_by_title(const struct bpf_object *obj,
const char *title);
LIBBPF_API struct bpf_program *
bpf_object__find_program_by_name(const struct bpf_object *obj,
const char *name);
LIBBPF_API struct bpf_object *bpf_object__next(struct bpf_object *prev);
#define bpf_object__for_each_safe(pos, tmp) \
@@ -112,18 +165,20 @@ LIBBPF_API struct bpf_object *bpf_object__next(struct bpf_object *prev);
typedef void (*bpf_object_clear_priv_t)(struct bpf_object *, void *);
LIBBPF_API int bpf_object__set_priv(struct bpf_object *obj, void *priv,
bpf_object_clear_priv_t clear_priv);
LIBBPF_API void *bpf_object__priv(struct bpf_object *prog);
LIBBPF_API void *bpf_object__priv(const struct bpf_object *prog);
LIBBPF_API int
libbpf_prog_type_by_name(const char *name, enum bpf_prog_type *prog_type,
enum bpf_attach_type *expected_attach_type);
LIBBPF_API int libbpf_attach_type_by_name(const char *name,
enum bpf_attach_type *attach_type);
LIBBPF_API int libbpf_find_vmlinux_btf_id(const char *name,
enum bpf_attach_type attach_type);
/* Accessors of bpf_program */
struct bpf_program;
LIBBPF_API struct bpf_program *bpf_program__next(struct bpf_program *prog,
struct bpf_object *obj);
const struct bpf_object *obj);
#define bpf_object__for_each_program(pos, obj) \
for ((pos) = bpf_program__next(NULL, (obj)); \
@@ -131,24 +186,27 @@ LIBBPF_API struct bpf_program *bpf_program__next(struct bpf_program *prog,
(pos) = bpf_program__next((pos), (obj)))
LIBBPF_API struct bpf_program *bpf_program__prev(struct bpf_program *prog,
struct bpf_object *obj);
const struct bpf_object *obj);
typedef void (*bpf_program_clear_priv_t)(struct bpf_program *,
void *);
typedef void (*bpf_program_clear_priv_t)(struct bpf_program *, void *);
LIBBPF_API int bpf_program__set_priv(struct bpf_program *prog, void *priv,
bpf_program_clear_priv_t clear_priv);
LIBBPF_API void *bpf_program__priv(struct bpf_program *prog);
LIBBPF_API void *bpf_program__priv(const struct bpf_program *prog);
LIBBPF_API void bpf_program__set_ifindex(struct bpf_program *prog,
__u32 ifindex);
LIBBPF_API const char *bpf_program__title(struct bpf_program *prog,
LIBBPF_API const char *bpf_program__name(const struct bpf_program *prog);
LIBBPF_API const char *bpf_program__title(const struct bpf_program *prog,
bool needs_copy);
/* returns program size in bytes */
LIBBPF_API size_t bpf_program__size(const struct bpf_program *prog);
LIBBPF_API int bpf_program__load(struct bpf_program *prog, char *license,
__u32 kern_version);
LIBBPF_API int bpf_program__fd(struct bpf_program *prog);
LIBBPF_API int bpf_program__fd(const struct bpf_program *prog);
LIBBPF_API int bpf_program__pin_instance(struct bpf_program *prog,
const char *path,
int instance);
@@ -159,6 +217,34 @@ LIBBPF_API int bpf_program__pin(struct bpf_program *prog, const char *path);
LIBBPF_API int bpf_program__unpin(struct bpf_program *prog, const char *path);
LIBBPF_API void bpf_program__unload(struct bpf_program *prog);
struct bpf_link;
LIBBPF_API void bpf_link__disconnect(struct bpf_link *link);
LIBBPF_API int bpf_link__destroy(struct bpf_link *link);
LIBBPF_API struct bpf_link *
bpf_program__attach(struct bpf_program *prog);
LIBBPF_API struct bpf_link *
bpf_program__attach_perf_event(struct bpf_program *prog, int pfd);
LIBBPF_API struct bpf_link *
bpf_program__attach_kprobe(struct bpf_program *prog, bool retprobe,
const char *func_name);
LIBBPF_API struct bpf_link *
bpf_program__attach_uprobe(struct bpf_program *prog, bool retprobe,
pid_t pid, const char *binary_path,
size_t func_offset);
LIBBPF_API struct bpf_link *
bpf_program__attach_tracepoint(struct bpf_program *prog,
const char *tp_category,
const char *tp_name);
LIBBPF_API struct bpf_link *
bpf_program__attach_raw_tracepoint(struct bpf_program *prog,
const char *tp_name);
LIBBPF_API struct bpf_link *
bpf_program__attach_trace(struct bpf_program *prog);
struct bpf_map;
LIBBPF_API struct bpf_link *bpf_map__attach_struct_ops(struct bpf_map *map);
struct bpf_insn;
/*
@@ -221,7 +307,7 @@ typedef int (*bpf_program_prep_t)(struct bpf_program *prog, int n,
LIBBPF_API int bpf_program__set_prep(struct bpf_program *prog, int nr_instance,
bpf_program_prep_t prep);
LIBBPF_API int bpf_program__nth_fd(struct bpf_program *prog, int n);
LIBBPF_API int bpf_program__nth_fd(const struct bpf_program *prog, int n);
/*
* Adjust type of BPF program. Default is kprobe.
@@ -234,20 +320,31 @@ LIBBPF_API int bpf_program__set_sched_cls(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_sched_act(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_xdp(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_perf_event(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_tracing(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_struct_ops(struct bpf_program *prog);
LIBBPF_API int bpf_program__set_extension(struct bpf_program *prog);
LIBBPF_API enum bpf_prog_type bpf_program__get_type(struct bpf_program *prog);
LIBBPF_API void bpf_program__set_type(struct bpf_program *prog,
enum bpf_prog_type type);
LIBBPF_API enum bpf_attach_type
bpf_program__get_expected_attach_type(struct bpf_program *prog);
LIBBPF_API void
bpf_program__set_expected_attach_type(struct bpf_program *prog,
enum bpf_attach_type type);
LIBBPF_API bool bpf_program__is_socket_filter(struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_tracepoint(struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_raw_tracepoint(struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_kprobe(struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_sched_cls(struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_sched_act(struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_xdp(struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_perf_event(struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_socket_filter(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_tracepoint(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_raw_tracepoint(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_kprobe(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_sched_cls(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_sched_act(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_xdp(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_perf_event(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_tracing(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_struct_ops(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_extension(const struct bpf_program *prog);
/*
* No need for __attribute__((packed)), all members of 'bpf_map_def'
@@ -267,12 +364,11 @@ struct bpf_map_def {
* The 'struct bpf_map' in include/linux/bpf.h is internal to the kernel,
* so no need to worry about a name clash.
*/
struct bpf_map;
LIBBPF_API struct bpf_map *
bpf_object__find_map_by_name(struct bpf_object *obj, const char *name);
bpf_object__find_map_by_name(const struct bpf_object *obj, const char *name);
LIBBPF_API int
bpf_object__find_map_fd_by_name(struct bpf_object *obj, const char *name);
bpf_object__find_map_fd_by_name(const struct bpf_object *obj, const char *name);
/*
* Get bpf_map through the offset of corresponding struct bpf_map_def
@@ -282,7 +378,7 @@ LIBBPF_API struct bpf_map *
bpf_object__find_map_by_offset(struct bpf_object *obj, size_t offset);
LIBBPF_API struct bpf_map *
bpf_map__next(struct bpf_map *map, struct bpf_object *obj);
bpf_map__next(const struct bpf_map *map, const struct bpf_object *obj);
#define bpf_object__for_each_map(pos, obj) \
for ((pos) = bpf_map__next(NULL, (obj)); \
(pos) != NULL; \
@@ -290,23 +386,26 @@ bpf_map__next(struct bpf_map *map, struct bpf_object *obj);
#define bpf_map__for_each bpf_object__for_each_map
LIBBPF_API struct bpf_map *
bpf_map__prev(struct bpf_map *map, struct bpf_object *obj);
bpf_map__prev(const struct bpf_map *map, const struct bpf_object *obj);
LIBBPF_API int bpf_map__fd(struct bpf_map *map);
LIBBPF_API const struct bpf_map_def *bpf_map__def(struct bpf_map *map);
LIBBPF_API const char *bpf_map__name(struct bpf_map *map);
LIBBPF_API int bpf_map__fd(const struct bpf_map *map);
LIBBPF_API const struct bpf_map_def *bpf_map__def(const struct bpf_map *map);
LIBBPF_API const char *bpf_map__name(const struct bpf_map *map);
LIBBPF_API __u32 bpf_map__btf_key_type_id(const struct bpf_map *map);
LIBBPF_API __u32 bpf_map__btf_value_type_id(const struct bpf_map *map);
typedef void (*bpf_map_clear_priv_t)(struct bpf_map *, void *);
LIBBPF_API int bpf_map__set_priv(struct bpf_map *map, void *priv,
bpf_map_clear_priv_t clear_priv);
LIBBPF_API void *bpf_map__priv(struct bpf_map *map);
LIBBPF_API void *bpf_map__priv(const struct bpf_map *map);
LIBBPF_API int bpf_map__reuse_fd(struct bpf_map *map, int fd);
LIBBPF_API int bpf_map__resize(struct bpf_map *map, __u32 max_entries);
LIBBPF_API bool bpf_map__is_offload_neutral(struct bpf_map *map);
LIBBPF_API bool bpf_map__is_internal(struct bpf_map *map);
LIBBPF_API bool bpf_map__is_offload_neutral(const struct bpf_map *map);
LIBBPF_API bool bpf_map__is_internal(const struct bpf_map *map);
LIBBPF_API void bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex);
LIBBPF_API int bpf_map__set_pin_path(struct bpf_map *map, const char *path);
LIBBPF_API const char *bpf_map__get_pin_path(const struct bpf_map *map);
LIBBPF_API bool bpf_map__is_pinned(const struct bpf_map *map);
LIBBPF_API int bpf_map__pin(struct bpf_map *map, const char *path);
LIBBPF_API int bpf_map__unpin(struct bpf_map *map, const char *path);
@@ -320,6 +419,7 @@ struct bpf_prog_load_attr {
enum bpf_attach_type expected_attach_type;
int ifindex;
int log_level;
int prog_flags;
};
LIBBPF_API int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,
@@ -327,8 +427,38 @@ LIBBPF_API int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,
LIBBPF_API int bpf_prog_load(const char *file, enum bpf_prog_type type,
struct bpf_object **pobj, int *prog_fd);
struct xdp_link_info {
__u32 prog_id;
__u32 drv_prog_id;
__u32 hw_prog_id;
__u32 skb_prog_id;
__u8 attach_mode;
};
LIBBPF_API int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags);
LIBBPF_API int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags);
LIBBPF_API int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info,
size_t info_size, __u32 flags);
struct perf_buffer;
typedef void (*perf_buffer_sample_fn)(void *ctx, int cpu,
void *data, __u32 size);
typedef void (*perf_buffer_lost_fn)(void *ctx, int cpu, __u64 cnt);
/* common use perf buffer options */
struct perf_buffer_opts {
/* if specified, sample_cb is called for each sample */
perf_buffer_sample_fn sample_cb;
/* if specified, lost_cb is called for each batch of lost samples */
perf_buffer_lost_fn lost_cb;
/* ctx is provided to sample_cb and lost_cb */
void *ctx;
};
LIBBPF_API struct perf_buffer *
perf_buffer__new(int map_fd, size_t page_cnt,
const struct perf_buffer_opts *opts);
enum bpf_perf_event_ret {
LIBBPF_PERF_EVENT_DONE = 0,
@@ -337,6 +467,35 @@ enum bpf_perf_event_ret {
};
struct perf_event_header;
typedef enum bpf_perf_event_ret
(*perf_buffer_event_fn)(void *ctx, int cpu, struct perf_event_header *event);
/* raw perf buffer options, giving most power and control */
struct perf_buffer_raw_opts {
/* perf event attrs passed directly into perf_event_open() */
struct perf_event_attr *attr;
/* raw event callback */
perf_buffer_event_fn event_cb;
/* ctx is provided to event_cb */
void *ctx;
/* if cpu_cnt == 0, open all on all possible CPUs (up to the number of
* max_entries of given PERF_EVENT_ARRAY map)
*/
int cpu_cnt;
/* if cpu_cnt > 0, cpus is an array of CPUs to open ring buffers on */
int *cpus;
/* if cpu_cnt > 0, map_keys specify map keys to set per-CPU FDs for */
int *map_keys;
};
LIBBPF_API struct perf_buffer *
perf_buffer__new_raw(int map_fd, size_t page_cnt,
const struct perf_buffer_raw_opts *opts);
LIBBPF_API void perf_buffer__free(struct perf_buffer *pb);
LIBBPF_API int perf_buffer__poll(struct perf_buffer *pb, int timeout_ms);
typedef enum bpf_perf_event_ret
(*bpf_perf_event_print_t)(struct perf_event_header *hdr,
void *private_data);
@@ -345,18 +504,6 @@ bpf_perf_event_read_simple(void *mmap_mem, size_t mmap_size, size_t page_size,
void **copy_mem, size_t *copy_size,
bpf_perf_event_print_t fn, void *private_data);
struct nlattr;
typedef int (*libbpf_dump_nlmsg_t)(void *cookie, void *msg, struct nlattr **tb);
int libbpf_netlink_open(unsigned int *nl_pid);
int libbpf_nl_get_link(int sock, unsigned int nl_pid,
libbpf_dump_nlmsg_t dump_link_nlmsg, void *cookie);
int libbpf_nl_get_class(int sock, unsigned int nl_pid, int ifindex,
libbpf_dump_nlmsg_t dump_class_nlmsg, void *cookie);
int libbpf_nl_get_qdisc(int sock, unsigned int nl_pid, int ifindex,
libbpf_dump_nlmsg_t dump_qdisc_nlmsg, void *cookie);
int libbpf_nl_get_filter(int sock, unsigned int nl_pid, int ifindex, int handle,
libbpf_dump_nlmsg_t dump_filter_nlmsg, void *cookie);
struct bpf_prog_linfo;
struct bpf_prog_info;
@@ -383,6 +530,7 @@ LIBBPF_API bool bpf_probe_prog_type(enum bpf_prog_type prog_type,
LIBBPF_API bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex);
LIBBPF_API bool bpf_probe_helper(enum bpf_func_id id,
enum bpf_prog_type prog_type, __u32 ifindex);
LIBBPF_API bool bpf_probe_large_insn_limit(__u32 ifindex);
/*
* Get bpf_prog_info in continuous memory
@@ -447,6 +595,66 @@ bpf_program__bpil_addr_to_offs(struct bpf_prog_info_linear *info_linear);
LIBBPF_API void
bpf_program__bpil_offs_to_addr(struct bpf_prog_info_linear *info_linear);
/*
* A helper function to get the number of possible CPUs before looking up
* per-CPU maps. Negative errno is returned on failure.
*
* Example usage:
*
* int ncpus = libbpf_num_possible_cpus();
* if (ncpus < 0) {
* // error handling
* }
* long values[ncpus];
* bpf_map_lookup_elem(per_cpu_map_fd, key, values);
*
*/
LIBBPF_API int libbpf_num_possible_cpus(void);
struct bpf_map_skeleton {
const char *name;
struct bpf_map **map;
void **mmaped;
};
struct bpf_prog_skeleton {
const char *name;
struct bpf_program **prog;
struct bpf_link **link;
};
struct bpf_object_skeleton {
size_t sz; /* size of this struct, for forward/backward compatibility */
const char *name;
void *data;
size_t data_sz;
struct bpf_object **obj;
int map_cnt;
int map_skel_sz; /* sizeof(struct bpf_skeleton_map) */
struct bpf_map_skeleton *maps;
int prog_cnt;
int prog_skel_sz; /* sizeof(struct bpf_skeleton_prog) */
struct bpf_prog_skeleton *progs;
};
LIBBPF_API int
bpf_object__open_skeleton(struct bpf_object_skeleton *s,
const struct bpf_object_open_opts *opts);
LIBBPF_API int bpf_object__load_skeleton(struct bpf_object_skeleton *s);
LIBBPF_API int bpf_object__attach_skeleton(struct bpf_object_skeleton *s);
LIBBPF_API void bpf_object__detach_skeleton(struct bpf_object_skeleton *s);
LIBBPF_API void bpf_object__destroy_skeleton(struct bpf_object_skeleton *s);
enum libbpf_tristate {
TRI_NO = 0,
TRI_YES = 1,
TRI_MODULE = 2,
};
#ifdef __cplusplus
} /* extern "C" */
#endif

View File

@@ -164,3 +164,74 @@ LIBBPF_0.0.3 {
bpf_map_freeze;
btf__finalize_data;
} LIBBPF_0.0.2;
LIBBPF_0.0.4 {
global:
bpf_link__destroy;
bpf_object__load_xattr;
bpf_program__attach_kprobe;
bpf_program__attach_perf_event;
bpf_program__attach_raw_tracepoint;
bpf_program__attach_tracepoint;
bpf_program__attach_uprobe;
btf_dump__dump_type;
btf_dump__free;
btf_dump__new;
btf__parse_elf;
libbpf_num_possible_cpus;
perf_buffer__free;
perf_buffer__new;
perf_buffer__new_raw;
perf_buffer__poll;
xsk_umem__create;
} LIBBPF_0.0.3;
LIBBPF_0.0.5 {
global:
bpf_btf_get_next_id;
} LIBBPF_0.0.4;
LIBBPF_0.0.6 {
global:
bpf_get_link_xdp_info;
bpf_map__get_pin_path;
bpf_map__is_pinned;
bpf_map__set_pin_path;
bpf_object__open_file;
bpf_object__open_mem;
bpf_program__attach_trace;
bpf_program__get_expected_attach_type;
bpf_program__get_type;
bpf_program__is_tracing;
bpf_program__set_tracing;
bpf_program__size;
btf__find_by_name_kind;
libbpf_find_vmlinux_btf_id;
} LIBBPF_0.0.5;
LIBBPF_0.0.7 {
global:
btf_dump__emit_type_decl;
bpf_link__disconnect;
bpf_map__attach_struct_ops;
bpf_map_delete_batch;
bpf_map_lookup_and_delete_batch;
bpf_map_lookup_batch;
bpf_map_update_batch;
bpf_object__find_program_by_name;
bpf_object__attach_skeleton;
bpf_object__destroy_skeleton;
bpf_object__detach_skeleton;
bpf_object__load_skeleton;
bpf_object__open_skeleton;
bpf_probe_large_insn_limit;
bpf_prog_attach_xattr;
bpf_program__attach;
bpf_program__name;
bpf_program__is_extension;
bpf_program__is_struct_ops;
bpf_program__set_extension;
bpf_program__set_struct_ops;
btf__align_of;
libbpf_find_kernel_btf;
} LIBBPF_0.0.6;

View File

@@ -8,5 +8,5 @@ Name: libbpf
Description: BPF library
Version: @VERSION@
Libs: -L${libdir} -lbpf
Requires.private: libelf
Requires.private: libelf zlib
Cflags: -I${includedir}

40
src/libbpf_common.h Normal file
View File

@@ -0,0 +1,40 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/*
* Common user-facing libbpf helpers.
*
* Copyright (c) 2019 Facebook
*/
#ifndef __LIBBPF_LIBBPF_COMMON_H
#define __LIBBPF_LIBBPF_COMMON_H
#include <string.h>
#ifndef LIBBPF_API
#define LIBBPF_API __attribute__((visibility("default")))
#endif
/* Helper macro to declare and initialize libbpf options struct
*
* This dance with uninitialized declaration, followed by memset to zero,
* followed by assignment using compound literal syntax is done to preserve
* ability to use a nice struct field initialization syntax and **hopefully**
* have all the padding bytes initialized to zero. It's not guaranteed though,
* when copying literal, that compiler won't copy garbage in literal's padding
* bytes, but that's the best way I've found and it seems to work in practice.
*
* Macro declares opts struct of given type and name, zero-initializes,
* including any extra padding, it with memset() and then assigns initial
* values provided by users in struct initializer-syntax as varargs.
*/
#define DECLARE_LIBBPF_OPTS(TYPE, NAME, ...) \
struct TYPE NAME = ({ \
memset(&NAME, 0, sizeof(struct TYPE)); \
(struct TYPE) { \
.sz = sizeof(struct TYPE), \
__VA_ARGS__ \
}; \
})
#endif /* __LIBBPF_LIBBPF_COMMON_H */

View File

@@ -13,6 +13,9 @@
#include "libbpf.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
#define ERRNO_OFFSET(e) ((e) - __LIBBPF_ERRNO__START)
#define ERRCODE_OFFSET(c) ERRNO_OFFSET(LIBBPF_ERRNO__##c)
#define NR_ERRNO (__LIBBPF_ERRNO__END - __LIBBPF_ERRNO__START)

View File

@@ -9,6 +9,8 @@
#ifndef __LIBBPF_LIBBPF_INTERNAL_H
#define __LIBBPF_LIBBPF_INTERNAL_H
#include "libbpf.h"
#define BTF_INFO_ENC(kind, kind_flag, vlen) \
((!!(kind_flag) << 31) | ((kind) << 24) | ((vlen) & BTF_MAX_VLEN))
#define BTF_TYPE_ENC(name, info, size_or_type) (name), (info), (size_or_type)
@@ -21,6 +23,33 @@
#define BTF_PARAM_ENC(name, type) (name), (type)
#define BTF_VAR_SECINFO_ENC(type, offset, size) (type), (offset), (size)
#ifndef min
# define min(x, y) ((x) < (y) ? (x) : (y))
#endif
#ifndef max
# define max(x, y) ((x) < (y) ? (y) : (x))
#endif
#ifndef offsetofend
# define offsetofend(TYPE, FIELD) \
(offsetof(TYPE, FIELD) + sizeof(((TYPE *)0)->FIELD))
#endif
/* Symbol versioning is different between static and shared library.
* Properly versioned symbols are needed for shared library, but
* only the symbol of the new version is needed for static library.
*/
#ifdef SHARED
# define COMPAT_VERSION(internal_name, api_name, version) \
asm(".symver " #internal_name "," #api_name "@" #version);
# define DEFAULT_VERSION(internal_name, api_name, version) \
asm(".symver " #internal_name "," #api_name "@@" #version);
#else
# define COMPAT_VERSION(internal_name, api_name, version)
# define DEFAULT_VERSION(internal_name, api_name, version) \
extern typeof(internal_name) api_name \
__attribute__((alias(#internal_name)));
#endif
extern void libbpf_print(enum libbpf_print_level level,
const char *format, ...)
__attribute__((format(printf, 2, 3)));
@@ -30,11 +59,178 @@ do { \
libbpf_print(level, "libbpf: " fmt, ##__VA_ARGS__); \
} while (0)
#define pr_warning(fmt, ...) __pr(LIBBPF_WARN, fmt, ##__VA_ARGS__)
#define pr_warn(fmt, ...) __pr(LIBBPF_WARN, fmt, ##__VA_ARGS__)
#define pr_info(fmt, ...) __pr(LIBBPF_INFO, fmt, ##__VA_ARGS__)
#define pr_debug(fmt, ...) __pr(LIBBPF_DEBUG, fmt, ##__VA_ARGS__)
int libbpf__probe_raw_btf(const char *raw_types, size_t types_len,
const char *str_sec, size_t str_len);
static inline bool libbpf_validate_opts(const char *opts,
size_t opts_sz, size_t user_sz,
const char *type_name)
{
if (user_sz < sizeof(size_t)) {
pr_warn("%s size (%zu) is too small\n", type_name, user_sz);
return false;
}
if (user_sz > opts_sz) {
size_t i;
for (i = opts_sz; i < user_sz; i++) {
if (opts[i]) {
pr_warn("%s has non-zero extra bytes\n",
type_name);
return false;
}
}
}
return true;
}
#define OPTS_VALID(opts, type) \
(!(opts) || libbpf_validate_opts((const char *)opts, \
offsetofend(struct type, \
type##__last_field), \
(opts)->sz, #type))
#define OPTS_HAS(opts, field) \
((opts) && opts->sz >= offsetofend(typeof(*(opts)), field))
#define OPTS_GET(opts, field, fallback_value) \
(OPTS_HAS(opts, field) ? (opts)->field : fallback_value)
int parse_cpu_mask_str(const char *s, bool **mask, int *mask_sz);
int parse_cpu_mask_file(const char *fcpu, bool **mask, int *mask_sz);
int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
const char *str_sec, size_t str_len);
int bpf_object__section_size(const struct bpf_object *obj, const char *name,
__u32 *size);
int bpf_object__variable_offset(const struct bpf_object *obj, const char *name,
__u32 *off);
struct nlattr;
typedef int (*libbpf_dump_nlmsg_t)(void *cookie, void *msg, struct nlattr **tb);
int libbpf_netlink_open(unsigned int *nl_pid);
int libbpf_nl_get_link(int sock, unsigned int nl_pid,
libbpf_dump_nlmsg_t dump_link_nlmsg, void *cookie);
int libbpf_nl_get_class(int sock, unsigned int nl_pid, int ifindex,
libbpf_dump_nlmsg_t dump_class_nlmsg, void *cookie);
int libbpf_nl_get_qdisc(int sock, unsigned int nl_pid, int ifindex,
libbpf_dump_nlmsg_t dump_qdisc_nlmsg, void *cookie);
int libbpf_nl_get_filter(int sock, unsigned int nl_pid, int ifindex, int handle,
libbpf_dump_nlmsg_t dump_filter_nlmsg, void *cookie);
struct btf_ext_info {
/*
* info points to the individual info section (e.g. func_info and
* line_info) from the .BTF.ext. It does not include the __u32 rec_size.
*/
void *info;
__u32 rec_size;
__u32 len;
};
#define for_each_btf_ext_sec(seg, sec) \
for (sec = (seg)->info; \
(void *)sec < (seg)->info + (seg)->len; \
sec = (void *)sec + sizeof(struct btf_ext_info_sec) + \
(seg)->rec_size * sec->num_info)
#define for_each_btf_ext_rec(seg, sec, i, rec) \
for (i = 0, rec = (void *)&(sec)->data; \
i < (sec)->num_info; \
i++, rec = (void *)rec + (seg)->rec_size)
struct btf_ext {
union {
struct btf_ext_header *hdr;
void *data;
};
struct btf_ext_info func_info;
struct btf_ext_info line_info;
struct btf_ext_info field_reloc_info;
__u32 data_size;
};
struct btf_ext_info_sec {
__u32 sec_name_off;
__u32 num_info;
/* Followed by num_info * record_size number of bytes */
__u8 data[0];
};
/* The minimum bpf_func_info checked by the loader */
struct bpf_func_info_min {
__u32 insn_off;
__u32 type_id;
};
/* The minimum bpf_line_info checked by the loader */
struct bpf_line_info_min {
__u32 insn_off;
__u32 file_name_off;
__u32 line_off;
__u32 line_col;
};
/* bpf_field_info_kind encodes which aspect of captured field has to be
* adjusted by relocations. Currently supported values are:
* - BPF_FIELD_BYTE_OFFSET: field offset (in bytes);
* - BPF_FIELD_EXISTS: field existence (1, if field exists; 0, otherwise);
*/
enum bpf_field_info_kind {
BPF_FIELD_BYTE_OFFSET = 0, /* field byte offset */
BPF_FIELD_BYTE_SIZE = 1,
BPF_FIELD_EXISTS = 2, /* field existence in target kernel */
BPF_FIELD_SIGNED = 3,
BPF_FIELD_LSHIFT_U64 = 4,
BPF_FIELD_RSHIFT_U64 = 5,
};
/* The minimum bpf_field_reloc checked by the loader
*
* Field relocation captures the following data:
* - insn_off - instruction offset (in bytes) within a BPF program that needs
* its insn->imm field to be relocated with actual field info;
* - type_id - BTF type ID of the "root" (containing) entity of a relocatable
* field;
* - access_str_off - offset into corresponding .BTF string section. String
* itself encodes an accessed field using a sequence of field and array
* indicies, separated by colon (:). It's conceptually very close to LLVM's
* getelementptr ([0]) instruction's arguments for identifying offset to
* a field.
*
* Example to provide a better feel.
*
* struct sample {
* int a;
* struct {
* int b[10];
* };
* };
*
* struct sample *s = ...;
* int x = &s->a; // encoded as "0:0" (a is field #0)
* int y = &s->b[5]; // encoded as "0:1:0:5" (anon struct is field #1,
* // b is field #0 inside anon struct, accessing elem #5)
* int z = &s[10]->b; // encoded as "10:1" (ptr is used as an array)
*
* type_id for all relocs in this example will capture BTF type id of
* `struct sample`.
*
* Such relocation is emitted when using __builtin_preserve_access_index()
* Clang built-in, passing expression that captures field address, e.g.:
*
* bpf_probe_read(&dst, sizeof(dst),
* __builtin_preserve_access_index(&src->a.b.c));
*
* In this case Clang will emit field relocation recording necessary data to
* be able to find offset of embedded `a.b.c` field within `src` struct.
*
* [0] https://llvm.org/docs/LangRef.html#getelementptr-instruction
*/
struct bpf_field_reloc {
__u32 insn_off;
__u32 type_id;
__u32 access_str_off;
enum bpf_field_info_kind kind;
};
#endif /* __LIBBPF_LIBBPF_INTERNAL_H */

View File

@@ -17,6 +17,9 @@
#include "libbpf.h"
#include "libbpf_internal.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
static bool grep(const char *buffer, const char *pattern)
{
return !!strstr(buffer, pattern);
@@ -101,6 +104,10 @@ probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns,
case BPF_PROG_TYPE_SK_REUSEPORT:
case BPF_PROG_TYPE_FLOW_DISSECTOR:
case BPF_PROG_TYPE_CGROUP_SYSCTL:
case BPF_PROG_TYPE_CGROUP_SOCKOPT:
case BPF_PROG_TYPE_TRACING:
case BPF_PROG_TYPE_STRUCT_OPS:
case BPF_PROG_TYPE_EXT:
default:
break;
}
@@ -133,8 +140,8 @@ bool bpf_probe_prog_type(enum bpf_prog_type prog_type, __u32 ifindex)
return errno != EINVAL && errno != EOPNOTSUPP;
}
int libbpf__probe_raw_btf(const char *raw_types, size_t types_len,
const char *str_sec, size_t str_len)
int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
const char *str_sec, size_t str_len)
{
struct btf_header hdr = {
.magic = BTF_MAGIC,
@@ -157,14 +164,9 @@ int libbpf__probe_raw_btf(const char *raw_types, size_t types_len,
memcpy(raw_btf + hdr.hdr_len + hdr.type_len, str_sec, hdr.str_len);
btf_fd = bpf_load_btf(raw_btf, btf_len, NULL, 0, false);
if (btf_fd < 0) {
free(raw_btf);
return 0;
}
close(btf_fd);
free(raw_btf);
return 1;
return btf_fd;
}
static int load_sk_storage_btf(void)
@@ -190,7 +192,7 @@ static int load_sk_storage_btf(void)
BTF_MEMBER_ENC(23, 2, 32),/* struct bpf_spin_lock l; */
};
return libbpf__probe_raw_btf((char *)types, sizeof(types),
return libbpf__load_raw_btf((char *)types, sizeof(types),
strs, sizeof(strs));
}
@@ -248,11 +250,13 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
case BPF_MAP_TYPE_ARRAY_OF_MAPS:
case BPF_MAP_TYPE_HASH_OF_MAPS:
case BPF_MAP_TYPE_DEVMAP:
case BPF_MAP_TYPE_DEVMAP_HASH:
case BPF_MAP_TYPE_SOCKMAP:
case BPF_MAP_TYPE_CPUMAP:
case BPF_MAP_TYPE_XSKMAP:
case BPF_MAP_TYPE_SOCKHASH:
case BPF_MAP_TYPE_REUSEPORT_SOCKARRAY:
case BPF_MAP_TYPE_STRUCT_OPS:
default:
break;
}
@@ -323,3 +327,24 @@ bool bpf_probe_helper(enum bpf_func_id id, enum bpf_prog_type prog_type,
return res;
}
/*
* Probe for availability of kernel commit (5.3):
*
* c04c0d2b968a ("bpf: increase complexity limit and maximum program size")
*/
bool bpf_probe_large_insn_limit(__u32 ifindex)
{
struct bpf_insn insns[BPF_MAXINSNS + 1];
int i;
for (i = 0; i < BPF_MAXINSNS; i++)
insns[i] = BPF_MOV64_IMM(BPF_REG_0, 1);
insns[BPF_MAXINSNS] = BPF_EXIT_INSN();
errno = 0;
probe_load(BPF_PROG_TYPE_SCHED_CLS, insns, ARRAY_SIZE(insns), NULL, 0,
ifindex);
return errno != E2BIG && errno != EINVAL;
}

View File

@@ -12,8 +12,12 @@
#include "bpf.h"
#include "libbpf.h"
#include "libbpf_internal.h"
#include "nlattr.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
#ifndef SOL_NETLINK
#define SOL_NETLINK 270
#endif
@@ -24,7 +28,7 @@ typedef int (*__dump_nlmsg_t)(struct nlmsghdr *nlmsg, libbpf_dump_nlmsg_t,
struct xdp_id_md {
int ifindex;
__u32 flags;
__u32 id;
struct xdp_link_info info;
};
int libbpf_netlink_open(__u32 *nl_pid)
@@ -43,7 +47,7 @@ int libbpf_netlink_open(__u32 *nl_pid)
if (setsockopt(sock, SOL_NETLINK, NETLINK_EXT_ACK,
&one, sizeof(one)) < 0) {
fprintf(stderr, "Netlink error reporting not supported\n");
pr_warn("Netlink error reporting not supported\n");
}
if (bind(sock, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
@@ -202,26 +206,11 @@ static int __dump_link_nlmsg(struct nlmsghdr *nlh,
return dump_link_nlmsg(cookie, ifi, tb);
}
static unsigned char get_xdp_id_attr(unsigned char mode, __u32 flags)
{
if (mode != XDP_ATTACHED_MULTI)
return IFLA_XDP_PROG_ID;
if (flags & XDP_FLAGS_DRV_MODE)
return IFLA_XDP_DRV_PROG_ID;
if (flags & XDP_FLAGS_HW_MODE)
return IFLA_XDP_HW_PROG_ID;
if (flags & XDP_FLAGS_SKB_MODE)
return IFLA_XDP_SKB_PROG_ID;
return IFLA_XDP_UNSPEC;
}
static int get_xdp_id(void *cookie, void *msg, struct nlattr **tb)
static int get_xdp_info(void *cookie, void *msg, struct nlattr **tb)
{
struct nlattr *xdp_tb[IFLA_XDP_MAX + 1];
struct xdp_id_md *xdp_id = cookie;
struct ifinfomsg *ifinfo = msg;
unsigned char mode, xdp_attr;
int ret;
if (xdp_id->ifindex && xdp_id->ifindex != ifinfo->ifi_index)
@@ -237,27 +226,40 @@ static int get_xdp_id(void *cookie, void *msg, struct nlattr **tb)
if (!xdp_tb[IFLA_XDP_ATTACHED])
return 0;
mode = libbpf_nla_getattr_u8(xdp_tb[IFLA_XDP_ATTACHED]);
if (mode == XDP_ATTACHED_NONE)
xdp_id->info.attach_mode = libbpf_nla_getattr_u8(
xdp_tb[IFLA_XDP_ATTACHED]);
if (xdp_id->info.attach_mode == XDP_ATTACHED_NONE)
return 0;
xdp_attr = get_xdp_id_attr(mode, xdp_id->flags);
if (!xdp_attr || !xdp_tb[xdp_attr])
return 0;
if (xdp_tb[IFLA_XDP_PROG_ID])
xdp_id->info.prog_id = libbpf_nla_getattr_u32(
xdp_tb[IFLA_XDP_PROG_ID]);
xdp_id->id = libbpf_nla_getattr_u32(xdp_tb[xdp_attr]);
if (xdp_tb[IFLA_XDP_SKB_PROG_ID])
xdp_id->info.skb_prog_id = libbpf_nla_getattr_u32(
xdp_tb[IFLA_XDP_SKB_PROG_ID]);
if (xdp_tb[IFLA_XDP_DRV_PROG_ID])
xdp_id->info.drv_prog_id = libbpf_nla_getattr_u32(
xdp_tb[IFLA_XDP_DRV_PROG_ID]);
if (xdp_tb[IFLA_XDP_HW_PROG_ID])
xdp_id->info.hw_prog_id = libbpf_nla_getattr_u32(
xdp_tb[IFLA_XDP_HW_PROG_ID]);
return 0;
}
int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags)
int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info,
size_t info_size, __u32 flags)
{
struct xdp_id_md xdp_id = {};
int sock, ret;
__u32 nl_pid;
__u32 mask;
if (flags & ~XDP_FLAGS_MASK)
if (flags & ~XDP_FLAGS_MASK || !info_size)
return -EINVAL;
/* Check whether the single {HW,DRV,SKB} mode is set */
@@ -273,14 +275,44 @@ int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags)
xdp_id.ifindex = ifindex;
xdp_id.flags = flags;
ret = libbpf_nl_get_link(sock, nl_pid, get_xdp_id, &xdp_id);
if (!ret)
*prog_id = xdp_id.id;
ret = libbpf_nl_get_link(sock, nl_pid, get_xdp_info, &xdp_id);
if (!ret) {
size_t sz = min(info_size, sizeof(xdp_id.info));
memcpy(info, &xdp_id.info, sz);
memset((void *) info + sz, 0, info_size - sz);
}
close(sock);
return ret;
}
static __u32 get_xdp_id(struct xdp_link_info *info, __u32 flags)
{
if (info->attach_mode != XDP_ATTACHED_MULTI)
return info->prog_id;
if (flags & XDP_FLAGS_DRV_MODE)
return info->drv_prog_id;
if (flags & XDP_FLAGS_HW_MODE)
return info->hw_prog_id;
if (flags & XDP_FLAGS_SKB_MODE)
return info->skb_prog_id;
return 0;
}
int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags)
{
struct xdp_link_info info;
int ret;
ret = bpf_get_link_xdp_info(ifindex, &info, sizeof(info), flags);
if (!ret)
*prog_id = get_xdp_id(&info, flags);
return ret;
}
int libbpf_nl_get_link(int sock, unsigned int nl_pid,
libbpf_dump_nlmsg_t dump_link_nlmsg, void *cookie)
{

View File

@@ -8,10 +8,14 @@
#include <errno.h>
#include "nlattr.h"
#include "libbpf_internal.h"
#include <linux/rtnetlink.h>
#include <string.h>
#include <stdio.h>
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
static uint16_t nla_attr_minlen[LIBBPF_NLA_TYPE_MAX+1] = {
[LIBBPF_NLA_U8] = sizeof(uint8_t),
[LIBBPF_NLA_U16] = sizeof(uint16_t),
@@ -121,8 +125,8 @@ int libbpf_nla_parse(struct nlattr *tb[], int maxtype, struct nlattr *head,
}
if (tb[type])
fprintf(stderr, "Attribute of type %#x found multiple times in message, "
"previous attribute is being ignored.\n", type);
pr_warn("Attribute of type %#x found multiple times in message, "
"previous attribute is being ignored.\n", type);
tb[type] = nla;
}
@@ -181,15 +185,14 @@ int libbpf_nla_dump_errormsg(struct nlmsghdr *nlh)
if (libbpf_nla_parse(tb, NLMSGERR_ATTR_MAX, attr, alen,
extack_policy) != 0) {
fprintf(stderr,
"Failed to parse extended error attributes\n");
pr_warn("Failed to parse extended error attributes\n");
return 0;
}
if (tb[NLMSGERR_ATTR_MSG])
errmsg = (char *) libbpf_nla_data(tb[NLMSGERR_ATTR_MSG]);
fprintf(stderr, "Kernel error message: %s\n", errmsg);
pr_warn("Kernel error message: %s\n", errmsg);
return 0;
}

View File

@@ -4,6 +4,9 @@
#include <stdio.h>
#include "str_error.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
/*
* Wrapper to allow for building in non-GNU systems such as Alpine Linux's musl
* libc, while checking strerror_r() return to avoid having to check this in
@@ -11,7 +14,7 @@
*/
char *libbpf_strerror_r(int err, char *dst, int len)
{
int ret = strerror_r(err, dst, len);
int ret = strerror_r(err < 0 ? -err : err, dst, len);
if (ret)
snprintf(dst, len, "ERROR: strerror_r(%d)=%d", err, ret);
return dst;

367
src/xsk.c
View File

@@ -32,6 +32,9 @@
#include "libbpf_internal.h"
#include "xsk.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
#ifndef SOL_XDP
#define SOL_XDP 283
#endif
@@ -60,10 +63,8 @@ struct xsk_socket {
struct xsk_umem *umem;
struct xsk_socket_config config;
int fd;
int xsks_map;
int ifindex;
int prog_fd;
int qidconf_map_fd;
int xsks_map_fd;
__u32 queue_id;
char ifname[IFNAMSIZ];
@@ -75,22 +76,20 @@ struct xsk_nl_info {
int fd;
};
/* For 32-bit systems, we need to use mmap2 as the offsets are 64-bit.
* Unfortunately, it is not part of glibc.
*/
static inline void *xsk_mmap(void *addr, size_t length, int prot, int flags,
int fd, __u64 offset)
{
#ifdef __NR_mmap2
unsigned int page_shift = __builtin_ffs(getpagesize()) - 1;
long ret = syscall(__NR_mmap2, addr, length, prot, flags, fd,
(off_t)(offset >> page_shift));
/* Up until and including Linux 5.3 */
struct xdp_ring_offset_v1 {
__u64 producer;
__u64 consumer;
__u64 desc;
};
return (void *)ret;
#else
return mmap(addr, length, prot, flags, fd, offset);
#endif
}
/* Up until and including Linux 5.3 */
struct xdp_mmap_offsets_v1 {
struct xdp_ring_offset_v1 rx;
struct xdp_ring_offset_v1 tx;
struct xdp_ring_offset_v1 fr;
struct xdp_ring_offset_v1 cr;
};
int xsk_umem__fd(const struct xsk_umem *umem)
{
@@ -117,6 +116,7 @@ static void xsk_set_umem_config(struct xsk_umem_config *cfg,
cfg->comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
cfg->frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE;
cfg->frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM;
cfg->flags = XSK_UMEM__DEFAULT_FLAGS;
return;
}
@@ -124,6 +124,7 @@ static void xsk_set_umem_config(struct xsk_umem_config *cfg,
cfg->comp_size = usr_cfg->comp_size;
cfg->frame_size = usr_cfg->frame_size;
cfg->frame_headroom = usr_cfg->frame_headroom;
cfg->flags = usr_cfg->flags;
}
static int xsk_set_xdp_socket_config(struct xsk_socket_config *cfg,
@@ -150,14 +151,66 @@ static int xsk_set_xdp_socket_config(struct xsk_socket_config *cfg,
return 0;
}
int xsk_umem__create(struct xsk_umem **umem_ptr, void *umem_area, __u64 size,
struct xsk_ring_prod *fill, struct xsk_ring_cons *comp,
const struct xsk_umem_config *usr_config)
static void xsk_mmap_offsets_v1(struct xdp_mmap_offsets *off)
{
struct xdp_mmap_offsets_v1 off_v1;
/* getsockopt on a kernel <= 5.3 has no flags fields.
* Copy over the offsets to the correct places in the >=5.4 format
* and put the flags where they would have been on that kernel.
*/
memcpy(&off_v1, off, sizeof(off_v1));
off->rx.producer = off_v1.rx.producer;
off->rx.consumer = off_v1.rx.consumer;
off->rx.desc = off_v1.rx.desc;
off->rx.flags = off_v1.rx.consumer + sizeof(__u32);
off->tx.producer = off_v1.tx.producer;
off->tx.consumer = off_v1.tx.consumer;
off->tx.desc = off_v1.tx.desc;
off->tx.flags = off_v1.tx.consumer + sizeof(__u32);
off->fr.producer = off_v1.fr.producer;
off->fr.consumer = off_v1.fr.consumer;
off->fr.desc = off_v1.fr.desc;
off->fr.flags = off_v1.fr.consumer + sizeof(__u32);
off->cr.producer = off_v1.cr.producer;
off->cr.consumer = off_v1.cr.consumer;
off->cr.desc = off_v1.cr.desc;
off->cr.flags = off_v1.cr.consumer + sizeof(__u32);
}
static int xsk_get_mmap_offsets(int fd, struct xdp_mmap_offsets *off)
{
socklen_t optlen;
int err;
optlen = sizeof(*off);
err = getsockopt(fd, SOL_XDP, XDP_MMAP_OFFSETS, off, &optlen);
if (err)
return err;
if (optlen == sizeof(*off))
return 0;
if (optlen == sizeof(struct xdp_mmap_offsets_v1)) {
xsk_mmap_offsets_v1(off);
return 0;
}
return -EINVAL;
}
int xsk_umem__create_v0_0_4(struct xsk_umem **umem_ptr, void *umem_area,
__u64 size, struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_umem_config *usr_config)
{
struct xdp_mmap_offsets off;
struct xdp_umem_reg mr;
struct xsk_umem *umem;
socklen_t optlen;
void *map;
int err;
@@ -179,10 +232,12 @@ int xsk_umem__create(struct xsk_umem **umem_ptr, void *umem_area, __u64 size,
umem->umem_area = umem_area;
xsk_set_umem_config(&umem->config, usr_config);
memset(&mr, 0, sizeof(mr));
mr.addr = (uintptr_t)umem_area;
mr.len = size;
mr.chunk_size = umem->config.frame_size;
mr.headroom = umem->config.frame_headroom;
mr.flags = umem->config.flags;
err = setsockopt(umem->fd, SOL_XDP, XDP_UMEM_REG, &mr, sizeof(mr));
if (err) {
@@ -204,17 +259,15 @@ int xsk_umem__create(struct xsk_umem **umem_ptr, void *umem_area, __u64 size,
goto out_socket;
}
optlen = sizeof(off);
err = getsockopt(umem->fd, SOL_XDP, XDP_MMAP_OFFSETS, &off, &optlen);
err = xsk_get_mmap_offsets(umem->fd, &off);
if (err) {
err = -errno;
goto out_socket;
}
map = xsk_mmap(NULL, off.fr.desc +
umem->config.fill_size * sizeof(__u64),
PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE,
umem->fd, XDP_UMEM_PGOFF_FILL_RING);
map = mmap(NULL, off.fr.desc + umem->config.fill_size * sizeof(__u64),
PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE, umem->fd,
XDP_UMEM_PGOFF_FILL_RING);
if (map == MAP_FAILED) {
err = -errno;
goto out_socket;
@@ -225,13 +278,13 @@ int xsk_umem__create(struct xsk_umem **umem_ptr, void *umem_area, __u64 size,
fill->size = umem->config.fill_size;
fill->producer = map + off.fr.producer;
fill->consumer = map + off.fr.consumer;
fill->flags = map + off.fr.flags;
fill->ring = map + off.fr.desc;
fill->cached_cons = umem->config.fill_size;
map = xsk_mmap(NULL,
off.cr.desc + umem->config.comp_size * sizeof(__u64),
PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE,
umem->fd, XDP_UMEM_PGOFF_COMPLETION_RING);
map = mmap(NULL, off.cr.desc + umem->config.comp_size * sizeof(__u64),
PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE, umem->fd,
XDP_UMEM_PGOFF_COMPLETION_RING);
if (map == MAP_FAILED) {
err = -errno;
goto out_mmap;
@@ -242,6 +295,7 @@ int xsk_umem__create(struct xsk_umem **umem_ptr, void *umem_area, __u64 size,
comp->size = umem->config.comp_size;
comp->producer = map + off.cr.producer;
comp->consumer = map + off.cr.consumer;
comp->flags = map + off.cr.flags;
comp->ring = map + off.cr.desc;
*umem_ptr = umem;
@@ -256,6 +310,29 @@ out_umem_alloc:
return err;
}
struct xsk_umem_config_v1 {
__u32 fill_size;
__u32 comp_size;
__u32 frame_size;
__u32 frame_headroom;
};
int xsk_umem__create_v0_0_2(struct xsk_umem **umem_ptr, void *umem_area,
__u64 size, struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_umem_config *usr_config)
{
struct xsk_umem_config config;
memcpy(&config, usr_config, sizeof(struct xsk_umem_config_v1));
config.flags = 0;
return xsk_umem__create_v0_0_4(umem_ptr, umem_area, size, fill, comp,
&config);
}
COMPAT_VERSION(xsk_umem__create_v0_0_2, xsk_umem__create, LIBBPF_0.0.2)
DEFAULT_VERSION(xsk_umem__create_v0_0_4, xsk_umem__create, LIBBPF_0.0.4)
static int xsk_load_xdp_prog(struct xsk_socket *xsk)
{
static const int log_buf_size = 16 * 1024;
@@ -265,42 +342,55 @@ static int xsk_load_xdp_prog(struct xsk_socket *xsk)
/* This is the C-program:
* SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
* {
* int *qidconf, index = ctx->rx_queue_index;
* int ret, index = ctx->rx_queue_index;
*
* // A set entry here means that the correspnding queue_id
* // has an active AF_XDP socket bound to it.
* qidconf = bpf_map_lookup_elem(&qidconf_map, &index);
* if (!qidconf)
* return XDP_ABORTED;
* ret = bpf_redirect_map(&xsks_map, index, XDP_PASS);
* if (ret > 0)
* return ret;
*
* if (*qidconf)
* // Fallback for pre-5.3 kernels, not supporting default
* // action in the flags parameter.
* if (bpf_map_lookup_elem(&xsks_map, &index))
* return bpf_redirect_map(&xsks_map, index, 0);
*
* return XDP_PASS;
* }
*/
struct bpf_insn prog[] = {
/* r1 = *(u32 *)(r1 + 16) */
BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_1, 16),
/* *(u32 *)(r10 - 4) = r1 */
BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_1, -4),
/* r2 = *(u32 *)(r1 + 16) */
BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, 16),
/* *(u32 *)(r10 - 4) = r2 */
BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_2, -4),
/* r1 = xskmap[] */
BPF_LD_MAP_FD(BPF_REG_1, xsk->xsks_map_fd),
/* r3 = XDP_PASS */
BPF_MOV64_IMM(BPF_REG_3, 2),
/* call bpf_redirect_map */
BPF_EMIT_CALL(BPF_FUNC_redirect_map),
/* if w0 != 0 goto pc+13 */
BPF_JMP32_IMM(BPF_JSGT, BPF_REG_0, 0, 13),
/* r2 = r10 */
BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
/* r2 += -4 */
BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -4),
BPF_LD_MAP_FD(BPF_REG_1, xsk->qidconf_map_fd),
/* r1 = xskmap[] */
BPF_LD_MAP_FD(BPF_REG_1, xsk->xsks_map_fd),
/* call bpf_map_lookup_elem */
BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
/* r1 = r0 */
BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
BPF_MOV32_IMM(BPF_REG_0, 0),
/* if r1 == 0 goto +8 */
BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0, 8),
BPF_MOV32_IMM(BPF_REG_0, 2),
/* r1 = *(u32 *)(r1 + 0) */
BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_1, 0),
/* if r1 == 0 goto +5 */
/* r0 = XDP_PASS */
BPF_MOV64_IMM(BPF_REG_0, 2),
/* if r1 == 0 goto pc+5 */
BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0, 5),
/* r2 = *(u32 *)(r10 - 4) */
BPF_LD_MAP_FD(BPF_REG_1, xsk->xsks_map_fd),
BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_10, -4),
BPF_MOV32_IMM(BPF_REG_3, 0),
/* r1 = xskmap[] */
BPF_LD_MAP_FD(BPF_REG_1, xsk->xsks_map_fd),
/* r3 = 0 */
BPF_MOV64_IMM(BPF_REG_3, 0),
/* call bpf_redirect_map */
BPF_EMIT_CALL(BPF_FUNC_redirect_map),
/* The jumps are to this instruction */
BPF_EXIT_INSN(),
@@ -311,7 +401,7 @@ static int xsk_load_xdp_prog(struct xsk_socket *xsk)
"LGPL-2.1 or BSD-2-Clause", 0, log_buf,
log_buf_size);
if (prog_fd < 0) {
pr_warning("BPF log buffer:\n%s", log_buf);
pr_warn("BPF log buffer:\n%s", log_buf);
return prog_fd;
}
@@ -327,30 +417,35 @@ static int xsk_load_xdp_prog(struct xsk_socket *xsk)
static int xsk_get_max_queues(struct xsk_socket *xsk)
{
struct ethtool_channels channels;
struct ifreq ifr;
struct ethtool_channels channels = { .cmd = ETHTOOL_GCHANNELS };
struct ifreq ifr = {};
int fd, err, ret;
fd = socket(AF_INET, SOCK_DGRAM, 0);
if (fd < 0)
return -errno;
channels.cmd = ETHTOOL_GCHANNELS;
ifr.ifr_data = (void *)&channels;
strncpy(ifr.ifr_name, xsk->ifname, IFNAMSIZ);
memcpy(ifr.ifr_name, xsk->ifname, IFNAMSIZ - 1);
ifr.ifr_name[IFNAMSIZ - 1] = '\0';
err = ioctl(fd, SIOCETHTOOL, &ifr);
if (err && errno != EOPNOTSUPP) {
ret = -errno;
goto out;
}
if (channels.max_combined == 0 || errno == EOPNOTSUPP)
if (err) {
/* If the device says it has no channels, then all traffic
* is sent to a single stream, so max queues = 1.
*/
ret = 1;
else
ret = channels.max_combined;
} else {
/* Take the max of rx, tx, combined. Drivers return
* the number of channels in different ways.
*/
ret = max(channels.max_rx, channels.max_tx);
ret = max(ret, (int)channels.max_combined);
}
out:
close(fd);
@@ -366,18 +461,11 @@ static int xsk_create_bpf_maps(struct xsk_socket *xsk)
if (max_queues < 0)
return max_queues;
fd = bpf_create_map_name(BPF_MAP_TYPE_ARRAY, "qidconf_map",
fd = bpf_create_map_name(BPF_MAP_TYPE_XSKMAP, "xsks_map",
sizeof(int), sizeof(int), max_queues, 0);
if (fd < 0)
return fd;
xsk->qidconf_map_fd = fd;
fd = bpf_create_map_name(BPF_MAP_TYPE_XSKMAP, "xsks_map",
sizeof(int), sizeof(int), max_queues, 0);
if (fd < 0) {
close(xsk->qidconf_map_fd);
return fd;
}
xsk->xsks_map_fd = fd;
return 0;
@@ -385,10 +473,8 @@ static int xsk_create_bpf_maps(struct xsk_socket *xsk)
static void xsk_delete_bpf_maps(struct xsk_socket *xsk)
{
close(xsk->qidconf_map_fd);
bpf_map_delete_elem(xsk->xsks_map_fd, &xsk->queue_id);
close(xsk->xsks_map_fd);
xsk->qidconf_map_fd = -1;
xsk->xsks_map_fd = -1;
}
static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
@@ -417,10 +503,9 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
if (err)
goto out_map_ids;
for (i = 0; i < prog_info.nr_map_ids; i++) {
if (xsk->qidconf_map_fd != -1 && xsk->xsks_map_fd != -1)
break;
xsk->xsks_map_fd = -1;
for (i = 0; i < prog_info.nr_map_ids; i++) {
fd = bpf_map_get_fd_by_id(map_ids[i]);
if (fd < 0)
continue;
@@ -431,11 +516,6 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
continue;
}
if (!strcmp(map_info.name, "qidconf_map")) {
xsk->qidconf_map_fd = fd;
continue;
}
if (!strcmp(map_info.name, "xsks_map")) {
xsk->xsks_map_fd = fd;
continue;
@@ -445,40 +525,18 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
}
err = 0;
if (xsk->qidconf_map_fd < 0 || xsk->xsks_map_fd < 0) {
if (xsk->xsks_map_fd == -1)
err = -ENOENT;
xsk_delete_bpf_maps(xsk);
}
out_map_ids:
free(map_ids);
return err;
}
static void xsk_clear_bpf_maps(struct xsk_socket *xsk)
{
int qid = false;
bpf_map_update_elem(xsk->qidconf_map_fd, &xsk->queue_id, &qid, 0);
bpf_map_delete_elem(xsk->xsks_map_fd, &xsk->queue_id);
}
static int xsk_set_bpf_maps(struct xsk_socket *xsk)
{
int qid = true, fd = xsk->fd, err;
err = bpf_map_update_elem(xsk->qidconf_map_fd, &xsk->queue_id, &qid, 0);
if (err)
goto out;
err = bpf_map_update_elem(xsk->xsks_map_fd, &xsk->queue_id, &fd, 0);
if (err)
goto out;
return 0;
out:
xsk_clear_bpf_maps(xsk);
return err;
return bpf_map_update_elem(xsk->xsks_map_fd, &xsk->queue_id,
&xsk->fd, 0);
}
static int xsk_setup_xdp_prog(struct xsk_socket *xsk)
@@ -497,26 +555,30 @@ static int xsk_setup_xdp_prog(struct xsk_socket *xsk)
return err;
err = xsk_load_xdp_prog(xsk);
if (err)
goto out_maps;
if (err) {
xsk_delete_bpf_maps(xsk);
return err;
}
} else {
xsk->prog_fd = bpf_prog_get_fd_by_id(prog_id);
if (xsk->prog_fd < 0)
return -errno;
err = xsk_lookup_bpf_maps(xsk);
if (err)
goto out_load;
if (err) {
close(xsk->prog_fd);
return err;
}
}
err = xsk_set_bpf_maps(xsk);
if (err)
goto out_load;
if (xsk->rx)
err = xsk_set_bpf_maps(xsk);
if (err) {
xsk_delete_bpf_maps(xsk);
close(xsk->prog_fd);
return err;
}
return 0;
out_load:
close(xsk->prog_fd);
out_maps:
xsk_delete_bpf_maps(xsk);
return err;
}
int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
@@ -528,21 +590,26 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
struct sockaddr_xdp sxdp = {};
struct xdp_mmap_offsets off;
struct xsk_socket *xsk;
socklen_t optlen;
int err;
if (!umem || !xsk_ptr || !rx || !tx)
if (!umem || !xsk_ptr || !(rx || tx))
return -EFAULT;
if (umem->refcount) {
pr_warning("Error: shared umems not supported by libbpf.\n");
return -EBUSY;
}
xsk = calloc(1, sizeof(*xsk));
if (!xsk)
return -ENOMEM;
err = xsk_set_xdp_socket_config(&xsk->config, usr_config);
if (err)
goto out_xsk_alloc;
if (umem->refcount &&
!(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
pr_warn("Error: shared umems not supported by libbpf supplied XDP program.\n");
err = -EBUSY;
goto out_xsk_alloc;
}
if (umem->refcount++ > 0) {
xsk->fd = socket(AF_XDP, SOCK_RAW, 0);
if (xsk->fd < 0) {
@@ -561,11 +628,8 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
err = -errno;
goto out_socket;
}
strncpy(xsk->ifname, ifname, IFNAMSIZ);
err = xsk_set_xdp_socket_config(&xsk->config, usr_config);
if (err)
goto out_socket;
memcpy(xsk->ifname, ifname, IFNAMSIZ - 1);
xsk->ifname[IFNAMSIZ - 1] = '\0';
if (rx) {
err = setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING,
@@ -586,19 +650,17 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
}
}
optlen = sizeof(off);
err = getsockopt(xsk->fd, SOL_XDP, XDP_MMAP_OFFSETS, &off, &optlen);
err = xsk_get_mmap_offsets(xsk->fd, &off);
if (err) {
err = -errno;
goto out_socket;
}
if (rx) {
rx_map = xsk_mmap(NULL, off.rx.desc +
xsk->config.rx_size * sizeof(struct xdp_desc),
PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_POPULATE,
xsk->fd, XDP_PGOFF_RX_RING);
rx_map = mmap(NULL, off.rx.desc +
xsk->config.rx_size * sizeof(struct xdp_desc),
PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE,
xsk->fd, XDP_PGOFF_RX_RING);
if (rx_map == MAP_FAILED) {
err = -errno;
goto out_socket;
@@ -608,16 +670,16 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
rx->size = xsk->config.rx_size;
rx->producer = rx_map + off.rx.producer;
rx->consumer = rx_map + off.rx.consumer;
rx->flags = rx_map + off.rx.flags;
rx->ring = rx_map + off.rx.desc;
}
xsk->rx = rx;
if (tx) {
tx_map = xsk_mmap(NULL, off.tx.desc +
xsk->config.tx_size * sizeof(struct xdp_desc),
PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_POPULATE,
xsk->fd, XDP_PGOFF_TX_RING);
tx_map = mmap(NULL, off.tx.desc +
xsk->config.tx_size * sizeof(struct xdp_desc),
PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE,
xsk->fd, XDP_PGOFF_TX_RING);
if (tx_map == MAP_FAILED) {
err = -errno;
goto out_mmap_rx;
@@ -627,6 +689,7 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
tx->size = xsk->config.tx_size;
tx->producer = tx_map + off.tx.producer;
tx->consumer = tx_map + off.tx.consumer;
tx->flags = tx_map + off.tx.flags;
tx->ring = tx_map + off.tx.desc;
tx->cached_cons = xsk->config.tx_size;
}
@@ -635,7 +698,12 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
sxdp.sxdp_family = PF_XDP;
sxdp.sxdp_ifindex = xsk->ifindex;
sxdp.sxdp_queue_id = xsk->queue_id;
sxdp.sxdp_flags = xsk->config.bind_flags;
if (umem->refcount > 1) {
sxdp.sxdp_flags = XDP_SHARED_UMEM;
sxdp.sxdp_shared_umem_fd = umem->fd;
} else {
sxdp.sxdp_flags = xsk->config.bind_flags;
}
err = bind(xsk->fd, (struct sockaddr *)&sxdp, sizeof(sxdp));
if (err) {
@@ -643,8 +711,7 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
goto out_mmap_tx;
}
xsk->qidconf_map_fd = -1;
xsk->xsks_map_fd = -1;
xsk->prog_fd = -1;
if (!(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
err = xsk_setup_xdp_prog(xsk);
@@ -674,7 +741,6 @@ out_xsk_alloc:
int xsk_umem__delete(struct xsk_umem *umem)
{
struct xdp_mmap_offsets off;
socklen_t optlen;
int err;
if (!umem)
@@ -683,8 +749,7 @@ int xsk_umem__delete(struct xsk_umem *umem)
if (umem->refcount)
return -EBUSY;
optlen = sizeof(off);
err = getsockopt(umem->fd, SOL_XDP, XDP_MMAP_OFFSETS, &off, &optlen);
err = xsk_get_mmap_offsets(umem->fd, &off);
if (!err) {
munmap(umem->fill->ring - off.fr.desc,
off.fr.desc + umem->config.fill_size * sizeof(__u64));
@@ -702,17 +767,17 @@ void xsk_socket__delete(struct xsk_socket *xsk)
{
size_t desc_sz = sizeof(struct xdp_desc);
struct xdp_mmap_offsets off;
socklen_t optlen;
int err;
if (!xsk)
return;
xsk_clear_bpf_maps(xsk);
xsk_delete_bpf_maps(xsk);
if (xsk->prog_fd != -1) {
xsk_delete_bpf_maps(xsk);
close(xsk->prog_fd);
}
optlen = sizeof(off);
err = getsockopt(xsk->fd, SOL_XDP, XDP_MMAP_OFFSETS, &off, &optlen);
err = xsk_get_mmap_offsets(xsk->fd, &off);
if (!err) {
if (xsk->rx) {
munmap(xsk->rx->ring - off.rx.desc,

View File

@@ -32,6 +32,7 @@ struct name { \
__u32 *producer; \
__u32 *consumer; \
void *ring; \
__u32 *flags; \
}
DEFINE_XSK_RING(xsk_ring_prod);
@@ -76,6 +77,11 @@ xsk_ring_cons__rx_desc(const struct xsk_ring_cons *rx, __u32 idx)
return &descs[idx & rx->mask];
}
static inline int xsk_ring_prod__needs_wakeup(const struct xsk_ring_prod *r)
{
return *r->flags & XDP_RING_NEED_WAKEUP;
}
static inline __u32 xsk_prod_nb_free(struct xsk_ring_prod *r, __u32 nb)
{
__u32 free_entries = r->cached_cons - r->cached_prod;
@@ -162,20 +168,37 @@ static inline void *xsk_umem__get_data(void *umem_area, __u64 addr)
return &((char *)umem_area)[addr];
}
static inline __u64 xsk_umem__extract_addr(__u64 addr)
{
return addr & XSK_UNALIGNED_BUF_ADDR_MASK;
}
static inline __u64 xsk_umem__extract_offset(__u64 addr)
{
return addr >> XSK_UNALIGNED_BUF_OFFSET_SHIFT;
}
static inline __u64 xsk_umem__add_offset_to_addr(__u64 addr)
{
return xsk_umem__extract_addr(addr) + xsk_umem__extract_offset(addr);
}
LIBBPF_API int xsk_umem__fd(const struct xsk_umem *umem);
LIBBPF_API int xsk_socket__fd(const struct xsk_socket *xsk);
#define XSK_RING_CONS__DEFAULT_NUM_DESCS 2048
#define XSK_RING_PROD__DEFAULT_NUM_DESCS 2048
#define XSK_UMEM__DEFAULT_FRAME_SHIFT 11 /* 2048 bytes */
#define XSK_UMEM__DEFAULT_FRAME_SHIFT 12 /* 4096 bytes */
#define XSK_UMEM__DEFAULT_FRAME_SIZE (1 << XSK_UMEM__DEFAULT_FRAME_SHIFT)
#define XSK_UMEM__DEFAULT_FRAME_HEADROOM 0
#define XSK_UMEM__DEFAULT_FLAGS 0
struct xsk_umem_config {
__u32 fill_size;
__u32 comp_size;
__u32 frame_size;
__u32 frame_headroom;
__u32 flags;
};
/* Flags for the libbpf_flags field. */
@@ -195,6 +218,16 @@ LIBBPF_API int xsk_umem__create(struct xsk_umem **umem,
struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_umem_config *config);
LIBBPF_API int xsk_umem__create_v0_0_2(struct xsk_umem **umem,
void *umem_area, __u64 size,
struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_umem_config *config);
LIBBPF_API int xsk_umem__create_v0_0_4(struct xsk_umem **umem,
void *umem_area, __u64 size,
struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_umem_config *config);
LIBBPF_API int xsk_socket__create(struct xsk_socket **xsk,
const char *ifname, __u32 queue_id,
struct xsk_umem *umem,

View File

@@ -13,6 +13,10 @@ function info() {
echo -e "\033[33;1m$1\033[0m"
}
function error() {
echo -e "\033[31;1m$1\033[0m"
}
function docker_exec() {
docker exec $ENV_VARS -it $CONT_NAME "$@"
}
@@ -26,43 +30,43 @@ for phase in "${PHASES[@]}"; do
SETUP)
info "Setup phase"
info "Using Debian $DEBIAN_RELEASE"
docker pull debian:$DEBIAN_RELEASE
docker pull debian:$DEBIAN_RELEASE
info "Starting container $CONT_NAME"
$DOCKER_RUN -v $REPO_ROOT:/build:rw \
-w /build --privileged=true --name $CONT_NAME \
-dit --net=host debian:$DEBIAN_RELEASE /bin/bash
-dit --net=host debian:$DEBIAN_RELEASE /bin/bash
docker_exec bash -c "echo deb-src http://deb.debian.org/debian $DEBIAN_RELEASE main >>/etc/apt/sources.list"
docker_exec apt-get -y update
docker_exec apt-get -y build-dep libelf-dev
docker_exec apt-get -y install libelf-dev
docker_exec apt-get -y install "${ADDITIONAL_DEPS[@]}"
;;
RUN|RUN_CLANG|RUN_GCC8)
if [[ "$phase" = "RUN_CLANG" ]]; then
RUN|RUN_CLANG|RUN_GCC8|RUN_ASAN|RUN_CLANG_ASAN|RUN_GCC8_ASAN)
if [[ "$phase" = *"CLANG"* ]]; then
ENV_VARS="-e CC=clang -e CXX=clang++"
CC="clang"
elif [[ "$phase" = "RUN_GCC8" ]]; then
elif [[ "$phase" = *"GCC8"* ]]; then
ENV_VARS="-e CC=gcc-8 -e CXX=g++-8"
CC="gcc-8"
else
CFLAGS="${CFLAGS} -Wno-stringop-truncation"
fi
docker_exec mkdir build
docker_exec ${CC:-cc} --version
docker_exec make CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
docker_exec rm -rf build
;;
RUN_ASAN|RUN_CLANG_ASAN|RUN_GCC8_ASAN)
if [[ "$phase" = "RUN_CLANG_ASAN" ]]; then
ENV_VARS="-e CC=clang -e CXX=clang++"
CC="clang"
elif [[ "$phase" = "RUN_GCC8_ASAN" ]]; then
ENV_VARS="-e CC=gcc-8 -e CXX=g++-8"
CC="gcc-8"
if [[ "$phase" = *"ASAN"* ]]; then
CFLAGS="${CFLAGS} -fsanitize=address,undefined"
fi
CFLAGS="${CFLAGS} -fsanitize=address,undefined"
docker_exec mkdir build
docker_exec mkdir build install
docker_exec ${CC:-cc} --version
info "build"
docker_exec make CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
docker_exec rm -rf build
info "ldd build/libbpf.so:"
docker_exec ldd build/libbpf.so
if ! docker_exec ldd build/libbpf.so | grep -q libelf; then
error "No reference to libelf.so in libbpf.so!"
exit 1
fi
info "install"
docker_exec make -C src OBJDIR=../build DESTDIR=../install install
docker_exec rm -rf build install
;;
CLEANUP)
info "Cleanup phase"

27
travis-ci/managers/ubuntu.sh Executable file
View File

@@ -0,0 +1,27 @@
#!/bin/bash
set -e
set -x
RELEASE="bionic"
echo "deb-src http://archive.ubuntu.com/ubuntu/ $RELEASE main restricted universe multiverse" >>/etc/apt/sources.list
apt-get update
apt-get -y build-dep libelf-dev
apt-get install -y libelf-dev pkg-config
source "$(dirname $0)/travis_wait.bash"
cd $REPO_ROOT
CFLAGS="-g -O2 -Werror -Wall -fsanitize=address,undefined"
mkdir build install
cc --version
make CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
ldd build/libbpf.so
if ! ldd build/libbpf.so | grep -q libelf; then
echo "FAIL: No reference to libelf.so in libbpf.so!"
exit 1
fi
make -C src OBJDIR=../build DESTDIR=../install install
rm -rf build install

View File

@@ -1,17 +0,0 @@
#!/bin/bash
set -e
set -x
apt-get update
apt-get -y build-dep libelf-dev
apt-get install -y libelf-dev
source "$(dirname $0)/travis_wait.bash"
cd $REPO_ROOT
CFLAGS="-g -O2 -Werror -Wall -fsanitize=address,undefined"
mkdir build
cc --version
make CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
rm -rf build

View File

@@ -0,0 +1,8 @@
#!/bin/bash
set -eux
GIT_FETCH_DEPTH="${GIT_FETCH_DEPTH}" ${VMTEST_ROOT}/checkout_latest_kernel.sh $1
cd $1
cp ${VMTEST_ROOT}/configs/latest.config .config
make -j $((4*$(nproc))) olddefconfig all

View File

@@ -0,0 +1,20 @@
#!/bin/bash
LIBBPF_PATH="${REPO_ROOT}"
REPO_PATH="travis-ci/vmtest/bpf-next"
make \
CLANG=clang-10 \
LLC=llc-10 \
LLVM_STRIP=llvm-strip-10 \
VMLINUX_BTF="${VMLINUX_BTF}" \
-C "${REPO_ROOT}/${REPO_PATH}/tools/testing/selftests/bpf" \
-j $((4*$(nproc)))
mkdir ${LIBBPF_PATH}/selftests
cp -R "${REPO_ROOT}/${REPO_PATH}/tools/testing/selftests/bpf" \
${LIBBPF_PATH}/selftests
cd ${LIBBPF_PATH}
rm selftests/bpf/.gitignore
git add selftests
blacklist_path="${VMTEST_ROOT}/configs/blacklist"
git add "${blacklist_path}"

View File

@@ -0,0 +1,24 @@
#!/bin/bash
set -eux
CWD=$(pwd)
LIBBPF_PATH=$(pwd)
REPO_PATH=$1
BPF_NEXT_ORIGIN=https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
LINUX_SHA=$(cat ${LIBBPF_PATH}/CHECKPOINT-COMMIT)
echo REPO_PATH = ${REPO_PATH}
echo LINUX_SHA = ${LINUX_SHA}
if [ ! -d "${REPO_PATH}" ]; then
mkdir -p ${REPO_PATH}
cd ${REPO_PATH}
git init
git remote add bpf-next ${BPF_NEXT_ORIGIN}
git fetch --depth ${GIT_FETCH_DEPTH} bpf-next
git reset --hard ${LINUX_SHA}
else
cd ${REPO_PATH}
fi

View File

@@ -0,0 +1,6 @@
INDEX https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/INDEX
libbpf-vmtest-rootfs-2020.01.10.tar.zst https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/libbpf-vmtest-rootfs-2020.01.10.tar.zst
vmlinux-5.5.0-rc6.zst https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinux-5.5.0-rc6.zst
vmlinux-5.5.0.zst https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinux-5.5.0.zst
vmlinuz-5.5.0-rc6 https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinuz-5.5.0-rc6
vmlinuz-5.5.0 https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinuz-5.5.0

View File

@@ -0,0 +1,21 @@
mmap
dctcp
cubic
bpf_tcp_ca
bpf_verif_scale
cgroup_attach
pinning
send_signal_tracepoint_thread
test_syncookie
select_reuseport
send_signal
sockopt_inherit
strobemeta_nounroll2
stacktrace_build_id
tp_attach_query
tcp_rtt
task_fd_query_tp
stacktrace_map
test_global_funcs
skb_ctx
fexit_bpf2bpf

View File

@@ -0,0 +1,18 @@
mmap
dctcp
cubic
bpf_tcp_ca
bpf_verif_scale
cgroup_attach
pinning
send_signal_tracepoint_thread
test_syncookie
select_reuseport
send_signal
sockopt_inherit
strobemeta_nounroll2
stacktrace_build_id
tp_attach_query
tcp_rtt
task_fd_query_tp
stacktrace_map

View File

@@ -0,0 +1,23 @@
mmap
dctcp
cubic
bpf_tcp_ca
bpf_verif_scale
cgroup_attach
pinning
send_signal_tracepoint_thread
test_syncookie
select_reuseport
send_signal
sockopt_inherit
strobemeta_nounroll2
stacktrace_build_id
tp_attach_query
tcp_rtt
task_fd_query_tp
stacktrace_map
fentry
test_overhead
kfree_skb
fexit_stress
fexit_test

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,11 @@
#!/bin/bash
set -eux
GIT_FETCH_DEPTH="${GIT_FETCH_DEPTH}" ${VMTEST_ROOT}/checkout_latest_kernel.sh $1
# Fix runqslower build
# TODO(hex@): remove after the patch is merged from bpf to bpf-next tree
cd $1
wget https://lore.kernel.org/bpf/908498f794661c44dca54da9e09dc0c382df6fcb.1580425879.git.hex@fb.com/t.mbox.gz
gunzip t.mbox.gz
git apply t.mbox

438
travis-ci/vmtest/run.sh Executable file
View File

@@ -0,0 +1,438 @@
#!/bin/bash
set -uo pipefail
trap 'exit 2' ERR
usage () {
USAGE_STRING="usage: $0 [-k KERNELRELEASE|-b DIR] [[-r ROOTFSVERSION] [-fo]|-I] [-Si] [-d DIR] IMG
$0 [-k KERNELRELEASE] -l
$0 -h
Run "${PROJECT_NAME}" tests in a virtual machine.
This exits with status 0 on success, 1 if the virtual machine ran successfully
but tests failed, and 2 if we encountered a fatal error.
This script uses sudo to mount and modify the disk image.
Arguments:
IMG path of virtual machine disk image to create
Versions:
-k, --kernel=KERNELRELEASE
kernel release to test. This is a glob pattern; the
newest (sorted by version number) release that matches
the pattern is used (default: newest available release)
-b, --build DIR use the kernel built in the given directory. This option
cannot be combined with -k
-r, --rootfs=ROOTFSVERSION
version of root filesystem to use (default: newest
available version)
Setup:
-f, --force overwrite IMG if it already exists
-o, --one-shot one-shot mode. By default, this script saves a clean copy
of the downloaded root filesystem image and vmlinux and
makes a copy (reflinked, when possible) for executing the
virtual machine. This allows subsequent runs to skip
downloading these files. If this option is given, the
root filesystem image and vmlinux are always
re-downloaded and are not saved. This option implies -f
-s, --setup-cmd setup commands run on VM boot. Whitespace characters
should be escaped with preceding '\'.
-I, --skip-image skip creating the disk image; use the existing one at
IMG. This option cannot be combined with -r, -f, or -o
-S, --skip-source skip copying the source files and init scripts
Miscellaneous:
-i, --interactive interactive mode. Boot the virtual machine into an
interactive shell instead of automatically running tests
-d, --dir=DIR working directory to use for downloading and caching
files (default: current working directory)
-l, --list list available kernel releases instead of running tests.
The list may be filtered with -k
-h, --help display this help message and exit"
case "$1" in
out)
echo "$USAGE_STRING"
exit 0
;;
err)
echo "$USAGE_STRING" >&2
exit 2
;;
esac
}
TEMP=$(getopt -o 'k:b:r:fos:ISid:lh' --long 'kernel:,build:,rootfs:,force,one-shot,setup-cmd,skip-image,skip-source:,interactive,dir:,list,help' -n "$0" -- "$@")
eval set -- "$TEMP"
unset TEMP
unset KERNELRELEASE
unset BUILDDIR
unset ROOTFSVERSION
unset IMG
unset SETUPCMD
FORCE=0
ONESHOT=0
SKIPIMG=0
SKIPSOURCE=0
APPEND=""
DIR="$PWD"
LIST=0
while true; do
case "$1" in
-k|--kernel)
KERNELRELEASE="$2"
shift 2
;;
-b|--build)
BUILDDIR="$2"
shift 2
;;
-r|--rootfs)
ROOTFSVERSION="$2"
shift 2
;;
-f|--force)
FORCE=1
shift
;;
-o|--one-shot)
ONESHOT=1
FORCE=1
shift
;;
-s|--setup-cmd)
SETUPCMD="$2"
shift 2
;;
-I|--skip-image)
SKIPIMG=1
shift
;;
-S|--skip-source)
SKIPSOURCE=1
shift
;;
-i|--interactive)
APPEND=" single"
shift
;;
-d|--dir)
DIR="$2"
shift 2
;;
-l|--list)
LIST=1
;;
-h|--help)
usage out
;;
--)
shift
break
;;
*)
usage err
;;
esac
done
if [[ -v BUILDDIR ]]; then
if [[ -v KERNELRELEASE ]]; then
usage err
fi
elif [[ ! -v KERNELRELEASE ]]; then
KERNELRELEASE='*'
fi
if [[ $SKIPIMG -ne 0 && ( -v ROOTFSVERSION || $FORCE -ne 0 ) ]]; then
usage err
fi
if (( LIST )); then
if [[ $# -ne 0 || -v BUILDDIR || -v ROOTFSVERSION || $FORCE -ne 0 ||
$SKIPIMG -ne 0 || $SKIPSOURCE -ne 0 || -n $APPEND ]]; then
usage err
fi
else
if [[ $# -ne 1 ]]; then
usage err
fi
IMG="${!OPTIND}"
fi
unset URLS
cache_urls() {
if ! declare -p URLS &> /dev/null; then
# This URL contains a mapping from file names to URLs where
# those files can be downloaded.
declare -gA URLS
while IFS=$'\t' read -r name url; do
URLS["$name"]="$url"
done < <(cat "${VMTEST_ROOT}/configs/INDEX")
fi
}
matching_kernel_releases() {
local pattern="$1"
{
for file in "${!URLS[@]}"; do
if [[ $file =~ ^vmlinux-(.*).zst$ ]]; then
release="${BASH_REMATCH[1]}"
case "$release" in
$pattern)
# sort -V handles rc versions properly
# if we use "~" instead of "-".
echo "${release//-rc/~rc}"
;;
esac
fi
done
} | sort -rV | sed 's/~rc/-rc/g'
}
newest_rootfs_version() {
{
for file in "${!URLS[@]}"; do
if [[ $file =~ ^${PROJECT_NAME}-vmtest-rootfs-(.*)\.tar\.zst$ ]]; then
echo "${BASH_REMATCH[1]}"
fi
done
} | sort -rV | head -1
}
download() {
local file="$1"
cache_urls
if [[ ! -v URLS[$file] ]]; then
echo "$file not found" >&2
return 1
fi
echo "Downloading $file..." >&2
curl -Lf "${URLS[$file]}" "${@:2}"
}
set_nocow() {
touch "$@"
chattr +C "$@" >/dev/null 2>&1 || true
}
cp_img() {
set_nocow "$2"
cp --reflink=auto "$1" "$2"
}
create_rootfs_img() {
local path="$1"
set_nocow "$path"
truncate -s 2G "$path"
mkfs.ext4 -q "$path"
}
download_rootfs() {
local rootfsversion="$1"
local dir="$2"
download "${PROJECT_NAME}-vmtest-rootfs-$rootfsversion.tar.zst" |
zstd -d | sudo tar -C "$dir" -x
}
if (( LIST )); then
cache_urls
matching_kernel_releases "$KERNELRELEASE"
exit 0
fi
if [[ $FORCE -eq 0 && $SKIPIMG -eq 0 && -e $IMG ]]; then
echo "$IMG already exists; use -f to overwrite it or -I to reuse it" >&2
exit 1
fi
# Only go to the network if it's actually a glob pattern.
if [[ -v BUILDDIR ]]; then
KERNELRELEASE="$(make -C "$BUILDDIR" -s kernelrelease)"
elif [[ ! $KERNELRELEASE =~ ^([^\\*?[]|\\[*?[])*\\?$ ]]; then
# We need to cache the list of URLs outside of the command
# substitution, which happens in a subshell.
cache_urls
KERNELRELEASE="$(matching_kernel_releases "$KERNELRELEASE" | head -1)"
if [[ -z $KERNELRELEASE ]]; then
echo "No matching kernel release found" >&2
exit 1
fi
fi
if [[ $SKIPIMG -eq 0 && ! -v ROOTFSVERSION ]]; then
cache_urls
ROOTFSVERSION="$(newest_rootfs_version)"
fi
echo "Kernel release: $KERNELRELEASE" >&2
if (( SKIPIMG )); then
echo "Not extracting root filesystem" >&2
else
echo "Root filesystem version: $ROOTFSVERSION" >&2
fi
echo "Disk image: $IMG" >&2
tmp=
ARCH_DIR="$DIR/x86_64"
mkdir -p "$ARCH_DIR"
mnt="$(mktemp -d -p "$DIR" mnt.XXXXXXXXXX)"
cleanup() {
if [[ -n $tmp ]]; then
rm -f "$tmp" || true
fi
if mountpoint -q "$mnt"; then
sudo umount "$mnt" || true
fi
if [[ -d "$mnt" ]]; then
rmdir "$mnt" || true
fi
}
trap cleanup EXIT
if [[ -v BUILDDIR ]]; then
vmlinuz="$BUILDDIR/$(make -C "$BUILDDIR" -s image_name)"
else
vmlinuz="${ARCH_DIR}/vmlinuz-${KERNELRELEASE}"
if [[ ! -e $vmlinuz ]]; then
tmp="$(mktemp "$vmlinuz.XXX.part")"
download "vmlinuz-${KERNELRELEASE}" -o "$tmp"
mv "$tmp" "$vmlinuz"
tmp=
fi
fi
# Mount and set up the rootfs image.
if (( ONESHOT )); then
rm -f "$IMG"
create_rootfs_img "$IMG"
sudo mount -o loop "$IMG" "$mnt"
download_rootfs "$ROOTFSVERSION" "$mnt"
else
if (( ! SKIPIMG )); then
rootfs_img="${ARCH_DIR}/${PROJECT_NAME}-vmtest-rootfs-${ROOTFSVERSION}.img"
if [[ ! -e $rootfs_img ]]; then
tmp="$(mktemp "$rootfs_img.XXX.part")"
set_nocow "$tmp"
truncate -s 2G "$tmp"
mkfs.ext4 -q "$tmp"
sudo mount -o loop "$tmp" "$mnt"
download_rootfs "$ROOTFSVERSION" "$mnt"
sudo umount "$mnt"
mv "$tmp" "$rootfs_img"
tmp=
fi
rm -f "$IMG"
cp_img "$rootfs_img" "$IMG"
fi
sudo mount -o loop "$IMG" "$mnt"
fi
# Install vmlinux.
vmlinux="$mnt/boot/vmlinux-${KERNELRELEASE}"
if [[ -v BUILDDIR || $ONESHOT -eq 0 ]]; then
if [[ -v BUILDDIR ]]; then
source_vmlinux="${BUILDDIR}/vmlinux"
else
source_vmlinux="${ARCH_DIR}/vmlinux-${KERNELRELEASE}"
if [[ ! -e $source_vmlinux ]]; then
tmp="$(mktemp "$source_vmlinux.XXX.part")"
download "vmlinux-${KERNELRELEASE}.zst" | zstd -dfo "$tmp"
mv "$tmp" "$source_vmlinux"
tmp=
fi
fi
echo "Copying vmlinux..." >&2
sudo rsync -cp --chmod 0644 "$source_vmlinux" "$vmlinux"
else
# We could use "sudo zstd -o", but let's not run zstd as root with
# input from the internet.
download "vmlinux-${KERNELRELEASE}.zst" |
zstd -d | sudo tee "$vmlinux" > /dev/null
sudo chmod 644 "$vmlinux"
fi
LIBBPF_PATH="${REPO_ROOT}" \
REPO_PATH="travis-ci/vmtest/bpf-next" \
VMTEST_ROOT="${VMTEST_ROOT}" \
VMLINUX_BTF=${vmlinux} ${VMTEST_ROOT}/build_selftests.sh
if (( SKIPSOURCE )); then
echo "Not copying source files..." >&2
else
echo "Copying source files..." >&2
# Copy the source files in.
sudo mkdir -p -m 0755 "$mnt/${PROJECT_NAME}"
{
if [[ -e .git ]]; then
git ls-files -z
else
tr '\n' '\0' < "${PROJECT_NAME}.egg-info/SOURCES.txt"
fi
} | sudo rsync --files-from=- -0cpt . "$mnt/${PROJECT_NAME}"
fi
setup_script="#!/bin/sh
echo 'Skipping setup commands'
echo 0 > /exitstatus
chmod 644 /exitstatus"
# Create the init scripts.
if [[ ! -z SETUPCMD ]]; then
# Unescape whitespace characters.
setup_cmd=$(sed 's/\(\\\)\([[:space:]]\)/\2/g' <<< "${SETUPCMD}")
kernel="${KERNELRELEASE}"
if [[ -v BUILDDIR ]]; then kernel='latest'; fi
setup_envvars="export KERNEL=${kernel}"
setup_script=$(printf "#!/bin/sh
set -e
echo 'Running setup commands'
%s
%s
echo $? > /exitstatus
chmod 644 /exitstatus" "${setup_envvars}" "${setup_cmd}")
fi
echo "${setup_script}" | sudo tee "$mnt/etc/rcS.d/S50-run-tests" > /dev/null
sudo chmod 755 "$mnt/etc/rcS.d/S50-run-tests"
poweroff_script="#!/bin/sh
poweroff"
echo "${poweroff_script}" | sudo tee "$mnt/etc/rcS.d/S99-poweroff" > /dev/null
sudo chmod 755 "$mnt/etc/rcS.d/S99-poweroff"
sudo umount "$mnt"
echo "Starting virtual machine..." >&2
qemu-system-x86_64 -nodefaults -display none -serial mon:stdio \
-cpu kvm64 -enable-kvm -smp "$(nproc)" -m 2G \
-drive file="$IMG",format=raw,index=1,media=disk,if=virtio,cache=none \
-kernel "$vmlinuz" -append "root=/dev/vda rw console=ttyS0,115200$APPEND"
sudo mount -o loop "$IMG" "$mnt"
if exitstatus="$(cat "$mnt/exitstatus" 2>/dev/null)"; then
printf '\nTests exit status: %s\n' "$exitstatus" >&2
else
printf '\nCould not read tests exit status\n' >&2
exitstatus=1
fi
sudo umount "$mnt"
exit "$exitstatus"

View File

@@ -0,0 +1,19 @@
#!/bin/bash
set -eux
configs_path='libbpf/travis-ci/vmtest/configs'
blacklist_path="$configs_path/blacklist/BLACKLIST-${KERNEL}"
if [[ -s "${blacklist_path}" ]]; then
BLACKLIST=$(cat "${blacklist_path}" | tr '\n' ',')
fi
whitelist_path="$configs_path/whitelist/WHITELIST-${KERNEL}"
if [[ -s "${whitelist_path}" ]]; then
WHITELIST=$(cat "${whitelist_path}" | tr '\n' ',')
fi
cd libbpf/selftests/bpf
echo TEST_PROGS
./test_progs ${BLACKLIST:+-b$BLACKLIST} ${WHITELIST:+-t$WHITELIST}

View File

@@ -0,0 +1,11 @@
#!/bin/sh
# An example of a script run on VM boot.
# To execute it in TravisCI set VMTEST_SETUPCMD env var of .travis.yml in
# libbpf root folder, e.g.
# VMTEST_SETUPCMD="./${PROJECT_NAME}/travis-ci/vmtest/setup_example.sh"
if [ ! -z "${PROJECT_NAME}" ]; then
echo "Running ${PROJECT_NAME} setup scripts..."
fi
echo "Hello, ${USER}!"