It's quite common on some systems to have more CPUs enlisted as "possible",
than there are (and could ever be) present/online CPUs. In such cases,
perf_buffer creationg will fail due to inability to create perf event on
missing CPU with error like this:
libbpf: failed to open perf buffer event on cpu #16: No such device
This patch fixes the logic of perf_buffer__new() to ignore CPUs that are
missing or currently offline. In rare cases where user explicitly listed
specific CPUs to connect to, behavior is unchanged: libbpf will try to open
perf event buffer on specified CPU(s) anyways.
Fixes: fb84b8224655 ("libbpf: add perf buffer API")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191212013609.1691168-1-andriin@fb.com
This logic is re-used for parsing a set of online CPUs. Having it as an
isolated piece of code working with input string makes it conveninent to test
this logic as well. While refactoring, also improve the robustness of original
implementation.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191212013548.1690564-1-andriin@fb.com
Allow loading BPF object files that contain SK_REUSEPORT programs without
having to manually set the program type before load if the the section name
is set to "sk_reuseport".
Makes user-space code needed to load SK_REUSEPORT BPF program more concise.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191212102259.418536-2-jakub@cloudflare.com
A follow up of [1]
Travis CI stages use default phases when no override provided.
This leads to Coverity scan stage fail due to execuing the default
before_script: phase of VMTEST.
Fix this with an explicit override with empty value.
[1] https://github.com/libbpf/libbpf/pull/108
Extend continuous integration tests by adding testing against various kernel
versions.
The code is based on vmtest CI scripts implemented by osandov@
for drgn [1] with the following modifications:
- The downloadables are stored in Amazon S3 cloud indexed in [2]
- `--setup-cmd` command line option is added to vmtest/run.sh so
setup commands run on VM boot can be set in e.g. `.travis.yml`
- Travis build matrix [2] is introduced for VM tests so VM tests are
followed by the existing CI tests. The matrix has `KERNEL` and
`VMTEST_SETUPCMD` dimensions.
- Minor style fixes.
The vmtest extention code is located in travis-ci/vmtest and contains
`run.sh` and `setup_example.sh`
- `run.sh` is responsible for the vmtest workflow: downloading vmlinux
and rootfs image from the cloud, fs mounting, syncing libbpf sources
to the image, setting up scripts run on VM boot, starting VM using
QEMU.
`run.sh` covers more use cases than a script for a job run in TravisCI,
e.g. int can build a kernel w/ `--build` option.
- `setup_example.sh` is an example of a script run in VM which can be
modified to e.g. run actual libbpf tests. A setup script should have
executable permission.
To set up a new kernel version for a test:
1) upload vmlinuz.* and vmlinux.*\.zst to Amazon S3 store
located at [4];
2) modify INDEX [2] file.
[1] https://github.com/osandov/drgn
[2] https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/INDEX
[3] https://docs.travis-ci.com/user/build-matrix
[4] https://libbpf-vmtest.s3-us-west-1.amazonaws.com/
On ppc64le __u64 and __s64 are defined as long int and unsigned long int,
respectively. This causes compiler to emit warning when %lld/%llu are used to
printf 64-bit numbers. Fix this by casting to size_t/ssize_t with %zu and %zd
format specifiers, respectively.
v1->v2:
- use size_t/ssize_t instead of custom typedefs (Martin).
Fixes: 1f8e2bcb2cd5 ("libbpf: Refactor relocation handling")
Fixes: abd29c931459 ("libbpf: allow specifying map definitions using BTF")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191212171918.638010-1-andriin@fb.com
These were added to the kernel repo, but not in Github. However, they are
useful for browsing the source in Github while prototyping new features and
compiling them into userspace utilities.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
For BPF_PROG_TYPE_TRACING, the bpf_prog's ctx is an array of u64.
This patch borrows the idea from BPF_CALL_x in filter.h to
convert a u64 to the arg type of the traced function.
The new BPF_TRACE_x has an arg to specify the return type of a bpf_prog.
It will be used in the future TCP-ops bpf_prog that may return "void".
The new macros are defined in the new header file "bpf_trace_helpers.h".
It is under selftests/bpf/ for now. It could be moved to libbpf later
after seeing more upcoming non-tracing use cases.
The tests are changed to use these new macros also. Hence,
the k[s]u8/16/32/64 are no longer needed and they are removed
from the bpf_helpers.h.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191123202504.1502696-1-kafai@fb.com
The st_value field is a 64-bit value and causing this error on 32-bit arches:
In file included from libbpf.c:52:
libbpf.c: In function 'bpf_program__record_reloc':
libbpf_internal.h:59:22: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'Elf64_Addr' {aka 'const long long unsigned int'} [-Werror=format=]
Fix it with (__u64) cast.
Fixes: 1f8e2bcb2cd5 ("libbpf: Refactor relocation handling")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Similarly to a0d7da26ce86 ("libbpf: Fix call relocation offset calculation
bug"), relocations against global variables need to take into account
referenced symbol's st_value, which holds offset into a corresponding data
section (and, subsequently, offset into internal backing map). For static
variables this offset is always zero and data offset is completely described
by respective instruction's imm field.
Convert a bunch of selftests to global variables. Previously they were relying
on `static volatile` trick to ensure Clang doesn't inline static variables,
which with global variables is not necessary anymore.
Fixes: 393cdfbee809 ("libbpf: Support initialized global variables")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191127200651.1381348-1-andriin@fb.com
Libbpf is dual-licensed under BSD-2-Clause and LGPL-2.1 licenses. Include
their texts in the root of the repo.
Suggestes-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
For BPF_PROG_TYPE_TRACING, the bpf_prog's ctx is an array of u64.
This patch borrows the idea from BPF_CALL_x in filter.h to
convert a u64 to the arg type of the traced function.
The new BPF_TRACE_x has an arg to specify the return type of a bpf_prog.
It will be used in the future TCP-ops bpf_prog that may return "void".
The new macros are defined in the new header file "bpf_trace_helpers.h".
It is under selftests/bpf/ for now. It could be moved to libbpf later
after seeing more upcoming non-tracing use cases.
The tests are changed to use these new macros also. Hence,
the k[s]u8/16/32/64 are no longer needed and they are removed
from the bpf_helpers.h.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191123202504.1502696-1-kafai@fb.com
If bpf_object__open_file() gets path like "some/dir/obj.o", it should derive
BPF object's name as "obj" (unless overriden through opts->object_name).
Instead, due to using `path` as a fallback value for opts->obj_name, path is
used as is for object name, so for above example BPF object's name will be
verbatim "some/dir/obj", which leads to all sorts of troubles, especially when
internal maps are concern (they are using up to 8 characters of object name).
Fix that by ensuring object_name stays NULL, unless overriden.
Fixes: 291ee02b5e40 ("libbpf: Refactor bpf_object__open APIs to use common opts")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191122003527.551556-1-andriin@fb.com
Initialized global variables are no different in ELF from static variables,
and don't require any extra support from libbpf. But they are matching
semantics of global data (backed by BPF maps) more closely, preventing
LLVM/Clang from aggressively inlining constant values and not requiring
volatile incantations to prevent those. This patch enables global variables.
It still disables uninitialized variables, which will be put into special COM
(common) ELF section, because BPF doesn't allow uninitialized data to be
accessed.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191121070743.1309473-5-andriin@fb.com
Relocation handling code is convoluted and unnecessarily deeply nested. Split
out per-relocation logic into separate function. Also refactor the logic to be
more a sequence of per-relocation type checks and processing steps, making it
simpler to follow control flow. This makes it easier to further extends it to
new kinds of relocations (e.g., support for extern variables).
This patch also makes relocation's section verification more robust.
Previously relocations against not yet supported externs were silently ignored
because of obj->efile.text_shndx was zero, when all BPF programs had custom
section names and there was no .text section. Also, invalid LDIMM64 relocations
against non-map sections were passed through, if they were pointing to a .text
section (or 0, which is invalid section). All these bugs are fixed within this
refactoring and checks are made more appropriate for each type of relocation.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191121070743.1309473-3-andriin@fb.com
Add ability to memory-map contents of BPF array map. This is extremely useful
for working with BPF global data from userspace programs. It allows to avoid
typical bpf_map_{lookup,update}_elem operations, improving both performance
and usability.
There had to be special considerations for map freezing, to avoid having
writable memory view into a frozen map. To solve this issue, map freezing and
mmap-ing is happening under mutex now:
- if map is already frozen, no writable mapping is allowed;
- if map has writable memory mappings active (accounted in map->writecnt),
map freezing will keep failing with -EBUSY;
- once number of writable memory mappings drops to zero, map freezing can be
performed again.
Only non-per-CPU plain arrays are supported right now. Maps with spinlocks
can't be memory mapped either.
For BPF_F_MMAPABLE array, memory allocation has to be done through vmalloc()
to be mmap()'able. We also need to make sure that array data memory is
page-sized and page-aligned, so we over-allocate memory in such a way that
struct bpf_array is at the end of a single page of memory with array->value
being aligned with the start of the second page. On deallocation we need to
accomodate this memory arrangement to free vmalloc()'ed memory correctly.
One important consideration regarding how memory-mapping subsystem functions.
Memory-mapping subsystem provides few optional callbacks, among them open()
and close(). close() is called for each memory region that is unmapped, so
that users can decrease their reference counters and free up resources, if
necessary. open() is *almost* symmetrical: it's called for each memory region
that is being mapped, **except** the very first one. So bpf_map_mmap does
initial refcnt bump, while open() will do any extra ones after that. Thus
number of close() calls is equal to number of open() calls plus one more.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lore.kernel.org/bpf/20191117172806.2195367-4-andriin@fb.com
The main reason why this is necessary is that gcc 5.x on Xenial doesn't
support ASan on s390x. Bumping the release to Bionic with gcc 7.x allows
us to build libbpf on s390x with ASan without issues.
Currently, libbpf only provides a function to get a single ID for the XDP
program attached to the interface. However, it can be useful to get the
full set of program IDs attached, along with the attachment mode, in one
go. Add a new getter function to support this, using an extendible
structure to carry the information. Express the old bpf_get_link_id()
function in terms of the new function.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157333185164.88376.7520653040667637246.stgit@toke.dk
When loading an eBPF program, libbpf overrides the return code for EPERM
errors instead of returning it to the caller. This makes it hard to figure
out what went wrong on load.
In particular, EPERM is returned when the system rlimit is too low to lock
the memory required for the BPF program. Previously, this was somewhat
obscured because the rlimit error would be hit on map creation (which does
return it correctly). However, since maps can now be reused, object load
can proceed all the way to loading programs without hitting the error;
propagating it even in this case makes it possible for the caller to react
appropriately (and, e.g., attempt to raise the rlimit before retrying).
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/157333184946.88376.11768171652794234561.stgit@toke.dk
Since the automatic map-pinning happens during load, it will leave pinned
maps around if the load fails at a later stage. Fix this by unpinning any
pinned maps on cleanup. To avoid unpinning pinned maps that were reused
rather than newly pinned, add a new boolean property on struct bpf_map to
keep track of whether that map was reused or not; and only unpin those maps
that were not reused.
Fixes: 57a00f41644f ("libbpf: Add auto-pinning of maps when loading BPF objects")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/157333184731.88376.9992935027056165873.stgit@toke.dk
If we get ELF file with "maps" section, but no symbols pointing to it, we'll
end up with division by zero. Add check against this situation and exit early
with error. Found by Coverity scan against Github libbpf sources.
Fixes: bf82927125dd ("libbpf: refactor map initialization")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-6-andriin@fb.com
Perform size check always in btf__resolve_size. Makes the logic a bit more
robust against corrupted BTF and silences LGTM/Coverity complaining about
always true (size < 0) check.
Fixes: 69eaab04c675 ("btf: extract BTF type size calculation")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-5-andriin@fb.com
Coverity scan against Github libbpf code found the issue of not freeing memory and
leaving already freed memory still referenced from bpf_program. Fix it by
re-assigning successfully reallocated memory sooner.
Fixes: 2993e0515bb4 ("tools/bpf: add support to read .BTF.ext sections")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-2-andriin@fb.com
Fix issue reported by static analysis (Coverity). If bpf_prog_get_fd_by_id()
fails, xsk_lookup_bpf_maps() will fail as well and clean-up code will attempt
close() with fd=-1. Fix by checking bpf_prog_get_fd_by_id() return result and
exiting early.
Fixes: 10a13bb40e54 ("libbpf: remove qidconf and better support external bpf programs.")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107054059.313884-1-andriin@fb.com