Compare commits

..

75 Commits

Author SHA1 Message Date
Andrii Nakryiko
550aa56dd4 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline commit:   8fc9f8bedf1bdaea48382ae2e3dd558e2b939cee
Checkpoint commit: 66b5f1c439843bcbab01cc7f3854ae2742f3d1e3

Andrii Nakryiko (2):
  libbpf: fix ptr to u64 conversion warning on 32-bit platforms
  libbpf: fix another GCC8 warning for strncpy

Baruch Siach (1):
  bpf: fix uapi bpf_prog_info fields alignment

Mauro Carvalho Chehab (2):
  docs: cgroup-v1: convert docs to ReST and rename to *.rst
  docs: cgroup-v1: add it to the admin-guide book

Stanislav Fomichev (1):
  bpf: sync bpf.h to tools/

 include/uapi/linux/bpf.h | 7 ++++---
 src/libbpf.c             | 4 ++--
 src/xsk.c                | 3 ++-
 3 files changed, 8 insertions(+), 6 deletions(-)

--
2.17.1
2019-07-23 14:26:50 -07:00
Andrii Nakryiko
54facd3fce libbpf: fix another GCC8 warning for strncpy
Similar issue was fixed in cdfc7f888c2a ("libbpf: fix GCC8 warning for
strncpy") already. This one was missed. Fixing now.

Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-23 14:26:50 -07:00
Stanislav Fomichev
1346b5b538 bpf: sync bpf.h to tools/
Update bpf_sock_addr comments to indicate support for 8-byte reads
from user_ip6 and msg_src_ip6.

Cc: Yonghong Song <yhs@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-23 14:26:50 -07:00
Andrii Nakryiko
78d3666065 libbpf: fix ptr to u64 conversion warning on 32-bit platforms
On 32-bit platforms compiler complains about conversion:

libbpf.c: In function ‘perf_event_open_probe’:
libbpf.c:4112:17: error: cast from pointer to integer of different
size [-Werror=pointer-to-int-cast]
  attr.config1 = (uint64_t)(void *)name; /* kprobe_func or uprobe_path */
                 ^

Reported-by: Matt Hart <matthew.hart@linaro.org>
Fixes: b26500274767 ("libbpf: add kprobe/uprobe attach API")
Tested-by: Matt Hart <matthew.hart@linaro.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-23 14:26:50 -07:00
Mauro Carvalho Chehab
ce2eb85588 docs: cgroup-v1: add it to the admin-guide book
Those files belong to the admin guide, so add them.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-23 14:26:50 -07:00
Baruch Siach
6d4104b077 bpf: fix uapi bpf_prog_info fields alignment
Merge commit 1c8c5a9d38f60 ("Merge
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next") undid the
fix from commit 36f9814a494 ("bpf: fix uapi hole for 32 bit compat
applications") by taking the gpl_compatible 1-bit field definition from
commit b85fab0e67b162 ("bpf: Add gpl_compatible flag to struct
bpf_prog_info") as is. That breaks architectures with 16-bit alignment
like m68k. Add 31-bit pad after gpl_compatible to restore alignment of
following fields.

Thanks to Dmitry V. Levin his analysis of this bug history.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-23 14:26:50 -07:00
Mauro Carvalho Chehab
43c14e871c docs: cgroup-v1: convert docs to ReST and rename to *.rst
Convert the cgroup-v1 files to ReST format, in order to
allow a later addition to the admin-guide.

The conversion is actually:
  - add blank lines and identation in order to identify paragraphs;
  - fix tables markups;
  - add some lists markups;
  - mark literal blocks;
  - adjust title markups.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2019-07-23 14:26:50 -07:00
Andrii Nakryiko
e61f4b8269 scripts/sync-kernel.sh: add missing if_xdp.h in one of file lists
if_xdp.h wasn't added to one of few file lists, causing sync content
verification failure.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-07-19 15:12:56 -07:00
Andrii Nakryiko
80a0eca14b libbpf: sanitize VAR to conservative 1-byte INT
If VAR in non-sanitized BTF was size less than 4, converting such VAR
into an INT with size=4 will cause BTF validation failure due to
violationg of STRUCT (into which DATASEC was converted) member size.
Fix by conservatively using size=1.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-07-19 13:01:08 -07:00
Andrii Nakryiko
5a8c675d0a libbpf: fix SIGSEGV when BTF loading fails, but .BTF.ext exists
In case when BTF loading fails despite sanitization, but BPF object has
.BTF.ext loaded as well, we free and null obj->btf, but not
obj->btf_ext. This leads to an attempt to relocate .BTF.ext later on
during bpf_object__load(), which assumes obj->btf is present. This leads
to SIGSEGV on null pointer access. Fix bug by freeing and nulling
obj->btf_ext as well.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-07-19 13:01:08 -07:00
Andrii Nakryiko
45ad862601 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline commit:   2c6a577d719e96c89b1aaa631089cce69b4d0045
Checkpoint commit: 8fc9f8bedf1bdaea48382ae2e3dd558e2b939cee

Andrii Nakryiko (19):
  libbpf: add common min/max macro to libbpf_internal.h
  libbpf: extract BTF loading logic
  libbpf: streamline ELF parsing error-handling
  libbpf: refactor map initialization
  libbpf: identify maps by section index in addition to offset
  libbpf: split initialization and loading of BTF
  libbpf: allow specifying map definitions using BTF
  libbpf: constify getter APIs
  libbpf: fix GCC8 warning for strncpy
  libbpf: make libbpf_strerror_r agnostic to sign of error
  libbpf: introduce concept of bpf_link
  libbpf: add ability to attach/detach BPF program to perf event
  libbpf: add kprobe/uprobe attach API
  libbpf: add tracepoint attach API
  libbpf: add raw tracepoint attach API
  libbpf: capture value in BTF type info for BTF-defined map defs
  libbpf: add perf buffer API
  libbpf: auto-set PERF_EVENT_ARRAY size to number of CPUs
  libbpf: add perf_buffer_ prefix to README

Colin Ian King (1):
  libbpf: fix spelling mistake "conflictling" -> "conflicting"

Daniel Borkmann (2):
  bpf: sync tooling uapi header
  bpf, libbpf: enable recvmsg attach types

Ivan Khoronzhuk (1):
  libbpf: fix max() type mismatch for 32bit

Leo Yan (1):
  bpf, libbpf, smatch: Fix potential NULL pointer dereference

Martynas Pumputis (1):
  bpf: sync BPF_FIB_LOOKUP flag changes with BPF uapi

Maxim Mikityanskiy (3):
  xsk: Add getsockopt XDP_OPTIONS
  libbpf: Support getsockopt XDP_OPTIONS
  xsk: Change the default frame size to 4096 and allow controlling it

Michal Rostecki (1):
  libbpf: Return btf_fd for load_sk_storage_btf

Stanislav Fomichev (4):
  bpf: sync bpf.h to tools/
  libbpf: support sockopt hooks
  bpf/tools: sync bpf.h
  bpf: sync bpf.h to tools/

Vincent Bernat (1):
  bonding: add an option to specify a delay between peer notifications

 include/uapi/linux/bpf.h     |   38 +-
 include/uapi/linux/if_link.h |    1 +
 include/uapi/linux/if_xdp.h  |    8 +
 src/README.rst               |    3 +-
 src/bpf.c                    |    7 +-
 src/bpf_prog_linfo.c         |    5 +-
 src/btf.c                    |    3 -
 src/btf.h                    |    1 +
 src/btf_dump.c               |    3 -
 src/libbpf.c                 | 1670 ++++++++++++++++++++++++++++------
 src/libbpf.h                 |  132 ++-
 src/libbpf.map               |   12 +-
 src/libbpf_internal.h        |   11 +-
 src/libbpf_probes.c          |   14 +-
 src/str_error.c              |    2 +-
 src/xsk.c                    |   15 +-
 src/xsk.h                    |    2 +-
 17 files changed, 1601 insertions(+), 326 deletions(-)

--
2.17.1
2019-07-08 12:52:46 -07:00
Stanislav Fomichev
5e4da17d43 bpf: sync bpf.h to tools/
Sync user_ip6 & msg_src_ip6 comments.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
2d8ab5cf2c libbpf: add perf_buffer_ prefix to README
perf_buffer "object" is part of libbpf API now, add it to the list of
libbpf function prefixes.

Suggested-by: Daniel Borkman <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
fe1ce6bd74 libbpf: auto-set PERF_EVENT_ARRAY size to number of CPUs
For BPF_MAP_TYPE_PERF_EVENT_ARRAY typically correct size is number of
possible CPUs. This is impossible to specify at compilation time. This
change adds automatic setting of PERF_EVENT_ARRAY size to number of
system CPUs, unless non-zero size is specified explicitly. This allows
to adjust size for advanced specific cases, while providing convenient
and logical defaults.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
9007494e6c libbpf: add perf buffer API
BPF_MAP_TYPE_PERF_EVENT_ARRAY map is often used to send data from BPF program
to user space for additional processing. libbpf already has very low-level API
to read single CPU perf buffer, bpf_perf_event_read_simple(), but it's hard to
use and requires a lot of code to set everything up. This patch adds
perf_buffer abstraction on top of it, abstracting setting up and polling
per-CPU logic into simple and convenient API, similar to what BCC provides.

perf_buffer__new() sets up per-CPU ring buffers and updates corresponding BPF
map entries. It accepts two user-provided callbacks: one for handling raw
samples and one for get notifications of lost samples due to buffer overflow.

perf_buffer__new_raw() is similar, but provides more control over how
perf events are set up (by accepting user-provided perf_event_attr), how
they are handled (perf_event_header pointer is passed directly to
user-provided callback), and on which CPUs ring buffers are created
(it's possible to provide a list of CPUs and corresponding map keys to
update). This API allows advanced users fuller control.

perf_buffer__poll() is used to fetch ring buffer data across all CPUs,
utilizing epoll instance.

perf_buffer__free() does corresponding clean up and unsets FDs from BPF map.

All APIs are not thread-safe. User should ensure proper locking/coordination if
used in multi-threaded set up.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
8b82b9c82b libbpf: capture value in BTF type info for BTF-defined map defs
Change BTF-defined map definitions to capture compile-time integer
values as part of BTF type definition, to avoid split of key/value type
information and actual type/size/flags initialization for maps.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
9a361d2fdd libbpf: add raw tracepoint attach API
Add a wrapper utilizing bpf_link "infrastructure" to allow attaching BPF
programs to raw tracepoints.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
01272d3040 libbpf: add tracepoint attach API
Allow attaching BPF programs to kernel tracepoint BPF hooks specified by
category and name.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
0a216f37f8 libbpf: add kprobe/uprobe attach API
Add ability to attach to kernel and user probes and retprobes.
Implementation depends on perf event support for kprobes/uprobes.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
04f987a89a libbpf: add ability to attach/detach BPF program to perf event
bpf_program__attach_perf_event allows to attach BPF program to existing
perf event hook, providing most generic and most low-level way to attach BPF
programs. It returns struct bpf_link, which should be passed to
bpf_link__destroy to detach and free resources, associated with a link.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
86171433b7 libbpf: introduce concept of bpf_link
bpf_link is an abstraction of an association of a BPF program and one of
many possible BPF attachment points (hooks). This allows to have uniform
interface for detaching BPF programs regardless of the nature of link
and how it was created. Details of creation and setting up of a specific
bpf_link is handled by corresponding attachment methods
(bpf_program__attach_xxx) added in subsequent commits. Once successfully
created, bpf_link has to be eventually destroyed with
bpf_link__destroy(), at which point BPF program is disassociated from
a hook and all the relevant resources are freed.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
3e8f8914cb libbpf: make libbpf_strerror_r agnostic to sign of error
It's often inconvenient to switch sign of error when passing it into
libbpf_strerror_r. It's better for it to handle that automatically.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Stanislav Fomichev
960ec9ace6 bpf/tools: sync bpf.h
Sync new bpf_tcp_sock fields and new BPF_PROG_TYPE_SOCK_OPS RTT callback.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Priyaranjan Jha <priyarjha@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Leo Yan
0cc3d9d332 bpf, libbpf, smatch: Fix potential NULL pointer dereference
Based on the following report from Smatch, fix the potential NULL
pointer dereference check:

  tools/lib/bpf/libbpf.c:3493
  bpf_prog_load_xattr() warn: variable dereferenced before check 'attr'
  (see line 3483)

  3479 int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,
  3480                         struct bpf_object **pobj, int *prog_fd)
  3481 {
  3482         struct bpf_object_open_attr open_attr = {
  3483                 .file           = attr->file,
  3484                 .prog_type      = attr->prog_type,
                                         ^^^^^^
  3485         };

At the head of function, it directly access 'attr' without checking
if it's NULL pointer. This patch moves the values assignment after
validating 'attr' and 'attr->file'.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
50a63f31b6 libbpf: fix GCC8 warning for strncpy
GCC8 started emitting warning about using strncpy with number of bytes
exactly equal destination size, which is generally unsafe, as can lead
to non-zero terminated string being copied. Use IFNAMSIZ - 1 as number
of bytes to ensure name is always zero-terminated.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Stanislav Fomichev
32a605a9a6 libbpf: support sockopt hooks
Make libbpf aware of new sockopt hooks so it can derive prog type
and hook point from the section names.

Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-08 12:52:46 -07:00
Stanislav Fomichev
5efb454851 bpf: sync bpf.h to tools/
Export new prog type and hook points to the libbpf.

Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-08 12:52:46 -07:00
Maxim Mikityanskiy
e0ee1593fd xsk: Change the default frame size to 4096 and allow controlling it
The typical XDP memory scheme is one packet per page. Change the AF_XDP
frame size in libbpf to 4096, which is the page size on x86, to allow
libbpf to be used with the drivers with the packet-per-page scheme.

Add a command line option -f to xdpsock to allow to specify a custom
frame size.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Maxim Mikityanskiy
421ecf02c8 libbpf: Support getsockopt XDP_OPTIONS
Query XDP_OPTIONS in libbpf to determine if the zero-copy mode is active
or not.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Maxim Mikityanskiy
38b91c640f xsk: Add getsockopt XDP_OPTIONS
Make it possible for the application to determine whether the AF_XDP
socket is running in zero-copy mode. To achieve this, add a new
getsockopt option XDP_OPTIONS that returns flags. The only flag
supported for now is the zero-copy mode indicator.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Ivan Khoronzhuk
f925686015 libbpf: fix max() type mismatch for 32bit
It fixes build error for 32bit caused by type mismatch
size_t/unsigned long.

Fixes: bf82927125dd ("libbpf: refactor map initialization")
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Colin Ian King
be0f832d40 libbpf: fix spelling mistake "conflictling" -> "conflicting"
There are several spelling mistakes in pr_warning messages. Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-08 12:52:46 -07:00
Vincent Bernat
f29b6fd1da bonding: add an option to specify a delay between peer notifications
Currently, gratuitous ARP/ND packets are sent every `miimon'
milliseconds. This commit allows a user to specify a custom delay
through a new option, `peer_notif_delay'.

Like for `updelay' and `downdelay', this delay should be a multiple of
`miimon' to avoid managing an additional work queue. The configuration
logic is copied from `updelay' and `downdelay'. However, the default
value cannot be set using a module parameter: Netlink or sysfs should
be used to configure this feature.

When setting `miimon' to 100 and `peer_notif_delay' to 500, we can
observe the 500 ms delay is respected:

    20:30:19.354693 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
    20:30:19.874892 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
    20:30:20.394919 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
    20:30:20.914963 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28

In bond_mii_monitor(), I have tried to keep the lock logic readable.
The change is due to the fact we cannot rely on a notification to
lower the value of `bond->send_peer_notif' as `NETDEV_NOTIFY_PEERS' is
only triggered once every N times, while we need to decrement the
counter each time.

iproute2 also needs to be updated to be able to specify this new
attribute through `ip link'.

Signed-off-by: Vincent Bernat <vincent@bernat.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
1ed7b6ade1 libbpf: constify getter APIs
Add const qualifiers to bpf_object/bpf_program/bpf_map arguments for
getter APIs. There is no need for them to not be const pointers.

Verified that

make -C tools/lib/bpf
make -C tools/testing/selftests/bpf
make -C tools/perf

all build without warnings.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
ec13b30349 libbpf: allow specifying map definitions using BTF
This patch adds support for a new way to define BPF maps. It relies on
BTF to describe mandatory and optional attributes of a map, as well as
captures type information of key and value naturally. This eliminates
the need for BPF_ANNOTATE_KV_PAIR hack and ensures key/value sizes are
always in sync with the key/value type.

Relying on BTF, this approach allows for both forward and backward
compatibility w.r.t. extending supported map definition features. By
default, any unrecognized attributes are treated as an error, but it's
possible relax this using MAPS_RELAX_COMPAT flag. New attributes, added
in the future will need to be optional.

The outline of the new map definition (short, BTF-defined maps) is as follows:
1. All the maps should be defined in .maps ELF section. It's possible to
   have both "legacy" map definitions in `maps` sections and BTF-defined
   maps in .maps sections. Everything will still work transparently.
2. The map declaration and initialization is done through
   a global/static variable of a struct type with few mandatory and
   extra optional fields:
   - type field is mandatory and specified type of BPF map;
   - key/value fields are mandatory and capture key/value type/size information;
   - max_entries attribute is optional; if max_entries is not specified or
     initialized, it has to be provided in runtime through libbpf API
     before loading bpf_object;
   - map_flags is optional and if not defined, will be assumed to be 0.
3. Key/value fields should be **a pointer** to a type describing
   key/value. The pointee type is assumed (and will be recorded as such
   and used for size determination) to be a type describing key/value of
   the map. This is done to save excessive amounts of space allocated in
   corresponding ELF sections for key/value of big size.
4. As some maps disallow having BTF type ID associated with key/value,
   it's possible to specify key/value size explicitly without
   associating BTF type ID with it. Use key_size and value_size fields
   to do that (see example below).

Here's an example of simple ARRAY map defintion:

struct my_value { int x, y, z; };

struct {
	int type;
	int max_entries;
	int *key;
	struct my_value *value;
} btf_map SEC(".maps") = {
	.type = BPF_MAP_TYPE_ARRAY,
	.max_entries = 16,
};

This will define BPF ARRAY map 'btf_map' with 16 elements. The key will
be of type int and thus key size will be 4 bytes. The value is struct
my_value of size 12 bytes. This map can be used from C code exactly the
same as with existing maps defined through struct bpf_map_def.

Here's an example of STACKMAP definition (which currently disallows BTF type
IDs for key/value):

struct {
	__u32 type;
	__u32 max_entries;
	__u32 map_flags;
	__u32 key_size;
	__u32 value_size;
} stackmap SEC(".maps") = {
	.type = BPF_MAP_TYPE_STACK_TRACE,
	.max_entries = 128,
	.map_flags = BPF_F_STACK_BUILD_ID,
	.key_size = sizeof(__u32),
	.value_size = PERF_MAX_STACK_DEPTH * sizeof(struct bpf_stack_build_id),
};

This approach is naturally extended to support map-in-map, by making a value
field to be another struct that describes inner map. This feature is not
implemented yet. It's also possible to incrementally add features like pinning
with full backwards and forward compatibility. Support for static
initialization of BPF_MAP_TYPE_PROG_ARRAY using pointers to BPF programs
is also on the roadmap.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
e64e62d19f libbpf: split initialization and loading of BTF
Libbpf does sanitization of BTF before loading it into kernel, if kernel
doesn't support some of newer BTF features. This removes some of the
important information from BTF (e.g., DATASEC and VAR description),
which will be used for map construction. This patch splits BTF
processing into initialization step, in which BTF is initialized from
ELF and all the original data is still preserved; and
sanitization/loading step, which ensures that BTF is safe to load into
kernel. This allows to use full BTF information to construct maps, while
still loading valid BTF into older kernels.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
115c0e02cb libbpf: identify maps by section index in addition to offset
To support maps to be defined in multiple sections, it's important to
identify map not just by offset within its section, but section index as
well. This patch adds tracking of section index.

For global data, we record section index of corresponding
.data/.bss/.rodata ELF section for uniformity, and thus don't need
a special value of offset for those maps.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
67057c6b7d libbpf: refactor map initialization
User and global data maps initialization has gotten pretty complicated
and unnecessarily convoluted. This patch splits out the logic for global
data map and user-defined map initialization. It also removes the
restriction of pre-calculating how many maps will be initialized,
instead allowing to keep adding new maps as they are discovered, which
will be used later for BTF-defined map definitions.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
bdf65f9fea libbpf: streamline ELF parsing error-handling
Simplify ELF parsing logic by exiting early, as there is no common clean
up path to execute. That makes it unnecessary to track when err was set
and when it was cleared. It also reduces nesting in some places.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
0559e41969 libbpf: extract BTF loading logic
As a preparation for adding BTF-based BPF map loading, extract .BTF and
.BTF.ext loading logic.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
abc096b71d libbpf: add common min/max macro to libbpf_internal.h
Multiple files in libbpf redefine their own definitions for min/max.
Let's define them in libbpf_internal.h and use those everywhere.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Martynas Pumputis
7586f784e6 bpf: sync BPF_FIB_LOOKUP flag changes with BPF uapi
Sync the changes to the flags made in "bpf: simplify definition of
BPF_FIB_LOOKUP related flags" with the BPF UAPI headers.

Doing in a separate commit to ease syncing of github/libbpf.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08 12:52:46 -07:00
Daniel Borkmann
76ff616fcd bpf, libbpf: enable recvmsg attach types
Another trivial patch to libbpf in order to enable identifying and
attaching programs to BPF_CGROUP_UDP{4,6}_RECVMSG by section name.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-08 12:52:46 -07:00
Daniel Borkmann
04a05786c3 bpf: sync tooling uapi header
Sync BPF uapi header in order to pull in BPF_CGROUP_UDP{4,6}_RECVMSG
attach types. This is done and preferred as an extra patch in order
to ease sync of libbpf.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-08 12:52:46 -07:00
Michal Rostecki
ddba4024c0 libbpf: Return btf_fd for load_sk_storage_btf
Before this change, function load_sk_storage_btf expected that
libbpf__probe_raw_btf was returning a BTF descriptor, but in fact it was
returning an information about whether the probe was successful (0 or
1). load_sk_storage_btf was using that value as an argument of the close
function, which was resulting in closing stdout and thus terminating the
process which called that function.

That bug was visible in bpftool. `bpftool feature` subcommand was always
exiting too early (because of closed stdout) and it didn't display all
requested probes. `bpftool -j feature` or `bpftool -p feature` were not
returning a valid json object.

This change renames the libbpf__probe_raw_btf function to
libbpf__load_raw_btf, which now returns a BTF descriptor, as expected in
load_sk_storage_btf.

v2:
- Fix typo in the commit message.

v3:
- Simplify BTF descriptor handling in bpf_object__probe_btf_* functions.
- Rename libbpf__probe_raw_btf function to libbpf__load_raw_btf and
return a BTF descriptor.

v4:
- Fix typo in the commit message.

Fixes: d7c4b3980c18 ("libbpf: detect supported kernel BTF features and sanitize BTF")
Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-08 12:52:46 -07:00
Andrii Nakryiko
52ec16bce8 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline commit:   5e2ac390fbd08b2a462db66cef2663e4db0d5191
Checkpoint commit: 2c6a577d719e96c89b1aaa631089cce69b4d0045

Andrii Nakryiko (1):
  libbpf: fix check for presence of associated BTF for map creation

Stanislav Fomichev (1):
  bpf/tools: sync bpf.h

 include/uapi/linux/bpf.h | 2 ++
 src/libbpf.c             | 9 +++++----
 2 files changed, 7 insertions(+), 4 deletions(-)

--
2.17.1
2019-06-14 16:53:36 -07:00
Stanislav Fomichev
a4e4dbc35a bpf/tools: sync bpf.h
Add sk to struct bpf_sock_addr and struct bpf_sock_ops.

Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-14 16:53:36 -07:00
Andrii Nakryiko
21742bc952 libbpf: fix check for presence of associated BTF for map creation
Kernel internally checks that either key or value type ID is specified,
before using btf_fd. Do the same in libbpf's map creation code for
determining when to retry map creation w/o BTF.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: fba01a0689a9 ("libbpf: use negative fd to specify missing BTF")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-14 16:53:36 -07:00
Andrii Nakryiko
39de671179 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline commit:   e672db03ab0e43e41ab6f8b2156a10d6e40f243d
Checkpoint commit: 5e2ac390fbd08b2a462db66cef2663e4db0d5191

Andrii Nakryiko (9):
  libbpf: fix detection of corrupted BPF instructions section
  libbpf: preserve errno before calling into user callback
  libbpf: simplify endianness check
  libbpf: check map name retrieved from ELF
  libbpf: fix error code returned on corrupted ELF
  libbpf: use negative fd to specify missing BTF
  libbpf: simplify two pieces of logic
  libbpf: typo and formatting fixes
  libbpf: reduce unnecessary line wrapping

Hechao Li (1):
  bpf: add a new API libbpf_num_possible_cpus()

Jonathan Lemon (2):
  bpf/tools: sync bpf.h
  libbpf: remove qidconf and better support external bpf programs.

Quentin Monnet (1):
  libbpf: prevent overwriting of log_level in bpf_object__load_progs()

 include/uapi/linux/bpf.h |   4 +
 src/libbpf.c             | 207 ++++++++++++++++++++++-----------------
 src/libbpf.h             |  16 +++
 src/libbpf.map           |   1 +
 src/xsk.c                | 103 ++++++-------------
 5 files changed, 167 insertions(+), 164 deletions(-)

--
2.17.1
2019-06-11 10:18:48 -07:00
Hechao Li
49967c1f5a bpf: add a new API libbpf_num_possible_cpus()
Adding a new API libbpf_num_possible_cpus() that helps user with
per-CPU map operations.

Signed-off-by: Hechao Li <hechaol@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Jonathan Lemon
4f260bccf5 libbpf: remove qidconf and better support external bpf programs.
Use the recent change to XSKMAP bpf_map_lookup_elem() to test if
there is a xsk present in the map instead of duplicating the work
with qidconf.

Fix things so callers using XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD
bypass any internal bpf maps, so xsk_socket__{create|delete} works
properly.

Clean up error handling path.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-06-11 10:18:48 -07:00
Jonathan Lemon
80aeaa33e7 bpf/tools: sync bpf.h
Sync uapi/linux/bpf.h

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
fcbc2604c8 libbpf: reduce unnecessary line wrapping
There are a bunch of lines of code or comments that are unnecessary
wrapped into multi-lines. Fix that without violating any code
guidelines.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
896c231d8c libbpf: typo and formatting fixes
A bunch of typo and formatting fixes.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
e13049a667 libbpf: simplify two pieces of logic
Extra check for type is unnecessary in first case.

Extra zeroing is unnecessary, as snprintf guarantees that it will
zero-terminate string.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
8e01ffc179 libbpf: use negative fd to specify missing BTF
0 is a valid FD, so it's better to initialize it to -1, as is done in
other places. Also, technically, BTF type ID 0 is valid (it's a VOID
type), so it's more reliable to check btf_fd, instead of
btf_key_type_id, to determine if there is any BTF associated with a map.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
c7b8b2e3a5 libbpf: fix error code returned on corrupted ELF
All of libbpf errors are negative, except this one. Fix it.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
e977af8516 libbpf: check map name retrieved from ELF
Validate there was no error retrieving symbol name corresponding to
a BPF map.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
dd55058989 libbpf: simplify endianness check
Rewrite endianness check to use "more canonical" way, using
compiler-defined macros, similar to few other places in libbpf. It also
is more obvious and shorter.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
64852b2fb9 libbpf: preserve errno before calling into user callback
pr_warning ultimately may call into user-provided callback function,
which can clobber errno value, so we need to save it before that.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
1aedc35d5d libbpf: fix detection of corrupted BPF instructions section
Ensure that size of a section w/ BPF instruction is exactly a multiple
of BPF instruction size.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Quentin Monnet
d5013de6a5 libbpf: prevent overwriting of log_level in bpf_object__load_progs()
There are two functions in libbpf that support passing a log_level
parameter for the verifier for loading programs:
bpf_object__load_xattr() and bpf_prog_load_xattr(). Both accept an
attribute object containing the log_level, and apply it to the programs
to load.

It turns out that to effectively load the programs, the latter function
eventually relies on the former. This was not taken into account when
adding support for log_level in bpf_object__load_xattr(), and the
log_level passed to bpf_prog_load_xattr() later gets overwritten with a
zero value, thus disabling verifier logs for the program in all cases:

bpf_prog_load_xattr()             // prog->log_level = attr1->log_level;
-> bpf_object__load()             // attr2->log_level = 0;
   -> bpf_object__load_xattr()    // <pass prog and attr2>
      -> bpf_object__load_progs() // prog->log_level = attr2->log_level;

Fix this by OR-ing the log_level in bpf_object__load_progs(), instead of
overwriting it.

v2: Fix commit log description (confusion on function names in v1).

Fixes: 60276f984998 ("libbpf: add bpf_object__load_xattr() API function to pass log_level")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 10:18:48 -07:00
Andrii Nakryiko
0e37e0d03a build: add hashmap and btf_dump to OBJS list
Fix Makefile to include expected object files.
Fixes https://github.com/libbpf/libbpf/issues/52.

Reported-by: Robert McCabe <robert.mccabe@rockwellcollins.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-05-30 15:07:51 -07:00
Andrii Nakryiko
75db50f4a0 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline commit:   f49aa1de98363b6c5fba4637678d6b0ba3d18065
Checkpoint commit: e672db03ab0e43e41ab6f8b2156a10d6e40f243d

Andrii Nakryiko (5):
  libbpf: ensure libbpf.h is included along libbpf_internal.h
  libbpf: add btf__parse_elf API to load .BTF and .BTF.ext
  libbpf: add resizable non-thread safe internal hashmap
  libbpf: switch btf_dedup() to hashmap for dedup table
  libbpf: add btf_dump API for BTF-to-C conversion

Hariprasad Kelam (1):
  libbpf: fix warning that PTR_ERR_OR_ZERO can be used

Jiong Wang (2):
  tools: bpf: sync uapi header bpf.h
  libbpf: add "prog_flags" to
    bpf_program/bpf_prog_load_attr/bpf_load_program_attr

Quentin Monnet (1):
  libbpf: add bpf_object__load_xattr() API function to pass log_level

Yonghong Song (1):
  tools/bpf: sync bpf uapi header bpf.h to tools directory

 include/uapi/linux/bpf.h |   35 +-
 src/bpf.c                |    1 +
 src/bpf.h                |    1 +
 src/btf.c                |  329 ++++++----
 src/btf.h                |   19 +
 src/btf_dump.c           | 1336 ++++++++++++++++++++++++++++++++++++++
 src/hashmap.c            |  229 +++++++
 src/hashmap.h            |  173 +++++
 src/libbpf.c             |   27 +-
 src/libbpf.h             |    7 +
 src/libbpf.map           |    9 +
 src/libbpf_internal.h    |    2 +
 12 files changed, 2045 insertions(+), 123 deletions(-)
 create mode 100644 src/btf_dump.c
 create mode 100644 src/hashmap.c
 create mode 100644 src/hashmap.h

--
2.17.1
2019-05-29 10:27:27 -07:00
Quentin Monnet
d714245dd9 libbpf: add bpf_object__load_xattr() API function to pass log_level
libbpf was recently made aware of the log_level attribute for programs,
used to specify the level of information expected to be dumped by the
verifier. Function bpf_prog_load_xattr() got support for this log_level
parameter.

But some applications using libbpf rely on another function to load
programs, bpf_object__load(), which does accept any parameter for log
level. Create an API function based on bpf_object__load(), but accepting
an "attr" object as a parameter. Then add a log_level field to that
object, so that applications calling the new bpf_object__load_xattr()
can pick the desired log level.

v3:
- Rewrite commit log.

v2:
- We are in a new cycle, bump libbpf extraversion number.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-29 10:27:27 -07:00
Hariprasad Kelam
1d0ddcdbda libbpf: fix warning that PTR_ERR_OR_ZERO can be used
Fix below warning reported by coccicheck:

/tools/lib/bpf/libbpf.c:3461:1-3: WARNING: PTR_ERR_OR_ZERO can be used

Signed-off-by: Hariprasad Kelam <hariprasad.kelam@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-29 10:27:27 -07:00
Jiong Wang
edf3123942 libbpf: add "prog_flags" to bpf_program/bpf_prog_load_attr/bpf_load_program_attr
libbpf doesn't allow passing "prog_flags" during bpf program load in a
couple of load related APIs, "bpf_load_program_xattr", "load_program" and
"bpf_prog_load_xattr".

It makes sense to allow passing "prog_flags" which is useful for
customizing program loading.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-29 10:27:27 -07:00
Jiong Wang
1f8e2f7208 tools: bpf: sync uapi header bpf.h
Sync new bpf prog load flag "BPF_F_TEST_RND_HI32" to tools/.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-29 10:27:27 -07:00
Yonghong Song
ca76b123f1 tools/bpf: sync bpf uapi header bpf.h to tools directory
The bpf uapi header include/uapi/linux/bpf.h is sync'ed
to tools/include/uapi/linux/bpf.h.

Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-29 10:27:27 -07:00
Andrii Nakryiko
4fe8a54efe libbpf: add btf_dump API for BTF-to-C conversion
BTF contains enough type information to allow generating valid
compilable C header w/ correct layout of structs/unions and all the
typedef/enum definitions. This patch adds a new "object" - btf_dump to
facilitate dumping BTF as valid C. btf_dump__dump_type() is the main API
which takes care of dumping out (through user-provided printf-like
callback function) C definitions for given type ID and it's required
dependencies. This allows for not just dumping out entirety of BTF types,
but also selective filtering based on user-provided criterias w/ minimal
set of dependent types.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-29 10:27:27 -07:00
Andrii Nakryiko
5f3f74892a libbpf: switch btf_dedup() to hashmap for dedup table
Utilize libbpf's hashmap as a multimap fof dedup_table implementation.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-29 10:27:27 -07:00
Andrii Nakryiko
553db8ba73 libbpf: add resizable non-thread safe internal hashmap
There is a need for fast point lookups inside libbpf for multiple use
cases (e.g., name resolution for BTF-to-C conversion, by-name lookups in
BTF for upcoming BPF CO-RE relocation support, etc). This patch
implements simple resizable non-thread safe hashmap using single linked
list chains.

Four different insert strategies are supported:
 - HASHMAP_ADD - only add key/value if key doesn't exist yet;
 - HASHMAP_SET - add key/value pair if key doesn't exist yet; otherwise,
   update value;
 - HASHMAP_UPDATE - update value, if key already exists; otherwise, do
   nothing and return -ENOENT;
 - HASHMAP_APPEND - always add key/value pair, even if key already exists.
   This turns hashmap into a multimap by allowing multiple values to be
   associated with the same key. Most useful read API for such hashmap is
   hashmap__for_each_key_entry() iteration. If hashmap__find() is still
   used, it will return last inserted key/value entry (first in a bucket
   chain).

For HASHMAP_SET and HASHMAP_UPDATE, old key/value pair is returned, so
that calling code can handle proper memory management, if necessary.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-29 10:27:27 -07:00
Andrii Nakryiko
939da1eb3f libbpf: add btf__parse_elf API to load .BTF and .BTF.ext
Loading BTF and BTF.ext from ELF file is a common need. Instead of
requiring every user to re-implement it, let's provide this API from
libbpf itself. It's mostly copy/paste from `bpftool btf dump`
implementation, which will be switched to libbpf's version in next
patch. btf__parse_elf allows to load BTF and optionally BTF.ext.
This is also useful for tests that need to load/work with BTF, loaded
from test ELF files.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-29 10:27:27 -07:00
Andrii Nakryiko
d557b32f71 libbpf: ensure libbpf.h is included along libbpf_internal.h
libbpf_internal.h expects a bunch of stuff defined in libbpf.h to be
defined. This patch makes sure that libbpf.h is always included.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-29 10:27:27 -07:00
Andrii Nakryiko
e60460f4e5 linux/err.h: add PTR_ERR_OR_ZERO
Add missing PTR_ERR_OR_ZERO implementation. Also add a bunch of
generated files to .gitignore.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2019-05-29 10:19:20 -07:00
25 changed files with 3832 additions and 608 deletions

View File

@@ -1 +1 @@
f49aa1de98363b6c5fba4637678d6b0ba3d18065
66b5f1c439843bcbab01cc7f3854ae2742f3d1e3

View File

@@ -30,4 +30,9 @@ static inline bool IS_ERR_OR_NULL(const void *ptr)
return (!ptr) || IS_ERR_VALUE((unsigned long)ptr);
}
static inline long PTR_ERR_OR_ZERO(const void *ptr)
{
return IS_ERR(ptr) ? PTR_ERR(ptr) : 0;
}
#endif

View File

@@ -170,6 +170,7 @@ enum bpf_prog_type {
BPF_PROG_TYPE_FLOW_DISSECTOR,
BPF_PROG_TYPE_CGROUP_SYSCTL,
BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE,
BPF_PROG_TYPE_CGROUP_SOCKOPT,
};
enum bpf_attach_type {
@@ -192,6 +193,10 @@ enum bpf_attach_type {
BPF_LIRC_MODE2,
BPF_FLOW_DISSECTOR,
BPF_CGROUP_SYSCTL,
BPF_CGROUP_UDP4_RECVMSG,
BPF_CGROUP_UDP6_RECVMSG,
BPF_CGROUP_GETSOCKOPT,
BPF_CGROUP_SETSOCKOPT,
__MAX_BPF_ATTACH_TYPE
};
@@ -260,6 +265,24 @@ enum bpf_attach_type {
*/
#define BPF_F_ANY_ALIGNMENT (1U << 1)
/* BPF_F_TEST_RND_HI32 is used in BPF_PROG_LOAD command for testing purpose.
* Verifier does sub-register def/use analysis and identifies instructions whose
* def only matters for low 32-bit, high 32-bit is never referenced later
* through implicit zero extension. Therefore verifier notifies JIT back-ends
* that it is safe to ignore clearing high 32-bit for these instructions. This
* saves some back-ends a lot of code-gen. However such optimization is not
* necessary on some arches, for example x86_64, arm64 etc, whose JIT back-ends
* hence hasn't used verifier's analysis result. But, we really want to have a
* way to be able to verify the correctness of the described optimization on
* x86_64 on which testsuites are frequently exercised.
*
* So, this flag is introduced. Once it is set, verifier will randomize high
* 32-bit for those instructions who has been identified as safe to ignore them.
* Then, if verifier is not doing correct analysis, such randomization will
* regress tests to expose bugs.
*/
#define BPF_F_TEST_RND_HI32 (1U << 2)
/* When BPF ldimm64's insn[0].src_reg != 0 then this can have
* two extensions:
*
@@ -783,7 +806,7 @@ union bpf_attr {
* based on a user-provided identifier for all traffic coming from
* the tasks belonging to the related cgroup. See also the related
* kernel documentation, available from the Linux sources in file
* *Documentation/cgroup-v1/net_cls.txt*.
* *Documentation/admin-guide/cgroup-v1/net_cls.rst*.
*
* The Linux kernel has two versions for cgroups: there are
* cgroups v1 and cgroups v2. Both are available to users, who can
@@ -1744,6 +1767,7 @@ union bpf_attr {
* * **BPF_SOCK_OPS_RTO_CB_FLAG** (retransmission time out)
* * **BPF_SOCK_OPS_RETRANS_CB_FLAG** (retransmission)
* * **BPF_SOCK_OPS_STATE_CB_FLAG** (TCP state change)
* * **BPF_SOCK_OPS_RTT_CB_FLAG** (every RTT)
*
* Therefore, this function can be used to clear a callback flag by
* setting the appropriate bit to zero. e.g. to disable the RTO
@@ -2672,6 +2696,20 @@ union bpf_attr {
* 0 on success.
*
* **-ENOENT** if the bpf-local-storage cannot be found.
*
* int bpf_send_signal(u32 sig)
* Description
* Send signal *sig* to the current task.
* Return
* 0 on success or successfully queued.
*
* **-EBUSY** if work queue under nmi is full.
*
* **-EINVAL** if *sig* is invalid.
*
* **-EPERM** if no permission to send the *sig*.
*
* **-EAGAIN** if bpf program can try again.
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
@@ -2782,7 +2820,8 @@ union bpf_attr {
FN(strtol), \
FN(strtoul), \
FN(sk_storage_get), \
FN(sk_storage_delete),
FN(sk_storage_delete), \
FN(send_signal),
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
* function eBPF program intends to call
@@ -3031,6 +3070,12 @@ struct bpf_tcp_sock {
* sum(delta(snd_una)), or how many bytes
* were acked.
*/
__u32 dsack_dups; /* RFC4898 tcpEStatsStackDSACKDups
* total number of DSACK blocks received
*/
__u32 delivered; /* Total data packets delivered incl. rexmits */
__u32 delivered_ce; /* Like the above but only ECE marked packets */
__u32 icsk_retransmits; /* Number of unrecovered [RTO] timeouts */
};
struct bpf_sock_tuple {
@@ -3050,6 +3095,10 @@ struct bpf_sock_tuple {
};
};
struct bpf_xdp_sock {
__u32 queue_id;
};
#define XDP_PACKET_HEADROOM 256
/* User return codes for XDP prog type.
@@ -3141,6 +3190,7 @@ struct bpf_prog_info {
char name[BPF_OBJ_NAME_LEN];
__u32 ifindex;
__u32 gpl_compatible:1;
__u32 :31; /* alignment pad */
__u64 netns_dev;
__u64 netns_ino;
__u32 nr_jited_ksyms;
@@ -3195,7 +3245,7 @@ struct bpf_sock_addr {
__u32 user_ip4; /* Allows 1,2,4-byte read and 4-byte write.
* Stored in network byte order.
*/
__u32 user_ip6[4]; /* Allows 1,2,4-byte read an 4-byte write.
__u32 user_ip6[4]; /* Allows 1,2,4,8-byte read and 4,8-byte write.
* Stored in network byte order.
*/
__u32 user_port; /* Allows 4-byte read and write.
@@ -3204,12 +3254,13 @@ struct bpf_sock_addr {
__u32 family; /* Allows 4-byte read, but no write */
__u32 type; /* Allows 4-byte read, but no write */
__u32 protocol; /* Allows 4-byte read, but no write */
__u32 msg_src_ip4; /* Allows 1,2,4-byte read an 4-byte write.
__u32 msg_src_ip4; /* Allows 1,2,4-byte read and 4-byte write.
* Stored in network byte order.
*/
__u32 msg_src_ip6[4]; /* Allows 1,2,4-byte read an 4-byte write.
__u32 msg_src_ip6[4]; /* Allows 1,2,4,8-byte read and 4,8-byte write.
* Stored in network byte order.
*/
__bpf_md_ptr(struct bpf_sock *, sk);
};
/* User bpf_sock_ops struct to access socket values and specify request ops
@@ -3261,13 +3312,15 @@ struct bpf_sock_ops {
__u32 sk_txhash;
__u64 bytes_received;
__u64 bytes_acked;
__bpf_md_ptr(struct bpf_sock *, sk);
};
/* Definitions for bpf_sock_ops_cb_flags */
#define BPF_SOCK_OPS_RTO_CB_FLAG (1<<0)
#define BPF_SOCK_OPS_RETRANS_CB_FLAG (1<<1)
#define BPF_SOCK_OPS_STATE_CB_FLAG (1<<2)
#define BPF_SOCK_OPS_ALL_CB_FLAGS 0x7 /* Mask of all currently
#define BPF_SOCK_OPS_RTT_CB_FLAG (1<<3)
#define BPF_SOCK_OPS_ALL_CB_FLAGS 0xF /* Mask of all currently
* supported cb flags
*/
@@ -3322,6 +3375,8 @@ enum {
BPF_SOCK_OPS_TCP_LISTEN_CB, /* Called on listen(2), right after
* socket transition to LISTEN state.
*/
BPF_SOCK_OPS_RTT_CB, /* Called on every RTT.
*/
};
/* List of TCP states. There is a build check in net/ipv4/tcp.c to detect
@@ -3376,8 +3431,8 @@ struct bpf_raw_tracepoint_args {
/* DIRECT: Skip the FIB rules and go to FIB table associated with device
* OUTPUT: Do lookup from egress perspective; default is ingress
*/
#define BPF_FIB_LOOKUP_DIRECT BIT(0)
#define BPF_FIB_LOOKUP_OUTPUT BIT(1)
#define BPF_FIB_LOOKUP_DIRECT (1U << 0)
#define BPF_FIB_LOOKUP_OUTPUT (1U << 1)
enum {
BPF_FIB_LKUP_RET_SUCCESS, /* lookup successful */
@@ -3500,4 +3555,15 @@ struct bpf_sysctl {
*/
};
struct bpf_sockopt {
__bpf_md_ptr(struct bpf_sock *, sk);
__bpf_md_ptr(void *, optval);
__bpf_md_ptr(void *, optval_end);
__s32 level;
__s32 optname;
__s32 optlen;
__s32 retval;
};
#endif /* _UAPI__LINUX_BPF_H__ */

View File

@@ -636,6 +636,7 @@ enum {
IFLA_BOND_AD_USER_PORT_KEY,
IFLA_BOND_AD_ACTOR_SYSTEM,
IFLA_BOND_TLB_DYNAMIC_LB,
IFLA_BOND_PEER_NOTIF_DELAY,
__IFLA_BOND_MAX,
};

View File

@@ -46,6 +46,7 @@ struct xdp_mmap_offsets {
#define XDP_UMEM_FILL_RING 5
#define XDP_UMEM_COMPLETION_RING 6
#define XDP_STATISTICS 7
#define XDP_OPTIONS 8
struct xdp_umem_reg {
__u64 addr; /* Start of packet data area */
@@ -60,6 +61,13 @@ struct xdp_statistics {
__u64 tx_invalid_descs; /* Dropped due to invalid descriptor */
};
struct xdp_options {
__u32 flags;
};
/* Flags for the flags field of struct xdp_options */
#define XDP_OPTIONS_ZEROCOPY (1 << 0)
/* Pgoff for mmaping the rings */
#define XDP_PGOFF_RX_RING 0
#define XDP_PGOFF_TX_RING 0x80000000

View File

@@ -65,7 +65,7 @@ git branch ${SQUASH_BASE_TAG} ${SQUASH_COMMIT}
git checkout -b ${SQUASH_TIP_TAG} ${SQUASH_COMMIT}
# Cherry-pick new commits onto squashed baseline commit
LIBBPF_PATHS=(tools/lib/bpf tools/include/uapi/linux/{bpf_common.h,bpf.h,btf.h,if_link.h,netlink.h} tools/include/tools/libc_compat.h)
LIBBPF_PATHS=(tools/lib/bpf tools/include/uapi/linux/{bpf_common.h,bpf.h,btf.h,if_link.h,if_xdp.h,netlink.h} tools/include/tools/libc_compat.h)
LIBBPF_NEW_MERGES=$(git rev-list --merges --topo-order --reverse ${BASELINE_TAG}..${TIP_TAG} ${LIBBPF_PATHS[@]})
for LIBBPF_NEW_MERGE in ${LIBBPF_NEW_MERGES}; do

2
src/.gitignore vendored
View File

@@ -1,2 +1,4 @@
*.o
*.a
/libbpf.pc
/libbpf.so*

View File

@@ -34,7 +34,8 @@ endif
OBJDIR ?= .
OBJS := $(addprefix $(OBJDIR)/,bpf.o btf.o libbpf.o libbpf_errno.o netlink.o \
nlattr.o str_error.o libbpf_probes.o bpf_prog_linfo.o xsk.o)
nlattr.o str_error.o libbpf_probes.o bpf_prog_linfo.o xsk.o \
btf_dump.o hashmap.o)
LIBS := $(OBJDIR)/libbpf.a
ifndef BUILD_STATIC_ONLY

View File

@@ -9,7 +9,8 @@ described here. It's recommended to follow these conventions whenever a
new function or type is added to keep libbpf API clean and consistent.
All types and functions provided by libbpf API should have one of the
following prefixes: ``bpf_``, ``btf_``, ``libbpf_``, ``xsk_``.
following prefixes: ``bpf_``, ``btf_``, ``libbpf_``, ``xsk_``,
``perf_buffer_``.
System call wrappers
--------------------

View File

@@ -26,10 +26,11 @@
#include <memory.h>
#include <unistd.h>
#include <asm/unistd.h>
#include <errno.h>
#include <linux/bpf.h>
#include "bpf.h"
#include "libbpf.h"
#include <errno.h>
#include "libbpf_internal.h"
/*
* When building perf, unistd.h is overridden. __NR_bpf is
@@ -53,10 +54,6 @@
# endif
#endif
#ifndef min
#define min(x, y) ((x) < (y) ? (x) : (y))
#endif
static inline __u64 ptr_to_u64(const void *ptr)
{
return (__u64) (unsigned long) ptr;
@@ -256,6 +253,7 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
if (load_attr->name)
memcpy(attr.prog_name, load_attr->name,
min(strlen(load_attr->name), BPF_OBJ_NAME_LEN - 1));
attr.prog_flags = load_attr->prog_flags;
fd = sys_bpf_prog_load(&attr, sizeof(attr));
if (fd >= 0)

View File

@@ -87,6 +87,7 @@ struct bpf_load_program_attr {
const void *line_info;
__u32 line_info_cnt;
__u32 log_level;
__u32 prog_flags;
};
/* Flags to direct loading requirements */

View File

@@ -6,10 +6,7 @@
#include <linux/err.h>
#include <linux/bpf.h>
#include "libbpf.h"
#ifndef min
#define min(x, y) ((x) < (y) ? (x) : (y))
#endif
#include "libbpf_internal.h"
struct bpf_prog_linfo {
void *raw_linfo;

334
src/btf.c
View File

@@ -4,17 +4,17 @@
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <linux/err.h>
#include <linux/btf.h>
#include <gelf.h>
#include "btf.h"
#include "bpf.h"
#include "libbpf.h"
#include "libbpf_internal.h"
#define max(a, b) ((a) > (b) ? (a) : (b))
#define min(a, b) ((a) < (b) ? (a) : (b))
#include "hashmap.h"
#define BTF_MAX_NR_TYPES 0x7fffffff
#define BTF_MAX_STR_OFFSET 0x7fffffff
@@ -417,6 +417,132 @@ done:
return btf;
}
static bool btf_check_endianness(const GElf_Ehdr *ehdr)
{
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
return ehdr->e_ident[EI_DATA] == ELFDATA2LSB;
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
return ehdr->e_ident[EI_DATA] == ELFDATA2MSB;
#else
# error "Unrecognized __BYTE_ORDER__"
#endif
}
struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext)
{
Elf_Data *btf_data = NULL, *btf_ext_data = NULL;
int err = 0, fd = -1, idx = 0;
struct btf *btf = NULL;
Elf_Scn *scn = NULL;
Elf *elf = NULL;
GElf_Ehdr ehdr;
if (elf_version(EV_CURRENT) == EV_NONE) {
pr_warning("failed to init libelf for %s\n", path);
return ERR_PTR(-LIBBPF_ERRNO__LIBELF);
}
fd = open(path, O_RDONLY);
if (fd < 0) {
err = -errno;
pr_warning("failed to open %s: %s\n", path, strerror(errno));
return ERR_PTR(err);
}
err = -LIBBPF_ERRNO__FORMAT;
elf = elf_begin(fd, ELF_C_READ, NULL);
if (!elf) {
pr_warning("failed to open %s as ELF file\n", path);
goto done;
}
if (!gelf_getehdr(elf, &ehdr)) {
pr_warning("failed to get EHDR from %s\n", path);
goto done;
}
if (!btf_check_endianness(&ehdr)) {
pr_warning("non-native ELF endianness is not supported\n");
goto done;
}
if (!elf_rawdata(elf_getscn(elf, ehdr.e_shstrndx), NULL)) {
pr_warning("failed to get e_shstrndx from %s\n", path);
goto done;
}
while ((scn = elf_nextscn(elf, scn)) != NULL) {
GElf_Shdr sh;
char *name;
idx++;
if (gelf_getshdr(scn, &sh) != &sh) {
pr_warning("failed to get section(%d) header from %s\n",
idx, path);
goto done;
}
name = elf_strptr(elf, ehdr.e_shstrndx, sh.sh_name);
if (!name) {
pr_warning("failed to get section(%d) name from %s\n",
idx, path);
goto done;
}
if (strcmp(name, BTF_ELF_SEC) == 0) {
btf_data = elf_getdata(scn, 0);
if (!btf_data) {
pr_warning("failed to get section(%d, %s) data from %s\n",
idx, name, path);
goto done;
}
continue;
} else if (btf_ext && strcmp(name, BTF_EXT_ELF_SEC) == 0) {
btf_ext_data = elf_getdata(scn, 0);
if (!btf_ext_data) {
pr_warning("failed to get section(%d, %s) data from %s\n",
idx, name, path);
goto done;
}
continue;
}
}
err = 0;
if (!btf_data) {
err = -ENOENT;
goto done;
}
btf = btf__new(btf_data->d_buf, btf_data->d_size);
if (IS_ERR(btf))
goto done;
if (btf_ext && btf_ext_data) {
*btf_ext = btf_ext__new(btf_ext_data->d_buf,
btf_ext_data->d_size);
if (IS_ERR(*btf_ext))
goto done;
} else if (btf_ext) {
*btf_ext = NULL;
}
done:
if (elf)
elf_end(elf);
close(fd);
if (err)
return ERR_PTR(err);
/*
* btf is always parsed before btf_ext, so no need to clean up
* btf_ext, if btf loading failed
*/
if (IS_ERR(btf))
return btf;
if (btf_ext && IS_ERR(*btf_ext)) {
btf__free(btf);
err = PTR_ERR(*btf_ext);
return ERR_PTR(err);
}
return btf;
}
static int compare_vsi_off(const void *_a, const void *_b)
{
const struct btf_var_secinfo *a = _a;
@@ -1165,16 +1291,9 @@ done:
return err;
}
#define BTF_DEDUP_TABLE_DEFAULT_SIZE (1 << 14)
#define BTF_DEDUP_TABLE_MAX_SIZE_LOG 31
#define BTF_UNPROCESSED_ID ((__u32)-1)
#define BTF_IN_PROGRESS_ID ((__u32)-2)
struct btf_dedup_node {
struct btf_dedup_node *next;
__u32 type_id;
};
struct btf_dedup {
/* .BTF section to be deduped in-place */
struct btf *btf;
@@ -1190,7 +1309,7 @@ struct btf_dedup {
* candidates, which is fine because we rely on subsequent
* btf_xxx_equal() checks to authoritatively verify type equality.
*/
struct btf_dedup_node **dedup_table;
struct hashmap *dedup_table;
/* Canonical types map */
__u32 *map;
/* Hypothetical mapping, used during type graph equivalence checks */
@@ -1215,30 +1334,18 @@ struct btf_str_ptrs {
__u32 cap;
};
static inline __u32 hash_combine(__u32 h, __u32 value)
static long hash_combine(long h, long value)
{
/* 2^31 + 2^29 - 2^25 + 2^22 - 2^19 - 2^16 + 1 */
#define GOLDEN_RATIO_PRIME 0x9e370001UL
return h * 37 + value * GOLDEN_RATIO_PRIME;
#undef GOLDEN_RATIO_PRIME
return h * 31 + value;
}
#define for_each_dedup_cand(d, hash, node) \
for (node = d->dedup_table[hash & (d->opts.dedup_table_size - 1)]; \
node; \
node = node->next)
#define for_each_dedup_cand(d, node, hash) \
hashmap__for_each_key_entry(d->dedup_table, node, (void *)hash)
static int btf_dedup_table_add(struct btf_dedup *d, __u32 hash, __u32 type_id)
static int btf_dedup_table_add(struct btf_dedup *d, long hash, __u32 type_id)
{
struct btf_dedup_node *node = malloc(sizeof(struct btf_dedup_node));
int bucket = hash & (d->opts.dedup_table_size - 1);
if (!node)
return -ENOMEM;
node->type_id = type_id;
node->next = d->dedup_table[bucket];
d->dedup_table[bucket] = node;
return 0;
return hashmap__append(d->dedup_table,
(void *)hash, (void *)(long)type_id);
}
static int btf_dedup_hypot_map_add(struct btf_dedup *d,
@@ -1267,36 +1374,10 @@ static void btf_dedup_clear_hypot_map(struct btf_dedup *d)
d->hypot_cnt = 0;
}
static void btf_dedup_table_free(struct btf_dedup *d)
{
struct btf_dedup_node *head, *tmp;
int i;
if (!d->dedup_table)
return;
for (i = 0; i < d->opts.dedup_table_size; i++) {
while (d->dedup_table[i]) {
tmp = d->dedup_table[i];
d->dedup_table[i] = tmp->next;
free(tmp);
}
head = d->dedup_table[i];
while (head) {
tmp = head;
head = head->next;
free(tmp);
}
}
free(d->dedup_table);
d->dedup_table = NULL;
}
static void btf_dedup_free(struct btf_dedup *d)
{
btf_dedup_table_free(d);
hashmap__free(d->dedup_table);
d->dedup_table = NULL;
free(d->map);
d->map = NULL;
@@ -1310,40 +1391,43 @@ static void btf_dedup_free(struct btf_dedup *d)
free(d);
}
/* Find closest power of two >= to size, capped at 2^max_size_log */
static __u32 roundup_pow2_max(__u32 size, int max_size_log)
static size_t btf_dedup_identity_hash_fn(const void *key, void *ctx)
{
int i;
for (i = 0; i < max_size_log && (1U << i) < size; i++)
;
return 1U << i;
return (size_t)key;
}
static size_t btf_dedup_collision_hash_fn(const void *key, void *ctx)
{
return 0;
}
static bool btf_dedup_equal_fn(const void *k1, const void *k2, void *ctx)
{
return k1 == k2;
}
static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
const struct btf_dedup_opts *opts)
{
struct btf_dedup *d = calloc(1, sizeof(struct btf_dedup));
hashmap_hash_fn hash_fn = btf_dedup_identity_hash_fn;
int i, err = 0;
__u32 sz;
if (!d)
return ERR_PTR(-ENOMEM);
d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;
sz = opts && opts->dedup_table_size ? opts->dedup_table_size
: BTF_DEDUP_TABLE_DEFAULT_SIZE;
sz = roundup_pow2_max(sz, BTF_DEDUP_TABLE_MAX_SIZE_LOG);
d->opts.dedup_table_size = sz;
/* dedup_table_size is now used only to force collisions in tests */
if (opts && opts->dedup_table_size == 1)
hash_fn = btf_dedup_collision_hash_fn;
d->btf = btf;
d->btf_ext = btf_ext;
d->dedup_table = calloc(d->opts.dedup_table_size,
sizeof(struct btf_dedup_node *));
if (!d->dedup_table) {
err = -ENOMEM;
d->dedup_table = hashmap__new(hash_fn, btf_dedup_equal_fn, NULL);
if (IS_ERR(d->dedup_table)) {
err = PTR_ERR(d->dedup_table);
d->dedup_table = NULL;
goto done;
}
@@ -1662,9 +1746,9 @@ done:
return err;
}
static __u32 btf_hash_common(struct btf_type *t)
static long btf_hash_common(struct btf_type *t)
{
__u32 h;
long h;
h = hash_combine(0, t->name_off);
h = hash_combine(h, t->info);
@@ -1680,10 +1764,10 @@ static bool btf_equal_common(struct btf_type *t1, struct btf_type *t2)
}
/* Calculate type signature hash of INT. */
static __u32 btf_hash_int(struct btf_type *t)
static long btf_hash_int(struct btf_type *t)
{
__u32 info = *(__u32 *)(t + 1);
__u32 h;
long h;
h = btf_hash_common(t);
h = hash_combine(h, info);
@@ -1703,9 +1787,9 @@ static bool btf_equal_int(struct btf_type *t1, struct btf_type *t2)
}
/* Calculate type signature hash of ENUM. */
static __u32 btf_hash_enum(struct btf_type *t)
static long btf_hash_enum(struct btf_type *t)
{
__u32 h;
long h;
/* don't hash vlen and enum members to support enum fwd resolving */
h = hash_combine(0, t->name_off);
@@ -1757,11 +1841,11 @@ static bool btf_compat_enum(struct btf_type *t1, struct btf_type *t2)
* as referenced type IDs equivalence is established separately during type
* graph equivalence check algorithm.
*/
static __u32 btf_hash_struct(struct btf_type *t)
static long btf_hash_struct(struct btf_type *t)
{
struct btf_member *member = (struct btf_member *)(t + 1);
__u32 vlen = BTF_INFO_VLEN(t->info);
__u32 h = btf_hash_common(t);
long h = btf_hash_common(t);
int i;
for (i = 0; i < vlen; i++) {
@@ -1804,10 +1888,10 @@ static bool btf_shallow_equal_struct(struct btf_type *t1, struct btf_type *t2)
* under assumption that they were already resolved to canonical type IDs and
* are not going to change.
*/
static __u32 btf_hash_array(struct btf_type *t)
static long btf_hash_array(struct btf_type *t)
{
struct btf_array *info = (struct btf_array *)(t + 1);
__u32 h = btf_hash_common(t);
long h = btf_hash_common(t);
h = hash_combine(h, info->type);
h = hash_combine(h, info->index_type);
@@ -1858,11 +1942,11 @@ static bool btf_compat_array(struct btf_type *t1, struct btf_type *t2)
* under assumption that they were already resolved to canonical type IDs and
* are not going to change.
*/
static inline __u32 btf_hash_fnproto(struct btf_type *t)
static long btf_hash_fnproto(struct btf_type *t)
{
struct btf_param *member = (struct btf_param *)(t + 1);
__u16 vlen = BTF_INFO_VLEN(t->info);
__u32 h = btf_hash_common(t);
long h = btf_hash_common(t);
int i;
for (i = 0; i < vlen; i++) {
@@ -1880,7 +1964,7 @@ static inline __u32 btf_hash_fnproto(struct btf_type *t)
* This function is called during reference types deduplication to compare
* FUNC_PROTO to potential canonical representative.
*/
static inline bool btf_equal_fnproto(struct btf_type *t1, struct btf_type *t2)
static bool btf_equal_fnproto(struct btf_type *t1, struct btf_type *t2)
{
struct btf_param *m1, *m2;
__u16 vlen;
@@ -1906,7 +1990,7 @@ static inline bool btf_equal_fnproto(struct btf_type *t1, struct btf_type *t2)
* IDs. This check is performed during type graph equivalence check and
* referenced types equivalence is checked separately.
*/
static inline bool btf_compat_fnproto(struct btf_type *t1, struct btf_type *t2)
static bool btf_compat_fnproto(struct btf_type *t1, struct btf_type *t2)
{
struct btf_param *m1, *m2;
__u16 vlen;
@@ -1937,11 +2021,12 @@ static inline bool btf_compat_fnproto(struct btf_type *t1, struct btf_type *t2)
static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
{
struct btf_type *t = d->btf->types[type_id];
struct hashmap_entry *hash_entry;
struct btf_type *cand;
struct btf_dedup_node *cand_node;
/* if we don't find equivalent type, then we are canonical */
__u32 new_id = type_id;
__u32 h;
__u32 cand_id;
long h;
switch (BTF_INFO_KIND(t->info)) {
case BTF_KIND_CONST:
@@ -1960,10 +2045,11 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
case BTF_KIND_INT:
h = btf_hash_int(t);
for_each_dedup_cand(d, h, cand_node) {
cand = d->btf->types[cand_node->type_id];
for_each_dedup_cand(d, hash_entry, h) {
cand_id = (__u32)(long)hash_entry->value;
cand = d->btf->types[cand_id];
if (btf_equal_int(t, cand)) {
new_id = cand_node->type_id;
new_id = cand_id;
break;
}
}
@@ -1971,10 +2057,11 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
case BTF_KIND_ENUM:
h = btf_hash_enum(t);
for_each_dedup_cand(d, h, cand_node) {
cand = d->btf->types[cand_node->type_id];
for_each_dedup_cand(d, hash_entry, h) {
cand_id = (__u32)(long)hash_entry->value;
cand = d->btf->types[cand_id];
if (btf_equal_enum(t, cand)) {
new_id = cand_node->type_id;
new_id = cand_id;
break;
}
if (d->opts.dont_resolve_fwds)
@@ -1982,21 +2069,22 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
if (btf_compat_enum(t, cand)) {
if (btf_is_enum_fwd(t)) {
/* resolve fwd to full enum */
new_id = cand_node->type_id;
new_id = cand_id;
break;
}
/* resolve canonical enum fwd to full enum */
d->map[cand_node->type_id] = type_id;
d->map[cand_id] = type_id;
}
}
break;
case BTF_KIND_FWD:
h = btf_hash_common(t);
for_each_dedup_cand(d, h, cand_node) {
cand = d->btf->types[cand_node->type_id];
for_each_dedup_cand(d, hash_entry, h) {
cand_id = (__u32)(long)hash_entry->value;
cand = d->btf->types[cand_id];
if (btf_equal_common(t, cand)) {
new_id = cand_node->type_id;
new_id = cand_id;
break;
}
}
@@ -2397,12 +2485,12 @@ static void btf_dedup_merge_hypot_map(struct btf_dedup *d)
*/
static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
{
struct btf_dedup_node *cand_node;
struct btf_type *cand_type, *t;
struct hashmap_entry *hash_entry;
/* if we don't find equivalent type, then we are canonical */
__u32 new_id = type_id;
__u16 kind;
__u32 h;
long h;
/* already deduped or is in process of deduping (loop detected) */
if (d->map[type_id] <= BTF_MAX_NR_TYPES)
@@ -2415,7 +2503,8 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
return 0;
h = btf_hash_struct(t);
for_each_dedup_cand(d, h, cand_node) {
for_each_dedup_cand(d, hash_entry, h) {
__u32 cand_id = (__u32)(long)hash_entry->value;
int eq;
/*
@@ -2428,17 +2517,17 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
* creating a loop (FWD -> STRUCT and STRUCT -> FWD), because
* FWD and compatible STRUCT/UNION are considered equivalent.
*/
cand_type = d->btf->types[cand_node->type_id];
cand_type = d->btf->types[cand_id];
if (!btf_shallow_equal_struct(t, cand_type))
continue;
btf_dedup_clear_hypot_map(d);
eq = btf_dedup_is_equiv(d, type_id, cand_node->type_id);
eq = btf_dedup_is_equiv(d, type_id, cand_id);
if (eq < 0)
return eq;
if (!eq)
continue;
new_id = cand_node->type_id;
new_id = cand_id;
btf_dedup_merge_hypot_map(d);
break;
}
@@ -2488,12 +2577,12 @@ static int btf_dedup_struct_types(struct btf_dedup *d)
*/
static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
{
struct btf_dedup_node *cand_node;
struct hashmap_entry *hash_entry;
__u32 new_id = type_id, cand_id;
struct btf_type *t, *cand;
/* if we don't find equivalent type, then we are representative type */
__u32 new_id = type_id;
int ref_type_id;
__u32 h;
long h;
if (d->map[type_id] == BTF_IN_PROGRESS_ID)
return -ELOOP;
@@ -2516,10 +2605,11 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
t->type = ref_type_id;
h = btf_hash_common(t);
for_each_dedup_cand(d, h, cand_node) {
cand = d->btf->types[cand_node->type_id];
for_each_dedup_cand(d, hash_entry, h) {
cand_id = (__u32)(long)hash_entry->value;
cand = d->btf->types[cand_id];
if (btf_equal_common(t, cand)) {
new_id = cand_node->type_id;
new_id = cand_id;
break;
}
}
@@ -2539,10 +2629,11 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
info->index_type = ref_type_id;
h = btf_hash_array(t);
for_each_dedup_cand(d, h, cand_node) {
cand = d->btf->types[cand_node->type_id];
for_each_dedup_cand(d, hash_entry, h) {
cand_id = (__u32)(long)hash_entry->value;
cand = d->btf->types[cand_id];
if (btf_equal_array(t, cand)) {
new_id = cand_node->type_id;
new_id = cand_id;
break;
}
}
@@ -2570,10 +2661,11 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
}
h = btf_hash_fnproto(t);
for_each_dedup_cand(d, h, cand_node) {
cand = d->btf->types[cand_node->type_id];
for_each_dedup_cand(d, hash_entry, h) {
cand_id = (__u32)(long)hash_entry->value;
cand = d->btf->types[cand_id];
if (btf_equal_fnproto(t, cand)) {
new_id = cand_node->type_id;
new_id = cand_id;
break;
}
}
@@ -2600,7 +2692,9 @@ static int btf_dedup_ref_types(struct btf_dedup *d)
if (err < 0)
return err;
}
btf_dedup_table_free(d);
/* we won't need d->dedup_table anymore */
hashmap__free(d->dedup_table);
d->dedup_table = NULL;
return 0;
}

View File

@@ -4,6 +4,7 @@
#ifndef __LIBBPF_BTF_H
#define __LIBBPF_BTF_H
#include <stdarg.h>
#include <linux/types.h>
#ifdef __cplusplus
@@ -16,6 +17,7 @@ extern "C" {
#define BTF_ELF_SEC ".BTF"
#define BTF_EXT_ELF_SEC ".BTF.ext"
#define MAPS_ELF_SEC ".maps"
struct btf;
struct btf_ext;
@@ -59,6 +61,8 @@ struct btf_ext_header {
LIBBPF_API void btf__free(struct btf *btf);
LIBBPF_API struct btf *btf__new(__u8 *data, __u32 size);
LIBBPF_API struct btf *btf__parse_elf(const char *path,
struct btf_ext **btf_ext);
LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf);
LIBBPF_API int btf__load(struct btf *btf);
LIBBPF_API __s32 btf__find_by_name(const struct btf *btf,
@@ -100,6 +104,22 @@ struct btf_dedup_opts {
LIBBPF_API int btf__dedup(struct btf *btf, struct btf_ext *btf_ext,
const struct btf_dedup_opts *opts);
struct btf_dump;
struct btf_dump_opts {
void *ctx;
};
typedef void (*btf_dump_printf_fn_t)(void *ctx, const char *fmt, va_list args);
LIBBPF_API struct btf_dump *btf_dump__new(const struct btf *btf,
const struct btf_ext *btf_ext,
const struct btf_dump_opts *opts,
btf_dump_printf_fn_t printf_fn);
LIBBPF_API void btf_dump__free(struct btf_dump *d);
LIBBPF_API int btf_dump__dump_type(struct btf_dump *d, __u32 id);
#ifdef __cplusplus
} /* extern "C" */
#endif

1333
src/btf_dump.c Normal file

File diff suppressed because it is too large Load Diff

229
src/hashmap.c Normal file
View File

@@ -0,0 +1,229 @@
// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
/*
* Generic non-thread safe hash map implementation.
*
* Copyright (c) 2019 Facebook
*/
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <linux/err.h>
#include "hashmap.h"
/* start with 4 buckets */
#define HASHMAP_MIN_CAP_BITS 2
static void hashmap_add_entry(struct hashmap_entry **pprev,
struct hashmap_entry *entry)
{
entry->next = *pprev;
*pprev = entry;
}
static void hashmap_del_entry(struct hashmap_entry **pprev,
struct hashmap_entry *entry)
{
*pprev = entry->next;
entry->next = NULL;
}
void hashmap__init(struct hashmap *map, hashmap_hash_fn hash_fn,
hashmap_equal_fn equal_fn, void *ctx)
{
map->hash_fn = hash_fn;
map->equal_fn = equal_fn;
map->ctx = ctx;
map->buckets = NULL;
map->cap = 0;
map->cap_bits = 0;
map->sz = 0;
}
struct hashmap *hashmap__new(hashmap_hash_fn hash_fn,
hashmap_equal_fn equal_fn,
void *ctx)
{
struct hashmap *map = malloc(sizeof(struct hashmap));
if (!map)
return ERR_PTR(-ENOMEM);
hashmap__init(map, hash_fn, equal_fn, ctx);
return map;
}
void hashmap__clear(struct hashmap *map)
{
free(map->buckets);
map->cap = map->cap_bits = map->sz = 0;
}
void hashmap__free(struct hashmap *map)
{
if (!map)
return;
hashmap__clear(map);
free(map);
}
size_t hashmap__size(const struct hashmap *map)
{
return map->sz;
}
size_t hashmap__capacity(const struct hashmap *map)
{
return map->cap;
}
static bool hashmap_needs_to_grow(struct hashmap *map)
{
/* grow if empty or more than 75% filled */
return (map->cap == 0) || ((map->sz + 1) * 4 / 3 > map->cap);
}
static int hashmap_grow(struct hashmap *map)
{
struct hashmap_entry **new_buckets;
struct hashmap_entry *cur, *tmp;
size_t new_cap_bits, new_cap;
size_t h;
int bkt;
new_cap_bits = map->cap_bits + 1;
if (new_cap_bits < HASHMAP_MIN_CAP_BITS)
new_cap_bits = HASHMAP_MIN_CAP_BITS;
new_cap = 1UL << new_cap_bits;
new_buckets = calloc(new_cap, sizeof(new_buckets[0]));
if (!new_buckets)
return -ENOMEM;
hashmap__for_each_entry_safe(map, cur, tmp, bkt) {
h = hash_bits(map->hash_fn(cur->key, map->ctx), new_cap_bits);
hashmap_add_entry(&new_buckets[h], cur);
}
map->cap = new_cap;
map->cap_bits = new_cap_bits;
free(map->buckets);
map->buckets = new_buckets;
return 0;
}
static bool hashmap_find_entry(const struct hashmap *map,
const void *key, size_t hash,
struct hashmap_entry ***pprev,
struct hashmap_entry **entry)
{
struct hashmap_entry *cur, **prev_ptr;
if (!map->buckets)
return false;
for (prev_ptr = &map->buckets[hash], cur = *prev_ptr;
cur;
prev_ptr = &cur->next, cur = cur->next) {
if (map->equal_fn(cur->key, key, map->ctx)) {
if (pprev)
*pprev = prev_ptr;
*entry = cur;
return true;
}
}
return false;
}
int hashmap__insert(struct hashmap *map, const void *key, void *value,
enum hashmap_insert_strategy strategy,
const void **old_key, void **old_value)
{
struct hashmap_entry *entry;
size_t h;
int err;
if (old_key)
*old_key = NULL;
if (old_value)
*old_value = NULL;
h = hash_bits(map->hash_fn(key, map->ctx), map->cap_bits);
if (strategy != HASHMAP_APPEND &&
hashmap_find_entry(map, key, h, NULL, &entry)) {
if (old_key)
*old_key = entry->key;
if (old_value)
*old_value = entry->value;
if (strategy == HASHMAP_SET || strategy == HASHMAP_UPDATE) {
entry->key = key;
entry->value = value;
return 0;
} else if (strategy == HASHMAP_ADD) {
return -EEXIST;
}
}
if (strategy == HASHMAP_UPDATE)
return -ENOENT;
if (hashmap_needs_to_grow(map)) {
err = hashmap_grow(map);
if (err)
return err;
h = hash_bits(map->hash_fn(key, map->ctx), map->cap_bits);
}
entry = malloc(sizeof(struct hashmap_entry));
if (!entry)
return -ENOMEM;
entry->key = key;
entry->value = value;
hashmap_add_entry(&map->buckets[h], entry);
map->sz++;
return 0;
}
bool hashmap__find(const struct hashmap *map, const void *key, void **value)
{
struct hashmap_entry *entry;
size_t h;
h = hash_bits(map->hash_fn(key, map->ctx), map->cap_bits);
if (!hashmap_find_entry(map, key, h, NULL, &entry))
return false;
if (value)
*value = entry->value;
return true;
}
bool hashmap__delete(struct hashmap *map, const void *key,
const void **old_key, void **old_value)
{
struct hashmap_entry **pprev, *entry;
size_t h;
h = hash_bits(map->hash_fn(key, map->ctx), map->cap_bits);
if (!hashmap_find_entry(map, key, h, &pprev, &entry))
return false;
if (old_key)
*old_key = entry->key;
if (old_value)
*old_value = entry->value;
hashmap_del_entry(pprev, entry);
free(entry);
map->sz--;
return true;
}

173
src/hashmap.h Normal file
View File

@@ -0,0 +1,173 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/*
* Generic non-thread safe hash map implementation.
*
* Copyright (c) 2019 Facebook
*/
#ifndef __LIBBPF_HASHMAP_H
#define __LIBBPF_HASHMAP_H
#include <stdbool.h>
#include <stddef.h>
#include "libbpf_internal.h"
static inline size_t hash_bits(size_t h, int bits)
{
/* shuffle bits and return requested number of upper bits */
return (h * 11400714819323198485llu) >> (__WORDSIZE - bits);
}
typedef size_t (*hashmap_hash_fn)(const void *key, void *ctx);
typedef bool (*hashmap_equal_fn)(const void *key1, const void *key2, void *ctx);
struct hashmap_entry {
const void *key;
void *value;
struct hashmap_entry *next;
};
struct hashmap {
hashmap_hash_fn hash_fn;
hashmap_equal_fn equal_fn;
void *ctx;
struct hashmap_entry **buckets;
size_t cap;
size_t cap_bits;
size_t sz;
};
#define HASHMAP_INIT(hash_fn, equal_fn, ctx) { \
.hash_fn = (hash_fn), \
.equal_fn = (equal_fn), \
.ctx = (ctx), \
.buckets = NULL, \
.cap = 0, \
.cap_bits = 0, \
.sz = 0, \
}
void hashmap__init(struct hashmap *map, hashmap_hash_fn hash_fn,
hashmap_equal_fn equal_fn, void *ctx);
struct hashmap *hashmap__new(hashmap_hash_fn hash_fn,
hashmap_equal_fn equal_fn,
void *ctx);
void hashmap__clear(struct hashmap *map);
void hashmap__free(struct hashmap *map);
size_t hashmap__size(const struct hashmap *map);
size_t hashmap__capacity(const struct hashmap *map);
/*
* Hashmap insertion strategy:
* - HASHMAP_ADD - only add key/value if key doesn't exist yet;
* - HASHMAP_SET - add key/value pair if key doesn't exist yet; otherwise,
* update value;
* - HASHMAP_UPDATE - update value, if key already exists; otherwise, do
* nothing and return -ENOENT;
* - HASHMAP_APPEND - always add key/value pair, even if key already exists.
* This turns hashmap into a multimap by allowing multiple values to be
* associated with the same key. Most useful read API for such hashmap is
* hashmap__for_each_key_entry() iteration. If hashmap__find() is still
* used, it will return last inserted key/value entry (first in a bucket
* chain).
*/
enum hashmap_insert_strategy {
HASHMAP_ADD,
HASHMAP_SET,
HASHMAP_UPDATE,
HASHMAP_APPEND,
};
/*
* hashmap__insert() adds key/value entry w/ various semantics, depending on
* provided strategy value. If a given key/value pair replaced already
* existing key/value pair, both old key and old value will be returned
* through old_key and old_value to allow calling code do proper memory
* management.
*/
int hashmap__insert(struct hashmap *map, const void *key, void *value,
enum hashmap_insert_strategy strategy,
const void **old_key, void **old_value);
static inline int hashmap__add(struct hashmap *map,
const void *key, void *value)
{
return hashmap__insert(map, key, value, HASHMAP_ADD, NULL, NULL);
}
static inline int hashmap__set(struct hashmap *map,
const void *key, void *value,
const void **old_key, void **old_value)
{
return hashmap__insert(map, key, value, HASHMAP_SET,
old_key, old_value);
}
static inline int hashmap__update(struct hashmap *map,
const void *key, void *value,
const void **old_key, void **old_value)
{
return hashmap__insert(map, key, value, HASHMAP_UPDATE,
old_key, old_value);
}
static inline int hashmap__append(struct hashmap *map,
const void *key, void *value)
{
return hashmap__insert(map, key, value, HASHMAP_APPEND, NULL, NULL);
}
bool hashmap__delete(struct hashmap *map, const void *key,
const void **old_key, void **old_value);
bool hashmap__find(const struct hashmap *map, const void *key, void **value);
/*
* hashmap__for_each_entry - iterate over all entries in hashmap
* @map: hashmap to iterate
* @cur: struct hashmap_entry * used as a loop cursor
* @bkt: integer used as a bucket loop cursor
*/
#define hashmap__for_each_entry(map, cur, bkt) \
for (bkt = 0; bkt < map->cap; bkt++) \
for (cur = map->buckets[bkt]; cur; cur = cur->next)
/*
* hashmap__for_each_entry_safe - iterate over all entries in hashmap, safe
* against removals
* @map: hashmap to iterate
* @cur: struct hashmap_entry * used as a loop cursor
* @tmp: struct hashmap_entry * used as a temporary next cursor storage
* @bkt: integer used as a bucket loop cursor
*/
#define hashmap__for_each_entry_safe(map, cur, tmp, bkt) \
for (bkt = 0; bkt < map->cap; bkt++) \
for (cur = map->buckets[bkt]; \
cur && ({tmp = cur->next; true; }); \
cur = tmp)
/*
* hashmap__for_each_key_entry - iterate over entries associated with given key
* @map: hashmap to iterate
* @cur: struct hashmap_entry * used as a loop cursor
* @key: key to iterate entries for
*/
#define hashmap__for_each_key_entry(map, cur, _key) \
for (cur = ({ size_t bkt = hash_bits(map->hash_fn((_key), map->ctx),\
map->cap_bits); \
map->buckets ? map->buckets[bkt] : NULL; }); \
cur; \
cur = cur->next) \
if (map->equal_fn(cur->key, (_key), map->ctx))
#define hashmap__for_each_key_entry_safe(map, cur, tmp, _key) \
for (cur = ({ size_t bkt = hash_bits(map->hash_fn((_key), map->ctx),\
map->cap_bits); \
cur = map->buckets ? map->buckets[bkt] : NULL; }); \
cur && ({ tmp = cur->next; true; }); \
cur = tmp) \
if (map->equal_fn(cur->key, (_key), map->ctx))
#endif /* __LIBBPF_HASHMAP_H */

File diff suppressed because it is too large Load Diff

View File

@@ -89,18 +89,25 @@ LIBBPF_API int bpf_object__unpin_programs(struct bpf_object *obj,
LIBBPF_API int bpf_object__pin(struct bpf_object *object, const char *path);
LIBBPF_API void bpf_object__close(struct bpf_object *object);
struct bpf_object_load_attr {
struct bpf_object *obj;
int log_level;
};
/* Load/unload object into/from kernel */
LIBBPF_API int bpf_object__load(struct bpf_object *obj);
LIBBPF_API int bpf_object__load_xattr(struct bpf_object_load_attr *attr);
LIBBPF_API int bpf_object__unload(struct bpf_object *obj);
LIBBPF_API const char *bpf_object__name(struct bpf_object *obj);
LIBBPF_API unsigned int bpf_object__kversion(struct bpf_object *obj);
LIBBPF_API const char *bpf_object__name(const struct bpf_object *obj);
LIBBPF_API unsigned int bpf_object__kversion(const struct bpf_object *obj);
struct btf;
LIBBPF_API struct btf *bpf_object__btf(struct bpf_object *obj);
LIBBPF_API struct btf *bpf_object__btf(const struct bpf_object *obj);
LIBBPF_API int bpf_object__btf_fd(const struct bpf_object *obj);
LIBBPF_API struct bpf_program *
bpf_object__find_program_by_title(struct bpf_object *obj, const char *title);
bpf_object__find_program_by_title(const struct bpf_object *obj,
const char *title);
LIBBPF_API struct bpf_object *bpf_object__next(struct bpf_object *prev);
#define bpf_object__for_each_safe(pos, tmp) \
@@ -112,7 +119,7 @@ LIBBPF_API struct bpf_object *bpf_object__next(struct bpf_object *prev);
typedef void (*bpf_object_clear_priv_t)(struct bpf_object *, void *);
LIBBPF_API int bpf_object__set_priv(struct bpf_object *obj, void *priv,
bpf_object_clear_priv_t clear_priv);
LIBBPF_API void *bpf_object__priv(struct bpf_object *prog);
LIBBPF_API void *bpf_object__priv(const struct bpf_object *prog);
LIBBPF_API int
libbpf_prog_type_by_name(const char *name, enum bpf_prog_type *prog_type,
@@ -123,7 +130,7 @@ LIBBPF_API int libbpf_attach_type_by_name(const char *name,
/* Accessors of bpf_program */
struct bpf_program;
LIBBPF_API struct bpf_program *bpf_program__next(struct bpf_program *prog,
struct bpf_object *obj);
const struct bpf_object *obj);
#define bpf_object__for_each_program(pos, obj) \
for ((pos) = bpf_program__next(NULL, (obj)); \
@@ -131,24 +138,23 @@ LIBBPF_API struct bpf_program *bpf_program__next(struct bpf_program *prog,
(pos) = bpf_program__next((pos), (obj)))
LIBBPF_API struct bpf_program *bpf_program__prev(struct bpf_program *prog,
struct bpf_object *obj);
const struct bpf_object *obj);
typedef void (*bpf_program_clear_priv_t)(struct bpf_program *,
void *);
typedef void (*bpf_program_clear_priv_t)(struct bpf_program *, void *);
LIBBPF_API int bpf_program__set_priv(struct bpf_program *prog, void *priv,
bpf_program_clear_priv_t clear_priv);
LIBBPF_API void *bpf_program__priv(struct bpf_program *prog);
LIBBPF_API void *bpf_program__priv(const struct bpf_program *prog);
LIBBPF_API void bpf_program__set_ifindex(struct bpf_program *prog,
__u32 ifindex);
LIBBPF_API const char *bpf_program__title(struct bpf_program *prog,
LIBBPF_API const char *bpf_program__title(const struct bpf_program *prog,
bool needs_copy);
LIBBPF_API int bpf_program__load(struct bpf_program *prog, char *license,
__u32 kern_version);
LIBBPF_API int bpf_program__fd(struct bpf_program *prog);
LIBBPF_API int bpf_program__fd(const struct bpf_program *prog);
LIBBPF_API int bpf_program__pin_instance(struct bpf_program *prog,
const char *path,
int instance);
@@ -159,6 +165,27 @@ LIBBPF_API int bpf_program__pin(struct bpf_program *prog, const char *path);
LIBBPF_API int bpf_program__unpin(struct bpf_program *prog, const char *path);
LIBBPF_API void bpf_program__unload(struct bpf_program *prog);
struct bpf_link;
LIBBPF_API int bpf_link__destroy(struct bpf_link *link);
LIBBPF_API struct bpf_link *
bpf_program__attach_perf_event(struct bpf_program *prog, int pfd);
LIBBPF_API struct bpf_link *
bpf_program__attach_kprobe(struct bpf_program *prog, bool retprobe,
const char *func_name);
LIBBPF_API struct bpf_link *
bpf_program__attach_uprobe(struct bpf_program *prog, bool retprobe,
pid_t pid, const char *binary_path,
size_t func_offset);
LIBBPF_API struct bpf_link *
bpf_program__attach_tracepoint(struct bpf_program *prog,
const char *tp_category,
const char *tp_name);
LIBBPF_API struct bpf_link *
bpf_program__attach_raw_tracepoint(struct bpf_program *prog,
const char *tp_name);
struct bpf_insn;
/*
@@ -221,7 +248,7 @@ typedef int (*bpf_program_prep_t)(struct bpf_program *prog, int n,
LIBBPF_API int bpf_program__set_prep(struct bpf_program *prog, int nr_instance,
bpf_program_prep_t prep);
LIBBPF_API int bpf_program__nth_fd(struct bpf_program *prog, int n);
LIBBPF_API int bpf_program__nth_fd(const struct bpf_program *prog, int n);
/*
* Adjust type of BPF program. Default is kprobe.
@@ -240,14 +267,14 @@ LIBBPF_API void
bpf_program__set_expected_attach_type(struct bpf_program *prog,
enum bpf_attach_type type);
LIBBPF_API bool bpf_program__is_socket_filter(struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_tracepoint(struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_raw_tracepoint(struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_kprobe(struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_sched_cls(struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_sched_act(struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_xdp(struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_perf_event(struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_socket_filter(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_tracepoint(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_raw_tracepoint(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_kprobe(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_sched_cls(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_sched_act(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_xdp(const struct bpf_program *prog);
LIBBPF_API bool bpf_program__is_perf_event(const struct bpf_program *prog);
/*
* No need for __attribute__((packed)), all members of 'bpf_map_def'
@@ -269,10 +296,10 @@ struct bpf_map_def {
*/
struct bpf_map;
LIBBPF_API struct bpf_map *
bpf_object__find_map_by_name(struct bpf_object *obj, const char *name);
bpf_object__find_map_by_name(const struct bpf_object *obj, const char *name);
LIBBPF_API int
bpf_object__find_map_fd_by_name(struct bpf_object *obj, const char *name);
bpf_object__find_map_fd_by_name(const struct bpf_object *obj, const char *name);
/*
* Get bpf_map through the offset of corresponding struct bpf_map_def
@@ -282,7 +309,7 @@ LIBBPF_API struct bpf_map *
bpf_object__find_map_by_offset(struct bpf_object *obj, size_t offset);
LIBBPF_API struct bpf_map *
bpf_map__next(struct bpf_map *map, struct bpf_object *obj);
bpf_map__next(const struct bpf_map *map, const struct bpf_object *obj);
#define bpf_object__for_each_map(pos, obj) \
for ((pos) = bpf_map__next(NULL, (obj)); \
(pos) != NULL; \
@@ -290,22 +317,22 @@ bpf_map__next(struct bpf_map *map, struct bpf_object *obj);
#define bpf_map__for_each bpf_object__for_each_map
LIBBPF_API struct bpf_map *
bpf_map__prev(struct bpf_map *map, struct bpf_object *obj);
bpf_map__prev(const struct bpf_map *map, const struct bpf_object *obj);
LIBBPF_API int bpf_map__fd(struct bpf_map *map);
LIBBPF_API const struct bpf_map_def *bpf_map__def(struct bpf_map *map);
LIBBPF_API const char *bpf_map__name(struct bpf_map *map);
LIBBPF_API int bpf_map__fd(const struct bpf_map *map);
LIBBPF_API const struct bpf_map_def *bpf_map__def(const struct bpf_map *map);
LIBBPF_API const char *bpf_map__name(const struct bpf_map *map);
LIBBPF_API __u32 bpf_map__btf_key_type_id(const struct bpf_map *map);
LIBBPF_API __u32 bpf_map__btf_value_type_id(const struct bpf_map *map);
typedef void (*bpf_map_clear_priv_t)(struct bpf_map *, void *);
LIBBPF_API int bpf_map__set_priv(struct bpf_map *map, void *priv,
bpf_map_clear_priv_t clear_priv);
LIBBPF_API void *bpf_map__priv(struct bpf_map *map);
LIBBPF_API void *bpf_map__priv(const struct bpf_map *map);
LIBBPF_API int bpf_map__reuse_fd(struct bpf_map *map, int fd);
LIBBPF_API int bpf_map__resize(struct bpf_map *map, __u32 max_entries);
LIBBPF_API bool bpf_map__is_offload_neutral(struct bpf_map *map);
LIBBPF_API bool bpf_map__is_internal(struct bpf_map *map);
LIBBPF_API bool bpf_map__is_offload_neutral(const struct bpf_map *map);
LIBBPF_API bool bpf_map__is_internal(const struct bpf_map *map);
LIBBPF_API void bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex);
LIBBPF_API int bpf_map__pin(struct bpf_map *map, const char *path);
LIBBPF_API int bpf_map__unpin(struct bpf_map *map, const char *path);
@@ -320,6 +347,7 @@ struct bpf_prog_load_attr {
enum bpf_attach_type expected_attach_type;
int ifindex;
int log_level;
int prog_flags;
};
LIBBPF_API int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,
@@ -330,6 +358,26 @@ LIBBPF_API int bpf_prog_load(const char *file, enum bpf_prog_type type,
LIBBPF_API int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags);
LIBBPF_API int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags);
struct perf_buffer;
typedef void (*perf_buffer_sample_fn)(void *ctx, int cpu,
void *data, __u32 size);
typedef void (*perf_buffer_lost_fn)(void *ctx, int cpu, __u64 cnt);
/* common use perf buffer options */
struct perf_buffer_opts {
/* if specified, sample_cb is called for each sample */
perf_buffer_sample_fn sample_cb;
/* if specified, lost_cb is called for each batch of lost samples */
perf_buffer_lost_fn lost_cb;
/* ctx is provided to sample_cb and lost_cb */
void *ctx;
};
LIBBPF_API struct perf_buffer *
perf_buffer__new(int map_fd, size_t page_cnt,
const struct perf_buffer_opts *opts);
enum bpf_perf_event_ret {
LIBBPF_PERF_EVENT_DONE = 0,
LIBBPF_PERF_EVENT_ERROR = -1,
@@ -337,6 +385,35 @@ enum bpf_perf_event_ret {
};
struct perf_event_header;
typedef enum bpf_perf_event_ret
(*perf_buffer_event_fn)(void *ctx, int cpu, struct perf_event_header *event);
/* raw perf buffer options, giving most power and control */
struct perf_buffer_raw_opts {
/* perf event attrs passed directly into perf_event_open() */
struct perf_event_attr *attr;
/* raw event callback */
perf_buffer_event_fn event_cb;
/* ctx is provided to event_cb */
void *ctx;
/* if cpu_cnt == 0, open all on all possible CPUs (up to the number of
* max_entries of given PERF_EVENT_ARRAY map)
*/
int cpu_cnt;
/* if cpu_cnt > 0, cpus is an array of CPUs to open ring buffers on */
int *cpus;
/* if cpu_cnt > 0, map_keys specify map keys to set per-CPU FDs for */
int *map_keys;
};
LIBBPF_API struct perf_buffer *
perf_buffer__new_raw(int map_fd, size_t page_cnt,
const struct perf_buffer_raw_opts *opts);
LIBBPF_API void perf_buffer__free(struct perf_buffer *pb);
LIBBPF_API int perf_buffer__poll(struct perf_buffer *pb, int timeout_ms);
typedef enum bpf_perf_event_ret
(*bpf_perf_event_print_t)(struct perf_event_header *hdr,
void *private_data);
@@ -447,6 +524,22 @@ bpf_program__bpil_addr_to_offs(struct bpf_prog_info_linear *info_linear);
LIBBPF_API void
bpf_program__bpil_offs_to_addr(struct bpf_prog_info_linear *info_linear);
/*
* A helper function to get the number of possible CPUs before looking up
* per-CPU maps. Negative errno is returned on failure.
*
* Example usage:
*
* int ncpus = libbpf_num_possible_cpus();
* if (ncpus < 0) {
* // error handling
* }
* long values[ncpus];
* bpf_map_lookup_elem(per_cpu_map_fd, key, values);
*
*/
LIBBPF_API int libbpf_num_possible_cpus(void);
#ifdef __cplusplus
} /* extern "C" */
#endif

View File

@@ -164,3 +164,23 @@ LIBBPF_0.0.3 {
bpf_map_freeze;
btf__finalize_data;
} LIBBPF_0.0.2;
LIBBPF_0.0.4 {
global:
bpf_link__destroy;
bpf_object__load_xattr;
bpf_program__attach_kprobe;
bpf_program__attach_perf_event;
bpf_program__attach_raw_tracepoint;
bpf_program__attach_tracepoint;
bpf_program__attach_uprobe;
btf_dump__dump_type;
btf_dump__free;
btf_dump__new;
btf__parse_elf;
libbpf_num_possible_cpus;
perf_buffer__free;
perf_buffer__new;
perf_buffer__new_raw;
perf_buffer__poll;
} LIBBPF_0.0.3;

View File

@@ -9,6 +9,8 @@
#ifndef __LIBBPF_LIBBPF_INTERNAL_H
#define __LIBBPF_LIBBPF_INTERNAL_H
#include "libbpf.h"
#define BTF_INFO_ENC(kind, kind_flag, vlen) \
((!!(kind_flag) << 31) | ((kind) << 24) | ((vlen) & BTF_MAX_VLEN))
#define BTF_TYPE_ENC(name, info, size_or_type) (name), (info), (size_or_type)
@@ -21,6 +23,13 @@
#define BTF_PARAM_ENC(name, type) (name), (type)
#define BTF_VAR_SECINFO_ENC(type, offset, size) (type), (offset), (size)
#ifndef min
# define min(x, y) ((x) < (y) ? (x) : (y))
#endif
#ifndef max
# define max(x, y) ((x) < (y) ? (y) : (x))
#endif
extern void libbpf_print(enum libbpf_print_level level,
const char *format, ...)
__attribute__((format(printf, 2, 3)));
@@ -34,7 +43,7 @@ do { \
#define pr_info(fmt, ...) __pr(LIBBPF_INFO, fmt, ##__VA_ARGS__)
#define pr_debug(fmt, ...) __pr(LIBBPF_DEBUG, fmt, ##__VA_ARGS__)
int libbpf__probe_raw_btf(const char *raw_types, size_t types_len,
const char *str_sec, size_t str_len);
int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
const char *str_sec, size_t str_len);
#endif /* __LIBBPF_LIBBPF_INTERNAL_H */

View File

@@ -101,6 +101,7 @@ probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns,
case BPF_PROG_TYPE_SK_REUSEPORT:
case BPF_PROG_TYPE_FLOW_DISSECTOR:
case BPF_PROG_TYPE_CGROUP_SYSCTL:
case BPF_PROG_TYPE_CGROUP_SOCKOPT:
default:
break;
}
@@ -133,8 +134,8 @@ bool bpf_probe_prog_type(enum bpf_prog_type prog_type, __u32 ifindex)
return errno != EINVAL && errno != EOPNOTSUPP;
}
int libbpf__probe_raw_btf(const char *raw_types, size_t types_len,
const char *str_sec, size_t str_len)
int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
const char *str_sec, size_t str_len)
{
struct btf_header hdr = {
.magic = BTF_MAGIC,
@@ -157,14 +158,9 @@ int libbpf__probe_raw_btf(const char *raw_types, size_t types_len,
memcpy(raw_btf + hdr.hdr_len + hdr.type_len, str_sec, hdr.str_len);
btf_fd = bpf_load_btf(raw_btf, btf_len, NULL, 0, false);
if (btf_fd < 0) {
free(raw_btf);
return 0;
}
close(btf_fd);
free(raw_btf);
return 1;
return btf_fd;
}
static int load_sk_storage_btf(void)
@@ -190,7 +186,7 @@ static int load_sk_storage_btf(void)
BTF_MEMBER_ENC(23, 2, 32),/* struct bpf_spin_lock l; */
};
return libbpf__probe_raw_btf((char *)types, sizeof(types),
return libbpf__load_raw_btf((char *)types, sizeof(types),
strs, sizeof(strs));
}

View File

@@ -11,7 +11,7 @@
*/
char *libbpf_strerror_r(int err, char *dst, int len)
{
int ret = strerror_r(err, dst, len);
int ret = strerror_r(err < 0 ? -err : err, dst, len);
if (ret)
snprintf(dst, len, "ERROR: strerror_r(%d)=%d", err, ret);
return dst;

119
src/xsk.c
View File

@@ -60,13 +60,12 @@ struct xsk_socket {
struct xsk_umem *umem;
struct xsk_socket_config config;
int fd;
int xsks_map;
int ifindex;
int prog_fd;
int qidconf_map_fd;
int xsks_map_fd;
__u32 queue_id;
char ifname[IFNAMSIZ];
bool zc;
};
struct xsk_nl_info {
@@ -265,15 +264,11 @@ static int xsk_load_xdp_prog(struct xsk_socket *xsk)
/* This is the C-program:
* SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
* {
* int *qidconf, index = ctx->rx_queue_index;
* int index = ctx->rx_queue_index;
*
* // A set entry here means that the correspnding queue_id
* // has an active AF_XDP socket bound to it.
* qidconf = bpf_map_lookup_elem(&qidconf_map, &index);
* if (!qidconf)
* return XDP_ABORTED;
*
* if (*qidconf)
* if (bpf_map_lookup_elem(&xsks_map, &index))
* return bpf_redirect_map(&xsks_map, index, 0);
*
* return XDP_PASS;
@@ -286,15 +281,10 @@ static int xsk_load_xdp_prog(struct xsk_socket *xsk)
BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_1, -4),
BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -4),
BPF_LD_MAP_FD(BPF_REG_1, xsk->qidconf_map_fd),
BPF_LD_MAP_FD(BPF_REG_1, xsk->xsks_map_fd),
BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
BPF_MOV32_IMM(BPF_REG_0, 0),
/* if r1 == 0 goto +8 */
BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0, 8),
BPF_MOV32_IMM(BPF_REG_0, 2),
/* r1 = *(u32 *)(r1 + 0) */
BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_1, 0),
/* if r1 == 0 goto +5 */
BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0, 5),
/* r2 = *(u32 *)(r10 - 4) */
@@ -337,7 +327,8 @@ static int xsk_get_max_queues(struct xsk_socket *xsk)
channels.cmd = ETHTOOL_GCHANNELS;
ifr.ifr_data = (void *)&channels;
strncpy(ifr.ifr_name, xsk->ifname, IFNAMSIZ);
strncpy(ifr.ifr_name, xsk->ifname, IFNAMSIZ - 1);
ifr.ifr_name[IFNAMSIZ - 1] = '\0';
err = ioctl(fd, SIOCETHTOOL, &ifr);
if (err && errno != EOPNOTSUPP) {
ret = -errno;
@@ -366,18 +357,11 @@ static int xsk_create_bpf_maps(struct xsk_socket *xsk)
if (max_queues < 0)
return max_queues;
fd = bpf_create_map_name(BPF_MAP_TYPE_ARRAY, "qidconf_map",
fd = bpf_create_map_name(BPF_MAP_TYPE_XSKMAP, "xsks_map",
sizeof(int), sizeof(int), max_queues, 0);
if (fd < 0)
return fd;
xsk->qidconf_map_fd = fd;
fd = bpf_create_map_name(BPF_MAP_TYPE_XSKMAP, "xsks_map",
sizeof(int), sizeof(int), max_queues, 0);
if (fd < 0) {
close(xsk->qidconf_map_fd);
return fd;
}
xsk->xsks_map_fd = fd;
return 0;
@@ -385,10 +369,8 @@ static int xsk_create_bpf_maps(struct xsk_socket *xsk)
static void xsk_delete_bpf_maps(struct xsk_socket *xsk)
{
close(xsk->qidconf_map_fd);
bpf_map_delete_elem(xsk->xsks_map_fd, &xsk->queue_id);
close(xsk->xsks_map_fd);
xsk->qidconf_map_fd = -1;
xsk->xsks_map_fd = -1;
}
static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
@@ -417,10 +399,9 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
if (err)
goto out_map_ids;
for (i = 0; i < prog_info.nr_map_ids; i++) {
if (xsk->qidconf_map_fd != -1 && xsk->xsks_map_fd != -1)
break;
xsk->xsks_map_fd = -1;
for (i = 0; i < prog_info.nr_map_ids; i++) {
fd = bpf_map_get_fd_by_id(map_ids[i]);
if (fd < 0)
continue;
@@ -431,11 +412,6 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
continue;
}
if (!strcmp(map_info.name, "qidconf_map")) {
xsk->qidconf_map_fd = fd;
continue;
}
if (!strcmp(map_info.name, "xsks_map")) {
xsk->xsks_map_fd = fd;
continue;
@@ -445,40 +421,18 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
}
err = 0;
if (xsk->qidconf_map_fd < 0 || xsk->xsks_map_fd < 0) {
if (xsk->xsks_map_fd == -1)
err = -ENOENT;
xsk_delete_bpf_maps(xsk);
}
out_map_ids:
free(map_ids);
return err;
}
static void xsk_clear_bpf_maps(struct xsk_socket *xsk)
{
int qid = false;
bpf_map_update_elem(xsk->qidconf_map_fd, &xsk->queue_id, &qid, 0);
bpf_map_delete_elem(xsk->xsks_map_fd, &xsk->queue_id);
}
static int xsk_set_bpf_maps(struct xsk_socket *xsk)
{
int qid = true, fd = xsk->fd, err;
err = bpf_map_update_elem(xsk->qidconf_map_fd, &xsk->queue_id, &qid, 0);
if (err)
goto out;
err = bpf_map_update_elem(xsk->xsks_map_fd, &xsk->queue_id, &fd, 0);
if (err)
goto out;
return 0;
out:
xsk_clear_bpf_maps(xsk);
return err;
return bpf_map_update_elem(xsk->xsks_map_fd, &xsk->queue_id,
&xsk->fd, 0);
}
static int xsk_setup_xdp_prog(struct xsk_socket *xsk)
@@ -497,26 +451,27 @@ static int xsk_setup_xdp_prog(struct xsk_socket *xsk)
return err;
err = xsk_load_xdp_prog(xsk);
if (err)
goto out_maps;
if (err) {
xsk_delete_bpf_maps(xsk);
return err;
}
} else {
xsk->prog_fd = bpf_prog_get_fd_by_id(prog_id);
err = xsk_lookup_bpf_maps(xsk);
if (err)
goto out_load;
if (err) {
close(xsk->prog_fd);
return err;
}
}
err = xsk_set_bpf_maps(xsk);
if (err)
goto out_load;
if (err) {
xsk_delete_bpf_maps(xsk);
close(xsk->prog_fd);
return err;
}
return 0;
out_load:
close(xsk->prog_fd);
out_maps:
xsk_delete_bpf_maps(xsk);
return err;
}
int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
@@ -527,6 +482,7 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
void *rx_map = NULL, *tx_map = NULL;
struct sockaddr_xdp sxdp = {};
struct xdp_mmap_offsets off;
struct xdp_options opts;
struct xsk_socket *xsk;
socklen_t optlen;
int err;
@@ -561,7 +517,8 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
err = -errno;
goto out_socket;
}
strncpy(xsk->ifname, ifname, IFNAMSIZ);
strncpy(xsk->ifname, ifname, IFNAMSIZ - 1);
xsk->ifname[IFNAMSIZ - 1] = '\0';
err = xsk_set_xdp_socket_config(&xsk->config, usr_config);
if (err)
@@ -643,8 +600,16 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
goto out_mmap_tx;
}
xsk->qidconf_map_fd = -1;
xsk->xsks_map_fd = -1;
xsk->prog_fd = -1;
optlen = sizeof(opts);
err = getsockopt(xsk->fd, SOL_XDP, XDP_OPTIONS, &opts, &optlen);
if (err) {
err = -errno;
goto out_mmap_tx;
}
xsk->zc = opts.flags & XDP_OPTIONS_ZEROCOPY;
if (!(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
err = xsk_setup_xdp_prog(xsk);
@@ -708,8 +673,10 @@ void xsk_socket__delete(struct xsk_socket *xsk)
if (!xsk)
return;
xsk_clear_bpf_maps(xsk);
xsk_delete_bpf_maps(xsk);
if (xsk->prog_fd != -1) {
xsk_delete_bpf_maps(xsk);
close(xsk->prog_fd);
}
optlen = sizeof(off);
err = getsockopt(xsk->fd, SOL_XDP, XDP_MMAP_OFFSETS, &off, &optlen);

View File

@@ -167,7 +167,7 @@ LIBBPF_API int xsk_socket__fd(const struct xsk_socket *xsk);
#define XSK_RING_CONS__DEFAULT_NUM_DESCS 2048
#define XSK_RING_PROD__DEFAULT_NUM_DESCS 2048
#define XSK_UMEM__DEFAULT_FRAME_SHIFT 11 /* 2048 bytes */
#define XSK_UMEM__DEFAULT_FRAME_SHIFT 12 /* 4096 bytes */
#define XSK_UMEM__DEFAULT_FRAME_SIZE (1 << XSK_UMEM__DEFAULT_FRAME_SHIFT)
#define XSK_UMEM__DEFAULT_FRAME_HEADROOM 0