511 Commits

Author SHA1 Message Date
Andrii Nakryiko
6a1615c263 vmtests: blacklist mmap test on 5.5
5.5 kernel has a bug in kernel allowing to violate read-only access to
mmap()-ed map. Disable selftest that now is failing.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
v0.0.8
2020-04-17 15:31:03 -07:00
Andrii Nakryiko
e66d297441 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   1a323ea5356edbb3073dc59d51b9e6b86908857d
Checkpoint bpf-next commit: 2fcd80144b93ff90836a44f2054b4d82133d3a85
Baseline bpf commit:        94b18a87efdd1626a1e6aef87271af4a7c616d36
Checkpoint bpf commit:      edadedf1c5b4e4404192a0a4c3c0c05e3b7672ab

Andrey Ignatov (1):
  libbpf: Fix bpf_get_link_xdp_id flags handling

Andrii Nakryiko (1):
  libbpf: Always specify expected_attach_type on program load if
    supported

Jeremy Cline (1):
  libbpf: Initialize *nl_pid so gcc 10 is happy

Toke Høiland-Jørgensen (1):
  libbpf: Fix type of old_fd in bpf_xdp_set_link_opts

 src/libbpf.c  | 126 ++++++++++++++++++++++++++++++++------------------
 src/libbpf.h  |   2 +-
 src/netlink.c |   6 +--
 3 files changed, 86 insertions(+), 48 deletions(-)

--
2.24.1
2020-04-17 15:31:03 -07:00
Toke Høiland-Jørgensen
632afdff45 libbpf: Fix type of old_fd in bpf_xdp_set_link_opts
The 'old_fd' parameter used for atomic replacement of XDP programs is
supposed to be an FD, but was left as a u32 from an earlier iteration of
the patch that added it. It was converted to an int when read, so things
worked correctly even with negative values, but better change the
definition to correctly reflect the intention.

Fixes: bd5ca3ef93cd ("libbpf: Add function to set link XDP fd while specifying old program")
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200414145025.182163-1-toke@redhat.com
2020-04-17 15:31:03 -07:00
Andrii Nakryiko
6e706b38bd libbpf: Always specify expected_attach_type on program load if supported
For some types of BPF programs that utilize expected_attach_type, libbpf won't
set load_attr.expected_attach_type, even if expected_attach_type is known from
section definition. This was done to preserve backwards compatibility with old
kernels that didn't recognize expected_attach_type attribute yet (which was
added in 5e43f899b03a ("bpf: Check attach type at prog load time"). But this
is problematic for some BPF programs that utilize newer features that require
kernel to know specific expected_attach_type (e.g., extended set of return
codes for cgroup_skb/egress programs).

This patch makes libbpf specify expected_attach_type by default, but also
detect support for this field in kernel and not set it during program load.
This allows to have a good metadata for bpf_program
(e.g., bpf_program__get_extected_attach_type()), but still work with old
kernels (for cases where it can work at all).

Additionally, due to expected_attach_type being always set for recognized
program types, bpf_program__attach_cgroup doesn't have to do extra checks to
determine correct attach type, so remove that additional logic.

Also adjust section_names selftest to account for this change.

More detailed discussion can be found in [0].

  [0] https://lore.kernel.org/bpf/20200412003604.GA15986@rdna-mbp.dhcp.thefacebook.com/

Fixes: 5cf1e9145630 ("bpf: cgroup inet skb programs can return 0 to 3")
Fixes: 5e43f899b03a ("bpf: Check attach type at prog load time")
Reported-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/20200414182645.1368174-1-andriin@fb.com
2020-04-17 15:31:03 -07:00
Andrey Ignatov
850293ba1c libbpf: Fix bpf_get_link_xdp_id flags handling
Currently if one of XDP_FLAGS_{DRV,HW,SKB}_MODE flags is passed to
bpf_get_link_xdp_id() and there is a single XDP program attached to
ifindex, that program's id will be returned by bpf_get_link_xdp_id() in
prog_id argument no matter what mode the program is attached in, i.e.
flags argument is not taken into account.

For example, if there is a single program attached with
XDP_FLAGS_SKB_MODE but user calls bpf_get_link_xdp_id() with flags =
XDP_FLAGS_DRV_MODE, that skb program will be returned.

Fix it by returning info->prog_id only if user didn't specify flags. If
flags is specified then return corresponding mode-specific-field from
struct xdp_link_info.

The initial error was introduced in commit 50db9f073188 ("libbpf: Add a
support for getting xdp prog id on ifindex") and then refactored in
473f4e133a12 so 473f4e133a12 is used in the Fixes tag.

Fixes: 473f4e133a12 ("libbpf: Add bpf_get_link_xdp_info() function to get more XDP information")
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/0e9e30490b44b447bb2bebc69c7135e7fe7e4e40.1586236080.git.rdna@fb.com
2020-04-17 15:31:03 -07:00
Jeremy Cline
fb528063b2 libbpf: Initialize *nl_pid so gcc 10 is happy
Builds of Fedora's kernel-tools package started to fail with "may be
used uninitialized" warnings for nl_pid in bpf_set_link_xdp_fd() and
bpf_get_link_xdp_info() on the s390 architecture.

Although libbpf_netlink_open() always returns a negative number when it
does not set *nl_pid, the compiler does not determine this and thus
believes the variable might be used uninitialized. Assuage gcc's fears
by explicitly initializing nl_pid.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1807781

Signed-off-by: Jeremy Cline <jcline@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200404051430.698058-1-jcline@redhat.com
2020-04-17 15:31:03 -07:00
Andrii Nakryiko
97ada10bd8 ci: update blacklists and Kconfig
Disable some of newest selftests on 5.5.0, turn on BPF_LSM on latest.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-04-02 00:02:25 -07:00
Andrii Nakryiko
9a35753b42 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   483d7a30f538e2f8addd32aa9a3d2e94ae55fa65
Checkpoint bpf-next commit: 1a323ea5356edbb3073dc59d51b9e6b86908857d
Baseline bpf commit:        94b18a87efdd1626a1e6aef87271af4a7c616d36
Checkpoint bpf commit:      94b18a87efdd1626a1e6aef87271af4a7c616d36

Andrii Nakryiko (2):
  bpf: Implement bpf_link-based cgroup BPF program attachment
  libbpf: Add support for bpf_link-based cgroup attachment

Antoine Tenart (1):
  net: macsec: add support for offloading to the MAC

Daniel Borkmann (2):
  bpf: Add netns cookie and enable it for bpf cgroup hooks
  bpf: Enable bpf cgroup hooks to retrieve cgroup v2 and ancestor id

Fletcher Dunn (1):
  libbpf, xsk: Init all ring members in xsk_umem__create and
    xsk_socket__create

Joe Stringer (1):
  bpf: Add socket assign support

KP Singh (2):
  bpf: Introduce BPF_PROG_TYPE_LSM
  tools/libbpf: Add support for BPF_PROG_TYPE_LSM

Mark Starovoytov (1):
  net: macsec: add support for specifying offload upon link creation

Stanislav Fomichev (1):
  libbpf: Don't allocate 16M for log buffer by default

Tobias Klauser (1):
  libbpf: Remove unused parameter `def` to get_map_field_int

Toke Høiland-Jørgensen (3):
  tools: Add EXPECTED_FD-related definitions in if_link.h
  libbpf: Add function to set link XDP fd while specifying old program
  libbpf: Add setter for initial value for internal maps

 include/uapi/linux/bpf.h     |  82 ++++++++++++++++++++-
 include/uapi/linux/if_link.h |   6 +-
 src/bpf.c                    |  37 +++++++++-
 src/bpf.h                    |  19 +++++
 src/btf.c                    |  20 ++++--
 src/libbpf.c                 | 134 +++++++++++++++++++++++++++++------
 src/libbpf.h                 |  22 +++++-
 src/libbpf.map               |   9 +++
 src/libbpf_probes.c          |   1 +
 src/netlink.c                |  34 ++++++++-
 src/xsk.c                    |  16 ++++-
 11 files changed, 345 insertions(+), 35 deletions(-)

--
2.24.1
2020-04-02 00:02:25 -07:00
Andrii Nakryiko
c4af2093cc sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-04-02 00:02:25 -07:00
Andrii Nakryiko
1543a19f36 libbpf: Add support for bpf_link-based cgroup attachment
Add bpf_program__attach_cgroup(), which uses BPF_LINK_CREATE subcommand to
create an FD-based kernel bpf_link. Also add low-level bpf_link_create() API.

If expected_attach_type is not specified explicitly with
bpf_program__set_expected_attach_type(), libbpf will try to determine proper
attach type from BPF program's section definition.

Also add support for bpf_link's underlying BPF program replacement:
  - unconditional through high-level bpf_link__update_program() API;
  - cmpxchg-like with specifying expected current BPF program through
    low-level bpf_link_update() API.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200330030001.2312810-4-andriin@fb.com
2020-04-02 00:02:25 -07:00
Andrii Nakryiko
8b41602694 bpf: Implement bpf_link-based cgroup BPF program attachment
Implement new sub-command to attach cgroup BPF programs and return FD-based
bpf_link back on success. bpf_link, once attached to cgroup, cannot be
replaced, except by owner having its FD. Cgroup bpf_link supports only
BPF_F_ALLOW_MULTI semantics. Both link-based and prog-based BPF_F_ALLOW_MULTI
attachments can be freely intermixed.

To prevent bpf_cgroup_link from keeping cgroup alive past the point when no
BPF program can be executed, implement auto-detachment of link. When
cgroup_bpf_release() is called, all attached bpf_links are forced to release
cgroup refcounts, but they leave bpf_link otherwise active and allocated, as
well as still owning underlying bpf_prog. This is because user-space might
still have FDs open and active, so bpf_link as a user-referenced object can't
be freed yet. Once last active FD is closed, bpf_link will be freed and
underlying bpf_prog refcount will be dropped. But cgroup refcount won't be
touched, because cgroup is released already.

The inherent race between bpf_cgroup_link release (from closing last FD) and
cgroup_bpf_release() is resolved by both operations taking cgroup_mutex. So
the only additional check required is when bpf_cgroup_link attempts to detach
itself from cgroup. At that time we need to check whether there is still
cgroup associated with that link. And if not, exit with success, because
bpf_cgroup_link was already successfully detached.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Roman Gushchin <guro@fb.com>
Link: https://lore.kernel.org/bpf/20200330030001.2312810-2-andriin@fb.com
2020-04-02 00:02:25 -07:00
Joe Stringer
cecb299ac4 bpf: Add socket assign support
Add support for TPROXY via a new bpf helper, bpf_sk_assign().

This helper requires the BPF program to discover the socket via a call
to bpf_sk*_lookup_*(), then pass this socket to the new helper. The
helper takes its own reference to the socket in addition to any existing
reference that may or may not currently be obtained for the duration of
BPF processing. For the destination socket to receive the traffic, the
traffic must be routed towards that socket via local route. The
simplest example route is below, but in practice you may want to route
traffic more narrowly (eg by CIDR):

  $ ip route add local default dev lo

This patch avoids trying to introduce an extra bit into the skb->sk, as
that would require more invasive changes to all code interacting with
the socket to ensure that the bit is handled correctly, such as all
error-handling cases along the path from the helper in BPF through to
the orphan path in the input. Instead, we opt to use the destructor
variable to switch on the prefetch of the socket.

Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200329225342.16317-2-joe@wand.net.nz
2020-04-02 00:02:25 -07:00
KP Singh
90e89264b9 tools/libbpf: Add support for BPF_PROG_TYPE_LSM
Since BPF_PROG_TYPE_LSM uses the same attaching mechanism as
BPF_PROG_TYPE_TRACING, the common logic is refactored into a static
function bpf_program__attach_btf_id.

A new API call bpf_program__attach_lsm is still added to avoid userspace
conflicts if this ever changes in the future.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Florent Revest <revest@google.com>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200329004356.27286-7-kpsingh@chromium.org
2020-04-02 00:02:25 -07:00
KP Singh
f69cc97272 bpf: Introduce BPF_PROG_TYPE_LSM
Introduce types and configs for bpf programs that can be attached to
LSM hooks. The programs can be enabled by the config option
CONFIG_BPF_LSM.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Florent Revest <revest@google.com>
Reviewed-by: Thomas Garnier <thgarnie@google.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Link: https://lore.kernel.org/bpf/20200329004356.27286-2-kpsingh@chromium.org
2020-04-02 00:02:25 -07:00
Toke Høiland-Jørgensen
a6e9750c8a libbpf: Add setter for initial value for internal maps
For internal maps (most notably the maps backing global variables), libbpf
uses an internal mmaped area to store the data after opening the object.
This data is subsequently copied into the kernel map when the object is
loaded.

This adds a function to set a new value for that data, which can be used to
before it is loaded into the kernel. This is especially relevant for RODATA
maps, since those are frozen on load.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200329132253.232541-1-toke@redhat.com
2020-04-02 00:02:25 -07:00
Toke Høiland-Jørgensen
60bade6674 libbpf: Add function to set link XDP fd while specifying old program
This adds a new function to set the XDP fd while specifying the FD of the
program to replace, using the newly added IFLA_XDP_EXPECTED_FD netlink
parameter. The new function uses the opts struct mechanism to be extendable
in the future.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/158515700857.92963.7052131201257841700.stgit@toke.dk
2020-04-02 00:02:25 -07:00
Toke Høiland-Jørgensen
e13c1b7b85 tools: Add EXPECTED_FD-related definitions in if_link.h
This adds the IFLA_XDP_EXPECTED_FD netlink attribute definition and the
XDP_FLAGS_REPLACE flag to if_link.h in tools/include.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/158515700747.92963.8615391897417388586.stgit@toke.dk
2020-04-02 00:02:25 -07:00
Fletcher Dunn
1d8451ccaf libbpf, xsk: Init all ring members in xsk_umem__create and xsk_socket__create
Fix a sharp edge in xsk_umem__create and xsk_socket__create.  Almost all of
the members of the ring buffer structs are initialized, but the "cached_xxx"
variables are not all initialized.  The caller is required to zero them.
This is needlessly dangerous.  The results if you don't do it can be very bad.
For example, they can cause xsk_prod_nb_free and xsk_cons_nb_avail to return
values greater than the size of the queue.  xsk_ring_cons__peek can return an
index that does not refer to an item that has been queued.

I have confirmed that without this change, my program misbehaves unless I
memset the ring buffers to zero before calling the function.  Afterwards,
my program works without (or with) the memset.

Signed-off-by: Fletcher Dunn <fletcherd@valvesoftware.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/85f12913cde94b19bfcb598344701c38@valvesoftware.com
2020-04-02 00:02:25 -07:00
Daniel Borkmann
fad6e249ea bpf: Enable bpf cgroup hooks to retrieve cgroup v2 and ancestor id
Enable the bpf_get_current_cgroup_id() helper for connect(), sendmsg(),
recvmsg() and bind-related hooks in order to retrieve the cgroup v2
context which can then be used as part of the key for BPF map lookups,
for example. Given these hooks operate in process context 'current' is
always valid and pointing to the app that is performing mentioned
syscalls if it's subject to a v2 cgroup. Also with same motivation of
commit 7723628101aa ("bpf: Introduce bpf_skb_ancestor_cgroup_id helper")
enable retrieval of ancestor from current so the cgroup id can be used
for policy lookups which can then forbid connect() / bind(), for example.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/d2a7ef42530ad299e3cbb245e6c12374b72145ef.1585323121.git.daniel@iogearbox.net
2020-04-02 00:02:25 -07:00
Daniel Borkmann
64f7fa917c bpf: Add netns cookie and enable it for bpf cgroup hooks
In Cilium we're mainly using BPF cgroup hooks today in order to implement
kube-proxy free Kubernetes service translation for ClusterIP, NodePort (*),
ExternalIP, and LoadBalancer as well as HostPort mapping [0] for all traffic
between Cilium managed nodes. While this works in its current shape and avoids
packet-level NAT for inter Cilium managed node traffic, there is one major
limitation we're facing today, that is, lack of netns awareness.

In Kubernetes, the concept of Pods (which hold one or multiple containers)
has been built around network namespaces, so while we can use the global scope
of attaching to root BPF cgroup hooks also to our advantage (e.g. for exposing
NodePort ports on loopback addresses), we also have the need to differentiate
between initial network namespaces and non-initial one. For example, ExternalIP
services mandate that non-local service IPs are not to be translated from the
host (initial) network namespace as one example. Right now, we have an ugly
work-around in place where non-local service IPs for ExternalIP services are
not xlated from connect() and friends BPF hooks but instead via less efficient
packet-level NAT on the veth tc ingress hook for Pod traffic.

On top of determining whether we're in initial or non-initial network namespace
we also have a need for a socket-cookie like mechanism for network namespaces
scope. Socket cookies have the nice property that they can be combined as part
of the key structure e.g. for BPF LRU maps without having to worry that the
cookie could be recycled. We are planning to use this for our sessionAffinity
implementation for services. Therefore, add a new bpf_get_netns_cookie() helper
which would resolve both use cases at once: bpf_get_netns_cookie(NULL) would
provide the cookie for the initial network namespace while passing the context
instead of NULL would provide the cookie from the application's network namespace.
We're using a hole, so no size increase; the assignment happens only once.
Therefore this allows for a comparison on initial namespace as well as regular
cookie usage as we have today with socket cookies. We could later on enable
this helper for other program types as well as we would see need.

  (*) Both externalTrafficPolicy={Local|Cluster} types
  [0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.c

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/c47d2346982693a9cf9da0e12690453aded4c788.1585323121.git.daniel@iogearbox.net
2020-04-02 00:02:25 -07:00
Stanislav Fomichev
240b8fa098 libbpf: Don't allocate 16M for log buffer by default
For each prog/btf load we allocate and free 16 megs of verifier buffer.
On production systems it doesn't really make sense because the
programs/btf have gone through extensive testing and (mostly) guaranteed
to successfully load.

Let's assume successful case by default and skip buffer allocation
on the first try. If there is an error, start with BPF_LOG_BUF_SIZE
and double it on each ENOSPC iteration.

v3:
* Return -ENOMEM when can't allocate log buffer (Andrii Nakryiko)

v2:
* Don't allocate the buffer at all on the first try (Andrii Nakryiko)

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200325195521.112210-1-sdf@google.com
2020-04-02 00:02:25 -07:00
Tobias Klauser
3756d20499 libbpf: Remove unused parameter def to get_map_field_int
Has been unused since commit ef99b02b23ef ("libbpf: capture value in BTF
type info for BTF-defined map defs").

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200325113655.19341-1-tklauser@distanz.ch
2020-04-02 00:02:25 -07:00
Mark Starovoytov
9e8b23289f net: macsec: add support for specifying offload upon link creation
This patch adds new netlink attribute to allow a user to (optionally)
specify the desired offload mode immediately upon MACSec link creation.

Separate iproute patch will be required to support this from user space.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-02 00:02:25 -07:00
Antoine Tenart
902eca48e5 net: macsec: add support for offloading to the MAC
This patch adds a new MACsec offloading option, MACSEC_OFFLOAD_MAC,
allowing a user to select a MAC as a provider for MACsec offloading
operations.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-02 00:02:25 -07:00
Andrii Nakryiko
9f0d55c24a vmtests: organize blacklists, enable sockmap_listen tests
Enable now-fixed sockmap_listen tests. Disabled vmlinux test on 5.5, on which
hrtimer_nanosleep() signature is incompatible. Filled out remaining
permanently disabled tests resons.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-03-17 14:56:36 -07:00
Andrii Nakryiko
e53dd1c436 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   9b79c0be350d3825ef26ed9eebac6ae50df506bc
Checkpoint bpf-next commit: 483d7a30f538e2f8addd32aa9a3d2e94ae55fa65
Baseline bpf commit:        90db6d772f749e38171d04619a5e3cd8804a6d02
Checkpoint bpf commit:      94b18a87efdd1626a1e6aef87271af4a7c616d36

Andrii Nakryiko (2):
  libbpf: Ignore incompatible types with matching name during CO-RE
    relocation
  libbpf: Provide CO-RE variants of PT_REGS macros

Wenbo Zhang (1):
  bpf, libbpf: Fix ___bpf_kretprobe_args1(x) macro definition

 src/bpf_tracing.h | 105 +++++++++++++++++++++++++++++++++++++++++++++-
 src/libbpf.c      |   4 ++
 2 files changed, 108 insertions(+), 1 deletion(-)

--
2.17.1
2020-03-17 14:56:36 -07:00
Andrii Nakryiko
da790d6014 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-03-17 14:56:36 -07:00
Wenbo Zhang
3d81b13b36 bpf, libbpf: Fix ___bpf_kretprobe_args1(x) macro definition
Use PT_REGS_RC instead of PT_REGS_RET to get ret correctly.

Fixes: df8ff35311c8 ("libbpf: Merge selftests' bpf_trace_helpers.h into libbpf's bpf_tracing.h")
Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200315083252.22274-1-ethercflow@gmail.com
2020-03-17 14:56:36 -07:00
Andrii Nakryiko
64bd9e074b libbpf: Provide CO-RE variants of PT_REGS macros
Syscall raw tracepoints have struct pt_regs pointer as tracepoint's first
argument. After that, reading any of pt_regs fields requires bpf_probe_read(),
even for tp_btf programs. Due to that, PT_REGS_PARMx macros are not usable as
is. This patch adds CO-RE variants of those macros that use BPF_CORE_READ() to
read necessary fields. This provides relocatable architecture-agnostic pt_regs
field accesses.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200313172336.1879637-4-andriin@fb.com
2020-03-17 14:56:36 -07:00
Andrii Nakryiko
53d473dd8e libbpf: Ignore incompatible types with matching name during CO-RE relocation
When finding target type candidates, ignore forward declarations, functions,
and other named types of incompatible kind. Not doing this can cause false
errors.  See [0] for one such case (due to struct pt_regs forward
declaration).

  [0] https://github.com/iovisor/bcc/pull/2806#issuecomment-598543645

Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm")
Reported-by: Wenbo Zhang <ethercflow@gmail.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200313172336.1879637-3-andriin@fb.com
2020-03-17 14:56:36 -07:00
Andrii Nakryiko
6d64d927a2 vmtests: enable previously failing kprobe selftests
With fixes in selftests, these tests should now pass.
Also add ability to add comments to blacklist/whitelist to explain why certain
test is disabled.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
cd87f1568e sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   abbc61a5f26d52a5d3abbbe552b275360b2c6631
Checkpoint bpf-next commit: 9b79c0be350d3825ef26ed9eebac6ae50df506bc
Baseline bpf commit:        542bf38f11d11bf98c69b2f83f3519ada8a76e95
Checkpoint bpf commit:      90db6d772f749e38171d04619a5e3cd8804a6d02

Andrii Nakryiko (4):
  libbpf: Fix handling of optional field_name in
    btf_dump__emit_type_decl
  bpf: Switch BPF UAPI #define constants used from BPF program side to
    enums
  libbpf: Assume unsigned values for BTF_KIND_ENUM
  libbpf: Split BTF presence checks into libbpf- and kernel-specific
    parts

Carlos Neira (1):
  bpf: Added new helper bpf_get_ns_current_pid_tgid

Eelco Chaudron (1):
  bpf: Add bpf_xdp_output() helper

KP Singh (2):
  bpf: Introduce BPF_MODIFY_RETURN
  tools/libbpf: Add support for BPF_MODIFY_RETURN

Willem de Bruijn (1):
  bpf: Sync uapi bpf.h to tools/

 include/uapi/linux/bpf.h | 223 +++++++++++++++++++++++++++------------
 src/btf_dump.c           |  10 +-
 src/libbpf.c             |  21 +++-
 3 files changed, 176 insertions(+), 78 deletions(-)

--
2.17.1
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
c417a4cb6f sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-03-12 22:57:51 -07:00
Eelco Chaudron
fa21d33fff bpf: Add bpf_xdp_output() helper
Introduce new helper that reuses existing xdp perf_event output
implementation, but can be called from raw_tracepoint programs
that receive 'struct xdp_buff *' as a tracepoint argument.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/158348514556.2239.11050972434793741444.stgit@xdp-tutorial
2020-03-12 22:57:51 -07:00
Carlos Neira
84cf76de9c bpf: Added new helper bpf_get_ns_current_pid_tgid
New bpf helper bpf_get_ns_current_pid_tgid,
This helper will return pid and tgid from current task
which namespace matches dev_t and inode number provided,
this will allows us to instrument a process inside a container.

Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200304204157.58695-3-cneirabustos@gmail.com
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
2ef4fdac6c libbpf: Split BTF presence checks into libbpf- and kernel-specific parts
Needs for application BTF being present differs between user-space libbpf needs and kernel
needs. Currently, BTF is mandatory only in kernel only when BPF application is
using STRUCT_OPS. While libbpf itself relies more heavily on presense of BTF:
  - for BTF-defined maps;
  - for Kconfig externs;
  - for STRUCT_OPS as well.

Thus, checks for presence and validness of bpf_object's BPF needs to be
performed separately, which is patch does.

Fixes: 5327644614a1 ("libbpf: Relax check whether BTF is mandatory")
Reported-by: Michal Rostecki <mrostecki@opensuse.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200312185033.736911-1-andriin@fb.com
2020-03-12 22:57:51 -07:00
KP Singh
1d72c9c382 tools/libbpf: Add support for BPF_MODIFY_RETURN
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200304191853.1529-6-kpsingh@chromium.org
2020-03-12 22:57:51 -07:00
KP Singh
7930230b43 bpf: Introduce BPF_MODIFY_RETURN
When multiple programs are attached, each program receives the return
value from the previous program on the stack and the last program
provides the return value to the attached function.

The fmod_ret bpf programs are run after the fentry programs and before
the fexit programs. The original function is only called if all the
fmod_ret programs return 0 to avoid any unintended side-effects. The
success value, i.e. 0 is not currently configurable but can be made so
where user-space can specify it at load time.

For example:

int func_to_be_attached(int a, int b)
{  <--- do_fentry

do_fmod_ret:
   <update ret by calling fmod_ret>
   if (ret != 0)
        goto do_fexit;

original_function:

    <side_effects_happen_here>

}  <--- do_fexit

The fmod_ret program attached to this function can be defined as:

SEC("fmod_ret/func_to_be_attached")
int BPF_PROG(func_name, int a, int b, int ret)
{
        // This will skip the original function logic.
        return 1;
}

The first fmod_ret program is passed 0 in its return argument.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200304191853.1529-4-kpsingh@chromium.org
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
483a8c238f libbpf: Assume unsigned values for BTF_KIND_ENUM
Currently, BTF_KIND_ENUM type doesn't record whether enum values should be
interpreted as signed or unsigned. In Linux, most enums are unsigned, though,
so interpreting them as unsigned matches real world better.

Change btf_dump test case to test maximum 32-bit value, instead of negative
value.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200303003233.3496043-3-andriin@fb.com
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
26cbe2384c bpf: Switch BPF UAPI #define constants used from BPF program side to enums
Switch BPF UAPI constants, previously defined as #define macro, to anonymous
enum values. This preserves constants values and behavior in expressions, but
has added advantaged of being captured as part of DWARF and, subsequently, BTF
type info. Which, in turn, greatly improves usefulness of generated vmlinux.h
for BPF applications, as it will not require BPF users to copy/paste various
flags and constants, which are frequently used with BPF helpers. Only those
constants that are used/useful from BPF program side are converted.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200303003233.3496043-2-andriin@fb.com
2020-03-12 22:57:51 -07:00
Andrii Nakryiko
cb4a430c8a libbpf: Fix handling of optional field_name in btf_dump__emit_type_decl
Internal functions, used by btf_dump__emit_type_decl(), assume field_name is
never going to be NULL. Ensure it's always the case.

Fixes: 9f81654eebe8 ("libbpf: Expose BTF-to-C type declaration emitting API")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200303180800.3303471-1-andriin@fb.com
2020-03-12 22:57:51 -07:00
Willem de Bruijn
f67d535cdb bpf: Sync uapi bpf.h to tools/
sync tools/include/uapi/linux/bpf.h to match include/uapi/linux/bpf.h

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200303200503.226217-3-willemdebruijn.kernel@gmail.com
2020-03-12 22:57:51 -07:00
Julia Kartseva
ef4785f065 vmtest: libbpf#137 follow-ups
- Run test_{maps|verifier} only with the latest kernel
- Mount run control script
- Style

Signed-off-by: Julia Kartseva (hex@fb.com)
2020-03-12 21:36:30 -07:00
Andrii Nakryiko
9a424bea42 vmtests: add few missing Kconfig settings
Add few missing Kconfig settings that might be relied on in selftests.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-03-11 14:44:28 -07:00
Julia Kartseva
10e4311ad7 vmtest: add mkrootfs.sh to build Arch Linux disk image
Generate a disk image for libbpf testing in compressed *.zst format

The mkrootfs.sh has the following stages:
- run pacstrap to install libbpf and selftests dependencies.
- create /etc/fstab w/ bpffs and debugfs filesystems
- create /etc/init.d/rcS to mount in bootime
- create /etc/inittab to invoke /etc/init.d/rcS
- compress an image

In addition ./travis-ci/vmtest/run.sh set up ext4 fs and mounts
it as a loop device:
mkfs.ext4 -q "$tmp"
mount -o loop "$tmp" "$mnt"

Signed-off-by: Julia Kartseva (hex@fb.com)
2020-03-11 08:31:13 -07:00
Julia Kartseva
50febacba1 vmtest: disk image update; run test_{maps|verifier}; blacklist update
The disk image is updated to 2020-03-11.

blacklist for LATEST kernel:
attach_probe (needs root cause)
perf_buffer (needs root cause)
send_signal (flaky)
sockmap_listen (flaky)

Run test_maps and test_verifier.
test_maps is not expected to pass for kernels other then LATEST.

Signed-off-by: Julia Kartseva (hex@fb.com)
2020-03-11 08:31:13 -07:00
Andrii Nakryiko
ef7d57fcec vmtest: blacklist link_pinning selftest on 5.5.0
Link pinning is not supported by 5.5.0 and older kernels.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
2020-03-03 00:05:56 -08:00
Andrii Nakryiko
7e7a15321e sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   503d539a6e417b018616bf3060e0b5814fafce47
Checkpoint bpf-next commit: abbc61a5f26d52a5d3abbbe552b275360b2c6631
Baseline bpf commit:        41f57cfde186dba6e357f9db25eafbed017e4487
Checkpoint bpf commit:      542bf38f11d11bf98c69b2f83f3519ada8a76e95

Andrii Nakryiko (3):
  libbpf: Fix use of PT_REGS_PARM macros with vmlinux.h
  libbpf: Merge selftests' bpf_trace_helpers.h into libbpf's
    bpf_tracing.h
  libbpf: Add bpf_link pinning/unpinning

 src/bpf_tracing.h | 120 +++++++++++++++++++++++++++++++++++++++++-
 src/libbpf.c      | 131 ++++++++++++++++++++++++++++++++++++----------
 src/libbpf.h      |   5 ++
 src/libbpf.map    |   5 ++
 4 files changed, 233 insertions(+), 28 deletions(-)

--
2.17.1
2020-03-03 00:05:56 -08:00
Andrii Nakryiko
77ac09c3eb libbpf: Add bpf_link pinning/unpinning
With bpf_link abstraction supported by kernel explicitly, add
pinning/unpinning API for links. Also allow to create (open) bpf_link from BPF
FS file.

This API allows to have an "ephemeral" FD-based BPF links (like raw tracepoint
or fexit/freplace attachments) surviving user process exit, by pinning them in
a BPF FS, which is an important use case for long-running BPF programs.

As part of this, expose underlying FD for bpf_link. While legacy bpf_link's
might not have a FD associated with them (which will be expressed as
a bpf_link with fd=-1), kernel's abstraction is based around FD-based usage,
so match it closely. This, subsequently, allows to have a generic
pinning/unpinning API for generalized bpf_link. For some types of bpf_links
kernel might not support pinning, in which case bpf_link__pin() will return
error.

With FD being part of generic bpf_link, also get rid of bpf_link_fd in favor
of using vanialla bpf_link.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200303043159.323675-3-andriin@fb.com
2020-03-03 00:05:56 -08:00
Andrii Nakryiko
40a08ef216 libbpf: Merge selftests' bpf_trace_helpers.h into libbpf's bpf_tracing.h
Move BPF_PROG, BPF_KPROBE, and BPF_KRETPROBE macro into libbpf's bpf_tracing.h
header to make it available for non-selftests users.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200229231112.1240137-5-andriin@fb.com
2020-03-03 00:05:56 -08:00