Commit Graph

2797 Commits

Author SHA1 Message Date
Ihor Solodrai
f5dcbae736 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   d7988720ef3ea5926f1b886b27eddf08abbadba0
Checkpoint bpf-next commit: ca0f39a369c5f927c3d004e63a5a778b08a9df94
Baseline bpf commit:        593fffb8bcfdacc2111a9951afe1fae77988aa4a
Checkpoint bpf commit:      e06e6b8001233241eb5b2e2791162f0585f50f4b

Andrey Grodzovsky (1):
  libbpf: Optimize kprobe.session attachment for exact function names

Ihor Solodrai (1):
  libbpf: Remove extern declaration of bpf_stream_vprintk()

Jakub Kicinski (1):
  bpftool: Fix truncated netlink dumps

Jiri Olsa (2):
  libbpf: Add uprobe syscall feature detection
  libbpf: Add support to detect nop,nop5 instructions combo for usdt
    probe

Josef Bacik (1):
  libbpf: Support appending split BTF in btf__add_btf()

 src/bpf_helpers.h     |  3 ---
 src/btf.c             | 27 ++++++++++++++++---------
 src/features.c        | 24 ++++++++++++++++++++++
 src/libbpf.c          | 19 ++++++++++++++++-
 src/libbpf_internal.h |  2 ++
 src/netlink.c         |  4 +++-
 src/usdt.c            | 47 +++++++++++++++++++++++++++++++++++++++----
 7 files changed, 108 insertions(+), 18 deletions(-)

Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
v1.7.0_pnetdata
2026-03-12 13:02:22 -07:00
Ihor Solodrai
8bf68faba8 sync: update .mailmap
Update .mailmap based on libbpf's list of contributors and on the latest
.mailmap version in the upstream repository.

Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2026-03-12 13:02:22 -07:00
Andrey Grodzovsky
e08da3014b libbpf: Optimize kprobe.session attachment for exact function names
Detect exact function names (no wildcards) in
bpf_program__attach_kprobe_multi_opts() and bypass kallsyms parsing,
passing the symbol directly to the kernel via syms[] array.  This
benefits all callers, not just kprobe.session.

When the pattern contains no '*' or '?' characters, set syms to point
directly at the pattern string and cnt to 1, skipping the expensive
/proc/kallsyms or available_filter_functions parsing (~150ms per
function).

Error code normalization: the fast path returns ESRCH from kernel's
ftrace_lookup_symbols(), while the slow path returns ENOENT from
userspace kallsyms parsing.  Convert ESRCH to ENOENT in the
bpf_link_create error path to maintain API consistency - both paths
now return identical error codes for "symbol not found".

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260302200837.317907-2-andrey.grodzovsky@crowdstrike.com
2026-03-12 13:02:22 -07:00
Josef Bacik
ddb0c14f1b libbpf: Support appending split BTF in btf__add_btf()
btf__add_btf() currently rejects split BTF sources with -ENOTSUP.
This prevents merging types from multiple kernel module BTFs that
are all split against the same vmlinux base.

Extend btf__add_btf() to handle split BTF sources by:

- Replacing the blanket -ENOTSUP with a validation that src and dst
  share the same base BTF pointer when both are split, returning
  -EOPNOTSUPP on mismatch.

- Computing src_start_id from the source's base to distinguish base
  type ID references (which must remain unchanged) from split type
  IDs (which must be remapped to new positions in the destination).

- Using src_btf->nr_types instead of btf__type_cnt()-1 for the type
  count, which is correct for both split and non-split sources.

- Skipping base string offsets (< start_str_off) during the string
  rewrite loop, mirroring the type ID skip pattern.  Since src and
  dst share the same base BTF, base string offsets are already valid
  and need no remapping.

For non-split sources the behavior is identical: src_start_id is 1,
the type_id < 1 guard is never true (VOID is already skipped), and
the remapping formula reduces to the original.  start_str_off is 0
so no string offsets are skipped.

Assisted-by: Claude:claude-opus-4-6

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/c00216ed48cf7897078d9645679059d5ebf42738.1772657690.git.josef@toxicpanda.com
2026-03-12 13:02:22 -07:00
Jiri Olsa
67cce78af9 libbpf: Add support to detect nop,nop5 instructions combo for usdt probe
Adding support to detect nop,nop5 instructions combo for usdt probe
by checking on probe's following nop5 instruction.

When the nop,nop5 combo is detected together with uprobe syscall,
we can place the probe on top of nop5 and get it optimized.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20260224103915.1369690-3-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-12 13:02:22 -07:00
Jiri Olsa
8cbfbe1415 libbpf: Add uprobe syscall feature detection
Adding uprobe syscall feature detection that will be used
in following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20260224103915.1369690-2-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-12 13:02:22 -07:00
Ihor Solodrai
cc7760ce9a libbpf: Remove extern declaration of bpf_stream_vprintk()
An issue was reported that building BPF program which includes both
vmlinux.h and bpf_helpers.h from libbpf fails due to conflicting
declarations of bpf_stream_vprintk().

Remove the extern declaration from bpf_helpers.h to address this.

In order to use bpf_stream_printk() macro, BPF programs are expected
to either include vmlinux.h of the kernel they are targeting, or add
their own extern declaration.

Reported-by: Luca Boccassi <luca.boccassi@gmail.com>
Closes: https://github.com/libbpf/libbpf/issues/947
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260218215651.2057673-3-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-12 13:02:22 -07:00
Jakub Kicinski
7155e8e234 bpftool: Fix truncated netlink dumps
Netlink requires that the recv buffer used during dumps is at least
min(PAGE_SIZE, 8k) (see the man page). Otherwise the messages will
get truncated. Make sure bpftool follows this requirement, avoid
missing information on systems with large pages.

Acked-by: Quentin Monnet <qmo@kernel.org>
Fixes: 7084566a236f ("tools/bpftool: Remove libbpf_internal.h usage in bpftool")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20260217194150.734701-1-kuba@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-12 13:02:22 -07:00
Ihor Solodrai
6314de34ef Update checkpoint commit due to upstream rebase
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2026-03-12 13:02:22 -07:00
Ihor Solodrai
bacf439175 ci/diffs: Add a patch fixing up bpftool_helpers.c
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2026-03-12 13:02:22 -07:00
Ihor Solodrai
eca706a5e1 ci: fix concurrency group for push and schedule triggers
github.head_ref is empty for push and schedule events, so all those
runs shared the same concurrency group and cancelled each other. Fall
back to github.run_id to give each non-PR run a unique group.

Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2026-03-11 18:35:55 -07:00
Ihor Solodrai
ac0952761f ci: run vmtest job inside kbuilder-debian container
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2026-03-11 18:35:55 -07:00
Ihor Solodrai
6b9b5e34b1 ci: remove LATEST kernel handling from vmtest
The libbpf CI always builds the kernel from source at CHECKPOINT-COMMIT;
there is no prebuilt-kernel matrix entry. Remove the `kernel` input and
the conditional build-vs-download logic, simplifying the workflow.

The run-vmtest action defaults KERNEL to "LATEST" internally when the
env var is unset, so DENYLIST-LATEST is still picked up. The vmlinuz
path is auto-discovered via `make -s image_name` when not passed.

Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2026-03-11 18:35:55 -07:00
Ihor Solodrai
09ae947cd8 ci: consolidate and clean up denylists
Merge DENYLIST-LATEST into DENYLIST and remove the per-kernel denylist
files. With LATEST being the only kernel mode, there's no need for
separate files. Also remove the s390x denylist (libbpf CI only tests
x86_64) and drop stale entries fixed upstream.

Add map_kptr, test_profiler (kprobes not available in VM kernel), and
sockmap udp multi channels (flaky) based on CI run results.

Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2026-03-11 18:35:55 -07:00
Ihor Solodrai
b57c0a1b38 ci: update libbpf/ci action references from v3 to v4
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2026-03-11 18:35:55 -07:00
Ihor Solodrai
f0b6d8cd49 ci: remove stale patches from ci/diffs
All 4 patches either fail to apply (context mismatch) or are already
applied upstream. They produce noise in CI logs.

Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2026-03-11 18:35:55 -07:00
Andrii Nakryiko
6ddc03d4fe sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   dc855b77719fe452d670cae2cf64da1eb51f16cc
Checkpoint bpf-next commit: 4c51f90d45dca71e7974ed5a7c40b9c04a6c6762
Baseline bpf commit:        1a7eb7a3d74031e6c173f0822023f354c2870354
Checkpoint bpf commit:      593fffb8bcfdacc2111a9951afe1fae77988aa4a

Amery Hung (1):
  libbpf: Fix invalid write loop logic in bpf_linker__add_buf()

Daniel Borkmann (2):
  net: Add queue-create operation
  netkit: Add single device mode for netkit

Emil Tsalapatis (3):
  libbpf: Add gating for arena globals relocation feature
  libbpf: Do not use PROG_TYPE_TRACEPOINT program for feature gating
  libbpf: Delay feature gate check until object prepare time

Jakub Kicinski (1):
  Revert "Merge branch
    'netkit-support-for-io_uring-zero-copy-and-af_xdp'"

Jonas Köppeler (1):
  net/sched: sch_cake: share shaper state across sub-instances of
    cake_mq

Paolo Abeni (1):
  geneve: add netlink support for GRO hint

 include/uapi/linux/if_link.h   |  1 +
 include/uapi/linux/pkt_sched.h |  1 +
 src/features.c                 | 65 ++++++++++++++++++++++++++++++++++
 src/libbpf.c                   | 17 ++++++---
 src/libbpf_internal.h          |  2 ++
 src/linker.c                   |  2 +-
 6 files changed, 83 insertions(+), 5 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2026-02-17 20:15:42 -08:00
Emil Tsalapatis
93c0673751 libbpf: Delay feature gate check until object prepare time
Commit 728ff167910e ("libbpf: Add gating for arena globals relocation feature")
adds a feature gate check that loads a map and BPF program to
test the running kernel supports large direct offsets for LDIMM64
instructions. This check is currently used to calculate arena symbol
offsets during bpf_object__collect_relos, itself called by
bpf_object_open.

However, the program calling bpf_object_open may not have the permissions to
load maps and programs. This is the case with the BPF selftests, where
bpftool is invoked at compilation time during skeleton generation. This
causes errors as the feature gate unexpectedly fails with -EPERM.

Avoid this by moving all the use of the FEAT_LDIMM64_FULL_RANGE_OFF feature gate
to BPF object preparation time instead.

Fixes: 728ff167910e ("libbpf: Add gating for arena globals relocation feature")
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260217204345.548648-3-emil@etsalapatis.com
2026-02-17 20:15:42 -08:00
Emil Tsalapatis
8a596848c7 libbpf: Do not use PROG_TYPE_TRACEPOINT program for feature gating
Commit 728ff167910e uses a PROG_TYPE_TRACEPOINT BPF test program to
check whether the running kernel supports large LDIMM64 offsets. The
feature gate incorrectly assumes that the program will fail at
verification time with one of two messages, depending on whether the
feature is supported by the running kernel. However,
PROG_TYPE_TRACEPOINT programs may fail to load before verification even
starts, e.g., if the shell does not have the appropriate capabilities.
Use a BPF_PROG_TYPE_SOCKET_FILTER program for the feature gate instead.

Also fix two minor issues. First, ensure the log buffer for the test is
initialized: Failing program load before verification led to libbpf dumping
uninitialized data to stdout. Also, ensure that close() is only called
for program_fd in the probe if the program load actually succeeded. The
call was currently failing silently with -EBADF most of the time.

Fixes: 728ff167910e ("libbpf: Add gating for arena globals relocation feature")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260217204345.548648-2-emil@etsalapatis.com
2026-02-17 20:15:42 -08:00
Paolo Abeni
a61d018702 geneve: add netlink support for GRO hint
Allow configuring and dumping the new device option, and cache its value
into the geneve socket itself.
The new option is not tie to it any code yet.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/2295d4e4d1e919a3189425141bbc71c7850a2de0.1769011015.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-17 20:15:42 -08:00
Jakub Kicinski
13ae2d1841 Revert "Merge branch 'netkit-support-for-io_uring-zero-copy-and-af_xdp'"
This reverts commit 77b9c4a438fc66e2ab004c411056b3fb71a54f2c, reversing
changes made to 4515ec4ad58a37e70a9e1256c0b993958c9b7497:

 931420a2fc36 ("selftests/net: Add netkit container tests")
 ab771c938d9a ("selftests/net: Make NetDrvContEnv support queue leasing")
 6be87fbb2776 ("selftests/net: Add env for container based tests")
 61d99ce3dfc2 ("selftests/net: Add bpf skb forwarding program")
 920da3634194 ("netkit: Add xsk support for af_xdp applications")
 eef51113f8af ("netkit: Add netkit notifier to check for unregistering devices")
 b5ef109d22d4 ("netkit: Implement rtnl_link_ops->alloc and ndo_queue_create")
 b5c3fa4a0b16 ("netkit: Add single device mode for netkit")
 0073d2fd679d ("xsk: Proxy pool management for leased queues")
 1ecea95dd3b5 ("xsk: Extend xsk_rcv_check validation")
 804bf334d08a ("net: Proxy netdev_queue_get_dma_dev for leased queues")
 0caa9a8ddec3 ("net: Proxy net_mp_{open,close}_rxq for leased queues")
 ff8889ff9107 ("net, ethtool: Disallow leased real rxqs to be resized")
 9e2103f36110 ("net: Add lease info to queue-get response")
 31127deddef4 ("net: Implement netdev_nl_queue_create_doit")
 a5546e18f77c ("net: Add queue-create operation")

The series will conflict with io_uring work, and the code needs more
polish.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-17 20:15:42 -08:00
Daniel Borkmann
05121e3fd9 netkit: Add single device mode for netkit
Add a single device mode for netkit instead of netkit pairs. The primary
target for the paired devices is to connect network namespaces, of course,
and support has been implemented in projects like Cilium [0]. For the rxq
leasing the plan is to support two main scenarios related to single device
mode:

* For the use-case of io_uring zero-copy, the control plane can either
  set up a netkit pair where the peer device can perform rxq leasing which
  is then tied to the lifetime of the peer device, or the control plane
  can use a regular netkit pair to connect the hostns to a Pod/container
  and dynamically add/remove rxq leasing through a single device without
  having to interrupt the device pair. In the case of io_uring, the memory
  pool is used as skb non-linear pages, and thus the skb will go its way
  through the regular stack into netkit. Things like the netkit policy when
  no BPF is attached or skb scrubbing etc apply as-is in case the paired
  devices are used, or if the backend memory is tied to the single device
  and traffic goes through a paired device.

* For the use-case of AF_XDP, the control plane needs to use netkit in the
  single device mode. The single device mode currently enforces only a
  pass policy when no BPF is attached, and does not yet support BPF link
  attachments for AF_XDP. skbs sent to that device get dropped at the
  moment. Given AF_XDP operates at a lower layer of the stack tying this
  to the netkit pair did not make sense. In future, the plan is to allow
  BPF at the XDP layer which can: i) process traffic coming from the AF_XDP
  application (e.g. QEMU with AF_XDP backend) to filter egress traffic or
  to push selected egress traffic up to the single netkit device to the
  local stack (e.g. DHCP requests), and ii) vice-versa skbs sent to the
  single netkit into the AF_XDP application (e.g. DHCP replies). Also,
  the control-plane can dynamically manage rxq leasing for the single
  netkit device without having to interrupt (e.g. down/up cycle) the main
  netkit pair for the Pod which has traffic going in and out.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Co-developed-by: David Wei <dw@davidwei.uk>
Signed-off-by: David Wei <dw@davidwei.uk>
Reviewed-by: Jordan Rife <jordan@jrife.io>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://docs.cilium.io/en/stable/operations/performance/tuning/#netkit-device-mode [0]
Link: https://patch.msgid.link/20260115082603.219152-10-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-17 20:15:42 -08:00
Daniel Borkmann
a623828cdc net: Add queue-create operation
Add a ynl netdev family operation called queue-create that creates a
new queue on a netdevice:

      name: queue-create
      attribute-set: queue
      flags: [admin-perm]
      do:
        request:
          attributes:
            - ifindex
            - type
            - lease
        reply: &queue-create-op
          attributes:
            - id

This is a generic operation such that it can be extended for various
use cases in future. Right now it is mandatory to specify ifindex,
the queue type which is enforced to rx and a lease. The newly created
queue id is returned to the caller.

A queue from a virtual device can have a lease which refers to another
queue from a physical device. This is useful for memory providers
and AF_XDP operations which take an ifindex and queue id to allow
applications to bind against virtual devices in containers. The lease
couples both queues together and allows to proxy the operations from
a virtual device in a container to the physical device.

In future, the nested lease attribute can be lifted and made optional
for other use-cases such as dynamic queue creation for physical
netdevs. The lack of lease and the specification of the physical
device as an ifindex will imply that we need a real queue to be
allocated. Similarly, the queue type enforcement to rx can then be
lifted as well to support tx.

An early implementation had only driver-specific integration [0], but
in order for other virtual devices to reuse, it makes sense to have
this as a generic API in core net.

For leasing queues, the virtual netdev must have real_num_rx_queue
less than num_rx_queues at the time of calling queue-create. The
queue-type must be rx as only rx queues are supported for leasing
for now. We also enforce that the queue-create ifindex must point
to a virtual device, and that the nested lease attribute's ifindex
must point to a physical device. The nested lease attribute set
contains a netns-id attribute which is currently only intended for
dumping as part of the queue-get operation. Also, it is modeled as
an s32 type similarly as done elsewhere in the stack.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Co-developed-by: David Wei <dw@davidwei.uk>
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://bpfconf.ebpf.io/bpfconf2025/bpfconf2025_material/lsfmmbpf_2025_netkit_borkmann.pdf [0]
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260115082603.219152-2-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-17 20:15:42 -08:00
Jonas Köppeler
4f8bdd9ce3 net/sched: sch_cake: share shaper state across sub-instances of cake_mq
This commit adds shared shaper state across the cake instances beneath a
cake_mq qdisc. It works by periodically tracking the number of active
instances, and scaling the configured rate by the number of active
queues.

The scan is lockless and simply reads the qlen and the last_active state
variable of each of the instances configured beneath the parent cake_mq
instance. Locking is not required since the values are only updated by
the owning instance, and eventual consistency is sufficient for the
purpose of estimating the number of active queues.

The interval for scanning the number of active queues is set to 200 us.
We found this to be a good tradeoff between overhead and response time.
For a detailed analysis of this aspect see the Netdevconf talk:

https://netdevconf.info/0x19/docs/netdev-0x19-paper16-talk-paper.pdf

Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-5-8d613fece5d8@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-17 20:15:42 -08:00
Andrii Nakryiko
fa0bbf147e sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   08a7491843224f8b96518fbe70d9e48163046054
Checkpoint bpf-next commit: dc855b77719fe452d670cae2cf64da1eb51f16cc
Baseline bpf commit:        22cc16c04b7893d8fc22810599f49a305d600b9e
Checkpoint bpf commit:      1a7eb7a3d74031e6c173f0822023f354c2870354

Amery Hung (1):
  libbpf: Fix invalid write loop logic in bpf_linker__add_buf()

Bill Wendling (1):
  compiler_types.h: Attributes: Add __counted_by_ptr macro

Dapeng Mi (1):
  perf/x86/intel: Add support for PEBS memory auxiliary info field in
    DMR

Emil Tsalapatis (1):
  libbpf: Add gating for arena globals relocation feature

 include/uapi/linux/perf_event.h | 27 ++++++++++++--
 include/uapi/linux/stddef.h     |  4 +++
 src/features.c                  | 64 +++++++++++++++++++++++++++++++++
 src/libbpf.c                    |  7 ++--
 src/libbpf_internal.h           |  2 ++
 src/linker.c                    |  2 +-
 6 files changed, 100 insertions(+), 6 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2026-02-11 12:04:56 -08:00
Andrii Nakryiko
4c4f39f873 sync: update .mailmap
Update .mailmap based on libbpf's list of contributors and on the latest
.mailmap version in the upstream repository.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2026-02-11 12:04:56 -08:00
Amery Hung
fc735dab54 libbpf: Fix invalid write loop logic in bpf_linker__add_buf()
Fix bpf_linker__add_buf()'s logic of copying data from memory buffer into
memfd. In the event of short write not writing entire buf_sz bytes into memfd
file, we'll append bytes from the beginning of buf *again* (corrupting ELF
file contents) instead of correctly appending the rest of not-yet-read buf
contents.

Closes: https://github.com/libbpf/libbpf/issues/945
Fixes: 6d5e5e5d7ce1 ("libbpf: Extend linker API to support in-memory ELF files")
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20260209230134.3530521-1-ameryhung@gmail.com
2026-02-11 12:04:56 -08:00
Emil Tsalapatis
429aaef6a3 libbpf: Add gating for arena globals relocation feature
Add feature gating for the arena globals relocation introduced in
commit c1f61171d44b. The commit depends on a previous commit in the
same patchset that is absent from older kernels
(12a1fe6e12db "bpf/verifier: Do not limit maximum direct offset into arena map").

Without this commit, arena globals relocation with arenas >= 512MiB
fails to load and breaks libbpf's backwards compatibility.

Introduce a libbpf feature to check whether the running kernel allows for
full range ldimm64 offset, and only relocate arena globals if it does.

Fixes: c1f61171d44b ("libbpf: Move arena globals to the end of the arena")
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260210184532.255475-1-emil@etsalapatis.com
2026-02-11 12:04:56 -08:00
Dapeng Mi
ad0d0e5112 perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
With the introduction of the OMR feature, the PEBS memory auxiliary info
field for load and store latency events has been restructured for DMR.

The memory auxiliary info field's bit[8] indicates whether a L2 cache
miss occurred for a memory load or store instruction. If bit[8] is 0,
it signifies no L2 cache miss, and bits[7:0] specify the exact cache data
source (up to the L2 cache level). If bit[8] is 1, bits[7:0] represent
the OMR encoding, indicating the specific L3 cache or memory region
involved in the memory access. A significant enhancement is OMR encoding
provides up to 8 fine-grained memory regions besides the cache region.

A significant enhancement for OMR encoding is the ability to provide
up to 8 fine-grained memory regions in addition to the cache region,
offering more detailed insights into memory access regions.

For detailed information on the memory auxiliary info encoding, please
refer to section 16.2 "PEBS LOAD LATENCY AND STORE LATENCY FACILITY" in
the ISE documentation.

This patch ensures that the PEBS memory auxiliary info field is correctly
interpreted and utilized in DMR.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260114011750.350569-3-dapeng1.mi@linux.intel.com
2026-02-11 12:04:56 -08:00
Bill Wendling
1317132162 compiler_types.h: Attributes: Add __counted_by_ptr macro
Introduce __counted_by_ptr(), which works like __counted_by(), but for
pointer struct members.

struct foo {
	int a, b, c;
	char *buffer __counted_by_ptr(bytes);
	short nr_bars;
	struct bar *bars __counted_by_ptr(nr_bars);
	size_t bytes;
};

Because "counted_by" can only be applied to pointer members in very
recent compiler versions, its application ends up needing to be distinct
from flexibe array "counted_by" annotations, hence a separate macro.

This is a reworking of Kees' previous patch [1].

Link: https://lore.kernel.org/all/20251020220118.1226740-1-kees@kernel.org/ [1]
Co-developed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Bill Wendling <morbo@google.com>
Link: https://patch.msgid.link/20260116005838.2419118-1-morbo@google.com
Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-11 12:04:56 -08:00
Andrii Nakryiko
85d9be97eb sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   6f0b824a61f212e9707ff68abcabfdfa4724b811
Checkpoint bpf-next commit: 08a7491843224f8b96518fbe70d9e48163046054
Baseline bpf commit:        1d528e794f3db5d32279123a89957c44c4406a09
Checkpoint bpf commit:      22cc16c04b7893d8fc22810599f49a305d600b9e

Donglin Peng (4):
  libbpf: Add BTF permutation support for type reordering
  libbpf: Optimize type lookup with binary search for sorted BTF
  libbpf: Verify BTF sorting
  btf: Refactor the code by calling str_is_empty

Emil Tsalapatis (2):
  libbpf: Turn relo_core->sym_off unsigned
  libbpf: Move arena globals to the end of the arena

Ihor Solodrai (1):
  bpf: Migrate bpf_stream_vprintk() to KF_IMPLICIT_ARGS

Leon Hwang (2):
  bpf: Introduce BPF_F_CPU and BPF_F_ALL_CPUS flags
  libbpf: Add BPF_F_CPU and BPF_F_ALL_CPUS flags support for percpu maps

Matt Bobrowski (1):
  bpf: add new BPF_CGROUP_ITER_CHILDREN control option

Menglong Dong (2):
  bpf: add fsession support
  libbpf: add fsession support

Thomas Gleixner (1):
  treewide: Update email address

Thomas Weißschuh (1):
  vfs: use UAPI types for new struct delegation definition

Varun R Mallya (1):
  libbpf: Fix OOB read in btf_dump_get_bitfield_value

 include/uapi/linux/bpf.h        |  11 ++
 include/uapi/linux/fcntl.h      |  10 +-
 include/uapi/linux/perf_event.h |   2 +-
 src/bpf.c                       |   1 +
 src/bpf.h                       |   8 +
 src/bpf_helpers.h               |   6 +-
 src/btf.c                       | 276 +++++++++++++++++++++++++++-----
 src/btf.h                       |  42 +++++
 src/btf_dump.c                  |   9 ++
 src/libbpf.c                    |  64 +++++---
 src/libbpf.h                    |  21 +--
 src/libbpf.map                  |   1 +
 12 files changed, 369 insertions(+), 82 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
v1.6.3.1p_netdata
2026-01-29 14:10:19 -08:00
Andrii Nakryiko
fddf93d20b sync: update .mailmap
Update .mailmap based on libbpf's list of contributors and on the latest
.mailmap version in the upstream repository.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2026-01-29 14:10:19 -08:00
Matt Bobrowski
ed6bb65cf1 bpf: add new BPF_CGROUP_ITER_CHILDREN control option
Currently, the BPF cgroup iterator supports walking descendants in
either pre-order (BPF_CGROUP_ITER_DESCENDANTS_PRE) or post-order
(BPF_CGROUP_ITER_DESCENDANTS_POST). These modes perform an exhaustive
depth-first search (DFS) of the hierarchy. In scenarios where a BPF
program may need to inspect only the direct children of a given parent
cgroup, a full DFS is unnecessarily expensive.

This patch introduces a new BPF cgroup iterator control option,
BPF_CGROUP_ITER_CHILDREN. This control option restricts the traversal
to the immediate children of a specified parent cgroup, allowing for
more targeted and efficient iteration, particularly when exhaustive
depth-first search (DFS) traversal is not required.

Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
Link: https://lore.kernel.org/r/20260127085112.3608687-1-mattbobrowski@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-29 14:10:19 -08:00
Menglong Dong
5ee8863eaf libbpf: add fsession support
Add BPF_TRACE_FSESSION to libbpf.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Link: https://lore.kernel.org/r/20260124062008.8657-9-dongml2@chinatelecom.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-29 14:10:19 -08:00
Menglong Dong
adde4f55b7 bpf: add fsession support
The fsession is something that similar to kprobe session. It allow to
attach a single BPF program to both the entry and the exit of the target
functions.

Introduce the struct bpf_fsession_link, which allows to add the link to
both the fentry and fexit progs_hlist of the trampoline.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Co-developed-by: Leon Hwang <leon.hwang@linux.dev>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260124062008.8657-2-dongml2@chinatelecom.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-29 14:10:19 -08:00
Ihor Solodrai
977b1f820c bpf: Migrate bpf_stream_vprintk() to KF_IMPLICIT_ARGS
Implement bpf_stream_vprintk with an implicit bpf_prog_aux argument,
and remote bpf_stream_vprintk_impl from the kernel.

Update the selftests to use the new API with implicit argument.

bpf_stream_vprintk macro is changed to use the new bpf_stream_vprintk
kfunc, and the extern definition of bpf_stream_vprintk_impl is
replaced accordingly.

Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260120222638.3976562-11-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-29 14:10:19 -08:00
Thomas Gleixner
5d02120e10 treewide: Update email address
In a vain attempt to consolidate the email zoo switch everything to the
kernel.org account.

Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-01-29 14:10:19 -08:00
Donglin Peng
8a090ef1e5 btf: Refactor the code by calling str_is_empty
Calling the str_is_empty function to clarify the code and
no functional changes are introduced.

Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-12-dolinux.peng@gmail.com
2026-01-29 14:10:19 -08:00
Donglin Peng
ad9c763445 libbpf: Verify BTF sorting
This patch checks whether the BTF is sorted by name in ascending
order. If sorted, binary search will be used when looking up types.

Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-6-dolinux.peng@gmail.com
2026-01-29 14:10:19 -08:00
Donglin Peng
1c96b72cb0 libbpf: Optimize type lookup with binary search for sorted BTF
This patch introduces binary search optimization for BTF type lookups
when the BTF instance contains sorted types.

The optimization significantly improves performance when searching for
types in large BTF instances with sorted types. For unsorted BTF, the
implementation falls back to the original linear search.

Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-5-dolinux.peng@gmail.com
2026-01-29 14:10:19 -08:00
Donglin Peng
b7c6c02b5f libbpf: Add BTF permutation support for type reordering
Introduce btf__permute() API to allow in-place rearrangement of BTF types.
This function reorganizes BTF type order according to a provided array of
type IDs, updating all type references to maintain consistency.

Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-2-dolinux.peng@gmail.com
2026-01-29 14:10:19 -08:00
Varun R Mallya
2c5038dcf4 libbpf: Fix OOB read in btf_dump_get_bitfield_value
When dumping bitfield data, btf_dump_get_bitfield_value() reads data
based on the underlying type's size (t->size). However, it does not
verify that the provided data buffer (data_sz) is large enough to
contain these bytes.

If btf_dump__dump_type_data() is called with a buffer smaller than
the type's size, this leads to an out-of-bounds read. This was
confirmed by AddressSanitizer in the linked issue.

Fix this by ensuring we do not read past the provided data_sz limit.

Fixes: a1d3cc3c5eca ("libbpf: Avoid use of __int128 in typed dump display")
Reported-by: Harrison Green <harrisonmichaelgreen@gmail.com>
Suggested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Varun R Mallya <varunrmallya@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260106233527.163487-1-varunrmallya@gmail.com

Closes: https://github.com/libbpf/libbpf/issues/928
2026-01-29 14:10:19 -08:00
Leon Hwang
dc8673b28b libbpf: Add BPF_F_CPU and BPF_F_ALL_CPUS flags support for percpu maps
Add libbpf support for the BPF_F_CPU flag for percpu maps by embedding the
cpu info into the high 32 bits of:

1. **flags**: bpf_map_lookup_elem_flags(), bpf_map__lookup_elem(),
   bpf_map_update_elem() and bpf_map__update_elem()
2. **opts->elem_flags**: bpf_map_lookup_batch() and
   bpf_map_update_batch()

And the flag can be BPF_F_ALL_CPUS, but cannot be
'BPF_F_CPU | BPF_F_ALL_CPUS'.

Behavior:

* If the flag is BPF_F_ALL_CPUS, the update is applied across all CPUs.
* If the flag is BPF_F_CPU, it updates value only to the specified CPU.
* If the flag is BPF_F_CPU, lookup value only from the specified CPU.
* lookup does not support BPF_F_ALL_CPUS.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260107022022.12843-7-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-29 14:10:19 -08:00
Leon Hwang
a6d7ceaaeb bpf: Introduce BPF_F_CPU and BPF_F_ALL_CPUS flags
Introduce BPF_F_CPU and BPF_F_ALL_CPUS flags and check them for
following APIs:

* 'map_lookup_elem()'
* 'map_update_elem()'
* 'generic_map_lookup_batch()'
* 'generic_map_update_batch()'

And, get the correct value size for these APIs.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260107022022.12843-2-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-29 14:10:19 -08:00
Thomas Weißschuh
e64e125ef6 vfs: use UAPI types for new struct delegation definition
Using libc types and headers from the UAPI headers is problematic as it
introduces a dependency on a full C toolchain.

Use the fixed-width integer types provided by the UAPI headers instead.

Fixes: 1602bad16d7d ("vfs: expose delegation support to userland")
Fixes: 4be9e04ebf75 ("vfs: add needed headers for new struct delegation definition")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://patch.msgid.link/20251203-uapi-fcntl-v1-1-490c67bf3425@linutronix.de
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-29 14:10:19 -08:00
Emil Tsalapatis
9dd6fda504 libbpf: Move arena globals to the end of the arena
Arena globals are currently placed at the beginning of the arena
by libbpf. This is convenient, but prevents users from reserving
guard pages in the beginning of the arena to identify NULL pointer
dereferences. Adjust the load logic to place the globals at the
end of the arena instead.

Also modify bpftool to set the arena pointer in the program's BPF
skeleton to point to the globals. Users now call bpf_map__initial_value()
to find the beginning of the arena mapping and use the arena pointer
in the skeleton to determine which part of the mapping holds the
arena globals and which part is free.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20251216173325.98465-5-emil@etsalapatis.com
2026-01-29 14:10:19 -08:00
Emil Tsalapatis
2c7fe6ec5d libbpf: Turn relo_core->sym_off unsigned
The symbols' relocation offsets in BPF are stored in an int field,
but cannot actually be negative. When in the next patch libbpf relocates
globals to the end of the arena, it is also possible to have valid
offsets > 2GiB that are used to calculate the final relo offsets.
Avoid accidentally interpreting large offsets as negative by turning
the sym_off field unsigned.

Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20251216173325.98465-4-emil@etsalapatis.com
2026-01-29 14:10:19 -08:00
Andrii Nakryiko
160423d498 ci: denylist flaky 'bpf_cookie/perf_event' selftest
It keeps failing. It relies on perf_events so not super reliable.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2026-01-26 13:12:58 -08:00
Andrii Nakryiko
afb8b17bc5 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   f8c67d8550ee69ce684c7015b2c8c63cda24bbfb
Checkpoint bpf-next commit: 6f0b824a61f212e9707ff68abcabfdfa4724b811
Baseline bpf commit:        e427054ae7bc8b1268cf1989381a43885795616f
Checkpoint bpf commit:      1d528e794f3db5d32279123a89957c44c4406a09

Alan Maguire (1):
  libbpf: Add debug messaging in dedup equivalence/identity matching

Amery Hung (2):
  bpf: Support associating BPF program with struct_ops
  libbpf: Add support for associating BPF program with struct_ops

Asbjørn Sloth Tønnesen (1):
  tools: ynl-gen: add regeneration comment

Heiko Carstens (1):
  tools: Remove s390 compat support

James Clark (1):
  perf: Add perf_event_attr::config4

Jeff Layton (2):
  vfs: expose delegation support to userland
  vfs: add needed headers for new struct delegation definition

Jianyun Gao (1):
  libbpf: Fix some incorrect @param descriptions in the comment of
    libbpf.h

Kuniyuki Iwashima (1):
  bpf: Introduce SK_BPF_BYPASS_PROT_MEM.

Mikhail Gavrilov (1):
  libbpf: Fix -Wdiscarded-qualifiers under C23

Paul Houssel (1):
  libbpf: Fix BTF dedup to support recursive typedef definitions

Peter Zijlstra (1):
  perf: Support deferred user unwind

Samiullah Khawaja (1):
  net: Extend NAPI threaded polling to allow kthread based busy polling

 include/uapi/linux/bpf.h        |  19 ++++++
 include/uapi/linux/fcntl.h      |  16 +++++
 include/uapi/linux/netdev.h     |   2 +
 include/uapi/linux/perf_event.h |  23 +++++++-
 src/bpf.c                       |  19 ++++++
 src/bpf.h                       |  21 +++++++
 src/btf.c                       | 100 +++++++++++++++++++++++++-------
 src/libbpf.c                    |  42 +++++++++++---
 src/libbpf.h                    |  43 ++++++++++----
 src/libbpf.map                  |   2 +
 src/usdt.c                      |   2 -
 11 files changed, 247 insertions(+), 42 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2025-12-16 09:52:07 -08:00
Mikhail Gavrilov
fda2bfcb7a libbpf: Fix -Wdiscarded-qualifiers under C23
glibc ≥ 2.42 (GCC 15) defaults to -std=gnu23, which promotes
-Wdiscarded-qualifiers to an error.

In C23, strstr() and strchr() return "const char *".

Change variable types to const char * where the pointers are never
modified (res, sym_sfx, next_path).

Suggested-by: Florian Weimer <fweimer@redhat.com>
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Link: https://lore.kernel.org/r/20251206092825.1471385-1-mikhail.v.gavrilov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-12-16 09:52:07 -08:00