| Age | Commit message (Collapse) | Author | Lines |
|
Pull drm updates from Dave Airlie:
"Highlights:
- new DRM RAS infrastructure using netlink
- amdgpu: enable DC on CIK APUs, and more IP enablement, and more
user queue work
- xe: purgeable BO support, and new hw enablement
- dma-buf : add revocable operations
Full summary:
mm:
- two-pass MMU interval notifiers
- add gpu active/reclaim per-node stat counters
math:
- provide __KERNEL_DIV_ROUND_CLOSEST() in UAPI
- implement DIV_ROUND_CLOSEST() with __KERNEL_DIV_ROUND_CLOSEST()
rust:
- shared tag with driver-core: register macro and io infra
- core: rework DMA coherent API
- core: add interop::list to interop with C linked lists
- core: add more num::Bounded operations
- core: enable generic_arg_infer and add EMSGSIZE
- workqueue: add ARef<T> support for work and delayed work
- add GPU buddy allocator abstraction
- add DRM shmem GEM helper abstraction
- allow drm:::Device to dispatch work and delayed work items
to driver private data
- add dma_resv_lock helper and raw accessors
core:
- introduce DRM RAS infrastructure over netlink
- add connector panel_type property
- fourcc: add ARM interleaved 64k modifier
- colorop: add destroy helper
- suballoc: split into alloc and init helpers
- mode: provide DRM_ARGB_GET*() macros for reading color components
edid:
- provide drm_output_color_Format
dma-buf:
- provide revoke mechanism for shared buffers
- rename move_notify to invalidate_mappings
- always enable move_notify
- protect dma_fence_ops with RCU and improve locking
- clean pages with helpers
atomic:
- allocate drm_private_state via callback
- helper: use system_percpu_wq
buddy:
- make buddy allocator available to gpu level
- add kernel-doc for buddy allocator
- improve aligned allocation
ttm:
- fix fence signalling
- improve tests and docs
- improve handling of gfp_retry_mayfail
- use per-node stat counters to track memory allocations
- port pool to use list_lru
- drop NUMA specific pools
- make pool shrinker numa aware
- track allocated pages per numa node
coreboot:
- cleanup coreboot framebuffer support
sched:
- fix race condition in drm_sched_fini
pagemap:
- enable THP support
- pass pagemap_addr by reference
gem-shmem:
- Track page accessed/dirty status across mmap/vmap
gpusvm:
- reenable device to device migration
- fix unbalanced unclock
bridge:
- anx7625: Support USB-C plus DT bindings
- connector: Fix EDID detection
- dw-hdmi-qp: Support Vendor-Specfic and SDP Infoframes; improve
others
- fsl-ldb: Fix visual artifacts plus related DT property
'enable-termination-resistor'
- imx8qxp-pixel-link: Improve bridge reference handling
- lt9611: Support Port-B-only input plus DT bindings
- tda998x: Support DRM_BRIDGE_ATTACH_NO_CONNECTOR; Clean up
- Support TH1520 HDMI plus DT bindings
- waveshare-dsi: Fix register and attach; Support 1..4 DSI lanes plus
DT bindings
- anx7625: Fix USB Type-C handling
- cdns-mhdp8546-core: Handle HDCP state in bridge atomic_check
- Support Lontium LT8713SX DP MST bridge plus DT bindings
- analogix_dp: Use DP helpers for link training
panel:
- panel-jdi-lt070me05000: Use mipi-dsi multi functions
- panel-edp: Support Add AUO B116XAT04.1 (HW: 1A); Support CMN
N116BCL-EAK (C2); Support FriendlyELEC plus DT changes
- panel-edp: Fix timings for BOE NV140WUM-N64
- ilitek-ili9882t: Allow GPIO calls to sleep
- jadard: Support TAIGUAN XTI05101-01A
- lxd: Support LXD M9189A plus DT bindings
- mantix: Fix pixel clock; Clean up
- motorola: Support Motorola Atrix 4G and Droid X2 plus DT bindings
- novatek: Support Novatek/Tianma NT37700F plus DT bindings
- simple: Support EDT ET057023UDBA plus DT bindings; Support Powertip
PH800480T032-ZHC19 plus DT bindings; Support Waveshare 13.3"
- novatek-nt36672a: Use mipi_dsi_*_multi() functions
- panel-edp: Support BOE NV153WUM-N42, CMN N153JCA-ELK, CSW
MNF307QS3-2
- support Himax HX83121A plus DT bindings
- support JuTouch JT070TM041 plus DT bindings
- support Samsung S6E8FC0 plus DT bindings
- himax-hx83102c: support Samsung S6E8FC0 plus DT bindings; support
backlight
- ili9806e: support Rocktech RK050HR345-CT106A plus DT bindings
- simple: support Tianma TM050RDH03 plus DT bindings
amdgpu:
- enable DC by default on CIK APUs
- userq fence ioctl param size fixes
- set panel_type to OLED for eDP
- refactor DC i2c code
- FAMS2 update
- rework ttm handling to allow multiple engines
- DC DCE 6.x cleanup
- DC support for NUTMEG/TRAVIS DP bridge
- DCN 4.2 support
- GC12 idle power fix for compute
- use struct drm_edid in non-DC code
- enable NV12/P010 support on primary planes
- support newer IP discovery tables
- VCN/JPEG 5.0.2 support
- GC/MES 12.1 updates
- USERQ fixes
- add DC idle state manager
- eDP DSC seamless boot
amdkfd:
- GC 12.1 updates
- non 4K page fixes
xe:
- basic Xe3p_LPG and NVL-P enabling patches
- allow VM_BIND decompress support
- add purgeable buffer object support
- add xe_vm_get_property_ioctl
- restrict multi-lrc to VCS/VECS engines
- allow disabling VM overcommit in fault mode
- dGPU memory optimizations
- Workaround cleanups and simplification
- Allow VFs VRAM quote changes using sysfs
- convert GT stats to per-cpu counters
- pagefault refactors
- enable multi-queue on xe3p_xpc
- disable DCC on PTL
- make MMIO communication more robust
- disable D3Cold for BMG on specific platforms
- vfio: improve FLR sync for Xe VFIO
i915/display:
- C10/C20/LT PHY PLL divider verification
- use trans push mechanism to generate PSR frame change on LNL+
- refactor DP DSC slice config
- VGA decode refactoring
- refactor DPT, gen2-4 overlay, masked field register macro helpers
- refactor stolen memory allocation decisions
- prepare for UHBR DP tunnels
- refactor LT PHY PLL to use DPLL framework
- implement register polling/waiting in display code
- add shared stepping header between i915 and display
i915:
- fix potential overflow of shmem scatterlist length
nouveau:
- provide Z cull info to userspace
- initial GA100 support
- shutdown on PCI device shutdown
nova-core:
- harden GSP command queue
- add support for large RPCs
- simplify GSP sequencer and message handling
- refactor falcon firmware handling
- convert to new register macro
- conver to new DMA coherent API
- use checked arithmetic
- add debugfs support for gsp-rm log buffers
- fix aux device registration for multi-GPU
msm:
- CI:
- Uprev mesa
- Restore CI jobs for Qualcomm APQ8016 and APQ8096 devices
- Core:
- Switched to of_get_available_child_by_name()
- DPU:
- Fixes for DSC panels
- Fixed brownout because of the frequency / OPP mismatch
- Quad pipe preparation (not enabled yet)
- Switched to virtual planes by default
- Dropped VBIF_NRT support
- Added support for Eliza platform
- Reworked alpha handling
- Switched to correct CWB definitions on Eliza
- Dropped dummy INTF_0 on MSM8953
- Corrected INTFs related to DP-MST
- DP:
- Removed debug prints looking into PHY internals
- DSI:
- Fixes for DSC panels
- RGB101010 support
- Support for SC8280XP
- Moved PHY bindings from display/ to phy/
- GPU:
- Preemption support for x2-85 and a840
- IFPC support for a840
- SKU detection support for x2-85 and a840
- Expose AQE support (VK ray-pipeline)
- Avoid locking in VM_BIND fence signaling path
- Fix to avoid reclaim in GPU snapshot path
- Disallow foreign mapping of _NO_SHARE BOs
- HDMI:
- Fixed infoframes programming
- MDP5:
- Dropped support for MSM8974v1
- Dropped now unused code for MSM8974 v1 and SDM660 / MSM8998
panthor:
- add tracepoints for power and IRQs
- fix fence handling
- extend timestamp query with flags
- support various sources for timestamp queries
tyr:
- fix names and model/versions
rockchip:
- vop2: use drm logging function
- rk3576 displayport support
- support CRTC background color
atmel-hlcdc:
- support sana5d65 LCD controller
tilcdc:
- use DT bindings schema
- use managed DRM interfaces
- support DRM_BRIDGE_ATTACH_NO_CONNECTOR
verisilicon:
- support DC8200 + DT bindings
virtgpu:
- support PRIME import with 3D enabled
komeda:
- fix integer overflow in AFBC checks
mcde:
- improve bridge handling
gma500:
- use drm client buffer for fbdev framebuffer
amdxdna:
- add sensors ioctls
- provide NPU power estimate
- support column utilization sensor
- allow forcing DMA through IOMMU IOVA
- support per-BO mem usage queries
- refactor GEM implementation
ivpu:
- update boot API to v3.29.4
- limit per-user number of doorbells/contexts
- perform engine reset on TDR error
loongson:
- replace custom code with drm_gem_ttm_dumb_map_offset()
imx:
- support planes behind the primary plane
- fix bus-format selection
vkms:
- support CRTC background color
v3d:
- improve handling of struct v3d_stats
komeda:
- support Arm China Linlon D6 plus DT bindings
imagination:
- improve power-off sequence
- support context-reset notification from firmware
mediatek:
- mtk_dsi: enable hs clock during pre-enable
- Remove all conflicting aperture devices during probe
- Add support for mt8167 display blocks"
* tag 'drm-next-2026-04-15' of https://gitlab.freedesktop.org/drm/kernel: (1735 commits)
drm/ttm/tests: Remove checks from ttm_pool_free_no_dma_alloc
drm/ttm/tests: fix lru_count ASSERT
drm/vram: remove DRM_VRAM_MM_FILE_OPERATIONS from docs
drm/fb-helper: Fix a locking bug in an error path
dma-fence: correct kernel-doc function parameter @flags
ttm/pool: track allocated_pages per numa node.
ttm/pool: make pool shrinker NUMA aware (v2)
ttm/pool: drop numa specific pools
ttm/pool: port to list_lru. (v2)
drm/ttm: use gpu mm stats to track gpu memory allocations. (v4)
mm: add gpu active/reclaim per-node stat counters (v2)
gpu: nova-core: fix missing colon in SEC2 boot debug message
gpu: nova-core: vbios: use from_le_bytes() for PCI ROM header parsing
gpu: nova-core: bitfield: fix broken Default implementation
gpu: nova-core: falcon: pad firmware DMA object size to required block alignment
gpu: nova-core: gsp: fix undefined behavior in command queue code
drm/shmem_helper: Make sure PMD entries get the writeable upgrade
accel/ivpu: Trigger recovery on TDR with OS scheduling
drm/msm: Use of_get_available_child_by_name()
dt-bindings: display/msm: move DSI PHY bindings to phy/ subdir
...
|
|
The bridge-stp usermode helper is currently restricted to the initial
network namespace, preventing userspace STP daemons (e.g. mstpd) from
operating on bridges in other network namespaces. Since commit
ff62198553e4 ("bridge: Only call /sbin/bridge-stp for the initial
network namespace"), bridges in non-init namespaces silently fall
back to kernel STP with no way to use userspace STP.
Add a new bridge attribute IFLA_BR_STP_MODE that allows explicit
per-bridge control over STP mode selection:
BR_STP_MODE_AUTO (default) - Existing behavior: invoke the
/sbin/bridge-stp helper in init_net only; fall back to kernel STP
if it fails or in non-init namespaces.
BR_STP_MODE_USER - Directly enable userspace STP (BR_USER_STP)
without invoking the helper. Works in any network namespace.
Userspace is responsible for ensuring an STP daemon manages the
bridge.
BR_STP_MODE_KERNEL - Directly enable kernel STP (BR_KERNEL_STP)
without invoking the helper.
The mode can only be changed while STP is disabled, or set to the
same value (-EBUSY otherwise). IFLA_BR_STP_MODE is processed before
IFLA_BR_STP_STATE in br_changelink(), so both can be set atomically
in a single netlink message. The mode can also be changed in the
same message that disables STP.
The stp_mode struct field is u8 since all possible values fit, while
NLA_U32 is used for the netlink attribute since it occupies the same
space in the netlink message as NLA_U8.
A new stp_helper_active boolean tracks whether the /sbin/bridge-stp
helper was invoked during br_stp_start(), so that br_stp_stop() only
calls the helper for stop when it was called for start. This avoids
calling the helper asymmetrically when stp_mode changes between
start and stop.
Suggested-by: Ido Schimmel <idosch@nvidia.com>
Assisted-by: Claude:claude-opus-4-6
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Andy Roulin <aroulin@nvidia.com>
Link: https://patch.msgid.link/20260405205224.3163000-2-aroulin@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a single device mode for netkit instead of netkit pairs. The primary
target for the paired devices is to connect network namespaces, of course,
and support has been implemented in projects like Cilium [0]. For the rxq
leasing the plan is to support two main scenarios related to single device
mode:
* For the use-case of io_uring zero-copy, the control plane can either
set up a netkit pair where the peer device can perform rxq leasing which
is then tied to the lifetime of the peer device, or the control plane
can use a regular netkit pair to connect the hostns to a Pod/container
and dynamically add/remove rxq leasing through a single device without
having to interrupt the device pair. In the case of io_uring, the memory
pool is used as skb non-linear pages, and thus the skb will go its way
through the regular stack into netkit. Things like the netkit policy when
no BPF is attached or skb scrubbing etc apply as-is in case the paired
devices are used, or if the backend memory is tied to the single device
and traffic goes through a paired device.
* For the use-case of AF_XDP, the control plane needs to use netkit in the
single device mode. The single device mode currently enforces only a
pass policy when no BPF is attached, and does not yet support BPF link
attachments for AF_XDP. skbs sent to that device get dropped at the
moment. Given AF_XDP operates at a lower layer of the stack tying this
to the netkit pair did not make sense. In future, the plan is to allow
BPF at the XDP layer which can: i) process traffic coming from the AF_XDP
application (e.g. QEMU with AF_XDP backend) to filter egress traffic or
to push selected egress traffic up to the single netkit device to the
local stack (e.g. DHCP requests), and ii) vice-versa skbs sent to the
single netkit into the AF_XDP application (e.g. DHCP replies). Also,
the control-plane can dynamically manage rxq leasing for the single
netkit device without having to interrupt (e.g. down/up cycle) the main
netkit pair for the Pod which has traffic going in and out.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Co-developed-by: David Wei <dw@davidwei.uk>
Signed-off-by: David Wei <dw@davidwei.uk>
Reviewed-by: Jordan Rife <jordan@jrife.io>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://docs.cilium.io/en/stable/operations/performance/tuning/#netkit-device-mode [0]
Link: https://patch.msgid.link/20260402231031.447597-11-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a ynl netdev family operation called queue-create that creates a
new queue on a netdevice:
name: queue-create
attribute-set: queue
flags: [admin-perm]
do:
request:
attributes:
- ifindex
- type
- lease
reply: &queue-create-op
attributes:
- id
This is a generic operation such that it can be extended for various
use cases in future. Right now it is mandatory to specify ifindex,
the queue type which is enforced to rx and a lease. The newly created
queue id is returned to the caller.
A queue from a virtual device can have a lease which refers to another
queue from a physical device. This is useful for memory providers
and AF_XDP operations which take an ifindex and queue id to allow
applications to bind against virtual devices in containers. The lease
couples both queues together and allows to proxy the operations from
a virtual device in a container to the physical device.
In future, the nested lease attribute can be lifted and made optional
for other use-cases such as dynamic queue creation for physical
netdevs. The lack of lease and the specification of the physical
device as an ifindex will imply that we need a real queue to be
allocated. Similarly, the queue type enforcement to rx can then be
lifted as well to support tx.
An early implementation had only driver-specific integration [0], but
in order for other virtual devices to reuse, it makes sense to have
this as a generic API in core net.
For leasing queues, the virtual netdev must have real_num_rx_queues
less than num_rx_queues at the time of calling queue-create. The
queue-type must be rx as only rx queues are supported for leasing
for now. We also enforce that the queue-create ifindex must point
to a virtual device, and that the nested lease attribute's ifindex
must point to a physical device. The nested lease attribute set
contains a netns-id attribute which is optional and can specify a
netns-id relative to the caller's netns. It requires cap_net_admin
and if the netns-id attribute is not specified, the lease ifindex
will be retrieved from the current netns. Also, it is modeled as
an s32 type similarly as done elsewhere in the stack.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Co-developed-by: David Wei <dw@davidwei.uk>
Signed-off-by: David Wei <dw@davidwei.uk>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://bpfconf.ebpf.io/bpfconf2025/bpfconf2025_material/lsfmmbpf_2025_netkit_borkmann.pdf [0]
Link: https://patch.msgid.link/20260402231031.447597-2-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Allow filtering the resource dump to device-level or port-level
resources using the 'scope' option.
Example - dump only device-level resources:
$ devlink resource show scope dev
pci/0000:03:00.0:
name max_local_SFs size 128 unit entry dpipe_tables none
name max_external_SFs size 128 unit entry dpipe_tables none
pci/0000:03:00.1:
name max_local_SFs size 128 unit entry dpipe_tables none
name max_external_SFs size 128 unit entry dpipe_tables none
Example - dump only port-level resources:
$ devlink resource show scope port
pci/0000:03:00.0/196608:
name max_SFs size 128 unit entry dpipe_tables none
pci/0000:03:00.0/196609:
name max_SFs size 128 unit entry dpipe_tables none
pci/0000:03:00.1/196708:
name max_SFs size 128 unit entry dpipe_tables none
pci/0000:03:00.1/196709:
name max_SFs size 128 unit entry dpipe_tables none
Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260407194107.148063-11-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Allow querying devlink resources per-port via the resource-dump doit
handler. When a port-index attribute is provided, only that port's
resources are returned. When no port-index is given, only device-level
resources are returned, preserving backward compatibility.
Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260407194107.148063-8-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add dumpit handler for resource-dump command to iterate over all devlink
devices and show their resources.
$ devlink resource show
pci/0000:08:00.0:
name local_max_SFs size 508 unit entry
name external_max_SFs size 508 unit entry
pci/0000:08:00.1:
name local_max_SFs size 508 unit entry
name external_max_SFs size 508 unit entry
Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Reviewed-by: Shay Drori <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260407194107.148063-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit f05d26198cf2 ("psp: add stats from psp spec to driver facing
api") added device statistics (rx-packets, rx-bytes, rx-auth-fail,
rx-error, rx-bad, tx-packets, tx-bytes, tx-error) to the stats
attribute-set but did not add them to the get-stats operation reply
attributes. The kernel reports these attributes in the reply, so
list them in the spec to match.
Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260403-psp-yaml-fix-v1-1-dacee0663903@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add DPLL_A_FREQUENCY_MONITOR device attribute to allow control over
the frequency monitor feature. The attribute uses the existing
dpll_feature_state enum (enable/disable) and is present in both
device-get reply and device-set request.
Add DPLL_A_PIN_MEASURED_FREQUENCY pin attribute to expose the measured
input frequency in millihertz (mHz). The attribute is present in the
pin-get reply. Add DPLL_PIN_MEASURED_FREQUENCY_DIVIDER constant to
allow userspace to extract integer and fractional parts.
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20260402184057.1890514-2-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Linux 7.0-rc6
Requested by a few people on irc to resolve conflicts in other tress.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Cross-merge networking fixes after downstream PR (net-7.0-rc5).
net/netfilter/nft_set_rbtree.c
598adea720b97 ("netfilter: revert nft_set_rbtree: validate open interval overlap")
3aea466a43998 ("netfilter: nft_set_rbtree: don't disable bh when acquiring tree lock")
https://lore.kernel.org/abgaQBpeGstdN4oq@sirena.org.uk
No adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We look up a netdev during prep of Netlink ops (pre- callbacks)
and take a ref to it. Then later in the body of the callback
we take its lock or RCU which are the actual protections.
The netdev may get unregistered in between the time we take
the ref and the time we lock it. We may allocate the hierarchy
after flush has already run, which would lead to a leak.
Take the instance lock in pre- already, this saves us from the race
and removes the need for dedicated lock/unlock callbacks completely.
After all, if there's any chance of write happening concurrently
with the flush - we're back to leaking the hierarchy.
We may take the lock for devices which don't support shapers but
we're only dealing with SET operations here, not taking the lock
would be optimizing for an error case.
Fixes: 93954b40f6a4 ("net-shapers: implement NL set and delete operations")
Link: https://lore.kernel.org/20260309173450.538026-1-p@1g4.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20260317161014.779569-2-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Antonio Quartulli says:
====================
Included features:
* use bitops.h API when possible
* send netlink notification in case of client float event
* implement support for asymmetric peer IDs
* consolidate memory allocations during crypto operations
* add netlink notification check in selftests
* add FW mark check in selftest
* tag 'ovpn-net-next-20260317' of https://github.com/OpenVPN/ovpn-net-next:
ovpn: consolidate crypto allocations in one chunk
selftests: ovpn: add test for the FW mark feature
selftests: ovpn: check asymmetric peer-id
ovpn: add support for asymmetric peer IDs
selftests: ovpn: add notification parsing and matching
ovpn: notify userspace on client float event
ovpn: pktid: use bitops.h API
ovpn: use correct array size to parse nested attributes in ovpn_nl_key_swap_doit
selftests: ovpn: allow compiling ovpn-cli.c with mbedtls3
====================
Link: https://patch.msgid.link/20260317104023.192548-1-antonio@openvpn.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add two parameters for drivers supporting Rx CQE coalescing /
descriptor writeback.
ETHTOOL_A_COALESCE_RX_CQE_FRAMES:
Maximum number of frames that can be coalesced into a CQE or
writeback.
ETHTOOL_A_COALESCE_RX_CQE_NSECS:
Max time in nanoseconds after the first packet arrival in a
coalesced CQE or writeback to be sent.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/20260317191826.1346111-2-haiyangz@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In order to support the multipeer architecture, upon connection setup
each side of a tunnel advertises a unique ID that the other side must
include in packets sent to them. Therefore when transmitting a packet, a
peer inserts the recipient's advertised ID for that specific tunnel into
the peer ID field. When receiving a packet, a peer expects to find its
own unique receive ID for that specific tunnel in the peer ID field.
Add support for the TX peer ID and embed it into transmitting packets.
If no TX peer ID is specified, fallback to using the same peer ID both
for RX and TX in order to be compatible with the non-multipeer compliant
peers.
Cc: horms@kernel.org
Cc: donald.hunter@gmail.com
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
|
|
Send a netlink notification when a client updates its remote UDP
endpoint. The notification includes the new IP address, port, and scope
ID (for IPv6).
Cc: linux-kselftest@vger.kernel.org
Cc: horms@kernel.org
Cc: shuah@kernel.org
Cc: donald.hunter@gmail.com
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
|
|
Currently devlink instances are addressed bus_name/dev_name tuple.
Allow the newly introduced DEVLINK_ATTR_INDEX to be used as
an alternative handle for all devlink commands.
When DEVLINK_ATTR_INDEX is present in the request, use it for a direct
xarray lookup instead of iterating over all instances comparing
bus_name/dev_name strings.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20260312100407.551173-5-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Each devlink instance has an internally assigned index used for xarray
storage. Expose it as a new DEVLINK_ATTR_INDEX uint attribute alongside
the existing bus_name and dev_name handle.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20260312100407.551173-2-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There is a duplicated unspec entry. Remove it.
No user impact expected, found by inspection.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20260311-b4-drop_dup_unspec-v1-1-e0dfa47b5981@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Backmerging to bring in 7.00-rc3. Important ahead GPU SVM merging THP
support.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
|
|
GENL_UNS_ADMIN_PERM may be required by protocols using
the `genetlink` family, however, this flag is currently
only allowed in `genetlink-legacy`.
Add it to the list of possible values in genetlink.yaml too.
Cc: Simon Horman <horms@kernel.org>
Cc: Donald Hunter <donald.hunter@gmail.com>
Link: https://github.com/OpenVPN/ovpn-net-next/issues/33
Suggested-by: Ralf Lici <ralf@mandelbit.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Link: https://patch.msgid.link/20260304141020.23270-1-antonio@openvpn.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Filled out operation attributes:
- newtable
- gettable
- deltable
- destroytable
- newchain
- getchain
- delchain
- destroychain
- newrule
- getrule
- getrule-reset
- delrule
- destroyrule
- newset
- getset
- delset
- destroyset
- newsetelem
- getsetelem
- getsetelem-reset
- delsetelem
- destroysetelem
- getgen
- newobj
- getobj
- delobj
- destroyobj
- newflowtable
- getflowtable
- delflowtable
- destroyflowtable
Signed-off-by: Remy D. Farley <one-d-wide@protonmail.com>
Link: https://patch.msgid.link/20260303195638.381642-6-one-d-wide@protonmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
New sub-messsages:
- log
- match
- numgen
- range
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Remy D. Farley <one-d-wide@protonmail.com>
Link: https://patch.msgid.link/20260303195638.381642-5-one-d-wide@protonmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
New attribute sets:
- log-attrs
- numgen-attrs
- range-attrs
- compat-target-attrs
- compat-match-attrs
- compat-attrs
Added missing attributes:
- table-attrs (pad, owner)
- set-attrs (type, count)
Added missing checks:
- range-attrs
- expr-bitwise-attrs
- compat-target-attrs
- compat-match-attrs
- compat-attrs
Annotated doc comment or associated enum:
- batch-attrs
- verdict-attrs
- expr-payload-attrs
Fixed byte order:
- nft-counter-attrs
- expr-counter-attrs
- rule-compat-attrs
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Remy D. Farley <one-d-wide@protonmail.com>
Link: https://patch.msgid.link/20260303195638.381642-4-one-d-wide@protonmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
New enums/flags:
- payload-base
- range-ops
- registers
- numgen-types
- log-level
- log-flags
Added missing enumerations:
- bitwise-ops
Annotated doc comment or associated enum:
- bitwise-ops
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Remy D. Farley <one-d-wide@protonmail.com>
Link: https://patch.msgid.link/20260303195638.381642-3-one-d-wide@protonmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add definitions for max check and len-or-limit type, the same as in other
specifications.
Suggested-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Remy D. Farley <one-d-wide@protonmail.com>
Link: https://patch.msgid.link/20260303195638.381642-2-one-d-wide@protonmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Introduces the DRM RAS infrastructure over generic netlink.
The new interface allows drivers to expose RAS nodes and their
associated error counters to userspace in a structured and extensible
way. Each drm_ras node can register its own set of error counters, which
are then discoverable and queryable through netlink operations. This
lays the groundwork for reporting and managing hardware error states
in a unified manner across different DRM drivers.
Currently it only supports error-counter nodes. But it can be
extended later.
The registration is also not tied to any drm node, so it can be
used by accel devices as well.
It uses the new and mandatory YAML description format stored in
Documentation/netlink/specs/. This forces a single generic netlink
family namespace for the entire drm: "drm-ras".
But multiple-endpoints are supported within the single family.
Any modification to this API needs to be applied to
Documentation/netlink/specs/drm_ras.yaml before regenerating the
code:
$ tools/net/ynl/pyynl/ynl_gen_c.py --spec \
Documentation/netlink/specs/drm_ras.yaml --mode uapi --header \
-o include/uapi/drm/drm_ras.h
$ tools/net/ynl/pyynl/ynl_gen_c.py --spec \
Documentation/netlink/specs/drm_ras.yaml --mode kernel \
--header -o drivers/gpu/drm/drm_ras_nl.h
$ tools/net/ynl/pyynl/ynl_gen_c.py --spec \
Documentation/netlink/specs/drm_ras.yaml \
--mode kernel --source -o drivers/gpu/drm/drm_ras_nl.c
Cc: Zack McKevitt <zachary.mckevitt@oss.qualcomm.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: netdev@vger.kernel.org
Co-developed-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Zack McKevitt <zachary.mckevitt@oss.qualcomm.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patch.msgid.link/20260304074412.464435-8-riana.tauro@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Cross-merge networking fixes after downstream PR (net-7.0-rc3).
No conflicts.
Adjacent changes:
net/netfilter/nft_set_rbtree.c
fb7fb4016300 ("netfilter: nf_tables: clone set on flush only")
3aea466a4399 ("netfilter: nft_set_rbtree: don't disable bh when acquiring tree lock")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With TX pause enabled, if a device is unable to pass packets up to the
stack (e.g., CPU is hanged), the device can cause pause storm. Given
that devices can have native support to protect the neighbor from such
flooding, such events need some tracking. This support is to track TX
pause storm events for better observability.
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
Link: https://patch.msgid.link/20260302230149.1580195-2-mohsin.bashr@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Restore previous nfsd thread count reporting behavior
- Fix credential reference leaks in the NFSD netlink admin protocol
* tag 'nfsd-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
nfsd: report the requested maximum number of threads instead of number running
nfsd: Fix cred ref leak in nfsd_nl_listener_set_doit().
nfsd: Fix cred ref leak in nfsd_nl_threads_set_doit().
|
|
The current netlink and /proc interfaces deviate from their traditional
values when dynamic threading is enabled, and there is currently no way
to know what the current setting is. This patch brings the reporting
back in line with traditional behavior.
Make these interfaces report the requested maximum number of threads
instead of the number currently running. Also, update documentation and
comments to reflect that this value represents a maximum and not the
number currently running.
Fixes: d8316b837c2c ("nfsd: add controls to set the minimum number of threads per pool")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Pull nfsd updates from Chuck Lever:
"Neil Brown and Jeff Layton contributed a dynamic thread pool sizing
mechanism for NFSD. The sunrpc layer now tracks minimum and maximum
thread counts per pool, and NFSD adjusts running thread counts based
on workload: idle threads exit after a timeout when the pool exceeds
its minimum, and new threads spawn automatically when all threads are
busy. Administrators control this behavior via the nfsdctl netlink
interface.
Rick Macklem, FreeBSD NFS maintainer, generously contributed server-
side support for the POSIX ACL extension to NFSv4, as specified in
draft-ietf-nfsv4-posix-acls. This extension allows NFSv4 clients to
get and set POSIX access and default ACLs using native NFSv4
operations, eliminating the need for sideband protocols. The feature
is gated by a Kconfig option since the IETF draft has not yet been
ratified.
Chuck Lever delivered numerous improvements to the xdrgen tool. Error
reporting now covers parsing, AST transformation, and invalid
declarations. Generated enum decoders validate incoming values against
valid enumerator lists. New features include pass-through line support
for embedding C directives in XDR specifications, 16-bit integer
types, and program number definitions. Several code generation issues
were also addressed.
When an administrator revokes NFSv4 state for a filesystem via the
unlock_fs interface, ongoing async COPY operations referencing that
filesystem are now cancelled, with CB_OFFLOAD callbacks notifying
affected clients.
The remaining patches in this pull request are clean-ups and minor
optimizations. Sincere thanks to all contributors, reviewers, testers,
and bug reporters who participated in the v7.0 NFSD development cycle"
* tag 'nfsd-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (45 commits)
NFSD: Add POSIX ACL file attributes to SUPPATTR bitmasks
NFSD: Add POSIX draft ACL support to the NFSv4 SETATTR operation
NFSD: Add support for POSIX draft ACLs for file creation
NFSD: Add support for XDR decoding POSIX draft ACLs
NFSD: Refactor nfsd_setattr()'s ACL error reporting
NFSD: Do not allow NFSv4 (N)VERIFY to check POSIX ACL attributes
NFSD: Add nfsd4_encode_fattr4_posix_access_acl
NFSD: Add nfsd4_encode_fattr4_posix_default_acl
NFSD: Add nfsd4_encode_fattr4_acl_trueform_scope
NFSD: Add nfsd4_encode_fattr4_acl_trueform
Add RPC language definition of NFSv4 POSIX ACL extension
NFSD: Add a Kconfig setting to enable support for NFSv4 POSIX ACLs
xdrgen: Implement pass-through lines in specifications
nfsd: cancel async COPY operations when admin revokes filesystem state
nfsd: add controls to set the minimum number of threads per pool
nfsd: adjust number of running nfsd threads based on activity
sunrpc: allow svc_recv() to return -ETIMEDOUT and -EBUSY
sunrpc: split new thread creation into a separate function
sunrpc: introduce the concept of a minimum number of threads per pool
sunrpc: track the max number of requested threads in a pool
...
|
|
Merge in late fixes in preparation for the net-next PR.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The following warnings were visible:
$ ./scripts/kernel-doc -Wall -none \
net/mptcp/ include/net/mptcp.h include/uapi/linux/mptcp*.h \
include/trace/events/mptcp.h
Warning: net/mptcp/token.c:108 No description found for return value of 'mptcp_token_new_request'
Warning: net/mptcp/token.c:151 No description found for return value of 'mptcp_token_new_connect'
Warning: net/mptcp/token.c:246 No description found for return value of 'mptcp_token_get_sock'
Warning: net/mptcp/token.c:298 No description found for return value of 'mptcp_token_iter_next'
Warning: net/mptcp/protocol.c:4431 No description found for return value of 'mptcp_splice_read'
Warning: include/uapi/linux/mptcp_pm.h:13 missing initial short description on line:
* enum mptcp_event_type
Address all of them: either by using the 'Return:' keyword, or by adding
a missing initial short description.
The MPTCP CI will soon report issues with kdoc to avoid introducing new
issues and being flagged by the Netdev CI.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260205-net-mptcp-misc-fixes-6-19-rc8-v2-3-c2720ce75c34@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, the dpll subsystem exports the fractional frequency offset
(FFO) in parts per million (ppm). This granularity is insufficient for
high-precision synchronization scenarios which often require parts per
trillion (ppt) resolution.
Add a new netlink attribute DPLL_A_PIN_FRACTIONAL_FREQUENCY_OFFSET_PPT
to expose the FFO in ppt.
Update the dpll netlink core to expect the driver-provided FFO value
to be in ppt. To maintain backward compatibility with existing userspace
tools, populate the legacy DPLL_A_PIN_FRACTIONAL_FREQUENCY_OFFSET
attribute by dividing the new ppt value by 1,000,000.
Update the zl3073x and mlx5 drivers to provide the FFO value in ppt:
- zl3073x: adjust the fixed-point calculation to produce ppt (10^12)
instead of ppm (10^6).
- mlx5: scale the existing ppm value by 1,000,000.
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20260126162253.27890-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a new "min_threads" variable to the nfsd_net, along with the
corresponding netlink interface, to set that value from userland.
Pass that value to svc_set_pool_threads() and svc_set_num_threads().
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Allow configuring and dumping the new device option, and cache its value
into the geneve socket itself.
The new option is not tie to it any code yet.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/2295d4e4d1e919a3189425141bbc71c7850a2de0.1769011015.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR (net-6.19-rc7).
Conflicts:
drivers/net/ethernet/huawei/hinic3/hinic3_irq.c
b35a6fd37a00 ("hinic3: Add adaptive IRQ coalescing with DIM")
fb2bb2a1ebf7 ("hinic3: Fix netif_queue_set_napi queue_index input parameter error")
https://lore.kernel.org/fc0a7fdf08789a52653e8ad05281a0a849e79206.1768915707.git.zhuyikai1@h-partners.com
drivers/net/wireless/ath/ath12k/mac.c
drivers/net/wireless/ath/ath12k/wifi7/hw.c
31707572108d ("wifi: ath12k: Fix wrong P2P device link id issue")
c26f294fef2a ("wifi: ath12k: Move ieee80211_ops callback to the arch specific module")
https://lore.kernel.org/20260114123751.6a208818@canb.auug.org.au
Adjacent changes:
drivers/net/wireless/ath/ath12k/mac.c
8b8d6ee53dfd ("wifi: ath12k: Fix scan state stuck in ABORTING after cancel_remain_on_channel")
914c890d3b90 ("wifi: ath12k: Add framework for hardware specific ieee80211_ops registration")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from CAN and wireless.
Pretty big, but hard to make up any cohesive story that would explain
it, a random collection of fixes. The two reverts of bad patches from
this release here feel like stuff that'd normally show up by rc5 or
rc6. Perhaps obvious thing to say, given the holiday timing.
That said, no active investigations / regressions. Let's see what the
next week brings.
Current release - fix to a fix:
- can: alloc_candev_mqs(): add missing default CAN capabilities
Current release - regressions:
- usbnet: fix crash due to missing BQL accounting after resume
- Revert "net: wwan: mhi_wwan_mbim: Avoid -Wflex-array-member-not ...
Previous releases - regressions:
- Revert "nfc/nci: Add the inconsistency check between the input ...
Previous releases - always broken:
- number of driver fixes for incorrect use of seqlocks on stats
- rxrpc: fix recvmsg() unconditional requeue, don't corrupt rcv queue
when MSG_PEEK was set
- ipvlan: make the addrs_lock be per port avoid races in the port
hash table
- sched: enforce that teql can only be used as root qdisc
- virtio: coalesce only linear skb
- wifi: ath12k: fix dead lock while flushing management frames
- eth: igc: reduce TSN TX packet buffer from 7KB to 5KB per queue"
* tag 'net-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (96 commits)
Octeontx2-af: Add proper checks for fwdata
dpll: Prevent duplicate registrations
net/sched: act_ife: avoid possible NULL deref
hinic3: Fix netif_queue_set_napi queue_index input parameter error
vsock/test: add stream TX credit bounds test
vsock/virtio: cap TX credit to local buffer size
vsock/test: fix seqpacket message bounds test
vsock/virtio: fix potential underflow in virtio_transport_get_credit()
net: fec: account for VLAN header in frame length calculations
net: openvswitch: fix data race in ovs_vport_get_upcall_stats
octeontx2-af: Fix error handling
net: pcs: pcs-mtk-lynxi: report in-band capability for 2500Base-X
rxrpc: Fix data-race warning and potential load/store tearing
net: dsa: fix off-by-one in maximum bridge ID determination
net: bcmasp: Fix network filter wake for asp-3.0
bonding: provide a net pointer to __skb_flow_dissect()
selftests: net: amt: wait longer for connection before sending packets
be2net: Fix NULL pointer dereference in be_cmd_get_mac_from_list
Revert "net: wwan: mhi_wwan_mbim: Avoid -Wflex-array-member-not-at-end warning"
netrom: fix double-free in nr_route_frame()
...
|
|
This reverts commit 77b9c4a438fc66e2ab004c411056b3fb71a54f2c, reversing
changes made to 4515ec4ad58a37e70a9e1256c0b993958c9b7497:
931420a2fc36 ("selftests/net: Add netkit container tests")
ab771c938d9a ("selftests/net: Make NetDrvContEnv support queue leasing")
6be87fbb2776 ("selftests/net: Add env for container based tests")
61d99ce3dfc2 ("selftests/net: Add bpf skb forwarding program")
920da3634194 ("netkit: Add xsk support for af_xdp applications")
eef51113f8af ("netkit: Add netkit notifier to check for unregistering devices")
b5ef109d22d4 ("netkit: Implement rtnl_link_ops->alloc and ndo_queue_create")
b5c3fa4a0b16 ("netkit: Add single device mode for netkit")
0073d2fd679d ("xsk: Proxy pool management for leased queues")
1ecea95dd3b5 ("xsk: Extend xsk_rcv_check validation")
804bf334d08a ("net: Proxy netdev_queue_get_dma_dev for leased queues")
0caa9a8ddec3 ("net: Proxy net_mp_{open,close}_rxq for leased queues")
ff8889ff9107 ("net, ethtool: Disallow leased real rxqs to be resized")
9e2103f36110 ("net: Add lease info to queue-get response")
31127deddef4 ("net: Implement netdev_nl_queue_create_doit")
a5546e18f77c ("net: Add queue-create operation")
The series will conflict with io_uring work, and the code needs more
polish.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a ynl netdev family operation called queue-create that creates a
new queue on a netdevice:
name: queue-create
attribute-set: queue
flags: [admin-perm]
do:
request:
attributes:
- ifindex
- type
- lease
reply: &queue-create-op
attributes:
- id
This is a generic operation such that it can be extended for various
use cases in future. Right now it is mandatory to specify ifindex,
the queue type which is enforced to rx and a lease. The newly created
queue id is returned to the caller.
A queue from a virtual device can have a lease which refers to another
queue from a physical device. This is useful for memory providers
and AF_XDP operations which take an ifindex and queue id to allow
applications to bind against virtual devices in containers. The lease
couples both queues together and allows to proxy the operations from
a virtual device in a container to the physical device.
In future, the nested lease attribute can be lifted and made optional
for other use-cases such as dynamic queue creation for physical
netdevs. The lack of lease and the specification of the physical
device as an ifindex will imply that we need a real queue to be
allocated. Similarly, the queue type enforcement to rx can then be
lifted as well to support tx.
An early implementation had only driver-specific integration [0], but
in order for other virtual devices to reuse, it makes sense to have
this as a generic API in core net.
For leasing queues, the virtual netdev must have real_num_rx_queue
less than num_rx_queues at the time of calling queue-create. The
queue-type must be rx as only rx queues are supported for leasing
for now. We also enforce that the queue-create ifindex must point
to a virtual device, and that the nested lease attribute's ifindex
must point to a physical device. The nested lease attribute set
contains a netns-id attribute which is currently only intended for
dumping as part of the queue-get operation. Also, it is modeled as
an s32 type similarly as done elsewhere in the stack.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Co-developed-by: David Wei <dw@davidwei.uk>
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://bpfconf.ebpf.io/bpfconf2025/bpfconf2025_material/lsfmmbpf_2025_netkit_borkmann.pdf [0]
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260115082603.219152-2-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently, userspace can retrieve the DPLL working mode but cannot
configure it. This prevents changing the device operation, such as
switching from manual to automatic mode and vice versa.
Add a new callback .mode_set() to struct dpll_device_ops. Extend
the netlink policy and device-set command handling to process
the DPLL_A_MODE attribute. Update the netlink YAML specification
to include the mode attribute in the device-set operation.
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20260114122726.120303-3-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
fou_udp_recv() has the same problem mentioned in the previous
patch.
If FOU_ATTR_IPPROTO is set to 0, skb is not freed by
fou_udp_recv() nor "resubmit"-ted in ip_protocol_deliver_rcu().
Let's forbid 0 for FOU_ATTR_IPPROTO.
Fixes: 23461551c0062 ("fou: Support for foo-over-udp RX path")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260115172533.693652-4-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Merge fixes related to the energy model management for 6.19-rc6:
- Fix a memory leak in em_create_pd() error path (Malaya Kumar Rout)
- Fix stale description of the cost field in struct em_perf_state to
reflect the current code (Yaxiong Tian)
- Fix and revamp the energy model YNL specification added recently
along with the energy model netlink interface (Changwoo Min)
* pm-em:
PM: EM: Add dump to get-perf-domains in the EM YNL spec
PM: EM: Change cpus' type from string to u64 array in the EM YNL spec
PM: EM: Rename em.yaml to dev-energymodel.yaml
PM: EM: Fix yamllint warnings in the EM YNL spec
PM: EM: Fix memory leak in em_create_pd() error path
PM: EM: Fix incorrect description of the cost field in struct em_perf_state
|
|
This commit adds shared shaper state across the cake instances beneath a
cake_mq qdisc. It works by periodically tracking the number of active
instances, and scaling the configured rate by the number of active
queues.
The scan is lockless and simply reads the qlen and the last_active state
variable of each of the instances configured beneath the parent cake_mq
instance. Locking is not required since the values are only updated by
the owning instance, and eventual consistency is sufficient for the
purpose of estimating the number of active queues.
The interval for scanning the number of active queues is set to 200 us.
We found this to be a good tradeoff between overhead and response time.
For a detailed analysis of this aspect see the Netdevconf talk:
https://netdevconf.info/0x19/docs/netdev-0x19-paper16-talk-paper.pdf
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-5-8d613fece5d8@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add dump to get-perf-domains, so that a user can fetch either information
about a specific performance domain with do or information about all
performance domains with dump. Share the reply format of do and dump using
perf-domain-attrs, so remove perf-domains. The YNL spec, autogenerated
files, and the do implementation are updated, and the dump implementation
is added.
Suggested-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Link: https://patch.msgid.link/20260108053212.642478-5-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Previously, the cpus attribute was a string format which was a "%*pb"
stringification of a bitmap. That is not very consumable for a UAPI,
so let’s change it to an u64 array of CPU ids.
Suggested-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Link: https://patch.msgid.link/20260108053212.642478-4-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The EM YNL specification used many acronyms, including ‘em’, ‘pd’,
‘ps’, etc. While the acronyms are short and convenient, they could be
confusing. So, let’s spell them out to be more specific. The following
changes were made in the spec. Note that the protocol name cannot exceed
GENL_NAMSIZ (16).
em -> dev-energymodel
pds -> perf-domains
pd -> perf-domain
pd-id -> perf-domain-id
pd-table -> perf-table
ps -> perf-state
get-pds -> get-perf-domains
get-pd-table -> get-perf-table
pd-created -> perf-domain-created
pd-updated -> perf-domain-updated
pd-deleted -> perf-domain-deleted
In addition. doc strings were added to the spec. based on the comments in
energy_model.h. Two flag attributes (perf-state-flags and
perf-domain-flags) were added for easily interpreting the bit flags.
Finally, the autogenerated files and em_netlink.c were updated accordingly
to reflect the name changes.
Suggested-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Link: https://patch.msgid.link/20260108053212.642478-3-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The energy model YNL spec has the following two warnings
when checking with yamlint:
3:1 warning missing document start "---" (document-start)
107:13 error wrong indentation: expected 10 but found 12 (indentation)
So let’s fix whose lint warnings.
Fixes: bd26631ccdfd ("PM: EM: Add em.yaml and autogen files")
Suggested-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Link: https://patch.msgid.link/20260108053212.642478-2-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The phrasing of the page-pool-get doc is very confusing.
It's supposed to highlight that support depends on the driver
doing its part but it sounds like orphaned page pools won't
be visible.
The description of the ifindex is completely wrong.
We move the page pool to loopback and skip the attribute if
ifindex is loopback.
Link: https://lore.kernel.org/20260104084347.5de3a537@kernel.org
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://patch.msgid.link/20260104165232.710460-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|