| Age | Commit message (Collapse) | Author | Lines |
|
The module is supported, enable it.
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
LLVM generates bpf_addr_space_cast instruction while translating pointers
between native (zero) address space and __attribute__((address_space(N))).
The addr_space=0 is reserved as bpf_arena address space.
rY = addr_space_cast(rX, 0, 1) is processed by the verifier and converted
to normal 32-bit move: wX = wY
rY = addr_space_cast(rX, 1, 0) has to be converted by JIT.
With this, the following test cases passed:
$ ./test_progs -a arena_htab,arena_list,arena_strsearch,verifier_arena,verifier_arena_large
#4/1 arena_htab/arena_htab_llvm:OK
#4/2 arena_htab/arena_htab_asm:OK
#4 arena_htab:OK
#5/1 arena_list/arena_list_1:OK
#5/2 arena_list/arena_list_1000:OK
#5 arena_list:OK
#7/1 arena_strsearch/arena_strsearch:OK
#7 arena_strsearch:OK
#507/1 verifier_arena/basic_alloc1:OK
#507/2 verifier_arena/basic_alloc2:OK
#507/3 verifier_arena/basic_alloc3:OK
#507/4 verifier_arena/basic_reserve1:OK
#507/5 verifier_arena/basic_reserve2:OK
#507/6 verifier_arena/reserve_twice:OK
#507/7 verifier_arena/reserve_invalid_region:OK
#507/8 verifier_arena/iter_maps1:OK
#507/9 verifier_arena/iter_maps2:OK
#507/10 verifier_arena/iter_maps3:OK
#507 verifier_arena:OK
#508/1 verifier_arena_large/big_alloc1:OK
#508/2 verifier_arena_large/access_reserved:OK
#508/3 verifier_arena_large/request_partially_reserved:OK
#508/4 verifier_arena_large/free_reserved:OK
#508/5 verifier_arena_large/big_alloc2:OK
#508 verifier_arena_large:OK
Summary: 5/20 PASSED, 0 SKIPPED, 0 FAILED
Acked-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Tested-by: Vincent Li <vincent.mc.li@gmail.com>
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Add support for `{LDX,STX,ST} | PROBE_MEM32 | {B,H,W,DW}` instructions.
They are similar to PROBE_MEM instructions with the following differences:
* PROBE_MEM32 supports store.
* PROBE_MEM32 relies on the verifier to clear upper 32-bit of the
src/dst register
* PROBE_MEM32 adds 64-bit kern_vm_start address (which is stored in S6
in the prologue). Due to bpf_arena constructions such S6 + reg +
off16 access is guaranteed to be within arena virtual range, so no
address check at run-time.
* S6 is a free callee-saved register, so it is used to store arena_vm_start
* PROBE_MEM32 allows ST and STX. If they fault the store is a nop. When
LDX faults the destination register is zeroed.
To support these on LoongArch, we employ the t2/t3 registers to store the
intermediate results of reg_arena + src/dst reg and use the t2/t3 registers
as the new src/dst reg. This allows us to reuse most of the existing code.
Acked-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Tested-by: Vincent Li <vincent.mc.li@gmail.com>
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Use bpf_jit_binary_pack_alloc() for BPF JIT binaries. The BPF prog pack
allocator creates a pair of RW and RX buffers. The BPF JIT writes the
program into the RW buffer. When the JIT is done, the program is copied
to the final RX buffer with bpf_jit_binary_pack_finalize().
Acked-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Tested-by: Vincent Li <vincent.mc.li@gmail.com>
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
In commit a759e37fb467 ("err.h: add ERR_PTR_PCPU(), PTR_ERR_PCPU() and
IS_ERR_PCPU() macros"), specialized macros were added to check an error
within a __percpu pointer, so use them instead of manually casting with
__force, like all other users of register_wide_hw_breakpoint().
Signed-off-by: Carlos López <clopez@suse.de>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
kasan_init_generic() indicates that kasan is fully initialized, so it
should be put at end of kasan_init().
Otherwise bringing up the primary CPU failed when CONFIG_KASAN is set
on PTW-enabled systems, here are the call chains:
kernel_entry()
start_kernel()
setup_arch()
kasan_init()
kasan_init_generic()
The reason is PTW-enabled systems have speculative accesses which means
memory accesses to the shadow memory after kasan_init() may be executed
by hardware before. However, accessing shadow memory is safe only after
kasan fully initialized because kasan_init() uses a temporary PGD table
until we have populated all levels of shadow page tables and writen the
PGD register. Moving kasan_init_generic() later can defer the occasion
of kasan_enabled(), so as to avoid speculative accesses on shadow pages.
After moving kasan_init_generic() to the end, kasan_init() can no longer
call kasan_mem_to_shadow() for shadow address conversion because it will
always return kasan_early_shadow_page. On the other hand, we should keep
the current logic of kasan_mem_to_shadow() for both the early and final
stage because there may be instrumentation before kasan_init().
To solve this, we factor out a new mem_to_shadow() function from current
kasan_mem_to_shadow() for the shadow address conversion in kasan_init().
Cc: stable@vger.kernel.org
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
According to Documentation/dev-tools/kasan.rst, software KASAN modes use
compiler instrumentation to insert validity checks. Such instrumentation
might be incompatible with some parts of the kernel, and therefore needs
to be disabled, just use the attribute __no_sanitize_address to disable
instrumentation for the low level function setup_ptwalker().
Otherwise bringing up the secondary CPUs failed when CONFIG_KASAN is set
(especially when PTW is enabled), here are the call chains:
smpboot_entry()
start_secondary()
cpu_probe()
per_cpu_trap_init()
tlb_init()
setup_tlb_handler()
setup_ptwalker()
The reason is the PGD registers are configured in setup_ptwalker(), but
KASAN instrumentation may cause TLB exceptions before that.
Cc: stable@vger.kernel.org
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
There are declarations of the variable "eentry", "pcpu_handlers[]" and
"exception_handlers[]" in asm/setup.h, the source files already include
this header file directly or indirectly, so no need to declare them in
the source files, just remove the code.
Cc: stable@vger.kernel.org
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
After commit 88fd2b70120d ("LoongArch: Fix sleeping in atomic context for
PREEMPT_RT"), it should guard percpu handler under !CONFIG_PREEMPT_RT to
avoid redundant operations.
Cc: stable@vger.kernel.org
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
After commit 4cd641a79e69 ("LoongArch: Remove unnecessary checks for ORC
unwinder"), the system can not boot normally under some configs (such as
enable KASAN), there are many error messages "cannot find unwind pc".
The kernel boots normally with the defconfig, so no problem found out at
the first time. Here is one way to reproduce:
cd linux
make mrproper defconfig -j"$(nproc)"
scripts/config -e KASAN
make olddefconfig all -j"$(nproc)"
sudo make modules_install
sudo make install
sudo reboot
The address that can not unwind is not a valid kernel address which is
between "pcpu_handlers[cpu]" and "pcpu_handlers[cpu] + vec_sz" due to
the code of eentry was copied to the new area of pcpu_handlers[cpu] in
setup_tlb_handler(), handle this special case to get the valid address
to unwind normally.
Cc: stable@vger.kernel.org
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Currently, use %p to prevent leaking information about the kernel memory
layout when printing the PC address, but the kernel log messages are not
useful to debug problem if bt_address() returns 0. Given that the type of
"pc" variable is unsigned long, it should use %px to print the unmodified
unwinding address.
Cc: stable@vger.kernel.org
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Currently we use bottom-up allocation after sparse_init(), the reason is
sparse_init() need a lot of memory, and bottom-up allocation may exhaust
precious low memory (below 4GB). On the other hand, SWIOTLB and CMA need
low memories for DMA32, so swiotlb_init() and dma_contiguous_reserve()
need bottom-up allocation.
Since swiotlb_init() and dma_contiguous_reserve() are both called in
arch_mem_init(), we no longer need bottom-up allocation after that. So
we set the allocation policy to top-down at the end of arch_mem_init(),
in order to avoid later memory allocations (such as KASAN) exhaust low
memory.
This solve at least two problems:
1. Some buggy BIOSes use 0xfd000000~0xfe000000 for secondary CPUs, but
didn't reserve this range, which causes smpboot failures.
2. Some DMA32 devices, such as Loongson-DRM and OHCI, cannot work with
KASAN enabled.
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
For benchmarking or debugging purpose, we usually want to control SMT
via boot parameter and sysfs knobs. So add HOTPLUG_SMT implementation.
1. Boot parameters:
nosmt: Disable SMT, can be enabled via sysfs knobs.
nosmt=force: Disable SMT, cannot be enabled via sysfs knobs.
2. Runtime sysfs controls:
Write "on", "off", "forceoff" or the number of SMT threads (1, 2, ...)
to /sys/devices/system/cpu/smt/control.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
The arch definition of cpumask_of_node() cannot handle NUMA_NO_NODE -
which is a valid index - so add a check for this.
Cc: stable@vger.kernel.org
Signed-off-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
LoongArch supports ARCH_HAS_SET_DIRECT_MAP, therefore wire up the
memfd_secret system call, which just depends on it.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Signed-off-by: Lain "Fearyncess" Yang <fearyncess@aosc.io>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Fix warnings like: "Prefer seq_puts to seq_printf" by checkpatch.pl.
Replace seq_printf() calls with seq_puts() in show_cpuinfo() when
outputting simple constant strings without format specifiers.
This improves performance slightly as seq_puts() avoids parsing the
format string.
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Implement 128-bit atomic compare-and-exchange using LoongArch's
LL.D/SC.Q instructions.
At the same time, this fix the BPF scheduler test failures (scx_central
and scx_qmap) caused by kmalloc_nolock_noprof() returning NULL, due to
missing 128-bit atomics. The NULL returns lead to -ENOMEM errors during
scheduler initialization, causing test cases to fail.
Verified by testing with the scx_qmap scheduler
(located in tools/sched_ext/).
Building with `make` and running ./tools/sched_ext/build/bin/scx_qmap.
Link: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=5fb750e8a9ae
Acked-by: Hengqi Chen <hengqi.chen@gmail.com>
Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
Co-developed-by:: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Check the CPUCFG2_SCQ bit to determine if the current CPU supports the
SC.Q instruction.
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
Co-developed-by: Yangyang Lian <lianyangyang@kylinos.cn>
Signed-off-by: Yangyang Lian <lianyangyang@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
LoongArch has already implemented cmpxchg_local(), this_cpu_cmpxchg()
and similar functions, so select HAVE_CMPXCHG_LOCAL in Kconfig to avoid
incurring the overhead of local_irq_save()/local_irq_restore() for page
state helpers in mm/vmstat.c.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Relax the maximum allowed Secure Execution (SE) header size from
8 KiB to 1 MiB. This allows individual secure guest images to run on a
wider range of physical machines.
Signed-off-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
|
|
Farhan has been doing a masterful job coming on in the
s390 PCI space, and my own attention has been lacking.
Let's make MAINTAINERS reflect reality.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Acked-by: Alex Williamson <alex@shazbot.org>
Acked-by: Farhan Ali <alifm@linux.ibm.com>
Acked-by: Matthew Rosato <mjrosato@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
|
|
Felix Maurer says:
====================
hsr: Implement more robust duplicate discard algorithm
The duplicate discard algorithms for PRP and HSR do not work reliably
with certain link faults. Especially with packet loss on one link, the
duplicate discard algorithms drop valid packets. For a more thorough
description see patches 4 (for PRP) and 6 (for HSR).
This patchset replaces the current algorithms (based on a drop window
for PRP and highest seen sequence number for HSR) with a single new one
that tracks the received sequence numbers individually (descriptions
again in patches 4 and 6).
The changes will lead to higher memory usage and more work to do for
each packet. But I argue that this is an acceptable trade-off to make
for a more robust PRP and HSR behavior with faulty links. After all,
both protocols are to be used in environments where redundancy is needed
and people are willing to setup special network topologies to achieve
that.
Some more reasoning on the overhead and expected scale of the deployment
from the RFC discussion:
> As for the expected scale, there are two dimensions: the number of nodes
> in the network and the data rate with which they send.
>
> The number of nodes in the network affect the memory usage because each
> node now has the block buffer. For PRP that's 64 blocks * 32 byte =
> 2kbyte for each node in the node table. A PRP network doesn't have an
> explicit limit for the number of nodes. However, the whole network is a
> single layer-2 segment which shouldn't grow too large anyways. Even if
> one really tries to put 1000 nodes into the PRP network, the memory
> overhead (2Mbyte) is acceptable in my opinion.
>
> For HSR, the blocks would be larger because we need to track the
> sequence numbers per port. I expect 64 blocks * 80 byte = 5kbyte per
> node in the node table. There is no explicit limit for the size of an
> HSR ring either. But I expect them to be of limited size because the
> forwarding delays add up throughout the ring. I've seen vendors limiting
> the ring size to 50 nodes with 100Mbit/s links and 300 with 1Gbit/s
> links. In both cases I consider the memory overhead acceptable.
>
> The data rates are harder to reason about. In general, the data rates
> for HSR and PRP are limited because too high packet rates would lead to
> very fast re-use of the 16bit sequence numbers. The IEC 62439-3:2021
> mentions 100Mbit/s links and 1Gbit/s links. I don't expect HSR or PRP
> networks to scale out to, e.g., 10Gbit/s links with the current
> specification as this would mean that sequence numbers could repeat as
> often as every ~4ms. The default constants in the IEC standard, which we
> also use, are oriented at a 100Mbit/s network.
>
> In my tests with veth pairs, the CPU overhead didn't lead to
> significantly lower data rates. The main factor limiting the data rate
> at the moment, I assume, is the per-node spinlock that is taken for each
> received packet. IMHO, there is a lot more to gain in terms of CPU
> overhead from making this lock smaller or getting rid of it, than we
> loose with the more accurate duplicate discard algorithm in this patchset.
>
> The CPU overhead of the algorithm benefits from the fact that in high
> packet rate scenarios (where it really matters) many packets will have
> sequence numbers in already initialized blocks. These packets just have
> additionally: one xarray lookup, one comparison, and one bit setting. If
> a block needs to be initialized (once every 128 packets plus their 128
> duplicates if all sequence numbers are seen), we will have: one
> xa_erase, a bunch of memory writes, and one xa_store.
>
> In theory, all packets could end up in the slow path if a node sends
> every 128th packet to us. If this is sent from a well behaving node, the
> packet rate wouldn't be an issue anymore, though.
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
====================
Link: https://patch.msgid.link/cover.1770299429.git.fmaurer@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Despite the HSR subsystem being orphaned at the moment due to the original
maintainer being unreachable for a while, assign the selftests to the
subsystem for the future.
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/f4a356b96f5e0c99d9db3984ea62596c99a97469.1770299429.git.fmaurer@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Run the packet loss and reordering tests also for both HSR versions. Now
they can be removed from the hsr_ping tests completely. The timeout needs
to be increased because there are 15 link fault test cases now, with each
of them taking 5-6sec for the test and at most 5sec for the HSR node tables
to get merged and we also want some room to make the test runs stable.
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/eb6f667d3804ce63d86f0ee3fbc0e0ac9e1a209a.1770299429.git.fmaurer@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The HSR duplicate discard algorithm had even more basic problems than the
described for PRP in the previous patch. It relied only on the last
received sequence number to decide if a new frame should be forwarded to
any port. This does not work correctly in any case where frames are
received out of order. The linked bug report claims that this can even
happen with perfectly fine links due to the order in which incoming frames
are processed (which can be unexpected on multi-core systems). The issue
also occasionally shows up in the HSR selftests. The main reason is that
the sequence number that was last forwarded to the master port may have
skipped a number which will in turn never be delivered to the host.
As the problem (we accidentally skip over a sequence number that has not
been received but will be received in the future) is similar to PRP, we can
apply a similar solution. The duplicate discard algorithm based on the
"sparse bitmap" works well for HSR if it is extended to track one bitmap
for each port (A, B, master, interlink). To do this, change the sequence
number blocks to contain a flexible array member as the last member that
can keep chunks for as many bitmaps as we need. This design makes it easy
to reuse the same algorithm in a potential PRP RedBox implementation.
The duplicate discard algorithm functions are modified to deal with
sequence number blocks of different sizes and to correctly use the array of
bitmap chunks. There is a notable speciality for HSR: the port type has a
special port type NONE with value 0. This leads to the number of port types
being 5 instead of actually 4. To save memory, remove the NONE port from
the bitmap (by subtracting 1) when setting up the block buffer and when
accessing the bitmap chunks in the array.
Removing the old algorithm allows us to get rid of a few fields that are
not needed any more: time_out and seq_out for each port. We can also remove
some functions that were only necessary for the previous duplicate discard
algorithm.
The removal of seq_out is possible despite its previous usage in
hsr_register_frame_in: it was used to prevent updates to time_in when
"invalid" sequence numbers were received. With the new duplicate discard
algorithm, time_in has no relevance for the expiry of sequence numbers
anymore. They will expire based on the timestamps in the sequence number
blocks after at most 400ms. There is no need that a node "re-registers" to
"resume communication": after 400ms, all sequence numbers are accepted
again. Also, according to the IEC 62439-3:2021, all nodes are supposed to
send no traffic for 500ms after boot to lead exactly to this expiry of seen
sequence numbers. time_in is still used for pruning nodes from the node
table after no traffic has been received for 60sec. Pruning is only needed
if the node is really gone and has not been sending any traffic for that
period.
seq_out was also used to report the last incoming sequence number from a
node through netlink. I am not sure how useful this value is to userspace
at all, but added getting it from the sequence number blocks. This number
can be outdated after node merging until a new block has been added.
Update the KUnit test for the PRP duplicate discard so that the node
allocation matches and expectations on the removed fields are removed.
Reported-by: Yoann Congal <yoann.congal@smile.fr>
Closes: https://lore.kernel.org/netdev/7d221a07-8358-4c0b-a09c-3b029c052245@smile.fr/
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/36dc3bc5bdb7e68b70bb5ef86f53ca95a3f35418.1770299429.git.fmaurer@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add tests where one link has different rates of packet loss or reorders
packets. PRP should still be able to recover from these link faults and
show no packet loss. However, it is acceptable to receive some level of
duplicate packets. This matches the current specification (IEC
62439-3:2021) of the duplicate discard algorithm that requires it to be
"designed such that it never rejects a legitimate frame, while occasional
acceptance of a duplicate can be tolerated." The rate of acceptable
duplicates in this test is intentionally high (10%) to make the test
stable, the values I observed in the worst test cases (20% loss) are around
5% duplicates.
The duplicates occur because of the 10ms ping interval in the test. As
blocks expire after 400ms based on the timestamp of the first received
sequence number in the block, every approx. 40th will lead to a new, clean
block being used where the sequence number hasn't been seen before. As this
occurs on both nodes in the test (for requests and replies), we observe
around 20 duplicate frames.
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/7b36506d3a80e53786fe56526cf6046c74dfeee1.1770299429.git.fmaurer@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The PRP duplicate discard algorithm does not work reliably with certain
link faults. Especially with packet loss on one link, the duplicate discard
algorithm drops valid packets which leads to packet loss on the PRP
interface where the link fault should in theory be perfectly recoverable by
PRP. This happens because the algorithm opens the drop window on the lossy
link, covering received and lost sequence numbers. If the other, non-lossy
link receives the duplicate for a lost frame, it is within the drop window
of the lossy link and therefore dropped.
Since IEC 62439-3:2012, a node has one sequence number counter for frames
it sends, instead of one sequence number counter for each destination.
Therefore, a node can not expect to receive contiguous sequence numbers
from a sender. A missing sequence number can be totally normal (if the
sender intermittently communicates with another node) or mean a frame was
lost.
The algorithm, as previously implemented in commit 05fd00e5e7b1 ("net: hsr:
Fix PRP duplicate detection"), was part of IEC 62439-3:2010 (HSRv0/PRPv0)
but was removed with IEC 62439-3:2012 (HSRv1/PRPv1). Since that, no
algorithm is specified but up to implementers. It should be "designed such
that it never rejects a legitimate frame, while occasional acceptance of a
duplicate can be tolerated" (IEC 62439-3:2021).
For the duplicate discard algorithm, this means that 1) we need to track
the sequence numbers individually to account for non-contiguous sequence
numbers, and 2) we should always err on the side of accepting a duplicate
than dropping a valid frame.
The idea of the new algorithm is to store the seen sequence numbers in a
bitmap. To keep the size of the bitmap in control, we store it as a "sparse
bitmap" where the bitmap is split into blocks and not all blocks exist at
the same time. The sparse bitmap is implemented using an xarray that keeps
the references to the individual blocks and a backing ring buffer that
stores the actual blocks. New blocks are initialized in the buffer and
added to the xarray as needed when new frames arrive. Existing blocks are
removed in two conditions:
1. The block found for an arriving sequence number is old and therefore not
relevant to the duplicate discard algorithm anymore, i.e., it has been
added more than the entry forget time ago. In this case, the block is
removed from the xarray and marked as forgotten (by setting its
timestamp to 0).
2. Space is needed in the ring buffer for a new block. In this case, the
block is removed from the xarray, if it hasn't already been forgotten
(by 1.). Afterwards, the new block is initialized in its place.
This has the nice property that we can reliably track sequence numbers on
low traffic situations (where they expire based on their timestamp) and
more quickly forget sequence numbers in high traffic situations before they
potentially wrap over and repeat before they are expired.
When nodes are merged, the blocks are merged as well. The timestamp of a
merged block is set to the minimum of the two timestamps to never keep
around a seen sequence number for too long. The bitmaps are or'd to mark
all seen sequence numbers as seen.
All of this still happens under seq_out_lock, to prevent concurrent
access to the blocks.
The KUnit test for the algorithm is updated as well. The updates are done
in a way to match the original intends pretty closely. Currently, there is
much knowledge about the actual algorithm baked into the tests (especially
the expectations) which may need some redesign in the future.
Reported-by: Steffen Lindner <steffen.lindner@de.abb.com>
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Steffen Lindner <steffen.lindner@de.abb.com>
Link: https://patch.msgid.link/8ce15a996099df2df5b700969a39e7df400e8dbb.1770299429.git.fmaurer@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add a test case that can support different types of faulty links for all
protocol versions (HSRv0, HSRv1, PRPv1). It starts with a baseline with
fully functional links. The first faulty case is one link being cut during
the ping. This test uses a different function for ping that sends more
packets in shorter intervals to stress the duplicate detection algorithms a
bit more and allow for future tests with other link faults (packet loss,
reordering, etc.).
As the link fault tests now cover the cut link for HSR and PRP, it can be
removed from the hsr_ping test. Note that the removed cut link test did not
really test the fault because do_ping_long takes about 1sec while the link
is only cut after a 3sec sleep.
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/dad52276e2c349ecb96168bef7e3001bf7becc81.1770299429.git.fmaurer@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Previously the hsr_ping test only checked that all nodes in a VLAN are
reachable (using do_ping). Update the test to also check that there is no
packet loss and no duplicate packets by running the same tests for VLANs as
without VLANs (including using do_ping_long). This also adds tests for IPv6
over VLAN. To unify the test code, the topology without VLANs now uses IP
addresses from dead:beef:0::/64 to align with the 100.64.0.0/24 range for
IPv4. Error messages are updated across the board to make it easier to find
what actually failed.
Also update the VLAN test to only run in VLAN 2, as there is no need to
check if ping really works with VLAN IDs 2, 3, 4, and 5. This lowers the
number of long ping tests on VLANs to keep the overall test runtime in
bounds.
It's still necessary to bump the test timeout a bit, though: a ping long
tests takes 1sec, do_ping_tests performs 12 of them, do_link_problem_tests
6, and the VLAN tests again 12. With some buffer for setup and waiting and
for two protocol versions, 90sec timeout seems reasonable.
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/e3ded0e2547b5f720524b62fabeb96debc579697.1770299429.git.fmaurer@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add a selftest for PRP that performs a basic ping test on IPv4 and IPv6,
over the plain PRP interface and a VLAN interface, similar to the existing
ping test for HSR. The test first checks reachability of the other node,
then checks for no loss and no duplicates.
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/4a342189e842d7308d037da72af566729ee75834.1770299429.git.fmaurer@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
zombie callers""
Commit 006568ab4c5c ("pid: Add a judgment for ns null in pid_nr_ns")
already added the ns != NULL check in pid_nr_ns(), so we can remove the
same check from __task_pid_nr_ns().
* patches from https://patch.msgid.link/20251015123550.GA9447@redhat.com:
pid: introduce task_ppid_vnr() helper
Revert "pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers"
Link: https://patch.msgid.link/20251015123550.GA9447@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Cosmetic change. Unlike all other similar helpers task_ppid_nr_ns() doesn't
have a _vnr() version; add one for consistency.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://patch.msgid.link/20251015123633.GB9456@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
OBJEXTS_NOSPIN_ALLOC was used to remember whether a slabobj_ext vector
was allocated via kmalloc_nolock(), so that free_slab_obj_exts() could
call kfree_nolock() instead of kfree().
Now that kfree() supports freeing kmalloc_nolock() objects, this flag is
no longer needed. Instead, pass the allow_spin parameter down to
free_slab_obj_exts() to determine whether kfree_nolock() or kfree()
should be called in the free path, and free one bit in
enum objext_flags.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Hao Li <hao.li@linux.dev>
Link: https://patch.msgid.link/20260210044642.139482-3-harry.yoo@oracle.com
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
|
|
This paves the way for scalable PID allocation later.
The 32 bit variant merely takes a spinlock for simplicity, the 64 bit
variant uses a scalable scheme.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://patch.msgid.link/20260120184539.1480930-1-mjguzik@gmail.com
Co-developed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
This reverts commit abdfd4948e45c51b19162cf8b3f5003f8f53c9b9.
The changelog in this commit explains why it is not easy to avoid ns == NULL
when the caller is exiting, but pid_vnr() is equally unsafe in this case.
However, commit 006568ab4c5c ("pid: Add a judgment for ns null in pid_nr_ns")
already added the ns != NULL check in pid_nr_ns(), so we can remove the same
check from __task_pid_nr_ns().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://patch.msgid.link/20251015123613.GA9456@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Slab objects that are allocated with kmalloc_nolock() must be freed
using kfree_nolock() because only a subset of alloc hooks are called,
since kmalloc_nolock() can't spin on a lock during allocation.
This imposes a limitation: such objects cannot be freed with kfree_rcu(),
forcing users to work around this limitation by calling call_rcu()
with a callback that frees the object using kfree_nolock().
Remove this limitation by teaching kmemleak to gracefully ignore cases
when kmemleak_free() or kmemleak_ignore() is called without a prior
kmemleak_alloc().
Unlike kmemleak, kfence already handles this case, because,
due to its design, only a subset of allocations are served from kfence.
With this change, kfree() and kfree_rcu() can be used to free objects
that are allocated using kmalloc_nolock().
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
Link: https://patch.msgid.link/20260210044642.139482-2-harry.yoo@oracle.com
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
|
|
alloc_pid() loads pid_cachep, level and pid_max prior to taking the
lock.
It dirties idr and pid_allocated with the lock.
Some of these fields share the cacheline as is, split them up.
No change in the size of the struct.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://patch.msgid.link/20260120204820.1497002-1-mjguzik@gmail.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Mateusz reported performance penalties [1] during task creation because
pidfs uses pidmap_lock to add elements into the rbtree. Switch to an
rhashtable to have separate fine-grained locking and to decouple from
pidmap_lock moving all heavy manipulations outside of it.
Convert the pidfs inode-to-pid mapping from an rb-tree with seqcount
protection to an rhashtable. This removes the global pidmap_lock
contention from pidfs_ino_get_pid() lookups and allows the hashtable
insert to happen outside the pidmap_lock.
pidfs_add_pid() is split. pidfs_prepare_pid() allocates inode number and
initializes pid fields and is called inside pidmap_lock. pidfs_add_pid()
inserts pid into rhashtable and is called outside pidmap_lock. Insertion
into the rhashtable can fail and memory allocation may happen so we need
to drop the spinlock.
To guard against accidently opening an already reaped task
pidfs_ino_get_pid() uses additional checks beyond pid_vnr(). If
pid->attr is PIDFS_PID_DEAD or NULL the pid either never had a pidfd or
it already went through pidfs_exit() aka the process as already reaped.
If pid->attr is valid check PIDFS_ATTR_BIT_EXIT to figure out whether
the task has exited.
This slightly changes visibility semantics: pidfd creation is denied
after pidfs_exit() runs, which is just before the pid number is removed
from the via free_pid(). That should not be an issue though.
Link: https://lore.kernel.org/20251206131955.780557-1-mjguzik@gmail.com [1]
Link: https://patch.msgid.link/20260120-work-pidfs-rhashtable-v2-1-d593c4d0f576@kernel.org
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add GPL-2.0 license id to file, replacing reference to
GPL in the header comment.
Signed-off-by: Tim Bird <tim.bird@sony.com>
Link: https://patch.msgid.link/20260117202759.692347-1-tim.bird@sony.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The shadow gmap returned by gmap_create_shadow() could get dropped
before taking the gmap->children_lock. This meant that the shadow gmap
was sometimes being used while its reference count was 0.
Fix this by taking the additional reference inside gmap_create_shadow()
while still holding gmap->children_lock, instead of afterwards.
Fixes: e38c884df921 ("KVM: s390: Switch to new gmap")
Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
|
|
It is possible that walk_guest_tables() is called on a shadow gmap that
has been removed already, in which case its parent will be NULL.
In such case, return -EAGAIN and let the callers deal with it.
Fixes: e38c884df921 ("KVM: s390: Switch to new gmap")
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
|
|
Stop using the userspace address to mark the guest page dirty.
mark_page_dirty() expects a guest frame number, but was being passed a
host virtual frame number. When slot == NULL, mark_page_dirty_in_slot()
does nothing and does not complain.
This means that in some circumstances the dirtiness of the guest page
might have been lost.
Fix by adding two fields in struct kvm_s390_adapter_int to keep the
guest addressses, and use those for mark_page_dirty().
Fixes: f65470661f36 ("KVM: s390/interrupt: do not pin adapter interrupt pages")
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
|
|
Reproducer available at [1].
The ATM send path (sendmsg -> vcc_sendmsg -> sigd_send) reads the vcc
pointer from msg->vcc and uses it directly without any validation. This
pointer comes from userspace via sendmsg() and can be arbitrarily forged:
int fd = socket(AF_ATMSVC, SOCK_DGRAM, 0);
ioctl(fd, ATMSIGD_CTRL); // become ATM signaling daemon
struct msghdr msg = { .msg_iov = &iov, ... };
*(unsigned long *)(buf + 4) = 0xdeadbeef; // fake vcc pointer
sendmsg(fd, &msg, 0); // kernel dereferences 0xdeadbeef
In normal operation, the kernel sends the vcc pointer to the signaling
daemon via sigd_enq() when processing operations like connect(), bind(),
or listen(). The daemon is expected to return the same pointer when
responding. However, a malicious daemon can send arbitrary pointer values.
Fix this by introducing find_get_vcc() which validates the pointer by
searching through vcc_hash (similar to how sigd_close() iterates over
all VCCs), and acquires a reference via sock_hold() if found.
Since struct atm_vcc embeds struct sock as its first member, they share
the same lifetime. Therefore using sock_hold/sock_put is sufficient to
keep the vcc alive while it is being used.
Note that there may be a race with sigd_close() which could mark the vcc
with various flags (e.g., ATM_VF_RELEASED) after find_get_vcc() returns.
However, sock_hold() guarantees the memory remains valid, so this race
only affects the logical state, not memory safety.
[1]: https://gist.github.com/mrpre/1ba5949c45529c511152e2f4c755b0f3
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+1f22cb1769f249df9fa0@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69039850.a70a0220.5b2ed.005d.GAE@google.com/T/
Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
Link: https://patch.msgid.link/20260205095501.131890-1-jiayuan.chen@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
`objtool` with Rust 1.84.0 reports:
rust/kernel.o: error: objtool: _RNvXNtNtCsaRPFapPOzLs_6kernel3str9parse_intaNtNtB2_7private12FromStrRadix14from_str_radix()
falls through to next function _RNvXNtNtCsaRPFapPOzLs_6kernel3str9parse_intaNtNtB2_7private12FromStrRadix16from_u64_negated()
This is very similar to commit c18f35e49049 ("objtool/rust: add one more
`noreturn` Rust function"), which added `from_ascii_radix_panic` for Rust
1.86.0, except that Rust 1.84.0 ends up needing `from_str_radix_panic`.
Thus add it to the list to fix the warning.
Cc: FUJITA Tomonori <fujita.tomonori@gmail.com>
Fixes: 51d9ee90ea90 ("rust: str: add radix prefixed integer parsing functions")
Reported-by: Alice Ryhl <aliceryhl@google.com>
Link: https://rust-for-linux.zulipchat.com/#narrow/channel/291565/topic/x/with/572427627
Tested-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20260206204336.38462-1-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Wei Fang says:
====================
net: fec: improve XDP copy mode and add AF_XDP zero-copy support
This patch set optimizes the XDP copy mode logic as follows.
1. Separate the processing of RX XDP frames from fec_enet_rx_queue(),
and adds a separate function fec_enet_rx_queue_xdp() for handling XDP
frames.
2. For TX XDP packets, using the batch sending method to avoid frequent
MMIO writes.
3. Use the switch statement to check the tx_buf type instead of the
if...else... statement, making the cleanup logic of TX BD ring cleared
and more efficient.
We compared the performance of XDP copy mode before and after applying
this patch set, and the results show that the performance has improved.
Before applying this patch set.
root@imx93evk:~# ./xdp-bench tx eth0
Summary 396,868 rx/s 0 err,drop/s
Summary 396,024 rx/s 0 err,drop/s
root@imx93evk:~# ./xdp-bench drop eth0
Summary 684,781 rx/s 0 err/s
Summary 675,746 rx/s 0 err/s
root@imx93evk:~# ./xdp-bench pass eth0
Summary 208,552 rx/s 0 err,drop/s
Summary 208,654 rx/s 0 err,drop/s
root@imx93evk:~# ./xdp-bench redirect eth0 eth0
eth0->eth0 311,210 rx/s 0 err,drop/s 311,208 xmit/s
eth0->eth0 310,808 rx/s 0 err,drop/s 310,809 xmit/s
After applying this patch set.
root@imx93evk:~# ./xdp-bench tx eth0
Summary 425,778 rx/s 0 err,drop/s
Summary 426,042 rx/s 0 err,drop/s
root@imx93evk:~# ./xdp-bench drop eth0
Summary 698,351 rx/s 0 err/s
Summary 701,882 rx/s 0 err/s
root@imx93evk:~# ./xdp-bench pass eth0
Summary 210,348 rx/s 0 err,drop/s
Summary 210,016 rx/s 0 err,drop/s
root@imx93evk:~# ./xdp-bench redirect eth0 eth0
eth0->eth0 354,407 rx/s 0 err,drop/s 354,401 xmit/s
eth0->eth0 350,381 rx/s 0 err,drop/s 350,389 xmit/s
This patch set also addes the AF_XDP zero-copy support, and we tested
the performance on i.MX93 platform with xdpsock tool. The following is
the performance comparison of copy mode and zero-copy mode. It can be
seen that the performance of zero-copy mode is better than that of copy
mode.
1. MAC swap L2 forwarding
1.1 Zero-copy mode
root@imx93evk:~# ./xdpsock -i eth0 -l -z
sock0@eth0:0 l2fwd xdp-drv
pps pkts 1.00
rx 414715 415455
tx 414715 415455
1.2 Copy mode
root@imx93evk:~# ./xdpsock -i eth0 -l -c
sock0@eth0:0 l2fwd xdp-drv
pps pkts 1.00
rx 356396 356609
tx 356396 356609
2. TX only
2.1 Zero-copy mode
root@imx93evk:~# ./xdpsock -i eth0 -t -s 64 -z
sock0@eth0:0 txonly xdp-drv
pps pkts 1.00
rx 0 0
tx 1119573 1126720
2.2 Copy mode
root@imx93evk:~# ./xdpsock -i eth0 -t -s 64 -c
sock0@eth0:0 txonly xdp-drv
pps pkts 1.00
rx 0 0
tx 406864 407616
====================
Link: https://patch.msgid.link/20260205085742.2685134-1-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This patch adds AF_XDP zero-copy support for both TX and RX on the FEC
driver. It introduces new functions for XSK buffer allocation, RX/TX
queue processing in zero-copy mode, and XSK pool setup/teardown.
For RX, fec_alloc_rxq_buffers_zc() is added to allocate RX buffers from
XSK pool. And fec_enet_rx_queue_xsk() is used to process the frames from
the RX queue which is bound to the AF_XDP socket. Similar to the copy
mode, the zero-copy mode also supports XDP_TX, XDP_PASS, XDP_DROP and
XDP_REDIRECT actions. In addition, fec_enet_xsk_tx_xmit() is similar to
fec_enet_xdp_tx_xmit() and is used to handle XDP_TX action in zero-copy
mode.
For TX, there are two cases, one is the frames from the AF_XDP socket,
so fec_enet_xsk_xmit() is added to directly transmit the frames from
the socket and the buffer type is marked as FEC_TXBUF_T_XSK_XMIT. The
other one is the frames from the RX queue (XDP_TX action), the buffer
type is marked as FEC_TXBUF_T_XSK_TX. Therefore, fec_enet_tx_queue()
could correctly clean the TX queue base on the buffer type.
Also, some tests have been done on the i.MX93-EVK board with the xdpsock
tool, the following are the results.
Env: i.MX93 connects to a packet generator, the link speed is 1Gbps, and
flow-control is off. The RX packet size is 64 bytes including FCS. Only
one RX queue (CPU) is used to receive frames.
1. MAC swap L2 forwarding
1.1 Zero-copy mode
root@imx93evk:~# ./xdpsock -i eth0 -l -z
sock0@eth0:0 l2fwd xdp-drv
pps pkts 1.00
rx 414715 415455
tx 414715 415455
1.2 Copy mode
root@imx93evk:~# ./xdpsock -i eth0 -l -c
sock0@eth0:0 l2fwd xdp-drv
pps pkts 1.00
rx 356396 356609
tx 356396 356609
2. TX only
2.1 Zero-copy mode
root@imx93evk:~# ./xdpsock -i eth0 -t -s 64 -z
sock0@eth0:0 txonly xdp-drv
pps pkts 1.00
rx 0 0
tx 1119573 1126720
2.2 Copy mode
root@imx93evk:~# ./xdpsock -i eth0 -t -s 64 -c
sock0@eth0:0 txonly xdp-drv
pps pkts 1.00
rx 0 0
tx 406864 407616
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260205085742.2685134-16-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
To support AF_XDP zero-copy mode in the subsequent patch, the following
adjustments have been made to fec_tx_queue().
1. Change the parameters of fec_tx_queue().
2. Some variables are initialized at the time of declaration, and the
order of local variables is updated to follow the reverse xmas tree
style.
3. Remove the variable xdpf and add the variable tx_buf.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260205085742.2685134-15-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently, the buffers of RX queue are allocated from the page pool. In
the subsequent patches to support XDP zero copy, the RX buffers will be
allocated from the UMEM. Therefore, extract fec_alloc_rxq_buffers_pp()
from fec_enet_alloc_rxq_buffers() and we will add another helper to
allocate RX buffers from UMEM for the XDP zero copy mode. In addition,
fec_alloc_rxq_buffers_pp() only initializes bdp->bufaddr and does not
initialize other fields of bdp, because these will be initialized in
fec_enet_bd_init().
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260205085742.2685134-14-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Extract fec_xdp_rxq_info_reg() from fec_enet_create_page_pool() and move
it out of fec_enet_create_page_pool(), so that it can be reused in the
subsequent patches to support XDP zero copy mode.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260205085742.2685134-13-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Remove the size parameter from fec_enet_create_page_pool(), since
rxq->bd.ring_size already contains this information.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260205085742.2685134-12-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|