aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/export-to-sqlite.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2025-07-20dt-bindings: hwmon: national,lm90: Add missing Dallas max6654 and onsemi ↵Rob Herring (Arm)1-0/+8
nct72, nct214, and nct218 The onsemi nct72, nct214, and nct218 and Dallas/Analog max6654 temperature sensors are already supported, but not documented. Add them to the LM90 binding. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20250701-dt-hwmon-compatibles-v1-1-ad99e65cf11b@kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-20hwmon: (w83627ehf) make the read-only arrays 'bit' static constColin Ian King1-3/+6
Don't populate the read-only arrays 'bit' on the stack at run time, instead make them static const. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20250714155505.1234012-1-colin.i.king@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-20hwmon: (emc2305) Set initial PWM minimum value during probe based on thermal ↵Florin Leotescu1-2/+8
state Prevent the PWM value from being set to minimum when thermal zone temperature exceeds any trip point during driver probe. Otherwise, the PWM fan speed will remains at minimum speed and not respond to temperature changes. Signed-off-by: Florin Leotescu <florin.leotescu@nxp.com> Link: https://lore.kernel.org/r/20250603113125.3175103-5-florin.leotescu@oss.nxp.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-20hwmon: (emc2305) Enable PWM polarity and output configurationFlorin Leotescu1-0/+14
Enable configuration of PWM polarity and PWM output config based Device Tree properties. Signed-off-by: Florin Leotescu <florin.leotescu@nxp.com> Link: https://lore.kernel.org/r/20250603113125.3175103-4-florin.leotescu@oss.nxp.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-20hwmon: (emc2305) Configure PWM channels based on DT propertiesFlorin Leotescu1-22/+129
Add support for configuring each PWM channel using Device Tree (DT) properties by parsing the 'pwms' phandle arguments. Signed-off-by: Florin Leotescu <florin.leotescu@nxp.com> Link: https://lore.kernel.org/r/20250603113125.3175103-3-florin.leotescu@oss.nxp.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-20hwmon: (emc2305) Add support for PWM frequency, polarity and outputFlorin Leotescu2-0/+12
Add three new attributes to the driver data structures to support configuration of PWM frequency, PWM polarity and PWM output config. Signed-off-by: Florin Leotescu <florin.leotescu@nxp.com> Link: https://lore.kernel.org/r/20250603113125.3175103-2-florin.leotescu@oss.nxp.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-20hwmon: (amc6821) Add cooling device supportJoão Paulo Gonçalves1-6/+107
Add support for using the AMC6821 as a cooling device. The AMC6821 registers with the thermal framework only if the `cooling-levels` property is present in the fan device tree child node. If this property is present, the driver assumes the fan will operate in open-loop, and the kernel will control it directly. In this case, the driver will change the AMC6821 mode to manual (software DCY) and set the initial PWM duty cycle to the maximum fan cooling state level as defined in the DT. It is worth mentioning that the cooling device is registered on the child fan node, not on the fan controller node. Existing behavior is unchanged, so the AMC6821 can still be used without the thermal framework (hwmon only). Signed-off-by: João Paulo Gonçalves <joao.goncalves@toradex.com> Link: https://lore.kernel.org/r/20250613-b4-amc6821-cooling-device-support-v4-3-a8fc063c55de@toradex.com [groeck: Reduced line length when registering thermal device] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-20hwmon: (amc6821) Move reading fan data from OF to a functionJoão Paulo Gonçalves1-8/+16
Move fan property reading from OF to a separate function. This keeps OF data handling separate from the code logic and makes it easier to add features like cooling device support that use the same fan node. Signed-off-by: João Paulo Gonçalves <joao.goncalves@toradex.com> Link: https://lore.kernel.org/r/20250613-b4-amc6821-cooling-device-support-v4-2-a8fc063c55de@toradex.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-20dt-bindings: hwmon: amc6821: Add cooling levelsJoão Paulo Gonçalves1-0/+6
The fan can be used as a cooling device, add a description of the `cooling-levels` property and restrict the maximum value to 255, which is the highest PWM duty cycle supported by the AMC6821 fan controller. Signed-off-by: João Paulo Gonçalves <joao.goncalves@toradex.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20250613-b4-amc6821-cooling-device-support-v4-1-a8fc063c55de@toradex.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-20hwmon: (ibmaem) match return type of wait_for_completion_timeoutQiushi Wu1-19/+8
Return type of wait_for_completion_timeout is unsigned long not int. Check its return value inline instead of introducing a throw-away variable. Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Qiushi Wu <qiushi@linux.ibm.com> Link: https://lore.kernel.org/r/20250613182413.1426367-1-qiushi@linux.ibm.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-20hwmon: (ltc4282) fix copy paste on variable nameNuno Sá1-2/+2
The struct hwmon_chip_info was named ltc2947_chip_info which is obviously a copy paste leftover. Name it accordingly. Signed-off-by: Nuno Sá <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20250611-fix-ltc4282-repetead-write-v1-2-fe46edd08cf1@analog.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-20hwmon: (pmbus/isl68137) Add support for RAA229621Chiang Brian1-0/+3
The RAA229621 is a digital dual output multiphase (X+Y <= 8) PWM controller designed to be compliant with AMD SVI3 specifications, targeting VDDCR_CPU and VDDCR_SOC rails. Add support for it to the isl68137 driver. Signed-off-by: Chiang Brian <chiang.brian@inventec.com> Link: https://lore.kernel.org/r/20250605040134.4012199-3-chiang.brian@inventec.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-20dt-bindings: hwmon: (pmbus/isl68137) Add RAA229621 supportChiang Brian1-0/+1
Add device type support for raa229621 Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Chiang Brian <chiang.brian@inventec.com> Link: https://lore.kernel.org/r/20250605040134.4012199-2-chiang.brian@inventec.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-20hwmon: (asus-ec-sensors) add ProArt X870E-CREATOR WIFIEugene Shalygin2-0/+29
Adds support for the ProArt X870E-CREATOR WIFI board. Signed-off-by: Eugene Shalygin <eugene.shalygin@gmail.com> Link: https://lore.kernel.org/r/20250607102626.9051-1-eugene.shalygin@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-20hwmon: (asus-ec-sensors) add support for ROG STRIX Z490-F GAMINGRoy Seitz2-0/+33
This adds support for the ROG STRIX Z490-F GAMING board. Signed-off-by: Roy Seitz <royseitz@bluewin.ch> Signed-off-by: Eugene Shalygin <eugene.shalygin@gmail.com> Link: https://lore.kernel.org/r/20250529090222.154696-1-eugene.shalygin@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-20hwmon: (gsc-hwmon) fix fan pwm setpoint show functionsTim Harvey1-2/+2
The Linux hwmon sysfs API values for pwmX_auto_pointY_pwm represent an integer value between 0 (0%) to 255 (100%) and the pwmX_auto_pointY_temp represent millidegrees Celcius. Commit a6d80df47ee2 ("hwmon: (gsc-hwmon) fix fan pwm temperature scaling") properly addressed the incorrect scaling in the pwm_auto_point_temp_store implementation but erroneously scaled the pwm_auto_point_pwm_show (pwm value) instead of the pwm_auto_point_temp_show (temp value) resulting in: # cat /sys/class/hwmon/hwmon0/pwm1_auto_point6_pwm 25500 # cat /sys/class/hwmon/hwmon0/pwm1_auto_point6_temp 4500 Fix the scaling of these attributes: # cat /sys/class/hwmon/hwmon0/pwm1_auto_point6_pwm 255 # cat /sys/class/hwmon/hwmon0/pwm1_auto_point6_temp 45000 Fixes: a6d80df47ee2 ("hwmon: (gsc-hwmon) fix fan pwm temperature scaling") Cc: stable@vger.kernel.org Signed-off-by: Tim Harvey <tharvey@gateworks.com> Link: https://lore.kernel.org/r/20250718200259.1840792-1-tharvey@gateworks.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-20Linux 6.16-rc7v6.16-rc7Linus Torvalds1-1/+1
2025-07-19Input: xpad - set correct controller type for Acer NGR200Nilton Perim Neto1-1/+1
The controller should have been set as XTYPE_XBOX360 and not XTYPE_XBOX. Also the entry is in the wrong place. Fix it. Reported-by: Vicki Pfau <vi@endrift.com> Signed-off-by: Nilton Perim Neto <niltonperimneto@gmail.com> Link: https://lore.kernel.org/r/20250708033126.26216-2-niltonperimneto@gmail.com Fixes: 22c69d786ef8 ("Input: xpad - support Acer NGR 200 Controller") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-07-19tracing: Add down_write(trace_event_sem) when adding trace eventSteven Rostedt1-0/+5
When a module is loaded, it adds trace events defined by the module. It may also need to modify the modules trace printk formats to replace enum names with their values. If two modules are loaded at the same time, the adding of the event to the ftrace_events list can corrupt the walking of the list in the code that is modifying the printk format strings and crash the kernel. The addition of the event should take the trace_event_sem for write while it adds the new event. Also add a lockdep_assert_held() on that semaphore in __trace_add_event_dirs() as it iterates the list. Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/20250718223158.799bfc0c@batman.local.home Reported-by: Fusheng Huang(黄富生) <Fusheng.Huang@luxshare-ict.com> Closes: https://lore.kernel.org/all/20250717105007.46ccd18f@batman.local.home/ Fixes: 110bf2b764eb6 ("tracing: add protection around module events unload") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-18tracing/osnoise: Fix crash in timerlat_dump_stack()Tomas Glozar1-1/+1
We have observed kernel panics when using timerlat with stack saving, with the following dmesg output: memcpy: detected buffer overflow: 88 byte write of buffer size 0 WARNING: CPU: 2 PID: 8153 at lib/string_helpers.c:1032 __fortify_report+0x55/0xa0 CPU: 2 UID: 0 PID: 8153 Comm: timerlatu/2 Kdump: loaded Not tainted 6.15.3-200.fc42.x86_64 #1 PREEMPT(lazy) Call Trace: <TASK> ? trace_buffer_lock_reserve+0x2a/0x60 __fortify_panic+0xd/0xf __timerlat_dump_stack.cold+0xd/0xd timerlat_dump_stack.part.0+0x47/0x80 timerlat_fd_read+0x36d/0x390 vfs_read+0xe2/0x390 ? syscall_exit_to_user_mode+0x1d5/0x210 ksys_read+0x73/0xe0 do_syscall_64+0x7b/0x160 ? exc_page_fault+0x7e/0x1a0 entry_SYSCALL_64_after_hwframe+0x76/0x7e __timerlat_dump_stack() constructs the ftrace stack entry like this: struct stack_entry *entry; ... memcpy(&entry->caller, fstack->calls, size); entry->size = fstack->nr_entries; Since commit e7186af7fb26 ("tracing: Add back FORTIFY_SOURCE logic to kernel_stack event structure"), struct stack_entry marks its caller field with __counted_by(size). At the time of the memcpy, entry->size contains garbage from the ringbuffer, which under some circumstances is zero, triggering a kernel panic by buffer overflow. Populate the size field before the memcpy so that the out-of-bounds check knows the correct size. This is analogous to __ftrace_trace_stack(). Cc: stable@vger.kernel.org Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Attila Fazekas <afazekas@redhat.com> Link: https://lore.kernel.org/20250716143601.7313-1-tglozar@redhat.com Fixes: e7186af7fb26 ("tracing: Add back FORTIFY_SOURCE logic to kernel_stack event structure") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-18Fix SMB311 posix special file creation to servers which do not advertise ↵Steve French2-2/+4
reparse support Some servers (including Samba), support the SMB3.1.1 POSIX Extensions (which use reparse points for handling special files) but do not properly advertise file system attribute FILE_SUPPORTS_REPARSE_POINTS. Although we don't check for this attribute flag when querying special file information, we do check it when creating special files which causes them to fail unnecessarily. If we have negotiated SMB3.1.1 POSIX Extensions with the server we can expect the server to support creating special files via reparse points, and even if the server fails the operation due to really forbidding creating special files, then it should be no problem and is more likely to return a more accurate rc in any case (e.g. EACCES instead of EOPNOTSUPP). Allow creating special files as long as the server supports either reparse points or the SMB3.1.1 POSIX Extensions (note that if the "sfu" mount option is specified it uses a different way of storing special files that does not rely on reparse points). Cc: <stable@vger.kernel.org> Fixes: 6c06be908ca19 ("cifs: Check if server supports reparse points before using them") Acked-by: Ralph Boehme <slow@samba.org> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-07-18hwmon: (pmbus/ucd9000) Fix error in ucd9000_gpio_setTorben Nielsen1-4/+4
The GPIO output functionality does not work as intended. The ucd9000_gpio_set function should set UCD9000_GPIO_CONFIG_OUT_VALUE (bit 2) in order to change the output value of the selected GPIO. Instead UCD9000_GPIO_CONFIG_STATUS (bit 3) is set, but this is a read-only value. This patch fixes the mistake and provides the intended functionality of the GPIOs. See UCD90xxx Sequencer and System Health Controller PMBus Command SLVU352C section 10.43 for reference. Signed-off-by: Torben Nielsen <t8927095@gmail.com> Link: https://lore.kernel.org/r/20250718093644.356085-2-t8927095@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-18hwmon: (ina238) Report energy in microjoulesJonas Rebmann2-5/+5
The hwmon sysfs interface specifies that energy values should be reported in microjoules. This is also what tools such as lmsensors expect, reporting wrong values otherwise. Adjust the driver to scale the output accordingly and adjust ina238 driver documentation. Fixes: 6daaf15a1173 ("hwmon: (ina238) Add support for SQ52206") Signed-off-by: Jonas Rebmann <jre@pengutronix.de> Link: https://lore.kernel.org/r/20250715-hwmon-ina238-microjoules-v1-1-9df678568a41@pengutronix.de Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-07-18xfs: don't allocate the xfs_extent_busy structure for zoned RTGsChristoph Hellwig2-5/+17
Busy extent tracking is primarily used to ensure that freed blocks are not reused for data allocations before the transaction that deleted them has been committed to stable storage, and secondarily to drive online discard. None of the use cases applies to zoned RTGs, as the zoned allocator can't overwrite blocks before resetting the zone, which already flushes out all transactions touching the RTGs. So the busy extent tracking is not needed for zoned RTGs, and also not called for zoned RTGs. But somehow the code to skip allocating and freeing the structure got lost during the zoned XFS upstreaming process. This not only causes these structures to unnecessarily allocated, but can also lead to memory leaks as the xg_busy_extents pointer in the xfs_group structure is overlayed with the pointer for the linked list of to be reset zones. Stop allocating and freeing the structure to not pointlessly allocate memory which is then leaked when the zone is reset. Fixes: 080d01c41d44 ("xfs: implement zoned garbage collection") Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: <stable@vger.kernel.org> # v6.15 [cem: Fix type and add stable tag] Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-07-18efivarfs: Fix memory leak of efivarfs_fs_info in fs_context error pathsBreno Leitao1-0/+6
When processing mount options, efivarfs allocates efivarfs_fs_info (sfi) early in fs_context initialization. However, sfi is associated with the superblock and typically freed when the superblock is destroyed. If the fs_context is released (final put) before fill_super is called—such as on error paths or during reconfiguration—the sfi structure would leak, as ownership never transfers to the superblock. Implement the .free callback in efivarfs_context_ops to ensure any allocated sfi is properly freed if the fs_context is torn down before fill_super, preventing this memory leak. Suggested-by: James Bottomley <James.Bottomley@HansenPartnership.com> Fixes: 5329aa5101f73c ("efivarfs: Add uid/gid mount options") Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-07-17libbpf: Fix handling of BPF arena relocationsAndrii Nakryiko1-7/+13
Initial __arena global variable support implementation in libbpf contains a bug: it remembers struct bpf_map pointer for arena, which is used later on to process relocations. Recording this pointer is problematic because map pointers are not stable during ELF relocation collection phase, as an array of struct bpf_map's can be reallocated, invalidating all the pointers. Libbpf is dealing with similar issues by using a stable internal map index, though for BPF arena map specifically this approach wasn't used due to an oversight. The resulting behavior is non-deterministic issue which depends on exact layout of ELF object file, number of actual maps, etc. We didn't hit this until very recently, when this bug started triggering crash in BPF CI when validating one of sched-ext BPF programs. The fix is rather straightforward: we just follow an established pattern of remembering map index (just like obj->kconfig_map_idx, for example) instead of `struct bpf_map *`, and resolving index to a pointer at the point where map information is necessary. While at it also add debug-level message for arena-related relocation resolution information, which we already have for all other kinds of maps. Fixes: 2e7ba4f8fd1f ("libbpf: Recognize __arena global variables.") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250718001009.610955-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-17drm/mediatek: mtk_dpi: Reorder output formats on MT8195/88Louis-Alexis Eyraud1-2/+2
Reorder output format arrays in both MT8195 DPI and DP_INTF block configuration by decreasing preference order instead of alphanumeric one, as expected by the atomic_get_output_bus_fmts callback function of drm_bridge controls, so the RGB ones are used first during the bus format negotiation process. Fixes: 20fa6a8fc588 ("drm/mediatek: mtk_dpi: Allow additional output formats on MT8195/88") Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/linux-mediatek/patch/20250606-mtk_dpi-mt8195-fix-wrong-color-v1-1-47988101b798@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2025-07-17drm/mediatek: only announce AFBC if really supportedIcenowy Zheng7-4/+27
Currently even the SoC's OVL does not declare the support of AFBC, AFBC is still announced to the userspace within the IN_FORMATS blob, which breaks modern Wayland compositors like KWin Wayland and others. Gate passing modifiers to drm_universal_plane_init() behind querying the driver of the hardware block for AFBC support. Fixes: c410fa9b07c3 ("drm/mediatek: Add AFBC support to Mediatek DRM driver") Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Reviewed-by: CK Hu <ck.hu@medaitek.com> Link: https://patchwork.kernel.org/project/linux-mediatek/patch/20250531121140.387661-1-uwu@icenowy.me/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2025-07-17drm/mediatek: Add wait_event_timeout when disabling planeJason-JH Lin3-0/+39
Our hardware registers are set through GCE, not by the CPU. DRM might assume the hardware is disabled immediately after calling atomic_disable() of drm_plane, but it is only truly disabled after the GCE IRQ is triggered. Additionally, the cursor plane in DRM uses async_commit, so DRM will not wait for vblank and will free the buffer immediately after calling atomic_disable(). To prevent the framebuffer from being freed before the layer disable settings are configured into the hardware, which can cause an IOMMU fault error, a wait_event_timeout has been added to wait for the ddp_cmdq_cb() callback,indicating that the GCE IRQ has been triggered. Fixes: 2f965be7f900 ("drm/mediatek: apply CMDQ control flow") Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/linux-mediatek/patch/20250624113223.443274-1-jason-jh.lin@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2025-07-17btf: Fix virt_to_phys() on arm64 when mmapping BTFLorenz Bauer1-1/+1
Breno Leitao reports that arm64 emits the following warning with CONFIG_DEBUG_VIRTUAL: [ 58.896157] virt_to_phys used for non-linear address: 000000009fea9737 (__start_BTF+0x0/0x685530) [ 23.988669] WARNING: CPU: 25 PID: 1442 at arch/arm64/mm/physaddr.c:15 __virt_to_phys (arch/arm64/mm/physaddr.c:?) ... [ 24.075371] Tainted: [E]=UNSIGNED_MODULE, [N]=TEST [ 24.080276] Hardware name: Quanta S7GM 20S7GCU0010/S7G MB (CG1), BIOS 3D22 07/03/2024 [ 24.088295] pstate: 63400009 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) [ 24.098440] pc : __virt_to_phys (arch/arm64/mm/physaddr.c:?) [ 24.105398] lr : __virt_to_phys (arch/arm64/mm/physaddr.c:?) ... [ 24.197257] Call trace: [ 24.199761] __virt_to_phys (arch/arm64/mm/physaddr.c:?) (P) [ 24.206883] btf_sysfs_vmlinux_mmap (kernel/bpf/sysfs_btf.c:27) [ 24.214264] sysfs_kf_bin_mmap (fs/sysfs/file.c:179) [ 24.218536] kernfs_fop_mmap (fs/kernfs/file.c:462) [ 24.222461] mmap_region (./include/linux/fs.h:? mm/internal.h:167 mm/vma.c:2405 mm/vma.c:2467 mm/vma.c:2622 mm/vma.c:2692) It seems that the memory layout on arm64 maps the kernel image in vmalloc space which is different than x86. This makes virt_to_phys emit the warning. Fix this by translating the address using __pa_symbol as suggested by Breno instead. Reported-by: Breno Leitao <leitao@debian.org> Closes: https://lore.kernel.org/bpf/g2gqhkunbu43awrofzqb4cs4sxkxg2i4eud6p4qziwrdh67q4g@mtw3d3aqfgmb/ Signed-off-by: Lorenz Bauer <lmb@isovalent.com> Tested-by: Breno Leitao <leitao@debian> Fixes: a539e2a6d51d ("btf: Allow mmap of vmlinux btf") Link: https://lore.kernel.org/r/20250717-vmlinux-mmap-pa-symbol-v1-1-970be6681158@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-17sched_ext: idle: Handle migration-disabled tasks in idle selectionAndrea Righi1-1/+1
When SCX_OPS_ENQ_MIGRATION_DISABLED is enabled, migration-disabled tasks are also routed to ops.enqueue(). A scheduler may attempt to dispatch such tasks directly to an idle CPU using the default idle selection policy via scx_bpf_select_cpu_and() or scx_bpf_select_cpu_dfl(). This scenario must be properly handled by the built-in idle policy to avoid returning an idle CPU where the target task isn't allowed to run. Otherwise, it can lead to errors such as: EXIT: runtime error (SCX_DSQ_LOCAL[_ON] cannot move migration disabled Chrome_ChildIOT[291646] from CPU 3 to 14) Prevent this by explicitly handling migration-disabled tasks in the built-in idle selection logic, maintaining their CPU affinity. Fixes: a730e3f7a48bc ("sched_ext: idle: Consolidate default idle CPU selection kfuncs") Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-07-17Revert "cgroup_freezer: cgroup_freezing: Check if not frozen"Chen Ridong1-7/+1
This reverts commit cff5f49d433fcd0063c8be7dd08fa5bf190c6c37. Commit cff5f49d433f ("cgroup_freezer: cgroup_freezing: Check if not frozen") modified the cgroup_freezing() logic to verify that the FROZEN flag is not set, affecting the return value of the freezing() function, in order to address a warning in __thaw_task. A race condition exists that may allow tasks to escape being frozen. The following scenario demonstrates this issue: CPU 0 (get_signal path) CPU 1 (freezer.state reader) try_to_freeze read freezer.state __refrigerator freezer_read update_if_frozen WRITE_ONCE(current->__state, TASK_FROZEN); ... /* Task is now marked frozen */ /* frozen(task) == true */ /* Assuming other tasks are frozen */ freezer->state |= CGROUP_FROZEN; /* freezing(current) returns false */ /* because cgroup is frozen (not freezing) */ break out __set_current_state(TASK_RUNNING); /* Bug: Task resumes running when it should remain frozen */ The existing !frozen(p) check in __thaw_task makes the WARN_ON_ONCE(freezing(p)) warning redundant. Removing this warning enables reverting the commit cff5f49d433f ("cgroup_freezer: cgroup_freezing: Check if not frozen") to resolve the issue. The warning has been removed in the previous patch. This patch revert the commit cff5f49d433f ("cgroup_freezer: cgroup_freezing: Check if not frozen") to complete the fix. Fixes: cff5f49d433f ("cgroup_freezer: cgroup_freezing: Check if not frozen") Reported-by: Zhong Jiawei<zhongjiawei1@huawei.com> Signed-off-by: Chen Ridong <chenridong@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-07-17sched,freezer: Remove unnecessary warning in __thaw_taskChen Ridong1-12/+3
Commit cff5f49d433f ("cgroup_freezer: cgroup_freezing: Check if not frozen") modified the cgroup_freezing() logic to verify that the FROZEN flag is not set, affecting the return value of the freezing() function, in order to address a warning in __thaw_task. A race condition exists that may allow tasks to escape being frozen. The following scenario demonstrates this issue: CPU 0 (get_signal path) CPU 1 (freezer.state reader) try_to_freeze read freezer.state __refrigerator freezer_read update_if_frozen WRITE_ONCE(current->__state, TASK_FROZEN); ... /* Task is now marked frozen */ /* frozen(task) == true */ /* Assuming other tasks are frozen */ freezer->state |= CGROUP_FROZEN; /* freezing(current) returns false */ /* because cgroup is frozen (not freezing) */ break out __set_current_state(TASK_RUNNING); /* Bug: Task resumes running when it should remain frozen */ The existing !frozen(p) check in __thaw_task makes the WARN_ON_ONCE(freezing(p)) warning redundant. Removing this warning enables reverting commit cff5f49d433f ("cgroup_freezer: cgroup_freezing: Check if not frozen") to resolve the issue. This patch removes the warning from __thaw_task. A subsequent patch will revert commit cff5f49d433f ("cgroup_freezer: cgroup_freezing: Check if not frozen") to complete the fix. Reported-by: Zhong Jiawei<zhongjiawei1@huawei.com> Signed-off-by: Chen Ridong <chenridong@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-07-17rxrpc: Fix to use conn aborts for conn-wide failuresDavid Howells5-19/+37
Fix rxrpc to use connection-level aborts for things that affect the whole connection, such as the service ID not matching a local service. Fixes: 57af281e5389 ("rxrpc: Tidy up abort generation infrastructure") Reported-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250717074350.3767366-6-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17rxrpc: Fix transmission of an abort in response to an abortDavid Howells1-0/+3
Under some circumstances, such as when a server socket is closing, ABORT packets will be generated in response to incoming packets. Unfortunately, this also may include generating aborts in response to incoming aborts - which may cause a cycle. It appears this may be made possible by giving the client a multicast address. Fix this such that rxrpc_reject_packet() will refuse to generate aborts in response to aborts. Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com> cc: LePremierHomme <kwqcheii@proton.me> cc: Linus Torvalds <torvalds@linux-foundation.org> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250717074350.3767366-5-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17rxrpc: Fix notification vs call-release vs recvmsgDavid Howells3-17/+18
When a call is released, rxrpc takes the spinlock and removes it from ->recvmsg_q in an effort to prevent racing recvmsg() invocations from seeing the same call. Now, rxrpc_recvmsg() only takes the spinlock when actually removing a call from the queue; it doesn't, however, take it in the lead up to that when it checks to see if the queue is empty. It *does* hold the socket lock, which prevents a recvmsg/recvmsg race - but this doesn't prevent sendmsg from ending the call because sendmsg() drops the socket lock and relies on the call->user_mutex. Fix this by firstly removing the bit in rxrpc_release_call() that dequeues the released call and, instead, rely on recvmsg() to simply discard released calls (done in a preceding fix). Secondly, rxrpc_notify_socket() is abandoned if the call is already marked as released rather than trying to be clever by setting both pointers in call->recvmsg_link to NULL to trick list_empty(). This isn't perfect and can still race, resulting in a released call on the queue, but recvmsg() will now clean that up. Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com> cc: LePremierHomme <kwqcheii@proton.me> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250717074350.3767366-4-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17rxrpc: Fix recv-recv race of completed callDavid Howells3-2/+21
If a call receives an event (such as incoming data), the call gets placed on the socket's queue and a thread in recvmsg can be awakened to go and process it. Once the thread has picked up the call off of the queue, further events will cause it to be requeued, and once the socket lock is dropped (recvmsg uses call->user_mutex to allow the socket to be used in parallel), a second thread can come in and its recvmsg can pop the call off the socket queue again. In such a case, the first thread will be receiving stuff from the call and the second thread will be blocked on call->user_mutex. The first thread can, at this point, process both the event that it picked call for and the event that the second thread picked the call for and may see the call terminate - in which case the call will be "released", decoupling the call from the user call ID assigned to it (RXRPC_USER_CALL_ID in the control message). The first thread will return okay, but then the second thread will wake up holding the user_mutex and, if it sees that the call has been released by the first thread, it will BUG thusly: kernel BUG at net/rxrpc/recvmsg.c:474! Fix this by just dequeuing the call and ignoring it if it is seen to be already released. We can't tell userspace about it anyway as the user call ID has become stale. Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: LePremierHomme <kwqcheii@proton.me> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250717074350.3767366-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17rxrpc: Fix irq-disabled in local_bh_enable()David Howells3-4/+4
The rxrpc_assess_MTU_size() function calls down into the IP layer to find out the MTU size for a route. When accepting an incoming call, this is called from rxrpc_new_incoming_call() which holds interrupts disabled across the code that calls down to it. Unfortunately, the IP layer uses local_bh_enable() which, config dependent, throws a warning if IRQs are enabled: WARNING: CPU: 1 PID: 5544 at kernel/softirq.c:387 __local_bh_enable_ip+0x43/0xd0 ... RIP: 0010:__local_bh_enable_ip+0x43/0xd0 ... Call Trace: <TASK> rt_cache_route+0x7e/0xa0 rt_set_nexthop.isra.0+0x3b3/0x3f0 __mkroute_output+0x43a/0x460 ip_route_output_key_hash+0xf7/0x140 ip_route_output_flow+0x1b/0x90 rxrpc_assess_MTU_size.isra.0+0x2a0/0x590 rxrpc_new_incoming_peer+0x46/0x120 rxrpc_alloc_incoming_call+0x1b1/0x400 rxrpc_new_incoming_call+0x1da/0x5e0 rxrpc_input_packet+0x827/0x900 rxrpc_io_thread+0x403/0xb60 kthread+0x2f7/0x310 ret_from_fork+0x2a/0x230 ret_from_fork_asm+0x1a/0x30 ... hardirqs last enabled at (23): _raw_spin_unlock_irq+0x24/0x50 hardirqs last disabled at (24): _raw_read_lock_irq+0x17/0x70 softirqs last enabled at (0): copy_process+0xc61/0x2730 softirqs last disabled at (25): rt_add_uncached_list+0x3c/0x90 Fix this by moving the call to rxrpc_assess_MTU_size() out of rxrpc_init_peer() and further up the stack where it can be done without interrupts disabled. It shouldn't be a problem for rxrpc_new_incoming_call() to do it after the locks are dropped as pmtud is going to be performed by the I/O thread - and we're in the I/O thread at this point. Fixes: a2ea9a907260 ("rxrpc: Use irq-disabling spinlocks between app and I/O thread") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com> cc: LePremierHomme <kwqcheii@proton.me> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250717074350.3767366-2-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17selftests/tc-testing: Test htb_dequeue_tree with deactivation and row emptyingWilliam Liu1-0/+26
Ensure that any deactivation and row emptying that occurs during htb_dequeue_tree does not cause a kernel panic. This scenario originally triggered a kernel BUG_ON, and we are checking for a graceful fail now. Signed-off-by: William Liu <will@willsroot.io> Signed-off-by: Savino Dicanosa <savy@syst3mfailure.io> Link: https://patch.msgid.link/20250717022912.221426-1-will@willsroot.io Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtreeWilliam Liu1-1/+3
htb_lookup_leaf has a BUG_ON that can trigger with the following: tc qdisc del dev lo root tc qdisc add dev lo root handle 1: htb default 1 tc class add dev lo parent 1: classid 1:1 htb rate 64bit tc qdisc add dev lo parent 1:1 handle 2: netem tc qdisc add dev lo parent 2:1 handle 3: blackhole ping -I lo -c1 -W0.001 127.0.0.1 The root cause is the following: 1. htb_dequeue calls htb_dequeue_tree which calls the dequeue handler on the selected leaf qdisc 2. netem_dequeue calls enqueue on the child qdisc 3. blackhole_enqueue drops the packet and returns a value that is not just NET_XMIT_SUCCESS 4. Because of this, netem_dequeue calls qdisc_tree_reduce_backlog, and since qlen is now 0, it calls htb_qlen_notify -> htb_deactivate -> htb_deactiviate_prios -> htb_remove_class_from_row -> htb_safe_rb_erase 5. As this is the only class in the selected hprio rbtree, __rb_change_child in __rb_erase_augmented sets the rb_root pointer to NULL 6. Because blackhole_dequeue returns NULL, netem_dequeue returns NULL, which causes htb_dequeue_tree to call htb_lookup_leaf with the same hprio rbtree, and fail the BUG_ON The function graph for this scenario is shown here: 0) | htb_enqueue() { 0) + 13.635 us | netem_enqueue(); 0) 4.719 us | htb_activate_prios(); 0) # 2249.199 us | } 0) | htb_dequeue() { 0) 2.355 us | htb_lookup_leaf(); 0) | netem_dequeue() { 0) + 11.061 us | blackhole_enqueue(); 0) | qdisc_tree_reduce_backlog() { 0) | qdisc_lookup_rcu() { 0) 1.873 us | qdisc_match_from_root(); 0) 6.292 us | } 0) 1.894 us | htb_search(); 0) | htb_qlen_notify() { 0) 2.655 us | htb_deactivate_prios(); 0) 6.933 us | } 0) + 25.227 us | } 0) 1.983 us | blackhole_dequeue(); 0) + 86.553 us | } 0) # 2932.761 us | qdisc_warn_nonwc(); 0) | htb_lookup_leaf() { 0) | BUG_ON(); ------------------------------------------ The full original bug report can be seen here [1]. We can fix this just by returning NULL instead of the BUG_ON, as htb_dequeue_tree returns NULL when htb_lookup_leaf returns NULL. [1] https://lore.kernel.org/netdev/pF5XOOIim0IuEfhI-SOxTgRvNoDwuux7UHKnE_Y5-zVd4wmGvNk2ceHjKb8ORnzw0cGwfmVu42g9dL7XyJLf1NEzaztboTWcm0Ogxuojoeo=@willsroot.io/ Fixes: 512bb43eb542 ("pkt_sched: sch_htb: Optimize WARN_ONs in htb_dequeue_tree() etc.") Signed-off-by: William Liu <will@willsroot.io> Signed-off-by: Savino Dicanosa <savy@syst3mfailure.io> Link: https://patch.msgid.link/20250717022816.221364-1-will@willsroot.io Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17net: bridge: Do not offload IGMP/MLD messagesJoseph Huang1-0/+3
Do not offload IGMP/MLD messages as it could lead to IGMP/MLD Reports being unintentionally flooded to Hosts. Instead, let the bridge decide where to send these IGMP/MLD messages. Consider the case where the local host is sending out reports in response to a remote querier like the following: mcast-listener-process (IP_ADD_MEMBERSHIP) \ br0 / \ swp1 swp2 | | QUERIER SOME-OTHER-HOST In the above setup, br0 will want to br_forward() reports for mcast-listener-process's group(s) via swp1 to QUERIER; but since the source hwdom is 0, the report is eligible for tx offloading, and is flooded by hardware to both swp1 and swp2, reaching SOME-OTHER-HOST as well. (Example and illustration provided by Tobias.) Fixes: 472111920f1c ("net: bridge: switchdev: allow the TX data plane forwarding to be offloaded") Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250716153551.1830255-1-Joseph.Huang@garmin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17selftests: Add test cases for vlan_filter modification during runtimeDong Chenchen1-12/+86
Add test cases for vlan_filter modification during runtime, which may triger null-ptr-ref or memory leak of vlan0. Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Link: https://patch.msgid.link/20250716034504.2285203-3-dongchenchen2@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtimeDong Chenchen2-9/+34
Assuming the "rx-vlan-filter" feature is enabled on a net device, the 8021q module will automatically add or remove VLAN 0 when the net device is put administratively up or down, respectively. There are a couple of problems with the above scheme. The first problem is a memory leak that can happen if the "rx-vlan-filter" feature is disabled while the device is running: # ip link add bond1 up type bond mode 0 # ethtool -K bond1 rx-vlan-filter off # ip link del dev bond1 When the device is put administratively down the "rx-vlan-filter" feature is disabled, so the 8021q module will not remove VLAN 0 and the memory will be leaked [1]. Another problem that can happen is that the kernel can automatically delete VLAN 0 when the device is put administratively down despite not adding it when the device was put administratively up since during that time the "rx-vlan-filter" feature was disabled. null-ptr-unref or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance if toggling filtering during runtime: $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ip link del vlan0 Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //vlan_info is null Fix both problems by noting in the VLAN info whether VLAN 0 was automatically added upon NETDEV_UP and based on that decide whether it should be deleted upon NETDEV_DOWN, regardless of the state of the "rx-vlan-filter" feature. [1] unreferenced object 0xffff8880068e3100 (size 256): comm "ip", pid 384, jiffies 4296130254 hex dump (first 32 bytes): 00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00 . 0............. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 81ce31fa): __kmalloc_cache_noprof+0x2b5/0x340 vlan_vid_add+0x434/0x940 vlan_device_event.cold+0x75/0xa8 notifier_call_chain+0xca/0x150 __dev_notify_flags+0xe3/0x250 rtnl_configure_link+0x193/0x260 rtnl_newlink_create+0x383/0x8e0 __rtnl_newlink+0x22c/0xa40 rtnl_newlink+0x627/0xb00 rtnetlink_rcv_msg+0x6fb/0xb70 netlink_rcv_skb+0x11f/0x350 netlink_unicast+0x426/0x710 netlink_sendmsg+0x75a/0xc20 __sock_sendmsg+0xc1/0x150 ____sys_sendmsg+0x5aa/0x7b0 ___sys_sendmsg+0xfc/0x180 [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 #61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Fixes: ad1afb003939 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Suggested-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250716034504.2285203-2-dongchenchen2@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17tls: always refresh the queue when reading sockJakub Kicinski1-2/+1
After recent changes in net-next TCP compacts skbs much more aggressively. This unearthed a bug in TLS where we may try to operate on an old skb when checking if all skbs in the queue have matching decrypt state and geometry. BUG: KASAN: slab-use-after-free in tls_strp_check_rcv+0x898/0x9a0 [tls] (net/tls/tls_strp.c:436 net/tls/tls_strp.c:530 net/tls/tls_strp.c:544) Read of size 4 at addr ffff888013085750 by task tls/13529 CPU: 2 UID: 0 PID: 13529 Comm: tls Not tainted 6.16.0-rc5-virtme Call Trace: kasan_report+0xca/0x100 tls_strp_check_rcv+0x898/0x9a0 [tls] tls_rx_rec_wait+0x2c9/0x8d0 [tls] tls_sw_recvmsg+0x40f/0x1aa0 [tls] inet_recvmsg+0x1c3/0x1f0 Always reload the queue, fast path is to have the record in the queue when we wake, anyway (IOW the path going down "if !strp->stm.full_len"). Fixes: 0d87bbd39d7f ("tls: strp: make sure the TCP skbs do not have overlapping data") Link: https://patch.msgid.link/20250716143850.1520292-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17virtio-net: fix recursived rtnl_lock() during probe()Zigit Zo1-1/+1
The deadlock appears in a stack trace like: virtnet_probe() rtnl_lock() virtio_config_changed_work() netdev_notify_peers() rtnl_lock() It happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing. The config_work in probe() will get scheduled until virtnet_open() enables the config change notification via virtio_config_driver_enable(). Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20250716115717.1472430-1-zuozhijie@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17net/mlx5: Update the list of the PCI supported devicesMaor Gottlieb1-0/+1
Add the upcoming ConnectX-10 device ID to the table of supported PCI device IDs. Cc: stable@vger.kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1752650969-148501-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17hv_netvsc: Set VF priv_flags to IFF_NO_ADDRCONF before open to prevent IPv6 ↵Li Tian1-1/+4
addrconf Set an additional flag IFF_NO_ADDRCONF to prevent ipv6 addrconf. Commit under Fixes added a new flag change that was not made to hv_netvsc resulting in the VF being assinged an IPv6. Fixes: 8a321cf7becc ("net: add IFF_NO_ADDRCONF and use it in bonding to prevent ipv6 addrconf") Suggested-by: Cathy Avery <cavery@redhat.com> Signed-off-by: Li Tian <litian@redhat.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/20250716002607.4927-1-litian@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17phonet/pep: Move call to pn_skb_get_dst_sockaddr() earlier in pep_sock_accept()Nathan Chancellor1-1/+1
A new warning in clang [1] points out a place in pep_sock_accept() where dst is uninitialized then passed as a const pointer to pep_find_pipe(): net/phonet/pep.c:829:37: error: variable 'dst' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer] 829 | newsk = pep_find_pipe(&pn->hlist, &dst, pipe_handle); | ^~~: Move the call to pn_skb_get_dst_sockaddr(), which initializes dst, to before the call to pep_find_pipe(), so that dst is consistently used initialized throughout the function. Cc: stable@vger.kernel.org Fixes: f7ae8d59f661 ("Phonet: allocate sock from accept syscall rather than soft IRQ") Link: https://github.com/llvm/llvm-project/commit/00dacf8c22f065cb52efb14cd091d441f19b319e [1] Closes: https://github.com/ClangBuiltLinux/linux/issues/2101 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://patch.msgid.link/20250715-net-phonet-fix-uninit-const-pointer-v1-1-8efd1bd188b3@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17Bluetooth: L2CAP: Fix attempting to adjust outgoing MTULuiz Augusto von Dentz1-5/+21
Configuration request only configure the incoming direction of the peer initiating the request, so using the MTU is the other direction shall not be used, that said the spec allows the peer responding to adjust: Bluetooth Core 6.1, Vol 3, Part A, Section 4.5 'Each configuration parameter value (if any is present) in an L2CAP_CONFIGURATION_RSP packet reflects an ‘adjustment’ to a configuration parameter value that has been sent (or, in case of default values, implied) in the corresponding L2CAP_CONFIGURATION_REQ packet.' That said adjusting the MTU in the response shall be limited to ERTM channels only as for older modes the remote stack may not be able to detect the adjustment causing it to silently drop packets. Link: https://github.com/bluez/bluez/issues/1422 Link: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/149 Link: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4793 Fixes: 042bb9603c44 ("Bluetooth: L2CAP: Fix L2CAP MTU negotiation") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-07-17KVM: TDX: Don't report base TDVMCALLsXiaoyao Li1-2/+0
Remove TDVMCALLINFO_GET_QUOTE from user_tdvmcallinfo_1_r11 reported to userspace to align with the direction of the GHCI spec. Recently, concern was raised about a gap in the GHCI spec that left ambiguity in how to expose to the guest that only a subset of GHCI TDVMCalls were supported. During the back and forth on the spec details[0], <GetQuote> was moved from an individually enumerable TDVMCall, to one that is part of the 'base spec', meaning it doesn't have a specific bit in the <GetTDVMCallInfo> return values. Although the spec[1] is still in draft form, the GetQoute part has been agreed by the major TDX VMMs. Unfortunately the commits that were upstreamed still treat <GetQuote> as individually enumerable. They set bit 0 in the user_tdvmcallinfo_1_r11 which is reported to userspace to tell supported optional TDVMCalls, intending to say that <GetQuote> is supported. So stop reporting <GetQute> in user_tdvmcallinfo_1_r11 to align with the direction of the spec, and allow some future TDVMCall to use that bit. [0] https://lore.kernel.org/all/aEmuKII8FGU4eQZz@google.com/ [1] https://cdrdv2.intel.com/v1/dl/getContent/858626 Fixes: 28224ef02b56 ("KVM: TDX: Report supported optional TDVMCALLs in TDX capabilities") Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20250717022010.677645-1-xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>