| Age | Commit message (Collapse) | Author | Files | Lines |
|
KHO test stores physical addresses of the preserved folios directly in
fdt. Use kho_preserve_vmalloc() instead of it and kho_restore_vmalloc()
to retrieve the addresses after kexec.
This makes the test more scalable from one side and adds tests coverage
for kho_preserve_vmalloc() from the other.
Link: https://lkml.kernel.org/r/20250921054458.4043761-5-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
A vmalloc allocation is preserved using binary structure similar to global
KHO memory tracker. It's a linked list of pages where each page is an
array of physical address of pages in vmalloc area.
kho_preserve_vmalloc() hands out the physical address of the head page to
the caller. This address is used as the argument to kho_vmalloc_restore()
to restore the mapping in the vmalloc address space and populate it with
the preserved pages.
[pasha.tatashin@soleen.com: free chunks using free_page() not kfree()]
Link: https://lkml.kernel.org/r/mafs0a52idbeg.fsf@kernel.org
[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/20250921054458.4043761-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
to make it clear that KHO operates on pages rather than on a random
physical address.
The kho_preserve_pages() will be also used in upcoming support for vmalloc
preservation.
Link: https://lkml.kernel.org/r/20250921054458.4043761-3-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: Alexander Graf <graf@amazon.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "kho: add support for preserving vmalloc allocations", v5.
Following the discussion about preservation of memfd with LUO [1] these
patches add support for preserving vmalloc allocations.
Any KHO uses case presumes that there's a data structure that lists
physical addresses of preserved folios (and potentially some additional
metadata). Allowing vmalloc preservations with KHO allows scalable
preservation of such data structures.
For instance, instead of allocating array describing preserved folios in
the fdt, memfd preservation can use vmalloc:
preserved_folios = vmalloc_array(nr_folios, sizeof(*preserved_folios));
memfd_luo_preserve_folios(preserved_folios, folios, nr_folios);
kho_preserve_vmalloc(preserved_folios, &folios_info);
This patch (of 4):
Instead of checking if kho is finalized in each caller of
__kho_preserve_order(), do it in the core function itself.
Link: https://lkml.kernel.org/r/20250921054458.4043761-1-rppt@kernel.org
Link: https://lkml.kernel.org/r/20250921054458.4043761-2-rppt@kernel.org
Link: https://lore.kernel.org/all/20250807014442.3829950-30-pasha.tatashin@soleen.com [1]
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Map my new professional email address in .mailmap and also update
the IMX283 driver entry with the same.
At the same time, update my role to a reviewer for the IMX283 driver.
Link: https://lkml.kernel.org/r/20250929074025.84407-1-uajain@igalia.com
Signed-off-by: Umang Jain <uajain@igalia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
In `nouveau_bo_move_prep`, if `nouveau_mem_map` fails, an error code
should be returned. Currently, it returns zero even if vmm addr is not
correctly mapped.
Cc: stable@vger.kernel.org
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
Fixes: 9ce523cc3bf2 ("drm/nouveau: separate buffer object backing memory from nvkm structures")
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening fixes from Kees Cook:
- tools headers: rename missed CONFIG_CFI_CLANG in merge (Carlos
Llamas)
- kconfig: Avoid prompting for transitional symbols
* tag 'hardening-fix1-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
tools headers: kcfi: rename missed CONFIG_CFI_CLANG
kconfig: Avoid prompting for transitional symbols
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping fixes from Marek Szyprowski:
"Two small fixes for the recently performed code refactoring (Shigeru
Yoshida) and missing handling of direction parameter in DMA debug code
(Petr Tesarik)"
* tag 'dma-mapping-6.18-2025-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
dma-mapping: fix direction in dma_alloc direction traces
kmsan: fix kmsan_handle_dma() to avoid false positives
|
|
cifs_swn_send_register_message()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* Return directly after a call of the function “genlmsg_new” failed
at the beginning.
* Delete the label “fail” which became unnecessary
with this refactoring.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc fixes from Greg KH:
"Here are some small nvmem and fastrpc fixes that missed the cut-off to
get into 6.17-final, due to me being slow in getting them out, my
fault, not the maintainers of these subsystems :(
Anyway, better late than never. Changes included in here are:
- nvmem fix for automatic module loading
- fastrpc driver fixes for reported issues
All of these have been in linux-next for weeks (4?) with no reported
issues"
* tag 'char-misc-6.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
misc: fastrpc: Skip reference for DMA handles
misc: fastrpc: fix possible map leak in fastrpc_put_args
misc: fastrpc: Fix fastrpc_map_lookup operation
misc: fastrpc: Save actual DMA size in fastrpc_map structure
nvmem: layouts: fix automatic module loading
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are some staging driver fixes that missed 6.17-final due to my
travel schedule. They fix a number of reported issues in the axis-fifo
driver, one of which was just independently discovered by someone else
today so someone is looking at this code.
All of these fixes have been in linux-next for many weeks with no
reported issues"
* tag 'staging-6.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: axis-fifo: flush RX FIFO on read errors
staging: axis-fifo: fix TX handling on copy_from_user() failure
staging: axis-fifo: fix maximum TX packet length check
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty driver fix from Greg KH:
"Here is a single driver fix for the qcom_geni_serial driver. It has
been in my tree for weeks, but missed being sent to you for 6.17-final
due to travel on my side.
This fixes a reported regression for this driver that prevents 6.17
from working properly on this platform.
It has been in linux-next for many weeks with no reported issues"
* tag 'tty-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: qcom-geni: Fix blocked task
|
|
Use a label once more so that a bit of common code can be better reused
at the end of this function implementation.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Enzo Matsumiya <ematsumiya@suse.de>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more thermal control updates from Rafael Wysocki:
"Fix RZ/G3E driver introduction fall-out (Geert Uytterhoeven) and
improve the compilation and installation of the thermal library for
user space (Emil Dahl Juhl and Sascha Hauer)"
* tag 'thermal-6.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
tools: lib: thermal: expose thermal_exit symbols
tools: lib: thermal: don't preserve owner in install
tools: lib: thermal: use pkg-config to locate libnl3
thermal: renesas: Fix RZ/G3E fall-out
|
|
[WHY]
hinit/vinit are incorrect in the case of mirroring.
[HOW]
Cositing sign must be flipped when image is mirrored in the vertical
or horizontal direction.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Samson Tam <samson.tam@amd.com>
Signed-off-by: Jesse Agate <jesse.agate@amd.com>
Signed-off-by: Brendan Leder <breleder@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[WHAT]
Since dcn35, DTBCLK can be disabled when no DP2 sink connected for
power saving purpose.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If reinitialization of one of the GPUs fails after reset, it logs
failure on all subsequent GPUs eventhough they have resumed
successfully.
A sample log where only device at 0000:95:00.0 had a failure -
amdgpu 0000:15:00.0: amdgpu: GPU reset(19) succeeded!
amdgpu 0000:65:00.0: amdgpu: GPU reset(19) succeeded!
amdgpu 0000:75:00.0: amdgpu: GPU reset(19) succeeded!
amdgpu 0000:85:00.0: amdgpu: GPU reset(19) succeeded!
amdgpu 0000:95:00.0: amdgpu: GPU reset(19) failed
amdgpu 0000:e5:00.0: amdgpu: GPU reset(19) failed
amdgpu 0000:f5:00.0: amdgpu: GPU reset(19) failed
amdgpu 0000:05:00.0: amdgpu: GPU reset(19) failed
amdgpu 0000:15:00.0: amdgpu: GPU reset end with ret = -5
To avoid confusion, report the error for each device
separately and return the first error as the overall result.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The CI systems are pointing out list corruptions, so we still need to
fix something here.
Keep the asserts, but revert the lock changes for now.
Fixes: 59e4405e9ee2 ("drm/amdgpu: revert to old status lock handling v3")
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The point of isolating code that uses kernel mode FPU in separate
compilation units is to ensure that even implicit uses of, e.g., SIMD
registers for spilling occur only in a context where this is permitted,
i.e., from inside a kernel_fpu_begin/end block.
This is important on arm64, which uses -mgeneral-regs-only to build all
kernel code, with the exception of such compilation units where FP or
SIMD registers are expected to be used. Given that the compiler may
invent uses of FP/SIMD anywhere in such a unit, none of its code may be
accessible from outside a kernel_fpu_begin/end block.
This means that all callers into such compilation units must use the
DC_FP start/end macros, which must not occur there themselves. For
robustness, all functions with external linkage that reside there should
call dc_assert_fp_enabled() to assert that the FPU context was set up
correctly.
Fix this for the DCN35, DCN351 and DCN36 implementations.
Cc: Austin Zheng <austin.zheng@amd.com>
Cc: Jun Lei <jun.lei@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Rodrigo Siqueira <siqueira@igalia.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Disable VCN reset capability for the program 4 as it's
causing regressions.
Fixes: 9d20f37a106f ("drm/amd/pm: Add VCN reset support for SMU v13.0.6")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
After GPU reset with VRAM loss, a general protection fault occurs
during user queue restoration when accessing vm_bo->vm after
spinlock release in amdgpu_vm_bo_reset_state_machine.
The root cause is that vm_bo points to the last entry from the
list_for_each_entry loop, but this becomes invalid after the
spinlock is released. Accessing vm_bo->vm at this point leads
to memory corruption.
Crash log shows:
[ 326.981811] Oops: general protection fault, probably for non-canonical address 0x4156415741e58ac8: 0000 [#1] SMP NOPTI
[ 326.981820] CPU: 13 UID: 0 PID: 1035 Comm: kworker/13:3 Tainted: G E 6.16.0+ #25 PREEMPT(voluntary)
[ 326.981826] Tainted: [E]=UNSIGNED_MODULE
[ 326.981827] Hardware name: Gigabyte Technology Co., Ltd. X870E AORUS PRO ICE/X870E AORUS PRO ICE, BIOS F3i 12/19/2024
[ 326.981831] Workqueue: events amdgpu_userq_restore_worker [amdgpu]
[ 326.981999] RIP: 0010:amdgpu_vm_assert_locked+0x16/0x70 [amdgpu]
[ 326.982094] Code: 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 85 ff 74 45 48 8b 87 80 03 00 00 48 85 c0 74 40 <48> 8b b8 80 01 00 00 48 85 ff 74 3b 8b 05 0c b7 0e f0 85 c0 75 05
[ 326.982098] RSP: 0018:ffffaa91c2a6bc20 EFLAGS: 00010206
[ 326.982100] RAX: 4156415741e58948 RBX: ffff9e8f013e8330 RCX: 0000000000000000
[ 326.982102] RDX: 0000000000000005 RSI: 000000001d254e88 RDI: ffffffffc144814a
[ 326.982104] RBP: ffffaa91c2a6bc68 R08: 0000004c21a25674 R09: 0000000000000001
[ 326.982106] R10: 0000000000000001 R11: dccaf3f2f82863fc R12: ffff9e8f013e8000
[ 326.982108] R13: ffff9e8f013e8000 R14: 0000000000000000 R15: ffff9e8f09980000
[ 326.982110] FS: 0000000000000000(0000) GS:ffff9e9e79995000(0000) knlGS:0000000000000000
[ 326.982112] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 326.982114] CR2: 000055ed6c9caa80 CR3: 0000000797060000 CR4: 0000000000750ef0
[ 326.982116] PKRU: 55555554
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
For saving switch state, check if the GPU is having SWUS/DS
architecture. Otherwise, skip saving.
Reported-by: Roman Elshin <roman.elshin@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4602
Fixes: 1dd2fa0e00f1 ("drm/amdgpu: Save and restore switch state")
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Queue read and write pointers are "to KFD", not "from KFD".
Suggested-by: Robert Liu <robert.liu@amd.com>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Robert Liu <robert.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
PMFW interface version is not used by some IP implementations like SMU
v13.0.6/12, instead rely on PMFW version checks. Avoid the log if
interface version is not used.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
As KFD no longer uses a separate PASID, the global amdgpu_vm_set_pasid()function is no longer necessary.
Merge its functionality directly intoamdgpu_vm_init() to simplify code flow and eliminate redundant locking.
v2: remove superflous check
adjust amdgpu_vm_fin and remove amdgpu_vm_set_pasid (Chritian)
v3: drop amdgpu_vm_assert_locked (Chritian)
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4614
Fixes: 59e4405e9ee2 ("drm/amdgpu: revert to old status lock handling v3")
Reviewed-by: Christian König <christian.koenig@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
MES version 0x83 is not stable to use the inv_tlbs API. Defer it to 0x84 vertsion.
Fixes: 85442bac8466 ("drm/amd/amdgpu: Fix the mes version that support inv_tlbs")
Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com>
Reviewed-by: Michael Chen <michael.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Not all renoir hardware supports secure display. If the TA is present
but the feature isn't supported it will fail to load or send commands.
This shows ERR messages to the user that make it seems like there is
a problem.
[How]
Check the resp_status of the context to see if there was an error
before trying to send any secure display commands.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1415
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If mmap write lock is taken while draining retry fault, mmap write lock
is not released because svm_range_restore_pages calls mmap_read_unlock
then returns. This causes deadlock and system hangs later because mmap
read or write lock cannot be taken.
Downgrade mmap write lock to read lock if draining retry fault fix this
bug.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
kfd_lookup_process_by_pid hold the kfd process reference to ensure it
doesn't get destroyed while sending the segfault event to user space.
Calling kfd_lookup_process_by_pid as function parameter leaks the kfd
process refcount and miss the NULL pointer check if app process is
already destroyed.
Fixes: 2d274bf7099b ("amd/amdkfd: Trigger segfault for early userptr unmmapping")
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
There is some probability that reset workqueue is blocked by KIQ I/O for 10+ seconds after gpu hangs.
So we need to add a in_reset check during each KIQ register poll.
Signed-off-by: Heng Zhou <Heng.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Scaling doesn't work on DCE6 at the moment, the current
register programming produces incorrect output when using
fractional scaling (between 100-200%) on resolutions higher
than 1080p.
Disable it until we figure out how to program it properly.
Fixes: 7c15fd86aaec ("drm/amd/display: dc/dce: add initial DCE6 support (v10)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
SCL_SCALER_ENABLE can be used to enable/disable the scaler
on DCE6. Program it to 0 when scaling isn't used, 1 when used.
Additionally, clear some other registers when scaling is
disabled and program the SCL_UPDATE register as recommended.
This fixes visible glitches for users whose BIOS sets up a
mode with scaling at boot, which DC was unable to clean up.
Fixes: b70aaf5586f2 ("drm/amd/display: dce_transform: add DCE6 specific macros,functions")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Previously, the code would set a bit field which didn't exist
on DCE6 so it would be effectively a no-op.
Fixes: b70aaf5586f2 ("drm/amd/display: dce_transform: add DCE6 specific macros,functions")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Without these, it's impossible to program these registers.
Fixes: 102b2f587ac8 ("drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init (v2)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fixes: 102b2f587ac8 ("drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init (v2)")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
syzbot reported the lockdep splat below in __kmem_cache_release(). [0]
The problem is that __kmem_cache_release() could be called from
do_kmem_cache_create() before init_kmem_cache_cpus() registers
the lockdep key.
Let's perform lockdep_unregister_key() only when init_kmem_cache_cpus()
has been done, which we can determine by checking s->cpu_slab
[0]:
WARNING: CPU: 1 PID: 6128 at kernel/locking/lockdep.c:6606 lockdep_unregister_key+0x2ca/0x310 kernel/locking/lockdep.c:6606
Modules linked in:
CPU: 1 UID: 0 PID: 6128 Comm: syz.4.21 Not tainted syzkaller #0 PREEMPT_{RT,(full)}
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/18/2025
RIP: 0010:lockdep_unregister_key+0x2ca/0x310 kernel/locking/lockdep.c:6606
Code: 50 e4 0f 48 3b 44 24 10 0f 84 26 fe ff ff e8 bd cd 17 09 e8 e8 ce 17 09 41 f7 c7 00 02 00 00 74 bd fb 40 84 ed 75 bc eb cd 90 <0f> 0b 90 e9 19 ff ff ff 90 0f 0b 90 e9 2a ff ff ff 48 c7 c7 d0 ac
RSP: 0018:ffffc90003e870d0 EFLAGS: 00010002
RAX: eb1525397f5bdf00 RBX: ffff88803c121148 RCX: 1ffff920007d0dfc
RDX: 0000000000000000 RSI: ffffffff8acb1500 RDI: ffffffff8b1dd0e0
RBP: 00000000ffffffea R08: ffffffff8eb5aa37 R09: 1ffffffff1d6b546
R10: dffffc0000000000 R11: fffffbfff1d6b547 R12: 0000000000000000
R13: ffff88814d1b8900 R14: 0000000000000000 R15: 0000000000000203
FS: 00007f773f75e6c0(0000) GS:ffff88812712f000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffdaea3af52 CR3: 000000003a5ca000 CR4: 00000000003526f0
Call Trace:
<TASK>
__kmem_cache_release+0xe3/0x1e0 mm/slub.c:7696
do_kmem_cache_create+0x74e/0x790 mm/slub.c:8575
create_cache mm/slab_common.c:242 [inline]
__kmem_cache_create_args+0x1ce/0x330 mm/slab_common.c:340
nfsd_file_cache_init+0x1d6/0x530 fs/nfsd/filecache.c:816
nfsd_startup_generic fs/nfsd/nfssvc.c:282 [inline]
nfsd_startup_net fs/nfsd/nfssvc.c:377 [inline]
nfsd_svc+0x393/0x900 fs/nfsd/nfssvc.c:786
nfsd_nl_threads_set_doit+0x84a/0x960 fs/nfsd/nfsctl.c:1639
genl_family_rcv_msg_doit+0x212/0x300 net/netlink/genetlink.c:1115
genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
genl_rcv_msg+0x60e/0x790 net/netlink/genetlink.c:1210
netlink_rcv_skb+0x208/0x470 net/netlink/af_netlink.c:2552
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline]
netlink_unicast+0x846/0xa10 net/netlink/af_netlink.c:1346
netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1896
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg+0x219/0x270 net/socket.c:742
____sys_sendmsg+0x508/0x820 net/socket.c:2630
___sys_sendmsg+0x21f/0x2a0 net/socket.c:2684
__sys_sendmsg net/socket.c:2716 [inline]
__do_sys_sendmsg net/socket.c:2721 [inline]
__se_sys_sendmsg net/socket.c:2719 [inline]
__x64_sys_sendmsg+0x1a1/0x260 net/socket.c:2719
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f77400eeec9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f773f75e038 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f7740345fa0 RCX: 00007f77400eeec9
RDX: 0000000000008004 RSI: 0000200000000180 RDI: 0000000000000006
RBP: 00007f7740171f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f7740346038 R14: 00007f7740345fa0 R15: 00007ffce616f8d8
</TASK>
[alexei.starovoitov@gmail.com: simplify the fix]
Link: https://lore.kernel.org/all/20251007052534.2776661-1-kuniyu@google.com/
Fixes: 83382af9ddc3 ("slab: Make slub local_(try)lock more precise for LOCKDEP")
Reported-by: syzbot+a6f4d69b9b23404bbabf@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68e4a3d1.a00a0220.298cc0.0471.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
|
|
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
The reproducer:
echo | ./usr/gen_init_cpio /dev/stdin > /dev/null
fsync() on a pipe fd returns -EINVAL, which makes gen_init_cpio fail.
Ignore -EINVAL from fsync().
Fixes: ae18b94099b0 ("gen_init_cpio: support -o <output_file> parameter")
Cc: David Disseldorp <ddiss@suse.de>
Cc: Nicolas Schier <nsc@kernel.org>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Link: https://patch.msgid.link/20251007-gen_init_cpio-pipe-v2-1-b098ab94b58a@arista.com
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
|
|
Commit 27758d8c2583 ("kbuild: enable -Werror for hostprogs")
unconditionally enabled -Werror for the compiler, assembler, and linker
when building the host programs, as the build footprint of the host
programs is small (thus risk of build failures from warnings are low)
and that stage of the build may not have Kconfig values (thus
CONFIG_WERROR could not be used as a precondition).
While turning warnings into errors unconditionally happens in a few
places within the kernel, it can be disruptive to people who may be
building with newer compilers, such as while doing a bisect. While it is
possible to avoid this behavior by passing HOSTCFLAGS=-w or
HOSTCFLAGS=-Wno-error, it may not be the most intuitive for regular
users not intimately familiar with Kbuild.
Avoid being disruptive to the entire build by depending on the explicit
opt-in of CONFIG_WERROR or W=e to enable -Werror and the like while
building the host programs. While this means there is a small portion of
the build that does not have -Werror enabled (namely scripts/kconfig/*
and scripts/basic/fixdep), it is better than not having it altogether.
Fixes: 27758d8c2583 ("kbuild: enable -Werror for hostprogs")
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Reported-by: Askar Safin <safinaskar@gmail.com>
Closes: https://lore.kernel.org/20251005011100.1035272-1-safinaskar@gmail.com/
Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Tested-by: Miguel Ojeda <ojeda@kernel.org> # Rust
Link: https://patch.msgid.link/20251006-kbuild-hostprogs-werror-fix-v1-1-23cf1ffced5c@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI updates from Rafael Wysocki:
"These fix a driver bug, clean up two pieces of code and improve the
fwnode API consistency:
- Add missing synchronization between interface updates in the ACPI
battery driver (Rafael Wysocki)
- Remove open coded check for cpu_feature_enabled() from
acpi_processor_power_init_bm_check() (Mario Limonciello)
- Remove redundant rcu_read_lock/unlock() under spinlock from
ghes_notify_hed() in the ACPI APEI support code (pengdonglin)
- Make the .get_next_child_node() callback in the ACPI fwnode backend
skip ACPI devices that are not present for consistency with the
analogous callback in the OF fwnode backend (Sakari Ailus)"
* tag 'acpi-6.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: property: Return present device nodes only on fwnode interface
ACPI: APEI: Remove redundant rcu_read_lock/unlock() under spinlock
ACPI: battery: Add synchronization between interface updates
x86/acpi/cstate: Remove open coded check for cpu_feature_enabled()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These are cpufreq fixes and cleanups on top of the material merged
previously, a power management core code fix and updates of the
runtime PM framework including unit tests, documentation updates and
introduction of auto-cleanup macros for runtime PM "resume and get"
and "get without resuming" operations.
Specifics:
- Make cpufreq drivers setting the default CPU transition latency to
CPUFREQ_ETERNAL specify a proper default transition latency value
instead which addresses a regression introduced during the 6.6
cycle that broke CPUFREQ_ETERNAL handling (Rafael Wysocki)
- Make the cpufreq CPPC driver use a proper transition delay value
when CPUFREQ_ETERNAL is returned by cppc_get_transition_latency()
to indicate an error condition (Rafael Wysocki)
- Make cppc_get_transition_latency() return a negative error code to
indicate error conditions instead of using CPUFREQ_ETERNAL for this
purpose and drop CPUFREQ_ETERNAL that has no other users (Rafael
Wysocki, Gopi Krishna Menon)
- Fix device leak in the mediatek cpufreq driver (Johan Hovold)
- Set target frequency on all CPUs sharing a policy during frequency
updates in the tegra186 cpufreq driver and make it initialize all
cores to max frequencies (Aaron Kling)
- Rust cpufreq helper cleanup (Thorsten Blum)
- Make pm_runtime_put*() family of functions return 1 when the given
device is already suspended which is consistent with the
documentation (Brian Norris)
- Add basic kunit tests for runtime PM API contracts and update
return values in kerneldoc comments for the runtime PM API (Brian
Norris, Dan Carpenter)
- Add auto-cleanup macros for runtime PM "resume and get" and "get
without resume" operations, use one of them in the PCI core and
drop the existing "free" macro introduced for similar purpose, but
somewhat cumbersome to use (Rafael Wysocki)
- Make the core power management code avoid waiting on device links
marked as SYNC_STATE_ONLY which is consistent with the handling of
those device links elsewhere (Pin-yen Lin)"
* tag 'pm-6.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
docs/zh_CN: Fix malformed table
docs/zh_TW: Fix malformed table
PM: runtime: Fix error checking for kunit_device_register()
PM: runtime: Introduce one more usage counter guard
cpufreq: Drop unused symbol CPUFREQ_ETERNAL
ACPI: CPPC: Do not use CPUFREQ_ETERNAL as an error value
cpufreq: CPPC: Avoid using CPUFREQ_ETERNAL as transition delay
cpufreq: Make drivers using CPUFREQ_ETERNAL specify transition latency
PM: runtime: Drop DEFINE_FREE() for pm_runtime_put()
PCI/sysfs: Use runtime PM guard macro for auto-cleanup
PM: runtime: Add auto-cleanup macros for "resume and get" operations
cpufreq: tegra186: Initialize all cores to max frequencies
cpufreq: tegra186: Set target frequency for all cpus in policy
rust: cpufreq: streamline find_supply_names
cpufreq: mediatek: fix device leak on probe failure
PM: sleep: Do not wait on SYNC_STATE_ONLY device links
PM: runtime: Update kerneldoc return codes
PM: runtime: Make put{,_sync}() return 1 when already suspended
PM: runtime: Add basic kunit tests for API contracts
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"There's a bunch of patches here across drivers/clk/ to migrate drivers
to use struct clk_ops::determine_rate() instead of the round_rate()
one so that we can remove the round_rate clk_op entirely. Brian has
taken up that task which nobody else has wanted to do for close to a
decade. Thanks Brian!
This is all prerequisite work to get to the real task of improving the
clk rate setting process. Once we have determine_rate() used
everywhere, we'll be able to do things like chain the rate request
structs in linked lists to order the rate setting operations or add
more parameters without having to change every clk driver in
existence. It's also nice to not have multiple ways to do something
which just causes confusion for clk driver authors. Overall I'm glad
this is getting done.
Beyond this change we also have a tweak to the clk_lookup() function
in the core framework to use hashing on the clk name instead of a clk
tree walk with string comparisons. We _still_ rely on the clk name to
be unique, because historically we've used globally unique strings to
describe the clk tree topology. This tree walk becomes increasingly
slow as more clks are added to the system. Searching from the roots
for a duplicate is simple but pretty dumb and it wastes boot time so
we're using a hash table as an improvement. Ideally we wouldn't rely
on the strings to be unique at all, relegating them to simply debug
information, but that is future work that will likely require some
sort of Kconfig knob indicating strings aren't used for topology
description.
Outside of the core framework changes we have the usual new SoC
support and fixes to clk drivers for things that were discovered once
the clks were used by consumer drivers. Nothing in particular is
jumping out at me in the "misc" pile, except maybe the Amlogic driver
that has gone through a refactoring. That series got a fix from
testing in -next though so it seems likely that things have been
getting good test coverage for a couple weeks already"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (299 commits)
clk: microchip: core: remove duplicate roclk_determine_rate()
reset: aspeed: register AST2700 reset auxiliary bus device
dt-bindings: clock: ast2700: modify soc0/1 clock define
clk: tegra: do not overallocate memory for bpmp clocks
clk: ep93xx: Use int type to store negative error codes
clk: nxp: Fix pll0 rate check condition in LPC18xx CGU driver
clk: loongson2: Add clock definitions for Loongson-2K0300 SoC
clk: loongson2: Avoid hardcoding firmware name of the reference clock
clk: loongson2: Allow zero divisors for dividers
clk: loongson2: Support scale clocks with an alternative mode
clk: loongson2: Allow specifying clock flags for gate clock
dt-bindings: clock: loongson2: Add Loongson-2K0300 compatible
clk: clocking-wizard: Fix output clock register offset for Versal platforms
clk: xilinx: Optimize divisor search in clk_wzrd_get_divisors_ver()
clk: mmp: pxa1908: Instantiate power driver through auxiliary bus
clk: s2mps11: add support for S2MPG10 PMIC clock
dt-bindings: clock: samsung,s2mps11: add s2mpg10
dt-bindings: stm32: cosmetic fixes for STM32MP25 clock and reset bindings
clk: stm32: introduce clocks for STM32MP21 platform
dt-bindings: stm32: add STM32MP21 clocks and reset bindings
...
|
|
Expand the prefault memory selftest to add a regression test for a KVM bug
where KVM's retry logic would result in (breakable) deadlock due to the
memslot deletion waiting on prefaulting to release SRCU, and prefaulting
waiting on the memslot to fully disappear (KVM uses a two-step process to
delete memslots, and KVM x86 retries page faults if a to-be-deleted, a.k.a.
INVALID, memslot is encountered).
To exercise concurrent memslot remove, spawn a second thread to initiate
memslot removal at roughly the same time as prefaulting. Test memslot
removal for all testcases, i.e. don't limit concurrent removal to only the
success case. There are essentially three prefault scenarios (so far)
that are of interest:
1. Success
2. ENOENT due to no memslot
3. EAGAIN due to INVALID memslot
For all intents and purposes, #1 and #2 are mutually exclusive, or rather,
easier to test via separate testcases since writing to non-existent memory
is trivial. But for #3, making it mutually exclusive with #1 _or_ #2 is
actually more complex than testing memslot removal for all scenarios. The
only requirement to let memslot removal coexist with other scenarios is a
way to guarantee a stable result, e.g. that the "no memslot" test observes
ENOENT, not EAGAIN, for the final checks.
So, rather than make memslot removal mutually exclusive with the ENOENT
scenario, simply restore the memslot and retry prefaulting. For the "no
memslot" case, KVM_PRE_FAULT_MEMORY should be idempotent, i.e. should
always fail with ENOENT regardless of how many times userspace attempts
prefaulting.
Pass in both the base GPA and the offset (instead of the "full" GPA) so
that the worker can recreate the memslot.
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20250924174255.2141847-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Use two additional labels so that another bit of common code can be better
reused at the end of this function implementation.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
cifs_spnego_key_instantiate()
* Return a status code without storing it in an intermediate variable.
* Delete the local variable “ret” and the label “error”
which became unnecessary with this refactoring.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
- Preserve old 'tt_core' UAPI for Hisilicon L3C PMU driver
- Ensure linear alias of kprobes instruction page is not writable
- Fix kernel stack unwinding from BPF
- Fix build warnings from the Fujitsu uncore PMU documentation
- Fix hang with deferred 'struct page' initialisation and MTE
- Consolidate KPTI page-table re-writing code
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mte: Do not flag the zero page as PG_mte_tagged
docs: perf: Fujitsu: Fix htmldocs build warnings and errors
arm64: mm: Move KPTI helpers to mmu.c
tracing: Fix the bug where bpf_get_stackid returns -EFAULT on the ARM64
arm64: kprobes: call set_memory_rox() for kprobe page
drivers/perf: hisi: Add tt_core_deprecated for compatibility
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv updates from Wei Liu:
- Unify guest entry code for KVM and MSHV (Sean Christopherson)
- Switch Hyper-V MSI domain to use msi_create_parent_irq_domain()
(Nam Cao)
- Add CONFIG_HYPERV_VMBUS and limit the semantics of CONFIG_HYPERV
(Mukesh Rathor)
- Add kexec/kdump support on Azure CVMs (Vitaly Kuznetsov)
- Deprecate hyperv_fb in favor of Hyper-V DRM driver (Prasanna
Kumar T S M)
- Miscellaneous enhancements, fixes and cleanups (Abhishek Tiwari,
Alok Tiwari, Nuno Das Neves, Wei Liu, Roman Kisel, Michael Kelley)
* tag 'hyperv-next-signed-20251006' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
hyperv: Remove the spurious null directive line
MAINTAINERS: Mark hyperv_fb driver Obsolete
fbdev/hyperv_fb: deprecate this in favor of Hyper-V DRM driver
Drivers: hv: Make CONFIG_HYPERV bool
Drivers: hv: Add CONFIG_HYPERV_VMBUS option
Drivers: hv: vmbus: Fix typos in vmbus_drv.c
Drivers: hv: vmbus: Fix sysfs output format for ring buffer index
Drivers: hv: vmbus: Clean up sscanf format specifier in target_cpu_store()
x86/hyperv: Switch to msi_create_parent_irq_domain()
mshv: Use common "entry virt" APIs to do work in root before running guest
entry: Rename "kvm" entry code assets to "virt" to genericize APIs
entry/kvm: KVM: Move KVM details related to signal/-EINTR into KVM proper
mshv: Handle NEED_RESCHED_LAZY before transferring to guest
x86/hyperv: Add kexec/kdump support on Azure CVMs
Drivers: hv: Simplify data structures for VMBus channel close message
Drivers: hv: util: Cosmetic changes for hv_utils_transport.c
mshv: Add support for a new parent partition configuration
clocksource: hyper-v: Skip unnecessary checks for the root partition
hyperv: Add missing field to hv_output_map_device_interrupt
|
|
pm_runtime_get_sync() and pm_runtime_put_autosuspend() were previously
called in cmdq_mbox_send_data(), which is under a spinlock in msg_submit()
(mailbox.c). This caused lockdep warnings such as "sleeping function
called from invalid context" when running with lockdebug enabled.
The BUG report:
BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1164
in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 3616, name: kworker/u17:3
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
irq event stamp: 0
CPU: 1 PID: 3616 Comm: kworker/u17:3 Not tainted 6.1.87-lockdep-14133-g26e933aca785 #1
Hardware name: Google Ciri sku0/unprovisioned board (DT)
Workqueue: imgsys_runner imgsys_runner_func
Call trace:
dump_backtrace+0x100/0x120
show_stack+0x20/0x2c
dump_stack_lvl+0x84/0xb4
dump_stack+0x18/0x48
__might_resched+0x354/0x4c0
__might_sleep+0x98/0xe4
__pm_runtime_resume+0x70/0x124
cmdq_mbox_send_data+0xe4/0xb1c
msg_submit+0x194/0x2dc
mbox_send_message+0x190/0x330
imgsys_cmdq_sendtask+0x1618/0x2224
imgsys_runner_func+0xac/0x11c
process_one_work+0x638/0xf84
worker_thread+0x808/0xcd0
kthread+0x24c/0x324
ret_from_fork+0x10/0x20
Additionally, pm_runtime_put_autosuspend() should be invoked from the
GCE IRQ handler to ensure the hardware has actually completed its work.
To resolve these issues, remove the pm_runtime calls from
cmdq_mbox_send_data() and delegate power management responsibilities
to the client driver.
Fixes: 8afe816b0c99 ("mailbox: mtk-cmdq-mailbox: Implement Runtime PM with autosuspend")
Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
|