| Age | Commit message (Collapse) | Author | Lines |
|
Convert the low-hanging fruits of workaround checks to the workaround
framework. Instead of having display structure checks for the
workarounds all over, concentrate the checks in intel_display_wa.c.
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patch.msgid.link/20260305100100.332956-7-luciano.coelho@intel.com
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
Convert the low-hanging fruits of workaround checks to the workaround
framework. Instead of having display structure checks for the
workarounds all over, concentrate the checks in intel_display_wa.c.
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patch.msgid.link/20260305100100.332956-6-luciano.coelho@intel.com
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
Convert the low-hanging fruits of workaround checks to the workaround
framework. Instead of having display structure checks for the
workarounds all over, concentrate the checks in intel_display_wa.c.
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patch.msgid.link/20260305100100.332956-5-luciano.coelho@intel.com
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
Convert the low-hanging fruits of workaround checks to the workaround
framework. Instead of having display structure checks for the
workarounds all over, concentrate the checks in intel_display_wa.c.
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patch.msgid.link/20260305100100.332956-4-luciano.coelho@intel.com
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
Convert the low-hanging fruits of workaround checks to the workaround
framework. Instead of having display structure checks for the
workarounds all over, concentrate the checks in intel_display_wa.c.
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patch.msgid.link/20260305100100.332956-3-luciano.coelho@intel.com
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
There's not much use in passing a number to the macro and let it
convert that into the enum and a string. It just hides the symbols.
Remove the number to enum conversion magic in intel_display_wa().
This has the side-effect of changing the print in the drm_WARN() that
is issued when the number is not implemented, but that is moot anyway
and can be changed later to something cleaner if needed.
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patch.msgid.link/20260305100100.332956-2-luciano.coelho@intel.com
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
Currently Tyr defines a convenience type alias for its DRM device type,
`TyrDrmDevice` but it does not use the alias outside of `tyr/driver.rs`.
Replace `drm::Device<TyrDrmDriver>` with the alias `TyrDrmDevice` across
the driver.
This change will ease future upstream Tyr development by reducing the
diffs when multiple series are touching these files.
No functional changes are intended.
Signed-off-by: Deborah Brouwer <deborah.brouwer@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20260302202331.176140-1-deborah.brouwer@collabora.com
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
|
|
Add a new kunit test gpu_test_buddy_alloc_range() that exercises the
__gpu_buddy_alloc_range() exact-range allocation path, triggered when
start + size == end with flags=0.
The test covers:
- Basic exact-range allocation of the full mm
- Exact-range allocation of equal sub-ranges (quarters)
- Minimum chunk-size exact ranges at start, middle, and end offsets
- Non power-of-two mm size with multiple roots, including cross-root
exact-range allocation
- Randomized exact-range allocations of N contiguous page-aligned
slices in random order
- Negative: partially allocated range must reject overlapping exact
alloc
- Negative: checkerboard allocation pattern rejects exact range over
partially occupied pairs
- Negative: misaligned start, unaligned size, and out-of-bounds end
- Free and re-allocate the same exact range across multiple iterations
- Various power-of-two exact ranges at natural alignment
Cc: Christian König <christian.koenig@amd.com>
Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Link: https://patch.msgid.link/20260302150947.47535-2-sanjay.kumar.yadav@intel.com
|
|
For eDP read the ALPM DPCD caps after DPCD initalization and just before
the PSR init.
v2: Move intel_alpm_init to intel_edp_init_dpcd (Jouni)
v3: Add Fixes with commit-id (Jouni)
v4: Separated the alpm dpcd read caps from alpm_init and moved to
intel_edp_init_dpcd.
v5: Read alpm_caps always for eDP irrespective of the eDP version (Jouni)
v6: replace drm_dp_dpcd_readb with drm_dp_dpcd_read_byte (Jouni)
Fixes: 15438b325987 ("drm/i915/alpm: Add compute config for lobf")
Signed-off-by: Arun R Murthy <arun.r.murthy@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patch.msgid.link/20260304072157.1123283-1-arun.r.murthy@intel.com
|
|
Add KUnit test to validate offset-aligned allocations in the DRM buddy
allocator.
Validate offset-aligned allocation:
The test covers allocations with sizes smaller than the alignment constraint
and verifies correct size preservation, offset alignment, and behavior across
multiple allocation sizes. It also exercises fragmentation by freeing
alternating blocks and confirms that allocation fails once all aligned offsets
are consumed.
Stress subtree_max_alignment propagation:
Exercise subtree_max_alignment tracking by allocating blocks with descending
alignment constraints and freeing them in reverse order. This verifies that
free-tree augmentation correctly propagates the maximum offset alignment
present in each subtree at every stage.
v2:
- Move the patch to gpu/tests/gpu_buddy_test.c file.
v3:
- Fixed build warnings reported by kernel test robot <lkp@intel.com>
v4:(Matthew)
- Use IS_ALIGNED() instead of manual alignment checks
- Simplify order iteration loop for readability
- Remove extra newline
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260306060155.2114-2-Arunpravin.PaneerSelvam@amd.com
|
|
Large alignment requests previously forced the buddy allocator to search by
alignment order, which often caused higher-order free blocks to be split even
when a suitably aligned smaller region already existed within them. This led
to excessive fragmentation, especially for workloads requesting small sizes
with large alignment constraints.
This change prioritizes the requested allocation size during the search and
uses an augmented RB-tree field (subtree_max_alignment) to efficiently locate
free blocks that satisfy both size and offset-alignment requirements. As a
result, the allocator can directly select an aligned sub-region without
splitting larger blocks unnecessarily.
A practical example is the VKCTS test
dEQP-VK.memory.allocation.basic.size_8KiB.reverse.count_4000, which repeatedly
allocates 8 KiB buffers with a 256 KiB alignment. Previously, such allocations
caused large blocks to be split aggressively, despite smaller aligned regions
being sufficient. With this change, those aligned regions are reused directly,
significantly reducing fragmentation.
This improvement is visible in the amdgpu VRAM buddy allocator state
(/sys/kernel/debug/dri/1/amdgpu_vram_mm). After the change, higher-order blocks
are preserved and the number of low-order fragments is substantially reduced.
Before:
order- 5 free: 1936 MiB, blocks: 15490
order- 4 free: 967 MiB, blocks: 15486
order- 3 free: 483 MiB, blocks: 15485
order- 2 free: 241 MiB, blocks: 15486
order- 1 free: 241 MiB, blocks: 30948
After:
order- 5 free: 493 MiB, blocks: 3941
order- 4 free: 246 MiB, blocks: 3943
order- 3 free: 123 MiB, blocks: 4101
order- 2 free: 61 MiB, blocks: 4101
order- 1 free: 61 MiB, blocks: 8018
By avoiding unnecessary splits, this change improves allocator efficiency and
helps maintain larger contiguous free regions under heavy offset-aligned
allocation workloads.
v2:(Matthew)
- Update augmented information along the path to the inserted node.
v3:
- Move the patch to gpu/buddy.c file.
v4:(Matthew)
- Use the helper instead of calling _ffs directly
- Remove gpu_buddy_block_order(block) >= order check and drop order
- Drop !node check as all callers handle this already
- Return larger than any other possible alignment for __ffs64(0)
- Replace __ffs with __ffs64
v5:(Matthew)
- Drop subtree_max_alignment initialization at gpu_block_alloc()
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260306060155.2114-1-Arunpravin.PaneerSelvam@amd.com
|
|
There are no offsets in `FalconUCodeDescV2` to give the non-secure and
secure IMEM sections start offsets relative to the beginning of the
firmware object.
The start offsets for both sections were set to `0`, but that is
obviously incorrect since two different sections cannot start at the
same offset. Since these offsets were not used by the bootloader, this
doesn't prevent proper function but is incorrect nonetheless.
Fix this by computing the start of the secure IMEM section relatively to
the start of the firmware object and setting it properly. Also add and
improve comments to explain how the values are obtained.
Fixes: dbfb5aa41f16 ("gpu: nova-core: add FalconUCodeDescV2 support")
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Link: https://patch.msgid.link/20260306-turing_prep-v11-9-8f0042c5d026@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
There are slice row per frame and pic height parameters in DSC that needs
to be configured on every Selective Update in Early Transport mode. Use
helper provided by DSC code to configure these on Selective Update when in
Early Transport mode. Also fill crtc_state->psr2_su_area with full frame
area on full frame update for DSC calculation.
v2: move psr2_su_area under skip_sel_fetch_set_loop label
Bspec: 68927, 71709
Fixes: 467e4e061c44 ("drm/i915/psr: Enable psr2 early transport as possible")
Cc: <stable@vger.kernel.org> # v6.9+
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Link: https://patch.msgid.link/20260304113011.626542-5-jouni.hogander@intel.com
|
|
There are slice row per frame and pic height configuration in DSC Selective
Update Parameter Set 1 register. Add helper for configuring these.
v2:
- Add WARN_ON_ONCE if vdsc instances per pipe > 2
- instead of checking vdsc instances per pipe being > 1 check == 2
Bspec: 71709
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Link: https://patch.msgid.link/20260304113011.626542-4-jouni.hogander@intel.com
|
|
Add definitions for DSC_SU_PARAMETER_SET_0_DSC0 and
DSC_SU_PARAMETER_SET_0_DSC1 registers. These are for Selective Update Early
Transport configuration.
Bspec: 71709
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Link: https://patch.msgid.link/20260304113011.626542-3-jouni.hogander@intel.com
|
|
Currently we are aligning Selective Update area to cover cursor fully if
needed only once. It may happen that cursor is in Selective Update area
after pipe alignment and after that covering cursor plane only
partially. Fix this by looping alignment as long as alignment isn't needed
anymore.
v2:
- do not unecessarily loop if cursor was already fully covered
- rename aligned as su_area_changed
Fixes: 1bff93b8bc27 ("drm/i915/psr: Extend SU area to cover cursor fully if needed")
Cc: <stable@vger.kernel.org> # v6.9+
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Link: https://patch.msgid.link/20260304113011.626542-2-jouni.hogander@intel.com
|
|
There is no member in `FalconUCodeDescV3` to describe the start offsets
of the IMEM and DMEM section in the firmware object. Add comments to
justify how they are computed.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Link: https://patch.msgid.link/20260306-turing_prep-v11-8-8f0042c5d026@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
On Turing and GA100, a new firmware image called the Generic Bootloader
(gen_bootloader) must be used to load FWSEC into Falcon memory. The
driver loads the generic bootloader into Falcon IMEM, passes a
descriptor that points to FWSEC using DMEM, and then boots the generic
bootloader. The bootloader will then load FWSEC into IMEM and boot it.
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-12-8f0042c5d026@nvidia.com
|
|
Turing GPUs need an additional firmware file (the FWSEC generic
bootloader) in order to initialize. Add it to `ModInfoBuilder`.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-11-8f0042c5d026@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
We will use this method from const context.
Also take `self` by value since it is the size of a primitive type and
implements `Copy`.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-10-8f0042c5d026@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
This safety check was an assumption based on the firmwares we work with
- it is not based on an actual hardware limitation. Thus, remove it.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-7-8f0042c5d026@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Turing and GA100 use programmed I/O (PIO) instead of DMA to upload
firmware images into Falcon memory.
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-6-8f0042c5d026@nvidia.com
|
|
These methods are relevant no matter the loading method used, thus move
them to the common `FalconFirmware` trait.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-5-8f0042c5d026@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Not all firmware is necessarily loaded by DMA. Remove the requirement
for `FalconFirmware` to implement `FalconDmaLoadable`, and adapt
`Falcon`'s methods constraints accordingly.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-4-8f0042c5d026@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
The current `FalconLoadParams` and `FalconLoadTarget` types are fit for
DMA loading, but not so much for PIO loading which will require its own
types. Start by renaming them to something that indicates that they are
indeed DMA-related.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-3-8f0042c5d026@nvidia.com
[acourbot@nvidia.com: fixup order of import items.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Falcon memory blocks are 256 bytes in size. This is a hard constant on
all models.
This value was hardcoded, so turn it into a documented constant. It will
also become useful with the PIO loading code.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-2-8f0042c5d026@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
When DMA was the only loading option for falcon firmwares, we decided to
store them in DMA objects as soon as they were loaded from disk and
patch them in-place to avoid having to do an extra copy.
This decision complicates the PIO loading patch considerably, and
actually does not even stand on its own when put into perspective with
the fact that it requires 8 unsafe statements in the code that wouldn't
exist if we stored the firmware into a `KVVec` and copied it into a DMA
object at the last minute.
The cost of the copy is, as can be expected, imperceptible at runtime.
Thus, switch to a lazy DMA object creation model and simplify our code
a bit. This will also have the nice side-effect of being more fit for
PIO loading.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-1-8f0042c5d026@nvidia.com
[acourbot@nvidia.com: add TODO item to switch back to a coherent
allocation when it becomes convenient to do so.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
gud_plane_atomic_update() currently handles both crtc state and
framebuffer updates - the complexity has led to a few accidental
NULL pointer dereferences.
Commit dc2d5ddb193e ("drm/gud: fix NULL fb and crtc dereferences
on USB disconnect") [1] fixed an earlier dereference but planes
can also be disabled in non-hotplug paths (e.g. display disables
via the desktop environment). The drm_dev_enter() call would not
cause an early return in those and subsequently oops on
dereferencing crtc:
BUG: kernel NULL pointer dereference, address: 00000000000005c8
CPU: 6 UID: 1000 PID: 3473 Comm: kwin_wayland Not tainted 6.18.2-200.vanilla.gud.fc42.x86_64 #1 PREEMPT(lazy)
RIP: 0010:gud_plane_atomic_update+0x148/0x470 [gud]
<TASK>
drm_atomic_helper_commit_planes+0x28e/0x310
drm_atomic_helper_commit_tail+0x2a/0x70
commit_tail+0xf1/0x150
drm_atomic_helper_commit+0x13c/0x180
drm_atomic_commit+0xb1/0xe0
info ? __pfx___drm_printfn_info+0x10/0x10
drm_mode_atomic_ioctl+0x70f/0x7c0
? __pfx_drm_mode_atomic_ioctl+0x10/0x10
drm_ioctl_kernel+0xae/0x100
drm_ioctl+0x2a8/0x550
? __pfx_drm_mode_atomic_ioctl+0x10/0x10
__x64_sys_ioctl+0x97/0xe0
do_syscall_64+0x7e/0x7f0
? __ct_user_enter+0x56/0xd0
? do_syscall_64+0x158/0x7f0
? __ct_user_enter+0x56/0xd0
? do_syscall_64+0x158/0x7f0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Split out crtc handling from gud_plane_atomic_update() into
atomic_enable() and atomic_disable() functions to delegate
crtc state transitioning work to the DRM helpers.
To preserve the gud state commit sequence [2], switch to
the runtime PM version of drm_atomic_helper_commit_tail() which
ensures that crtcs are enabled (hence sending the
GUD_REQ_SET_CONTROLLER_ENABLE and GUD_REQ_SET_DISPLAY_ENABLE
requests) before a framebuffer update is sent.
[1] https://lore.kernel.org/all/20251231055039.44266-1-me@shenghaoyang.info/
[2] https://github.com/notro/gud/wiki/GUD-Protocol#display-state
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202601142159.0v8ilfVs-lkp@intel.com/
Fixes: 73cfd166e045 ("drm/gud: Replace simple display pipe with DRM atomic helpers")
Cc: <stable@vger.kernel.org> # 6.19.x
Cc: <stable@vger.kernel.org> # 6.18.x
Signed-off-by: Shenghao Yang <me@shenghaoyang.info>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Ruben Wauters <rubenru09@aol.com>
Signed-off-by: Ruben Wauters <rubenru09@aol.com>
Link: https://patch.msgid.link/20260222054551.80864-1-me@shenghaoyang.info
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for v7.1:
Cross-subsystem Changes:
dma-buf:
- Prepare for compile-time concurrency analysis
Core Changes:
buddy:
- Improve assert testing
sched:
- Fix race condition in drm_sched_fini()
- Mark slow tests
Driver Changes:
bridge:
- waveshare-dsi: Fix register and attach; Support 1..4 DSI lanes plus DT bindings
gma500:
- Use DRM client buffer for fbdev framebuffer
gud:
- Test for imported buffers with helper
imagination:
- Fix power domain handling
ivpu:
- Update boot API to v3.29.4
- Limit per-user number of doorbells and contexts
nouveau:
- Test for imported buffers with helper
panel:
- panel-edp: Fix timings for BOE NV140WUM-N64
panfrost:
- Test for imported buffers with helper
panthor:
- Test for imported buffers with helper
vc4:
- Test for imported buffers with helper
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260305081140.GA171266@linux.fritz.box
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-7.1-2026-03-04:
amdgpu:
- FAMS2 updates
- Refactor DC I2C
- Rework ttm handling to allow for multiple engines
- UserQ updates
- Ring reset improvements
- DC DCE 6.x cleanups
- DC support for NUTMEG and TRAVIS DP bridges
- Enable DC by default on CIK APUs
- Add DCN 4.2 support
- IPS fixes
- Overlay fixes for DCN4
- SDMA Limit updates
- Misc fixes
- RAS updates
- Register access callback rework
- GC 12.1 updates
amdkfd:
- Misc cleanups
UAPI:
- UserQ fence IOCTL parameter size fixes. The change is backwards compatible on LE, but not BE.
UserQs are still not considered stable and are disabled by default.
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260304213233.1938311-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Correctly set dbi->write_memory_bpw for the ST7586 driver. This driver
is for a monochrome display that has an unusual data format, so the
default value set in mipi_dbi_spi_init() is not correct simply because
this controller is non-standard.
Previously, we were using dbi->swap_bytes to make the same sort of
workaround, but it was removed in the same commit that added
dbi->write_memory_bpw, so we need to use the latter now to have the
correct behavior.
This fixes every 3 columns of pixels being swapped on the display. There
are 3 pixels per byte, so the byte swap caused this effect.
Fixes: df3fb27a74a4 ("drm/mipi-dbi: Make bits per word configurable for pixel transfers")
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20260228-drm-mipi-dbi-fix-st7586-byte-swap-v1-1-e78f6c24cd28@baylibre.com
|
|
Current `dma_read!`, `dma_write!` macros also use a custom
`addr_of!()`-based implementation for projecting pointers, which has
soundness issue as it relies on absence of `Deref` implementation on types.
It also has a soundness issue where it does not protect against unaligned
fields (when `#[repr(packed)]` is used) so it can generate misaligned
accesses.
This commit migrates them to use the general pointer projection
infrastructure, which handles these cases correctly.
As part of migration, the macro is updated to have an improved surface
syntax. The current macro have
dma_read!(a.b.c[d].e.f)
to mean `a.b.c` is a DMA coherent allocation and it should project into it
with `[d].e.f` and do a read, which is confusing as it makes the indexing
operator integral to the macro (so it will break if you have an array of
`CoherentAllocation`, for example).
This also is problematic as we would like to generalize
`CoherentAllocation` from just slices to arbitrary types.
Make the macro expects `dma_read!(path.to.dma, .path.inside.dma)` as the
canonical syntax. The index operator is no longer special and is just one
type of projection (in additional to field projection). Similarly, make
`dma_write!(path.to.dma, .path.inside.dma, value)` become the canonical
syntax for writing.
Another issue of the current macro is that it is always fallible. This
makes sense with existing design of `CoherentAllocation`, but once we
support fixed size arrays with `CoherentAllocation`, it is desirable to
have the ability to perform infallible indexing as well, e.g. doing a `[0]`
index of `[Foo; 2]` is okay and can be checked at build-time, so forcing
falliblity is non-ideal. To capture this, the macro is changed to use
`[idx]` as infallible projection and `[idx]?` as fallible index projection
(those syntax are part of the general projection infra). A benefit of this
is that while individual indexing operation may fail, the overall
read/write operation is not fallible.
Fixes: ad2907b4e308 ("rust: add dma coherent allocator abstraction")
Reviewed-by: Benno Lossin <lossin@kernel.org>
Signed-off-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260302164239.284084-4-gary@kernel.org
[ Capitalize safety comments; slightly improve wording in doc-comments.
- Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
Currently xe_migrate_prepare_vm() does three things.
1. Allocates pt_bo for migrate context.
2. Initializes pt_bo with actual pte details.
3. Initializes sa_manager for migrate context.
Split these implementations in their own functions for better
maintainability.
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260303101913.3576481-1-raag.jadav@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
|
|
I found a few more paths that cleanup fails due to a NULL version pointer
on unsupported hardware.
Add NULL checks as applicable.
Fixes: 39fc2bc4da00 ("drm/amdgpu: Protect GPU register accesses in powergated state in some paths")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f5a05f8414fc10f307eb965f303580c7778f8dd2)
Cc: stable@vger.kernel.org
|
|
Older versions of the MES firmware may cause abnormal GPU power consumption.
When performing inference tasks on the GPU (e.g., with Ollama using ROCm),
the GPU may show abnormal power consumption in idle state and incorrect GPU load information.
This issue has been fixed in firmware version 0x8b and newer.
Closes: https://github.com/ROCm/ROCm/issues/5706
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4e22a5fe6ea6e0b057e7f246df4ac3ff8bfbc46a)
|
|
The following members of struct amdgpu_mode_info do not have valid
references in the related kernel-doc sections:
- plane_shaper_lut_property
- plane_shaper_lut_size_property,
- plane_lut3d_size_property
Correct all affected comment blocks.
Fixes: f545d82479b4 ("drm/amd/display: add plane shaper LUT and TF driver-specific properties")
Fixes: 671994e3bf33 ("drm/amd/display: add plane 3D LUT driver-specific properties")
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ec5708d6e547f7efe2f009073bfa98dbc4c5c2ac)
|
|
When GPU initialization fails due to an unsupported HW block
IP blocks may have a NULL version pointer. During cleanup in
amdgpu_device_fini_hw, the code calls amdgpu_device_set_pg_state and
amdgpu_device_set_cg_state which iterate over all IP blocks and access
adev->ip_blocks[i].version without NULL checks, leading to a kernel
NULL pointer dereference.
Add NULL checks for adev->ip_blocks[i].version in both
amdgpu_device_set_cg_state and amdgpu_device_set_pg_state to prevent
dereferencing NULL pointers during GPU teardown when initialization has
failed.
Fixes: 39fc2bc4da00 ("drm/amdgpu: Protect GPU register accesses in powergated state in some paths")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b7ac77468cda92eecae560b05f62f997a12fe2f2)
Cc: stable@vger.kernel.org
|
|
add missing od setting PP_OD_FEATURE_ZERO_FAN_BIT for smu v14.0.2/14.0.3
Fixes: 9710b84e2a6a ("drm/amd/pm: add overdrive support on smu v14.0.2/3")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5018
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1b5cf07d80bb16d1593579ccdb23f08ea4262c14)
|
|
add missing od setting PP_OD_FEATURE_ZERO_FAN_BIT for smu v13.0.0/13.0.7
Fixes: cfffd980bf21 ("drm/amd/pm: add zero RPM OD setting support for SMU13")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5018
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 576a10797b607ee9e4068218daf367b481564120)
|
|
Debugfs files for amdgpu are better to be handled in the dedicated
amdgpu_debugfs.c/.h files.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Clean up the amdgpu_userq_create function clean up in
failure condition using goto method. This avoid replication
of cleanup for every failure condition.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The userq create path publishes queues to global xarrays such as
userq_doorbell_xa and userq_xa before creation was fully complete.
Later on if create queue fails, teardown could free an already
visible queue, opening a UAF race with concurrent queue walkers.
Also calling amdgpu_userq_put in such cases complicates the cleanup.
Solution is to defer queue publication until create succeeds and no
partially initialized queue is exposed.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
get_checkpoint_info() in kfd_mqd_manager_v9.c finds 32-bit value
ctl_stack_size by multiplying two 32-bit values. This can overflow to a
lower value, which could result in copying outside the bounds of
a buffer in checkpoint_mqd() in the same file.
Put in a check for the overflow, and fail with -EINVAL if detected.
v2: use check_mul_overflow()
Signed-off-by: David Francis <David.Francis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Below is the warning thrown by the clang compiler:
linux/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:588:9: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
stats_dir_filename);
^~~~~~~~~~~~~~~~~~
linux/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:588:9: note: treat the string as an argument to avoid this
stats_dir_filename);
^
"%s",
linux/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:635:18: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
p->kobj, counters_dir_filename);
^~~~~~~~~~~~~~~~~~~~~
linux/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:635:18: note: treat the string as an argument to avoid this
p->kobj, counters_dir_filename);
^
"%s",
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
CC: Philip Yang <philip.yang@amd.com>
CC: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
linux/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:2358:24: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
sprintf(ring->name, amdgpu_sw_ring_name(i));
^~~~~~~~~~~~~~~~~~~~~~
linux/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:2358:24: note: treat the string as an argument to avoid this
sprintf(ring->name, amdgpu_sw_ring_name(i));
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Enable aid/xcd/hbm temperature reporting for smu_v13_0_12 via gpu
metrics
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add aid, xcd & hbm temperatures to gpu metrics for smu_v13_0_12
v2: Use correct umc control per stack (Lijo)
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This is the best way to describe the GPU to a tool loading
the devcoredump.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Older versions of the MES firmware may cause abnormal GPU power consumption.
When performing inference tasks on the GPU (e.g., with Ollama using ROCm),
the GPU may show abnormal power consumption in idle state and incorrect GPU load information.
This issue has been fixed in firmware version 0x8b and newer.
Closes: https://github.com/ROCm/ROCm/issues/5706
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The following members of struct amdgpu_mode_info do not have valid
references in the related kernel-doc sections:
- plane_shaper_lut_property
- plane_shaper_lut_size_property,
- plane_lut3d_size_property
Correct all affected comment blocks.
Fixes: f545d82479b4 ("drm/amd/display: add plane shaper LUT and TF driver-specific properties")
Fixes: 671994e3bf33 ("drm/amd/display: add plane 3D LUT driver-specific properties")
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|