summaryrefslogtreecommitdiffstats
path: root/drivers/gpu
AgeCommit message (Collapse)AuthorLines
2025-10-14drm/xe: Don't check BIOS-disabled FlatCCS if primary GT is disabledMatt Roper-0/+2
If the primary is GT is disabled via configfs, we can't read the GT registers that would tell us whether the BIOS has disabled FlatCCS on a platform that would otherwise have it; we'll just proceed as if the FlatCCS is still enabled. This is similar to the situation seen by SRIOV VFs and doesn't cause any functional problems since the hardware will simply drop writes to the CCS region and reads will always come back as 0 (indicating uncompressed data). We'll simply miss out on the chance to avoid some unnecessary overhead during BO creation and migration. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-45-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe: Check that GT is not NULL before testing Wa_16023588340Matt Roper-1/+2
If the primary GT is disabled, skip the check for this workaround (which only applies to dgpu platforms where the primary GT cannot be NULL). Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-44-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe: Correct lineage for Wa_22014953428 and only check with valid GTMatt Roper-2/+3
Wa_22014953428 was incorrectly labelled with a release-specific ID number rather than the cross-platform lineage number; fix that. Also check that the GT is not NULL before trying to lookup the workaround in it. Since this workaround only applies to DG2 discrete GPUs (where the primary GT cannot be disabled), no coverage is lost. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-43-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe: Bypass Wa_14018094691 when primary GT is disabledMatt Roper-2/+2
Don't try to lookup Wa_14018094691 on a NULL GT when the primary GT is disabled. Since this whole workaround centers around mid-thread preemption behavior, the workaround isn't relevant if the primary GT (where the engines that can do MTP live) is disabled. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-42-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe/rtp: Pass xe_device parameter to FUNC matchesMatt Roper-24/+41
FUNC matches in RTP only pass the GT and hwe, preventing them from being used effectively in device workarounds. Add an additional xe_device parameter so that we can use them in device workarounds where a GT may not be available. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-41-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe: Handle Wa_22010954014 and Wa_14022085890 as device workaroundsMatt Roper-7/+12
When Wa_22010954014 and Wa_14022085890 were first implemented, we didn't have a device workaround infrastructure so we hacked them into the GT workaround list. Now that we have proper device workaround support, move them to the proper place. Note that Wa_14022085890 specifically applies to BMG-G21 platforms, so this requires defining a BMG subplatform to capture the correct subset of device IDs. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-40-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe/irq: Don't try to lookup engine masks for non-existent primary GTMatt Roper-5/+9
If the primary GT is disabled via configfs, we shouldn't try to access it to lookup BCS/CCS engine masks. For the purposes of IRQ reset (which masks & disables interrupts in an sgunit register), assume all possible instances are present. Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-39-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe: Make display part of Wa_22019338487 a device workaroundMatt Roper-5/+5
The display part of Wa_22019338487 (i.e., avoiding use of stolen memory) is using a platform test rather than an graphics/media IP test. Since this workaround is focused on non-GT uses of stolen memory, it makes sense that we'd want to still apply the workaround on affected platforms even if the GTs themselves are disabled via configfs. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-38-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe: Check for primary GT before looking up Wa_22019338487Matt Roper-9/+22
If the primary GT is disabled via configfs, we need to make sure that we don't search for this workaround on a NULL xe_gt pointer. Since we can disable the primary GT only on igpu platforms, the media GT is the one we'd want to check anyway for this workaround. The ternary operators in ggtt_update_access_counter() were getting a bit long/complicated, so rewrite them with regular if/else statements. While we're at it, throw in a couple extra assertions to make sure that we're truly picking the expected GT according to igpu/dgpu type. v2: - Adjust indentation/wrapping; it's easier to read this with longer, unwrapped lines. (Lucas) - Tweak wording of commit message to remove ambiguity. (Gustavo) Cc: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-37-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe/pmu: Initialize PMU event types based on first available GTMatt Roper-1/+10
GT ID#0 (primary GT on tile 0) may not always be available if the primary GT has been disabled via configfs. Instead use the first available GT when determining which PMU events are supported. If there are no GTs, then don't advertise any GT-related events. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-36-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe/query: Report hwconfig size as 0 if primary GT is disabledMatt Roper-1/+1
The hwconfig table is part of the primary GT's GuC firmware. If the primary GT is disabled, the hwconfig is unavailable and should be reported to userspace as having size 0. Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-35-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe: Skip L2 / TDF cache flushes if primary GT is disabledMatt Roper-0/+5
If the primary GT is disabled via configfs, GT-side L2 and TD cache flushes are unnecessary since nothing is using/filling these caches. Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-34-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe: Move primary GT allocation from xe_tile_init_early to xe_tile_initMatt Roper-4/+4
During the early days of the Xe driver, there were cases where we accessed some fields in the primary GT's xe_gt structure before the GT itself was formally initialized; this required that the structure itself be allocated during xe_tile_init_early(). A lot of refactoring of the device probe has happened since that time and there's no longer a need to allocate the primary GT early. Move the allocation into xe_info_init() where GT initialization happens and where we're doing the allocation of the media GT. v2: - Only make this change after a separate patch to perform VF GMD_ID lookup with a dummy GT instead of xe_root_mmio_gt(). Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-33-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe: Read VF GMD_ID with a specifically-allocated dummy GTMatt Roper-31/+45
SRIOV VF initialization has a bit of a chicken and egg design problem. Determining the IP version of the graphics and media IPs can't be done via direct register reads as it is on PF or native and instead requires querying the GuC. However initialization of the GT, including its GuC, needs to wait until after we know the IP versions so that the proper initialization steps for the platform/IP are followed. Currently the (somewhat hacky) solution is to manually fill out just enough fields in tile 0's primary GT structure to make it look as if the GT has been initialized so that the GuC can be partially initialized and queried to obtain the GMD_ID values. When the GT gets properly initialized during the regular flows, the hacked-up values will get overwritten as part of the general initialization flows. Rather than using tile 0's primary GT structure to hold the hacked up values for querying every GT on every tile, instead allocate a dedicated dummy structure. This will allow us to move the tile->primary_gt's allocation to a more consistent place later in the initialization flow in future patches (i.e., we shouldn't even allocate this GT structure if the GT is disabled/unavailable). It also helps ensure there can't be any accidental leakage of initialization or state between the dummy initialization for GMD_ID and the real driver initialization of the GT. v2: - Initialize gt->tile for temporary GT. (CI, Michal) - Use scope-based cleanup handler to free temp GT. (Michal) - Propagate actual error code from xe_gt_sriov_vf_bootstrap() rather than just setting IP version to 0.0 now that read_gmdid() can return an error. (Michal) v3: - Explicitly initialize gt to NULL, just in case something else gets inserted before the kzalloc() in the future. (Lucas) Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-32-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe: Move 'has_flatccs' flag back to platform descriptorMatt Roper-7/+7
FlatCCS presence/absence is a flag that should be tracked at the platform level rather than the IP level. FlatCCS affects the device-wide memory initialization and reservations so its effects are not confined to a single IP block or GT. This is also a trait that should be tied to the platform even if the graphics IP itself is not present (e.g., if we disable the primary GT via configfs). Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-31-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe: Move 'vram_flags' flag back to platform descriptorMatt Roper-6/+5
Restrictions and requirements on VRAM alignment are something that should be tracked at the platform level rather than the IP level. Even when mixing and matching various graphics, media, and display IP blocks, the platform as a whole has to have consistent memory allocation handling. This is also a trait that should be tied to the platform even if the graphics IP itself is not present (e.g., if we disable the primary GT via configfs). Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-30-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe: Move 'vm_max_level' flag back to platform descriptorMatt Roper-9/+16
The number of page table levels for PPGTT virtual addresses is something that should be tracked at the platform level rather than the IP level. Even when mixing and matching various graphics, media, and display IP blocks, the platform as a whole has to have consistent page table handling. This is also a trait that should be tied to the platform even if the graphics IP itself is not present (e.g., if we disable the primary GT via configfs). v2: - Drop default value of 4 and explicitly set the value in each platform descriptor. (Lucas) v3: - Drop outdated code comment and commit message paragraph about default value. (Gustavo) v4: - Add missing setting for tgl_desc. (Gustavo) Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-29-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe: Move 'va_bits' flag back to platform descriptorMatt Roper-7/+16
The number of virtual address bits is something that should be tracked at the platform level rather than the IP level. Even when mixing and matching various graphics, media, and display IP blocks, the platform as a whole has to have consistent page table handling. This is also a trait that should be tied to the platform even if the graphics IP itself is not present (e.g., if we disable the primary GT via configfs). v2: - Drop the default value of 48 and explicitly set it in each relevant descriptor. (Lucas, Michal) v3: - Drop an outdated comment about default value. (Michal) Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-28-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe: Drop GT parameter to xe_display_irq_postinstall()Matt Roper-6/+5
Display interrupt handling has no relation to GT(s) on the platforms supported by the Xe driver. We only call xe_display_irq_postinstall with the first tile's primary GT, so the single condition that uses the GT pointer within the function always evaluates to true. Drop the unnecessary parameter and the condition. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-27-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe/huc: Adjust HuC check on primary GTMatt Roper-3/+7
The HuC initialization code determines whether a platform can have a HuC on the primary GT by checking whether tile->media_gt is NULL; old Xe1 platforms that combined render+media into a single GT will always have a NULL media_gt pointer. However once we allow media to be disabled via configfs, there will also be cases where tile->media_gt is NULL on more modern platforms, causing this condition to behave incorrectly. To handle cases where media gets disabled via configfs (or theoretical cases where media is truly fused off in hardware), change the condition to consider the graphics version of the primary GT; only the old Xe1 platforms with graphics versions 12.55 or earlier should try to initialize a HuC on the primary GT. Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://lore.kernel.org/r/20251013200944.2499947-26-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/xe/kunit: Fix kerneldoc for parameterized testsMatt Roper-0/+5
Kunit's generate_params() was recently updated to take an additional test context parameter. Xe's IP and platform parameter generators were updated accordingly at the same time, but the new parameter was not added to the functions' kerneldoc, resulting in the following warnings: Warning: drivers/gpu/drm/xe/tests/xe_pci.c:78 function parameter 'test' not described in 'xe_pci_fake_data_gen_params' Warning: drivers/gpu/drm/xe/tests/xe_pci.c:254 function parameter 'test' not described in 'xe_pci_graphics_ip_gen_param' Warning: drivers/gpu/drm/xe/tests/xe_pci.c:278 function parameter 'test' not described in 'xe_pci_media_ip_gen_param' Warning: drivers/gpu/drm/xe/tests/xe_pci.c:302 function parameter 'test' not described in 'xe_pci_id_gen_param' Warning: drivers/gpu/drm/xe/tests/xe_pci.c:390 function parameter 'test' not described in 'xe_pci_live_device_gen_param' 5 warnings as errors Document the new parameter to eliminate the warnings and make CI happy. Fixes: b9a214b5f6aa ("kunit: Pass parameterized test context to generate_params()") Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://lore.kernel.org/r/20251013153014.2362879-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-10-14drm/rockchip: vop: add lut_size for RK3368 vop_dataWeiHao Li-0/+1
VOP driver need a correct lut_size to work normally. According to rockchip downstream kernel source [1], the lut_size is 0x400. [1] https://github.com/rockchip-linux/kernel/blob/develop-4.4/arch/arm64/boot/dts/rockchip/rk3368.dtsi#L1497 Signed-off-by: WeiHao Li <cn.liweihao@gmail.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20250905025632.222422-3-cn.liweihao@gmail.com
2025-10-14drm/rockchip: dsi: Add support for RK3368WeiHao Li-0/+20
RK3368 has DesignWare MIPI DSI controller and an external inno D-PHY. Signed-off-by: WeiHao Li <cn.liweihao@gmail.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20250905025632.222422-2-cn.liweihao@gmail.com
2025-10-14drm/rockchip: analogix_dp: Apply devm_clk_get_optional() for ↵Damon Ding-9/+3
&rockchip_dp_device.grfclk The "grf" clock is optional for Rockchip eDP controller(RK3399 needs while RK3288 and RK3588 do not). It can make the code more concise to use devm_clk_get_optional() instead of devm_clk_get() with extra checks. In addtion, DRM_DEV_ERROR() is replaced by dev_err_probe(). Signed-off-by: Damon Ding <damon.ding@rock-chips.com> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20250928103734.4007257-1-damon.ding@rock-chips.com
2025-10-14drm/xe/svm: Ensure data will be migrated to system if indicated by madvise.Thomas Hellström-1/+3
If the location madvise() is set to DRM_XE_PREFERRED_LOC_DEFAULT_SYSTEM, the drm_pagemap in the SVM gpu fault handler will be set to NULL. However there is nothing that explicitly migrates the data to system if it is already present in device memory. In that case, set the device memory owner to NULL to ensure data gets properly migrated to system on page-fault. v2: - Remove redundant dpagemap assignment (Himal Prasad Ghimiray) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v1 Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20251010104149.72783-2-thomas.hellstrom@linux.intel.com Fixes: 10aa5c806030 ("drm/gpusvm, drm/xe: Fix userptr to not allow device private pages")
2025-10-14drm/xe/ct: Separate waiting for retry from ct send functionTomasz Lis-25/+39
The function `guc_ct_send_locked()` is really quite simple, but still looks complex due to exposed internals. It is sending a message, and in case of lack of space, waiting for a proper moment to send a retry. Clear separation of send function and wait function will help with readability. This is a cosmetic change only, no functional difference is expected. This patch introduces `guc_ct_send_wait_for_retry()`, and uses it to greatly simplify `guc_ct_send_locked()`. Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20251009170844.178199-1-tomasz.lis@intel.com
2025-10-14drm/i915/display: add HAS_AUX_CCS() feature checkJani Nikula-7/+4
We should try to get rid of checks that depend on struct drm_i915_private (or struct xe_device) in display code. HAS_FLAT_CCS() is one of them. In the interest of simplicity, add a reversed HAS_AUX_CCS() feature check macro, as that's we mostly use it for in display. v2: include adl-p (Ville) Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20251013144552.1710851-1-jani.nikula@intel.com
2025-10-14drm/i915/display: duplicate 128-byte Y-tiling feature checkJani Nikula-3/+2
We should try to get rid of checks that depend on struct drm_i915_private (or struct xe_device) in display code. HAS_128_BYTE_Y_TILING() is one of them. In the interest of simplicity, just duplicate the check as HAS_128B_Y_TILING() in display. v2: gen2 also has 128-byte Y-tile Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/2a7877f8f1d11114c1a17869bd24d83e13b1fac2.1760094361.git.jani.nikula@intel.com
2025-10-14drm/i915: include gen 2 in HAS_128_BYTE_Y_TILING()Jani Nikula-5/+5
Gen 2 platforms actually have 128-byte Y-tile, it's just different from the 128-byte Y-tile on i945+. Make the HAS_128_BYTE_Y_TILING() feature check macro and its usage slightly less convoluted by including gen 2 in it. i915_tiling_ok() would strictly not need changing, but separate the if clauses to emphasize gen 2 X-tile also being 128 bytes. Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/41bf9d67a11f38f4ab0f82740f38d5c8fe0bb58b.1760094361.git.jani.nikula@intel.com
2025-10-14Merge drm/drm-next into drm-xe-nextThomas Hellström-821/+1537
Backmerging to bring in 6.18-rc1. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-10-14drm: atmel-hlcdc: fix atmel_xlcdc_plane_setup_scaler()Cyrille Pitchen-3/+24
On SoCs, like the SAM9X75, which embed the XLCDC ip, the registers that configure the unified scaling engine were not filled with proper values. Indeed, for YCbCr formats, the VXSCFACT bitfield of the HEOCFG25 register and the HXSCFACT bitfield of the HEOCFG27 register were incorrect. For 4:2:0 formats, both vertical and horizontal factors for chroma chanels should be divided by 2 from the factors for the luma channel. Hence: HEOCFG24.VXSYFACT = VFACTOR HEOCFG25.VSXCFACT = VFACTOR / 2 HEOCFG26.HXSYFACT = HFACTOR HEOCFG27.HXSCFACT = HFACTOR / 2 However, for 4:2:2 formats, only the horizontal factor for chroma chanels should be divided by 2 from the factor for the luma channel; the vertical factor is the same for all the luma and chroma channels. Hence: HEOCFG24.VXSYFACT = VFACTOR HEOCFG25.VXSCFACT = VFACTOR HEOCFG26.HXSYFACT = HFACTOR HEOCFG27.HXSCFACT = HFACTOR / 2 Fixes: d498771b0b83 ("drm: atmel_hlcdc: Add support for XLCDC using IP specific driver ops") Signed-off-by: Cyrille Pitchen <cyrille.pitchen@microchip.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20241014094942.325211-1-manikandan.m@microchip.com Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com>
2025-10-14Merge drm/drm-fixes into drm-misc-fixesThomas Zimmermann-19291/+40342
Updating drm-misc-fixes to the state of v6.18-rc1. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2025-10-14drm: atmel-hlcdc: update the LCDC_ATTRE register in plane atomic_disableManikandan Muralidharan-4/+16
update the LCDC_ATTRE register in drm plane atomic_disable to handle the configuration changes of each layer when a plane is disabled. Reviewed-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20241014064644.292943-1-manikandan.m@microchip.com Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com>
2025-10-14drm/rockchip: vop2: use correct destination rectangle height checkAlok Tiwari-1/+1
The vop2_plane_atomic_check() function incorrectly checks drm_rect_width(dest) twice instead of verifying both width and height. Fix the second condition to use drm_rect_height(dest) so that invalid destination rectangles with height < 4 are correctly rejected. Fixes: 604be85547ce ("drm/rockchip: Add VOP2 driver") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Andy Yan <andy.yan@rock-chips.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20251012142005.660727-1-alok.a.tiwari@oracle.com
2025-10-14drm/ttm: Add safety check for NULL man->bdev in ttm_resource_manager_usageJesse.Zhang-0/+3
The `ttm_resource_manager_usage()` function currently assumes `man->bdev` is non-NULL when accessing `man->bdev->lru_lock`. However, in scenarios where the resource manager is not fully initialized (e.g., APU platforms that lack dedicated VRAM, or incomplete manager setup), `man->bdev` may remain NULL. This leads to a NULL pointer dereference when attempting to acquire the `lru_lock`, triggering kernel OOPS. Fix this by adding an explicit safety check for `man->bdev` before accessing its members: - Use `WARN_ON_ONCE(!man->bdev)` to emit a one-time warning (a soft assertion) when `man->bdev` is NULL. This helps catch invalid usage patterns during debugging without breaking production workflows. - Return 0 immediately if `man->bdev` is NULL, as a non-initialized manager cannot have valid resource usage to report. Suggested-by: Christian König <christian.koenig@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Link: https://lore.kernel.org/r/20251013085849.1735612-1-Jesse.Zhang@amd.com
2025-10-14Merge drm/drm-next into drm-intel-nextJani Nikula-893/+1640
Sync to v6.18-rc1. Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-10-14drm/draw: fix color truncation in drm_draw_fill24Francesco Valla-2/+2
The color parameter passed to drm_draw_fill24() was truncated to 16 bits, leading to an incorrect color drawn to the target iosys_map. Fix this behavior, widening the parameter to 32 bits. Fixes: 31fa2c1ca0b2 ("drm/panic: Move drawing functions to drm_draw") Signed-off-by: Francesco Valla <francesco@valla.it> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20251003-drm_draw_fill24_fix-v1-1-8fb7c1c2a893@valla.it Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
2025-10-13drm/xe/guc: Check GuC running state before deregistering exec queueShuicheng Lin-1/+12
In normal operation, a registered exec queue is disabled and deregistered through the GuC, and freed only after the GuC confirms completion. However, if the driver is forced to unbind while the exec queue is still running, the user may call exec_destroy() after the GuC has already been stopped and CT communication disabled. In this case, the driver cannot receive a response from the GuC, preventing proper cleanup of exec queue resources. Fix this by directly releasing the resources when GuC is not running. Here is the failure dmesg log: " [ 468.089581] ---[ end trace 0000000000000000 ]--- [ 468.089608] pci 0000:03:00.0: [drm] *ERROR* GT0: GUC ID manager unclean (1/65535) [ 468.090558] pci 0000:03:00.0: [drm] GT0: total 65535 [ 468.090562] pci 0000:03:00.0: [drm] GT0: used 1 [ 468.090564] pci 0000:03:00.0: [drm] GT0: range 1..1 (1) [ 468.092716] ------------[ cut here ]------------ [ 468.092719] WARNING: CPU: 14 PID: 4775 at drivers/gpu/drm/xe/xe_ttm_vram_mgr.c:298 ttm_vram_mgr_fini+0xf8/0x130 [xe] " v2: use xe_uc_fw_is_running() instead of xe_guc_ct_enabled(). As CT may go down and come back during VF migration. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20251010172529.2967639-2-shuicheng.lin@intel.com (cherry picked from commit 9b42321a02c50a12b2beb6ae9469606257fbecea) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-10-13drm/xe: Enable media sampler power gatingVinay Belgaumkar-0/+9
Where applicable, enable media sampler power gating. Also, add it to the powergate_info debugfs. v2: Remove the sampler powergate status since it is cleared quickly anyway. v3: Use vcs mask (Rodrigo) and fix the version check for media v4: Remove extra spaces v5: Media samplers are independent of vcs mask, use Media version 1255 (Matt Roper) Fixes: 38e8c4184ea0 ("drm/xe: Enable Coarse Power Gating") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://lore.kernel.org/r/20251010011047.2047584-1-vinay.belgaumkar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 4cbc08649a54c3d533df9832342d52d409dfbbf0) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-10-13drm/xe: Handle mixed mappings and existing VRAM on atomic faultsMatthew Brost-1/+12
Moving to VRAM will fail if mixed mappings are present or if the page is already located in VRAM. Atomic faults that require a move to VRAM currently retry without attempting to evict mixed mappings or locate existing VRAM mappings. This patch fixes the issue by attempting to evict mixed mappings or find existing VRAM pages when a move to VRAM fails during atomic fault handling. Fixes: a9ac0fa455b0 ("drm/xe: Strict migration policy for atomic SVM faults") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20251009130629.3531962-1-matthew.brost@intel.com (cherry picked from commit 75188605c56d10c1bd3b1cd94f4872f349c3a9c8) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-10-13drm/xe/migrate: Fix an error pathThomas Hellström-1/+1
The exhaustive eviction accidently changed an error path goto to a return. Fix this. Fixes: 59eabff2a352 ("drm/xe: Convert xe_bo_create_pin_map() for exhaustive eviction") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://lore.kernel.org/r/20250910160939.103473-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 381f1ed15159c4b3f00dd37cc70924dedebeb111) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-10-13drm/xe: Move rebar to be done earlierLucas De Marchi-8/+29
There may be cases in which the BAR0 also needs to move to accommodate the bigger BAR2. However if it's not released, the BAR2 resize fails. During the vram probe it can't be released as it's already in use by xe_mmio for early register access. Add a new function in xe_vram and let xe_pci call it directly before even early device probe. This allows the BAR2 to resize in cases BAR0 also needs to move, assuming there aren't other reasons to hold that move: [] xe 0000:03:00.0: vgaarb: deactivate vga console [] xe 0000:03:00.0: [drm] Attempting to resize bar from 8192MiB -> 16384MiB [] xe 0000:03:00.0: BAR 0 [mem 0x83000000-0x83ffffff 64bit]: releasing [] xe 0000:03:00.0: BAR 2 [mem 0x4000000000-0x41ffffffff 64bit pref]: releasing [] pcieport 0000:02:01.0: bridge window [mem 0x4000000000-0x41ffffffff 64bit pref]: releasing [] pcieport 0000:01:00.0: bridge window [mem 0x4000000000-0x41ffffffff 64bit pref]: releasing [] pcieport 0000:01:00.0: bridge window [mem 0x4000000000-0x43ffffffff 64bit pref]: assigned [] pcieport 0000:02:01.0: bridge window [mem 0x4000000000-0x43ffffffff 64bit pref]: assigned [] xe 0000:03:00.0: BAR 2 [mem 0x4000000000-0x43ffffffff 64bit pref]: assigned [] xe 0000:03:00.0: BAR 0 [mem 0x83000000-0x83ffffff 64bit]: assigned [] pcieport 0000:00:01.0: PCI bridge to [bus 01-04] [] pcieport 0000:00:01.0: bridge window [mem 0x83000000-0x840fffff] [] pcieport 0000:00:01.0: bridge window [mem 0x4000000000-0x44007fffff 64bit pref] [] pcieport 0000:01:00.0: PCI bridge to [bus 02-04] [] pcieport 0000:01:00.0: bridge window [mem 0x83000000-0x840fffff] [] pcieport 0000:01:00.0: bridge window [mem 0x4000000000-0x43ffffffff 64bit pref] [] pcieport 0000:02:01.0: PCI bridge to [bus 03] [] pcieport 0000:02:01.0: bridge window [mem 0x83000000-0x83ffffff] [] pcieport 0000:02:01.0: bridge window [mem 0x4000000000-0x43ffffffff 64bit pref] [] xe 0000:03:00.0: [drm] BAR2 resized to 16384M [] xe 0000:03:00.0: [drm:xe_pci_probe [xe]] BATTLEMAGE e221:0000 dgfx:1 gfx:Xe2_HPG (20.02) ... For BMG there are additional fix needed in the PCI side, but this helps getting it to a working resize. All the rebar logic is more pci-specific than xe-specific and can be done very early in the probe sequence. In future it would be good to move it out of xe_vram.c, but this refactor is left for later. Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: stable@vger.kernel.org # 6.12+ Link: https://lore.kernel.org/intel-xe/fafda2a3-fc63-ce97-d22b-803f771a4d19@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20250918-xe-pci-rebar-2-v1-2-6c094702a074@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 45e33f220fd625492c11e15733d8e9b4f9db82a4) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-10-13drm/xe: Don't allow evicting of BOs in same VM in array of VM bindsMatthew Brost-9/+24
An array of VM binds can potentially evict other buffer objects (BOs) within the same VM under certain conditions, which may lead to NULL pointer dereferences later in the bind pipeline. To prevent this, clear the allow_res_evict flag in the xe_bo_validate call. v2: - Invert polarity of no_res_evict (Thomas) - Add comment in code explaining issue (Thomas) Cc: stable@vger.kernel.org Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6268 Fixes: 774b5fa509a9 ("drm/xe: Avoid evicting object of the same vm in none fault mode") Fixes: 77f2ef3f16f5 ("drm/xe: Lock all gpuva ops during VM bind IOCTL") Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20251009110618.3481870-1-matthew.brost@intel.com (cherry picked from commit 8b9ba8d6d95fe75fed6b0480bb03da4b321bea08) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-10-13drm/xe: Increase global invalidation timeout to 1000usKenneth Graunke-1/+1
The previous timeout of 500us seems to be too small; panning the map in the Roll20 VTT in Firefox on a KDE/Wayland desktop reliably triggered timeouts within a few seconds of usage, causing the monitor to freeze and the following to be printed to dmesg: [Jul30 13:44] xe 0000:03:00.0: [drm] *ERROR* GT0: Global invalidation timeout [Jul30 13:48] xe 0000:03:00.0: [drm] *ERROR* [CRTC:82:pipe A] flip_done timed out I haven't hit a single timeout since increasing it to 1000us even after several multi-hour testing sessions. Fixes: 0dd2dd0182bc ("drm/xe: Move DSB l2 flush to a more sensible place") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5710 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Cc: stable@vger.kernel.org Cc: Maarten Lankhorst <dev@lankhorst.se> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://lore.kernel.org/r/20250912223254.147940-1-kenneth@whitecape.org Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 146046907b56578263434107f5a7d5051847c459) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-10-13drm/xe: Enable 2M pages in xe_migrate_vramMatthew Brost-8/+45
Using 2M pages in xe_migrate_vram has two benefits: we issue fewer instructions per 2M copy (1 vs. 512), and the cache hit rate should be higher. This results in increased copy engine bandwidth, as shown by benchmark IGTs. Enable 2M pages by reserving PDEs in the migrate VM and using 2M pages in xe_migrate_vram if the DMA address order matches 2M. v2: - Reuse build_pt_update_batch_sram (Stuart) - Fix build_pt_update_batch_sram for PAGE_SIZE > 4K v3: - More fixes for PAGE_SIZE > 4K, align chunk, decrement chunk as needed - Use stack incr var in xe_migrate_vram_use_pde (Stuart) v4: - Split PAGE_SIZE > 4K fix out in different patch (Stuart) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20251013034555.4121168-3-matthew.brost@intel.com
2025-10-13drm/xe: Fix build_pt_update_batch_sram for non-4K PAGE_SIZEMatthew Brost-8/+22
The build_pt_update_batch_sram function in the Xe migrate layer assumes PAGE_SIZE == XE_PAGE_SIZE (4K), which is not a valid assumption on non-x86 platforms. This patch updates build_pt_update_batch_sram to correctly handle PAGE_SIZE > 4K by programming multiple 4K GPU pages per CPU page. v5: - Mask off non-address bits during compare Signed-off-by: Matthew Brost <matthew.brost@intel.com> Tested-by: Simon Richter <Simon.Richter@hogyros.de> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20251013034555.4121168-2-matthew.brost@intel.com
2025-10-13drm/xe: Fix comments in xe_gt structShuicheng Lin-5/+5
Correct several spelling and grammar issues in xe_gt struct documentation to improve readability: - Fix "to not" -> "do not". - Fix "mmigrations" -> "migrations". - Fix "Multiple queues exists" -> "Multiple queues exist". - Fix "are be processed" -> "to be processed". - Fix "have being processed" -> "have been processed". These changes are purely cosmetic and do not affect functionality. v2: drop kernel-doc formatting change. (Jani) Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20251006151317.2553182-2-shuicheng.lin@intel.com
2025-10-13drm/i915/gem: fix typo in comment (I915_EXEC_NO_RELOC)Marlon Henrique Sanches-1/+1
The comment referenced the flag name incorrectly as 'I915_EXEC_NORELOC' (missing underscore). This patch corrects the spelling in the comment only; there is no functional change. Signed-off-by: Marlon Henrique Sanches <marlonsanches@estudante.ufscar.br> Link: https://lore.kernel.org/r/20251013183123.438573-1-marlonsanches@estudante.ufscar.br Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-10-13drm/amd: Drop calls to restore power limit and clock from smu_resume()Mario Limonciello-13/+0
User requested power limits and clock settings are already restored as part of smu_restore_dpm_user_profile(). It's unnecessary to call the same restore as part of smu_resume(). Revert the following commits to drop that extra restore: commit ed4efe426a49 ("drm/amd: Restore cached power limit during resume") commit 796ff8a7e01b ("drm/amd: Restore cached manual clock settings during resume") commit f9b80514a722 ("drm/amd: Only restore cached manual clock settings in restore if OD enabled") Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amdgpu: update remove after reset flag for MES remove queueJonathan Kim-2/+13
Remove queue after reset flag is required to remove a queue that has been successfully reset to clean up the MES' internal state. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>