linux/drivers/gpu/host1x, branch v5.1

gpu: host1x: Program stream ID to bypass without SMMU

2019-04-11T15:40:35Z

If SMMU support is not available, fall back to programming the bypass stream ID (0x7f). Fixes: de5469c21ff9 ("gpu: host1x: Program the channel stream ID") Suggested-by: Mikko Perttunen Signed-off-by: Arnd Bergmann Reviewed-by: Mikko Perttunen [treding@nvidia.com: rebase this on top of a later build fix] Signed-off-by: Thierry Reding

gpu: host1x: Fix compile error when IOMMU API is not available

2019-04-11T08:35:39Z

In case the IOMMU API is not available compiling host1x fails with the following error: In file included from drivers/gpu/host1x/hw/host1x06.c:27: drivers/gpu/host1x/hw/channel_hw.c: In function ‘host1x_channel_set_streamid’: drivers/gpu/host1x/hw/channel_hw.c:118:30: error: implicit declaration of function ‘dev_iommu_fwspec_get’; did you mean ‘iommu_fwspec_free’? [-Werror=implicit-function-declaration] struct iommu_fwspec *spec = dev_iommu_fwspec_get(channel->dev->parent); ^~~~~~~~~~~~~~~~~~~~ iommu_fwspec_free Fixes: de5469c21ff9 ("gpu: host1x: Program the channel stream ID") Signed-off-by: Stefan Agner Signed-off-by: Thierry Reding

gpu: host1x: Continue CDMA execution starting with a next job

2019-02-07T17:34:25Z

Currently gathers of a hung job are getting NOP'ed and a restarted CDMA executes the NOP'ed gathers. There shouldn't be a reason to not restart CDMA execution starting with a next job, avoiding the unnecessary churning with gathers NOP'ing. Signed-off-by: Dmitry Osipenko Reviewed-by: Mikko Perttunen Signed-off-by: Thierry Reding

gpu: host1x: Don't complete a completed job

2019-02-07T17:30:58Z

There is a chance that the last job has been completed at the time of CDMA timeout handler invocation. In this case there is no need to complete the completed job. Signed-off-by: Dmitry Osipenko Reviewed-by: Mikko Perttunen Signed-off-by: Thierry Reding

gpu: host1x: Cancel only job that actually got stuck

2019-02-07T17:30:24Z

Host1x doesn't have information about jobs inter-dependency, that is something that will become available once host1x will get a proper jobs scheduler implementation. Currently a hang job causes other unrelated jobs to be canceled, that is a relic from downstream driver which is irrelevant to upstream. Let's cancel only the hanging job and not to touch other jobs in queue. Signed-off-by: Dmitry Osipenko Reviewed-by: Mikko Perttunen Signed-off-by: Thierry Reding

gpu: host1x: Optimize CDMA push buffer memory usage

2019-02-07T17:28:59Z

The host1x CDMA push buffer is terminated by a special opcode (RESTART) that tells the CDMA to wrap around to the beginning of the push buffer. To accomodate the RESTART opcode, an extra 4 bytes are allocated on top of the 512 * 8 = 4096 bytes needed for the 512 slots (1 slot = 2 words) that are used for other commands passed to CDMA. This requires that two memory pages are allocated, but most of the second page (4092 bytes) is never used. Decrease the number of slots to 511 so that the RESTART opcode fits within the page. Adjust the push buffer wraparound code to take into account push buffer sizes that are not a power of two. Signed-off-by: Thierry Reding

gpu: host1x: Use correct semantics for HOST1X_CHANNEL_DMAEND

2019-02-07T17:28:58Z

The HOST1X_CHANNEL_DMAEND is an offset relative to the value written to the HOST1X_CHANNEL_DMASTART register, but it is currently treated as an absolute address. This can cause SMMU faults if the CDMA fetches past a pushbuffer's IOMMU mapping. Properly setting the DMAEND prevents the CDMA from fetching beyond that address and avoid such issues. This is currently not observed because a whole (almost) page of essentially scratch space absorbs any excessive prefetching by CDMA. However, changing the number of slots in the push buffer can trigger these SMMU faults. Signed-off-by: Thierry Reding

gpu: host1x: Support 40-bit addressing on Tegra186

2019-02-07T17:28:58Z

The host1x and clients instantiated on Tegra186 support addressing 40 bits of memory. Signed-off-by: Thierry Reding

gpu: host1x: Restrict IOVA space to DMA mask

2019-02-07T17:28:57Z

On Tegra186 and later, the ARM SMMU provides an input address space that is 48 bits wide. However, memory clients can only address up to 40 bits. If the geometry is used as-is, allocations of IOVA space can end up in a region that is not addressable by the memory clients. To fix this, restrict the IOVA space to the DMA mask of the host1x device. Signed-off-by: Thierry Reding

gpu: host1x: Support 40-bit addressing

2019-02-07T17:28:35Z

Tegra186 and later support 40 bits of address space. Additional registers need to be programmed to store the full 40 bits of push buffer addresses. Since command stream gathers can also reside in buffers in a 40-bit address space, a new variant of the GATHER opcode is also introduced. It takes two parameters: the first parameter contains the lower 32 bits of the address and the second parameter contains bits 32 to 39. Signed-off-by: Thierry Reding