linux/drivers/vhost, branch v6.17

vhost-net: flush batched before enabling notifications

2025-09-19T08:15:26Z

Commit 8c2e6b26ffe2 ("vhost/net: Defer TX queue re-enable until after sendmsg") tries to defer the notification enabling by moving the logic out of the loop after the vhost_tx_batch() when nothing new is spotted. This caused unexpected side effects as the new logic is reused for several other error conditions. A previous patch reverted 8c2e6b26ffe2. Now, bring the performance back up by flushing batched buffers before enabling notifications. Reported-by: Jon Kohler Cc: stable@vger.kernel.org Fixes: 8c2e6b26ffe2 ("vhost/net: Defer TX queue re-enable until after sendmsg") Signed-off-by: Jason Wang Signed-off-by: Michael S. Tsirkin Message-Id: <20250917063045.2042-3-jasowang@redhat.com>

Revert "vhost/net: Defer TX queue re-enable until after sendmsg"

2025-09-19T08:15:26Z

This reverts commit 8c2e6b26ffe243be1e78f5a4bfb1a857d6e6f6d6. It tries to defer the notification enabling by moving the logic out of the loop after the vhost_tx_batch() when nothing new is spotted. This will bring side effects as the new logic would be reused for several other error conditions. One example is the IOTLB: when there's an IOTLB miss, get_tx_bufs() might return -EAGAIN and exit the loop and see there's still available buffers, so it will queue the tx work again until userspace feed the IOTLB entry correctly. This will slowdown the tx processing and trigger the TX watchdog in the guest as reported in https://lkml.org/lkml/2025/9/10/1596. To fix, revert the change. A follow up patch will bring the performance back in a safe way. Reported-by: Jon Kohler Cc: stable@vger.kernel.org Fixes: 8c2e6b26ffe2 ("vhost/net: Defer TX queue re-enable until after sendmsg") Signed-off-by: Jason Wang Signed-off-by: Michael S. Tsirkin Message-Id: <20250917063045.2042-2-jasowang@redhat.com>

vhost-net: unbreak busy polling

2025-09-19T08:15:26Z

Commit 67a873df0c41 ("vhost: basic in order support") pass the number of used elem to vhost_net_rx_peek_head_len() to make sure it can signal the used correctly before trying to do busy polling. But it forgets to clear the count, this would cause the count run out of sync with handle_rx() and break the busy polling. Fixing this by passing the pointer of the count and clearing it after the signaling the used. Acked-by: Michael S. Tsirkin Cc: stable@vger.kernel.org Fixes: 67a873df0c41 ("vhost: basic in order support") Signed-off-by: Jason Wang Message-Id: <20250917063045.2042-1-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin

vhost-scsi: fix argument order in tport allocation error message

2025-09-15T22:25:58Z

The error log in vhost_scsi_make_tport() prints the arguments in the wrong order, producing confusing output. For example, when creating a target with a name in WWNN format such as "fc.port1234", the log looks like: Emulated fc.port1234 Address: FCP, exceeds max: 64 Instead, the message should report the emulated protocol type first, followed by the configfs name as: Emulated FCP Address: fc.port1234, exceeds max: 64 Fix the argument order so the error log is consistent and clear. Signed-off-by: Alok Tiwari Message-Id: <20250913154106.3995856-1-alok.a.tiwari@oracle.com> Signed-off-by: Michael S. Tsirkin Reviewed-by: Stefan Hajnoczi

vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put()

2025-08-26T07:38:10Z

When operating on struct vhost_net_ubuf_ref, the following execution sequence is theoretically possible: CPU0 is finalizing DMA operation CPU1 is doing VHOST_NET_SET_BACKEND // ubufs->refcount == 2 vhost_net_ubuf_put() vhost_net_ubuf_put_wait_and_free(oldubufs) vhost_net_ubuf_put_and_wait() vhost_net_ubuf_put() int r = atomic_sub_return(1, &ubufs->refcount); // r = 1 int r = atomic_sub_return(1, &ubufs->refcount); // r = 0 wait_event(ubufs->wait, !atomic_read(&ubufs->refcount)); // no wait occurs here because condition is already true kfree(ubufs); if (unlikely(!r)) wake_up(&ubufs->wait); // use-after-free This leads to use-after-free on ubufs access. This happens because CPU1 skips waiting for wake_up() when refcount is already zero. To prevent that use a read-side RCU critical section in vhost_net_ubuf_put(), as suggested by Hillf Danton. For this lock to take effect, free ubufs with kfree_rcu(). Cc: stable@vger.kernel.org Fixes: 0ad8b480d6ee9 ("vhost: fix ref cnt checking deadlock") Reported-by: Andrey Ryabinin Suggested-by: Hillf Danton Signed-off-by: Nikolay Kuratov Message-Id: <20250805130917.727332-1-kniv@yandex-team.ru> Signed-off-by: Michael S. Tsirkin

vhost: initialize vq->nheads properly

2025-08-05T09:57:40Z

Commit 7918bb2d19c9 ("vhost: basic in order support") introduces vq->nheads to store the number of batched used buffers per used elem but it forgets to initialize the vq->nheads to NULL in vhost_dev_init() this will cause kfree() that would try to free it without be allocated if SET_OWNER is not called. Reported-by: JAEHOON KIM Reported-by: Breno Leitao Fixes: 45347e79b544 ("vhost: basic in order support") Signed-off-by: Jason Wang Message-Id: <20250729073916.80647-1-jasowang@redhat.com> Reviewed-by: Dawid Osuchowski Tested-by: Breno Leitao Reviewed-by: Stefano Garzarella Tested-by: Jaehoon Kim Signed-off-by: Michael S. Tsirkin

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

2025-08-01T21:17:48Z

Pull virtio updates from Michael Tsirkin: - vhost can now support legacy threading if enabled in Kconfig - vsock memory allocation strategies for large buffers have been improved, reducing pressure on kmalloc - vhost now supports the in-order feature. guest bits missed the merge window. - fixes, cleanups all over the place * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (30 commits) vsock/virtio: Allocate nonlinear SKBs for handling large transmit buffers vsock/virtio: Rename virtio_vsock_skb_rx_put() vhost/vsock: Allocate nonlinear SKBs for handling large receive buffers vsock/virtio: Move SKB allocation lower-bound check to callers vsock/virtio: Rename virtio_vsock_alloc_skb() vsock/virtio: Resize receive buffers so that each SKB fits in a 4K page vsock/virtio: Move length check to callers of virtio_vsock_skb_rx_put() vsock/virtio: Validate length in packet header before skb_put() vhost/vsock: Avoid allocating arbitrarily-sized SKBs vhost_net: basic in_order support vhost: basic in order support vhost: fail early when __vhost_add_used() fails vhost: Reintroduce kthread API and add mode selection vdpa: Fix IDR memory leak in VDUSE module exit vdpa/mlx5: Fix release of uninitialized resources on error path vhost-scsi: Fix check for inline_sg_cnt exceeding preallocated limit virtio: virtio_dma_buf: fix missing parameter documentation vhost: Fix typos vhost: vringh: Remove unused functions vhost: vringh: Remove unused iotlb functions ...

vsock/virtio: Rename virtio_vsock_skb_rx_put()

2025-08-01T13:11:09Z

In preparation for using virtio_vsock_skb_rx_put() when populating SKBs on the vsock TX path, rename virtio_vsock_skb_rx_put() to virtio_vsock_skb_put(). No functional change. Reviewed-by: Stefano Garzarella Signed-off-by: Will Deacon Message-Id: <20250717090116.11987-9-will@kernel.org> Signed-off-by: Michael S. Tsirkin

vhost/vsock: Allocate nonlinear SKBs for handling large receive buffers

2025-08-01T13:11:09Z

When receiving a packet from a guest, vhost_vsock_handle_tx_kick() calls vhost_vsock_alloc_linear_skb() to allocate and fill an SKB with the receive data. Unfortunately, these are always linear allocations and can therefore result in significant pressure on kmalloc() considering that the maximum packet size (VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM) is a little over 64KiB, resulting in a 128KiB allocation for each packet. Rework the vsock SKB allocation so that, for sizes with page order greater than PAGE_ALLOC_COSTLY_ORDER, a nonlinear SKB is allocated instead with the packet header in the SKB and the receive data in the fragments. Finally, add a debug warning if virtio_vsock_skb_rx_put() is ever called on an SKB with a non-zero length, as this would be destructive for the nonlinear case. Reviewed-by: Stefano Garzarella Signed-off-by: Will Deacon Message-Id: <20250717090116.11987-8-will@kernel.org> Signed-off-by: Michael S. Tsirkin

vsock/virtio: Move SKB allocation lower-bound check to callers

2025-08-01T13:11:09Z

virtio_vsock_alloc_linear_skb() checks that the requested size is at least big enough for the packet header (VIRTIO_VSOCK_SKB_HEADROOM). Of the three callers of virtio_vsock_alloc_linear_skb(), only vhost_vsock_alloc_skb() can potentially pass a packet smaller than the header size and, as it already has a check against the maximum packet size, extend its bounds checking to consider the minimum packet size and remove the check from virtio_vsock_alloc_linear_skb(). Reviewed-by: Stefano Garzarella Signed-off-by: Will Deacon Message-Id: <20250717090116.11987-7-will@kernel.org> Signed-off-by: Michael S. Tsirkin