<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/virtio, branch v6.4</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.4</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.4'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2023-04-28T02:42:02Z</updated>
<entry>
<title>Merge tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2023-04-28T02:42:02Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-04-28T02:42:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7fa8a8ee9400fe8ec188426e40e481717bc5e924'/>
<id>urn:sha1:7fa8a8ee9400fe8ec188426e40e481717bc5e924</id>
<content type='text'>
Pull MM updates from Andrew Morton:

 - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
   switching from a user process to a kernel thread.

 - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj
   Raghav.

 - zsmalloc performance improvements from Sergey Senozhatsky.

 - Yue Zhao has found and fixed some data race issues around the
   alteration of memcg userspace tunables.

 - VFS rationalizations from Christoph Hellwig:
     - removal of most of the callers of write_one_page()
     - make __filemap_get_folio()'s return value more useful

 - Luis Chamberlain has changed tmpfs so it no longer requires swap
   backing. Use `mount -o noswap'.

 - Qi Zheng has made the slab shrinkers operate locklessly, providing
   some scalability benefits.

 - Keith Busch has improved dmapool's performance, making part of its
   operations O(1) rather than O(n).

 - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
   permitting userspace to wr-protect anon memory unpopulated ptes.

 - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive
   rather than exclusive, and has fixed a bunch of errors which were
   caused by its unintuitive meaning.

 - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
   which causes minor faults to install a write-protected pte.

 - Vlastimil Babka has done some maintenance work on vma_merge():
   cleanups to the kernel code and improvements to our userspace test
   harness.

 - Cleanups to do_fault_around() by Lorenzo Stoakes.

 - Mike Rapoport has moved a lot of initialization code out of various
   mm/ files and into mm/mm_init.c.

 - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
   DRM, but DRM doesn't use it any more.

 - Lorenzo has also coverted read_kcore() and vread() to use iterators
   and has thereby removed the use of bounce buffers in some cases.

 - Lorenzo has also contributed further cleanups of vma_merge().

 - Chaitanya Prakash provides some fixes to the mmap selftesting code.

 - Matthew Wilcox changes xfs and afs so they no longer take sleeping
   locks in -&gt;map_page(), a step towards RCUification of pagefaults.

 - Suren Baghdasaryan has improved mmap_lock scalability by switching to
   per-VMA locking.

 - Frederic Weisbecker has reworked the percpu cache draining so that it
   no longer causes latency glitches on cpu isolated workloads.

 - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
   logic.

 - Liu Shixin has changed zswap's initialization so we no longer waste a
   chunk of memory if zswap is not being used.

 - Yosry Ahmed has improved the performance of memcg statistics
   flushing.

 - David Stevens has fixed several issues involving khugepaged,
   userfaultfd and shmem.

 - Christoph Hellwig has provided some cleanup work to zram's IO-related
   code paths.

 - David Hildenbrand has fixed up some issues in the selftest code's
   testing of our pte state changing.

 - Pankaj Raghav has made page_endio() unneeded and has removed it.

 - Peter Xu contributed some rationalizations of the userfaultfd
   selftests.

 - Yosry Ahmed has fixed an issue around memcg's page recalim
   accounting.

 - Chaitanya Prakash has fixed some arm-related issues in the
   selftests/mm code.

 - Longlong Xia has improved the way in which KSM handles hwpoisoned
   pages.

 - Peter Xu fixes a few issues with uffd-wp at fork() time.

 - Stefan Roesch has changed KSM so that it may now be used on a
   per-process and per-cgroup basis.

* tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
  mm,unmap: avoid flushing TLB in batch if PTE is inaccessible
  shmem: restrict noswap option to initial user namespace
  mm/khugepaged: fix conflicting mods to collapse_file()
  sparse: remove unnecessary 0 values from rc
  mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area()
  hugetlb: pte_alloc_huge() to replace huge pte_alloc_map()
  maple_tree: fix allocation in mas_sparse_area()
  mm: do not increment pgfault stats when page fault handler retries
  zsmalloc: allow only one active pool compaction context
  selftests/mm: add new selftests for KSM
  mm: add new KSM process and sysfs knobs
  mm: add new api to enable ksm per process
  mm: shrinkers: fix debugfs file permissions
  mm: don't check VMA write permissions if the PTE/PMD indicates write permissions
  migrate_pages_batch: fix statistics for longterm pin retry
  userfaultfd: use helper function range_in_vma()
  lib/show_mem.c: use for_each_populated_zone() simplify code
  mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list()
  fs/buffer: convert create_page_buffers to folio_create_buffers
  fs/buffer: add folio_create_empty_buffers helper
  ...
</content>
</entry>
<entry>
<title>Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost</title>
<updated>2023-04-28T00:05:34Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-04-28T00:05:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8ccd54fe45713cd458015b5b08d6098545e70543'/>
<id>urn:sha1:8ccd54fe45713cd458015b5b08d6098545e70543</id>
<content type='text'>
Pull virtio updates from Michael Tsirkin:
 "virtio,vhost,vdpa: features, fixes, and cleanups:

   - reduction in interrupt rate in virtio

   - perf improvement for VDUSE

   - scalability for vhost-scsi

   - non power of 2 ring support for packed rings

   - better management for mlx5 vdpa

   - suspend for snet

   - VIRTIO_F_NOTIFICATION_DATA

   - shared backend with vdpa-sim-blk

   - user VA support in vdpa-sim

   - better struct packing for virtio

  and fixes, cleanups all over the place"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (52 commits)
  vhost_vdpa: fix unmap process in no-batch mode
  MAINTAINERS: make me a reviewer of VIRTIO CORE AND NET DRIVERS
  tools/virtio: fix build caused by virtio_ring changes
  virtio_ring: add a struct device forward declaration
  vdpa_sim_blk: support shared backend
  vdpa_sim: move buffer allocation in the devices
  vdpa/snet: use likely/unlikely macros in hot functions
  vdpa/snet: implement kick_vq_with_data callback
  virtio-vdpa: add VIRTIO_F_NOTIFICATION_DATA feature support
  virtio: add VIRTIO_F_NOTIFICATION_DATA feature support
  vdpa/snet: support the suspend vDPA callback
  vdpa/snet: support getting and setting VQ state
  MAINTAINERS: add vringh.h to Virtio Core and Net Drivers
  vringh: address kdoc warnings
  vdpa: address kdoc warnings
  virtio_ring: don't update event idx on get_buf
  vdpa_sim: add support for user VA
  vdpa_sim: replace the spinlock with a mutex to protect the state
  vdpa_sim: use kthread worker
  vdpa_sim: make devices agnostic for work management
  ...
</content>
</entry>
<entry>
<title>virtio-vdpa: add VIRTIO_F_NOTIFICATION_DATA feature support</title>
<updated>2023-04-21T07:02:35Z</updated>
<author>
<name>Alvaro Karsz</name>
<email>alvaro.karsz@solid-run.com</email>
</author>
<published>2023-04-13T08:18:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2c4e4a22a3b090e06191f8544d21e0e1d72ce518'/>
<id>urn:sha1:2c4e4a22a3b090e06191f8544d21e0e1d72ce518</id>
<content type='text'>
Add VIRTIO_F_NOTIFICATION_DATA support for vDPA transport.
If this feature is negotiated, the driver passes extra data when kicking
a virtqueue.

A device that offers this feature needs to implement the
kick_vq_with_data callback.

kick_vq_with_data receives the vDPA device and data.
data includes:
16 bits vqn and 16 bits next available index for split virtqueues.
16 bits vqs, 15 least significant bits of next available index
and 1 bit next_wrap for packed virtqueues.

This patch follows a patch [1] by Viktor Prutyanov which adds support
for the MMIO, channel I/O and modern PCI transports.

Signed-off-by: Alvaro Karsz &lt;alvaro.karsz@solid-run.com&gt;
Message-Id: &lt;20230413081855.36643-3-alvaro.karsz@solid-run.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
</content>
</entry>
<entry>
<title>virtio: add VIRTIO_F_NOTIFICATION_DATA feature support</title>
<updated>2023-04-21T07:02:35Z</updated>
<author>
<name>Viktor Prutyanov</name>
<email>viktor@daynix.com</email>
</author>
<published>2023-04-13T08:18:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=af8ececda185078c096852edb4e1d7a2349e6856'/>
<id>urn:sha1:af8ececda185078c096852edb4e1d7a2349e6856</id>
<content type='text'>
According to VirtIO spec v1.2, VIRTIO_F_NOTIFICATION_DATA feature
indicates that the driver passes extra data along with the queue
notifications.

In a split queue case, the extra data is 16-bit available index. In a
packed queue case, the extra data is 1-bit wrap counter and 15-bit
available index.

Add support for this feature for MMIO, channel I/O and modern PCI
transports.

Signed-off-by: Viktor Prutyanov &lt;viktor@daynix.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
Reviewed-by: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Message-Id: &lt;20230413081855.36643-2-alvaro.karsz@solid-run.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
</entry>
<entry>
<title>virtio_ring: don't update event idx on get_buf</title>
<updated>2023-04-21T07:02:34Z</updated>
<author>
<name>Albert Huang</name>
<email>huangjie.albert@bytedance.com</email>
</author>
<published>2023-03-29T10:23:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6c0b057cec5eade4c3afec3908821176931a9997'/>
<id>urn:sha1:6c0b057cec5eade4c3afec3908821176931a9997</id>
<content type='text'>
In virtio_net, if we disable napi_tx, when we trigger a tx interrupt,
the vq-&gt;event_triggered will be set to true. It is then never reset
until we explicitly call virtqueue_enable_cb_delayed or
virtqueue_enable_cb_prepare.

If we disable the napi_tx, virtqueue_enable_cb* will only be called when
the tx ring is getting relatively empty.

Since event_triggered is true, VRING_AVAIL_F_NO_INTERRUPT or
VRING_PACKED_EVENT_FLAG_DISABLE will not be set. As a result we update
vring_used_event(&amp;vq-&gt;split.vring) or vq-&gt;packed.vring.driver-&gt;off_wrap
every time we call virtqueue_get_buf_ctx. This causes more interrupts.

To summarize:
1) event_triggered was set to true in vring_interrupt()
2) after this nothing will happen in virtqueue_disable_cb() so
   VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
   and then it will publish a new event index

To fix:
update VRING_AVAIL_F_NO_INTERRUPT or VRING_PACKED_EVENT_FLAG_DISABLE in
the vq when we call virtqueue_disable_cb even when event_triggered is
true.

Tested with iperf:
iperf3 tcp stream:
vm1 -----------------&gt; vm2
vm2 just receives tcp data stream from vm1, and sends acks to vm1,
there are many tx interrupts in vm2.
with the patch applied there are just a few tx interrupts.

v2-&gt;v3:
-update the interrupt disable flag even with the event_triggered is set,
-instead of checking whether event_triggered is set in
-virtqueue_get_buf_ctx_{packed/split}, will cause the drivers  which have
-not called virtqueue_{enable/disable}_cb to miss notifications.

v3-&gt;v4:
-remove change for
-"if (vq-&gt;packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE)"
-in virtqueue_disable_cb_packed

Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
Signed-off-by: Albert Huang &lt;huangjie.albert@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Jason Wang &lt;jasowang@redhat.com&gt;
Message-Id: &lt;20230329102300.61000-1-huangjie.albert@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
</entry>
<entry>
<title>vdpa: Add eventfd for the vdpa callback</title>
<updated>2023-04-21T07:02:32Z</updated>
<author>
<name>Xie Yongji</name>
<email>xieyongji@bytedance.com</email>
</author>
<published>2023-03-23T05:30:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5e68470f4e80a4120e9ecec408f6ab4ad386bd4a'/>
<id>urn:sha1:5e68470f4e80a4120e9ecec408f6ab4ad386bd4a</id>
<content type='text'>
Add eventfd for the vdpa callback so that user
can signal it directly instead of triggering the
callback. It will be used for vhost-vdpa case.

Signed-off-by: Xie Yongji &lt;xieyongji@bytedance.com&gt;
Message-Id: &lt;20230323053043.35-9-xieyongji@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
</content>
</entry>
<entry>
<title>virtio-vdpa: Support interrupt affinity spreading mechanism</title>
<updated>2023-04-21T07:02:31Z</updated>
<author>
<name>Xie Yongji</name>
<email>xieyongji@bytedance.com</email>
</author>
<published>2023-03-23T05:30:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3dad56823b5332ffdbe1867b2d7b50fbacea124a'/>
<id>urn:sha1:3dad56823b5332ffdbe1867b2d7b50fbacea124a</id>
<content type='text'>
To support interrupt affinity spreading mechanism,
this makes use of group_cpus_evenly() to create
an irq callback affinity mask for each virtqueue
of vdpa device. Then we will unify set_vq_affinity
callback to pass the affinity to the vdpa device driver.

Signed-off-by: Xie Yongji &lt;xieyongji@bytedance.com&gt;
Message-Id: &lt;20230323053043.35-4-xieyongji@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
</content>
</entry>
<entry>
<title>vdpa: Add set/get_vq_affinity callbacks in vdpa_config_ops</title>
<updated>2023-04-21T07:02:31Z</updated>
<author>
<name>Xie Yongji</name>
<email>xieyongji@bytedance.com</email>
</author>
<published>2023-03-23T05:30:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1d24692732fb299c94b0dcc032b48ac8fa85c854'/>
<id>urn:sha1:1d24692732fb299c94b0dcc032b48ac8fa85c854</id>
<content type='text'>
This introduces set/get_vq_affinity callbacks in
vdpa_config_ops to support virtqueue affinity
management for vdpa device drivers.

Signed-off-by: Xie Yongji &lt;xieyongji@bytedance.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
Message-Id: &lt;20230323053043.35-3-xieyongji@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
</entry>
<entry>
<title>virtio_ring: Allow non power of 2 sizes for packed virtqueue</title>
<updated>2023-04-21T07:02:31Z</updated>
<author>
<name>Feng Liu</name>
<email>feliu@nvidia.com</email>
</author>
<published>2023-03-15T18:54:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a084983dcd4e75616fe4102c690d2fc922ac32b5'/>
<id>urn:sha1:a084983dcd4e75616fe4102c690d2fc922ac32b5</id>
<content type='text'>
According to the Virtio Specification, the Queue Size parameter of a
virtqueue corresponds to the maximum number of descriptors in that
queue, and it does not have to be a power of 2 for packed virtqueues.
However, the virtio_pci_modern driver enforced a power of 2 check for
virtqueue sizes, which is unnecessary and restrictive for packed
virtuqueue.

Split virtqueue still needs to check the virtqueue size is power_of_2
which has been done in vring_alloc_queue_split of the virtio_ring layer.

To validate this change, we tested various virtqueue sizes for packed
rings, including 128, 256, 512, 100, 200, 500, and 1000, with
CONFIG_PAGE_POISONING enabled, and all tests passed successfully.

Signed-off-by: Feng Liu &lt;feliu@nvidia.com&gt;
Reviewed-by: Jiri Pirko &lt;jiri@nvidia.com&gt;

Message-Id: &lt;20230315185458.11638-2-feliu@nvidia.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
</entry>
<entry>
<title>virtio_ring: Use const to annotate read-only pointer params</title>
<updated>2023-04-21T07:02:30Z</updated>
<author>
<name>Feng Liu</name>
<email>feliu@nvidia.com</email>
</author>
<published>2023-03-10T05:34:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4b6ec919b84830949313c805c586fb34f141ce0c'/>
<id>urn:sha1:4b6ec919b84830949313c805c586fb34f141ce0c</id>
<content type='text'>
Add const to make the read-only pointer parameters clear, similar to
many existing functions.

To implement this change, the commit also introduces the use of
`container_of_const` to implement `to_vvq`, which ensures the const-ness
of read-only parameters and avoids accidental modification of their
members.

Signed-off-by: Feng Liu &lt;feliu@nvidia.com&gt;
Reviewed-by: Jiri Pirko &lt;jiri@nvidia.com&gt;
Reviewed-by: Parav Pandit &lt;parav@nvidia.com&gt;
Reviewed-by: Gavin Li &lt;gavinl@nvidia.com&gt;
Reviewed-by: Bodong Wang &lt;bodong@nvidia.com&gt;

Message-Id: &lt;20230310053428.3376-4-feliu@nvidia.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
</entry>
</feed>
