<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/io_uring/poll.c, branch v6.19</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.19</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.19'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2025-12-05T17:23:28Z</updated>
<entry>
<title>io_uring/poll: unify poll waitqueue entry and list removal</title>
<updated>2025-12-05T17:23:28Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2025-12-05T17:20:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=55d57b3bcc7efcab812a8179e2dc17d781302997'/>
<id>urn:sha1:55d57b3bcc7efcab812a8179e2dc17d781302997</id>
<content type='text'>
For some cases, the order in which the waitq entry list and head
writing happens is important, for others it doesn't really matter.
But it's somewhat confusing to have them spread out over the file.

Abstract out the nicely documented code in io_pollfree_wake() and
move it into a helper, and use that helper consistently rather than
having other call sites manually do the same thing. While at it,
correct a comment function name as well.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/poll: correctly handle io_poll_add() return value on update</title>
<updated>2025-12-04T14:18:01Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2025-12-01T20:25:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=84230ad2d2afbf0c44c32967e525c0ad92e26b4e'/>
<id>urn:sha1:84230ad2d2afbf0c44c32967e525c0ad92e26b4e</id>
<content type='text'>
When the core of io_uring was updated to handle completions
consistently and with fixed return codes, the POLL_REMOVE opcode
with updates got slightly broken. If a POLL_ADD is pending and
then POLL_REMOVE is used to update the events of that request, if that
update causes the POLL_ADD to now trigger, then that completion is lost
and a CQE is never posted.

Additionally, ensure that if an update does cause an existing POLL_ADD
to complete, that the completion value isn't always overwritten with
-ECANCELED. For that case, whatever io_poll_add() set the value to
should just be retained.

Cc: stable@vger.kernel.org
Fixes: 97b388d70b53 ("io_uring: handle completions in the core")
Reported-by: syzbot+641eec6b7af1f62f2b99@syzkaller.appspotmail.com
Tested-by: syzbot+641eec6b7af1f62f2b99@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring: add wrapper type for io_req_tw_func_t arg</title>
<updated>2025-11-03T15:31:26Z</updated>
<author>
<name>Caleb Sander Mateos</name>
<email>csander@purestorage.com</email>
</author>
<published>2025-10-31T20:34:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c33e779aba6804778c1440192a8033a145ba588d'/>
<id>urn:sha1:c33e779aba6804778c1440192a8033a145ba588d</id>
<content type='text'>
In preparation for uring_cmd implementations to implement functions
with the io_req_tw_func_t signature, introduce a wrapper struct
io_tw_req to hide the struct io_kiocb * argument. The intention is for
only the io_uring core to access the inner struct io_kiocb *. uring_cmd
implementations should instead call a helper from io_uring/cmd.h to
convert struct io_tw_req to struct io_uring_cmd *.

Signed-off-by: Caleb Sander Mateos &lt;csander@purestorage.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring: unify task_work cancelation checks</title>
<updated>2025-10-20T16:37:48Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2025-09-23T10:25:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7be20254a743be4f02414b9d56cc3fe5f84e6500'/>
<id>urn:sha1:7be20254a743be4f02414b9d56cc3fe5f84e6500</id>
<content type='text'>
Rather than do per-tw checking, which needs to dip into the task_struct
for checking flags, do it upfront before running task_work. This places
a 'cancel' member in io_tw_token_t, which is assigned before running
task_work for that given ctx.

This is both more efficient in doing it upfront rather than for every
task_work, and it means that io_should_terminate_tw() can be made
private in io_uring.c rather than need to be called by various
callbacks of task_work.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-6.18/io_uring-20250929' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux</title>
<updated>2025-10-02T16:56:23Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-10-02T16:56:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5832d26433f2bd0d28f8b12526e3c2fdb203507f'/>
<id>urn:sha1:5832d26433f2bd0d28f8b12526e3c2fdb203507f</id>
<content type='text'>
Pull io_uring updates from Jens Axboe:

 - Store ring provided buffers locally for the users, rather than stuff
   them into struct io_kiocb.

   These types of buffers must always be fully consumed or recycled in
   the current context, and leaving them in struct io_kiocb is hence not
   a good ideas as that struct has a vastly different life time.

   Basically just an architecture cleanup that can help prevent issues
   with ring provided buffers in the future.

 - Support for mixed CQE sizes in the same ring.

   Before this change, a CQ ring either used the default 16b CQEs, or it
   was setup with 32b CQE using IORING_SETUP_CQE32. For use cases where
   a few 32b CQEs were needed, this caused everything else to use big
   CQEs. This is wasteful both in terms of memory usage, but also memory
   bandwidth for the posted CQEs.

   With IORING_SETUP_CQE_MIXED, applications may use request types that
   post both normal 16b and big 32b CQEs on the same ring.

 - Add helpers for async data management, to make it harder for opcode
   handlers to mess it up.

 - Add support for multishot for uring_cmd, which ublk can use. This
   helps improve efficiency, by providing a persistent request type that
   can trigger multiple CQEs.

 - Add initial support for ring feature querying.

   We had basic support for probe operations, but the API isn't great.
   Rather than expand that, add support for QUERY which is easily
   expandable and can cover a lot more cases than the existing probe
   support. This will help applications get a better idea of what
   operations are supported on a given host.

 - zcrx improvements from Pavel:
        - Improve refill entry alignment for better caching
        - Various cleanups, especially around deduplicating normal
          memory vs dmabuf setup.
        - Generalisation of the niov size (Patch 12). It's still hard
          coded to PAGE_SIZE on init, but will let the user to specify
          the rx buffer length on setup.
        - Syscall / synchronous bufer return. It'll be used as a slow
          fallback path for returning buffers when the refill queue is
          full. Useful for tolerating slight queue size misconfiguration
          or with inconsistent load.
        - Accounting more memory to cgroups.
        - Additional independent cleanups that will also be useful for
          mutli-area support.

 - Various fixes and cleanups

* tag 'for-6.18/io_uring-20250929' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (68 commits)
  io_uring/cmd: drop unused res2 param from io_uring_cmd_done()
  io_uring: fix nvme's 32b cqes on mixed cq
  io_uring/query: cap number of queries
  io_uring/query: prevent infinite loops
  io_uring/zcrx: account niov arrays to cgroup
  io_uring/zcrx: allow synchronous buffer return
  io_uring/zcrx: introduce io_parse_rqe()
  io_uring/zcrx: don't adjust free cache space
  io_uring/zcrx: use guards for the refill lock
  io_uring/zcrx: reduce netmem scope in refill
  io_uring/zcrx: protect netdev with pp_lock
  io_uring/zcrx: rename dma lock
  io_uring/zcrx: make niov size variable
  io_uring/zcrx: set sgt for umem area
  io_uring/zcrx: remove dmabuf_offset
  io_uring/zcrx: deduplicate area mapping
  io_uring/zcrx: pass ifq to io_zcrx_alloc_fallback()
  io_uring/zcrx: check all niovs filled with dma addresses
  io_uring/zcrx: move area reg checks into io_import_area
  io_uring/zcrx: don't pass slot to io_zcrx_create_area
  ...
</content>
</entry>
<entry>
<title>io_uring: include dying ring in task_work "should cancel" state</title>
<updated>2025-09-18T16:24:50Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2025-09-18T16:21:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3539b1467e94336d5854ebf976d9627bfb65d6c3'/>
<id>urn:sha1:3539b1467e94336d5854ebf976d9627bfb65d6c3</id>
<content type='text'>
When running task_work for an exiting task, rather than perform the
issue retry attempt, the task_work is canceled. However, this isn't
done for a ring that has been closed. This can lead to requests being
successfully completed post the ring being closed, which is somewhat
confusing and surprising to an application.

Rather than just check the task exit state, also include the ring
ref state in deciding whether or not to terminate a given request when
run from task_work.

Cc: stable@vger.kernel.org # 6.1+
Link: https://github.com/axboe/liburing/discussions/1459
Reported-by: Benedek Thaler &lt;thaler@thaler.hu&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring: remove async/poll related provided buffer recycles</title>
<updated>2025-08-24T17:41:12Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2025-08-21T02:03:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e973837b54024f070b2b48c7ee9725548548257a'/>
<id>urn:sha1:e973837b54024f070b2b48c7ee9725548548257a</id>
<content type='text'>
These aren't necessary anymore, get rid of them.

Link: https://lore.kernel.org/r/20250821020750.598432-13-axboe@kernel.dk
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/kbuf: switch to storing struct io_buffer_list locally</title>
<updated>2025-08-24T17:41:12Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2025-08-21T02:03:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5fda51255439addd1c9059098e30847a375a1008'/>
<id>urn:sha1:5fda51255439addd1c9059098e30847a375a1008</id>
<content type='text'>
Currently the buffer list is stored in struct io_kiocb. The buffer list
can be of two types:

1) Classic/legacy buffer list. These don't need to get referenced after
   a buffer pick, and hence storing them in struct io_kiocb is perfectly
   fine.

2) Ring provided buffer lists. These DO need to be referenced after the
   initial buffer pick, as they need to get consumed later on. This can
   be either just incrementing the head of the ring, or it can be
   consuming parts of a buffer if incremental buffer consumptions has
   been configured.

For case 2, io_uring needs to be careful not to access the buffer list
after the initial pick-and-execute context. The core does recycling of
these, but it's easy to make a mistake, because it's stored in the
io_kiocb which does persist across multiple execution contexts. Either
because it's a multishot request, or simply because it needed some kind
of async trigger (eg poll) for retry purposes.

Add a struct io_buffer_list to struct io_br_sel, which is always on
stack for the various users of it. This prevents the buffer list from
leaking outside of that execution context, and additionally it enables
kbuf to not even pass back the struct io_buffer_list if the given
context isn't appropriately locked already.

This doesn't fix any bugs, it's simply a defensive measure to prevent
any issues with reuse of a buffer list.

Link: https://lore.kernel.org/r/20250821020750.598432-12-axboe@kernel.dk
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/kbuf: pass in struct io_buffer_list to commit/recycle helpers</title>
<updated>2025-08-24T17:41:12Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2025-08-21T02:03:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1b5add75d7c894c62506c9b55f1d9eaadae50ef1'/>
<id>urn:sha1:1b5add75d7c894c62506c9b55f1d9eaadae50ef1</id>
<content type='text'>
Rather than have this implied being in the io_kiocb, pass it in directly
so it's immediately obvious where these users of -&gt;buf_list are coming
from.

Link: https://lore.kernel.org/r/20250821020750.598432-6-axboe@kernel.dk
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-6.17/io_uring-20250728' of git://git.kernel.dk/linux</title>
<updated>2025-07-28T23:30:12Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-07-28T23:30:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c3018a2c6adae9b32f7b9259f5b38257ba9a758e'/>
<id>urn:sha1:c3018a2c6adae9b32f7b9259f5b38257ba9a758e</id>
<content type='text'>
Pull io_uring updates from Jens Axboe:

 - Optimization to avoid reference counts on non-cloned registered
   buffers. This is how these buffers were handled prior to having
   cloning support, and we can still use that approach as long as the
   buffers haven't been cloned to another ring.

 - Cleanup and improvement for uring_cmd, where btrfs was the only user
   of storing allocated data for the lifetime of the uring_cmd. Clean
   that up so we can get rid of the need to do that.

 - Avoid unnecessary memory copies in uring_cmd usage. This is
   particularly important as a lot of uring_cmd usage necessitates the
   use of 128b SQEs.

 - A few updates for recv multishot, where it's now possible to add
   fairness limits for limiting how much is transferred for each retry
   loop. Additionally, recv multishot now supports an overall cap as
   well, where once reached the multishot recv will terminate. The
   latter is useful for buffer management and juggling many recv streams
   at the same time.

 - Add support for returning the TX timestamps via a new socket command.
   This feature can work in either singleshot or multishot mode, where
   the latter triggers a completion whenever new timestamps are
   available. This is an alternative to using the existing error queue.

 - Add support for an io_uring "mock" file, which is the start of being
   able to do 100% targeted testing in terms of exercising io_uring
   request handling. The idea is to have a file type that can be
   anything the tester would like, and behave exactly how you want it to
   behave in terms of hitting the code paths you want.

 - Improve zcrx by using sgtables to de-duplicate and improve dma
   address handling.

 - Prep work for supporting larger pages for zcrx.

 - Various little improvements and fixes.

* tag 'for-6.17/io_uring-20250728' of git://git.kernel.dk/linux: (42 commits)
  io_uring/zcrx: fix leaking pages on sg init fail
  io_uring/zcrx: don't leak pages on account failure
  io_uring/zcrx: fix null ifq on area destruction
  io_uring: fix breakage in EXPERT menu
  io_uring/cmd: remove struct io_uring_cmd_data
  btrfs/ioctl: store btrfs_uring_encoded_data in io_btrfs_cmd
  io_uring/cmd: introduce IORING_URING_CMD_REISSUE flag
  io_uring/zcrx: account area memory
  io_uring: export io_[un]account_mem
  io_uring/net: Support multishot receive len cap
  io_uring: deduplicate wakeup handling
  io_uring/net: cast min_not_zero() type
  io_uring/poll: cleanup apoll freeing
  io_uring/net: allow multishot receive per-invocation cap
  io_uring/net: move io_sr_msg-&gt;retry_flags to io_sr_msg-&gt;flags
  io_uring/net: use passed in 'len' in io_recv_buf_select()
  io_uring/zcrx: prepare fallback for larger pages
  io_uring/zcrx: assert area type in io_zcrx_iov_page
  io_uring/zcrx: allocate sgtable for umem areas
  io_uring/zcrx: introduce io_populate_area_dma
  ...
</content>
</entry>
</feed>
