<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/nvme/host, branch master</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=master</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2026-04-13T23:22:30Z</updated>
<entry>
<title>Merge tag 'for-7.1/io_uring-20260411' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux</title>
<updated>2026-04-13T23:22:30Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-04-13T23:22:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=23acda7c221a76ff711d65f4ca90029d43b249a0'/>
<id>urn:sha1:23acda7c221a76ff711d65f4ca90029d43b249a0</id>
<content type='text'>
Pull io_uring updates from Jens Axboe:

 - Add a callback driven main loop for io_uring, and BPF struct_ops
   on top to allow implementing custom event loop logic

 - Decouple IOPOLL from being a ring-wide all-or-nothing setting,
   allowing IOPOLL use cases to also issue certain white listed
   non-polled opcodes

 - Timeout improvements. Migrate internal timeout storage from
   timespec64 to ktime_t for simpler arithmetic and avoid copying of
   timespec data

 - Zero-copy receive (zcrx) updates:

      - Add a device-less mode (ZCRX_REG_NODEV) for testing and
        experimentation where data flows through the copy fallback path

      - Fix two-step unregistration regression, DMA length calculations,
        xarray mark usage, and a potential 32-bit overflow in id
        shifting

      - Refactoring toward multi-area support: dedicated refill queue
        struct, consolidated DMA syncing, netmem array refilling format,
        and guard-based locking

 - Zero-copy transmit (zctx) cleanup:

      - Unify io_send_zc() and io_sendmsg_zc() into a single function

      - Add vectorized registered buffer send for IORING_OP_SEND_ZC

      - Add separate notification user_data via sqe-&gt;addr3 so
        notification and completion CQEs can be distinguished without
        extra reference counting

 - Switch struct io_ring_ctx internal bitfields to explicit flag bits
   with atomic-safe accessors, and annotate the known harmless races on
   those flags

 - Various optimizations caching ctx and other request fields in local
   variables to avoid repeated loads, and cleanups for tctx setup, ring
   fd registration, and read path early returns

* tag 'for-7.1/io_uring-20260411' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (58 commits)
  io_uring: unify getting ctx from passed in file descriptor
  io_uring/register: don't get a reference to the registered ring fd
  io_uring/tctx: clean up __io_uring_add_tctx_node() error handling
  io_uring/tctx: have io_uring_alloc_task_context() return tctx
  io_uring/timeout: use 'ctx' consistently
  io_uring/rw: clean up __io_read() obsolete comment and early returns
  io_uring/zcrx: use correct mmap off constants
  io_uring/zcrx: use dma_len for chunk size calculation
  io_uring/zcrx: don't clear not allocated niovs
  io_uring/zcrx: don't use mark0 for allocating xarray
  io_uring: cast id to u64 before shifting in io_allocate_rbuf_ring()
  io_uring/zcrx: reject REG_NODEV with large rx_buf_size
  io_uring/cancel: validate opcode for IORING_ASYNC_CANCEL_OP
  io_uring/rsrc: use io_cache_free() to free node
  io_uring/zcrx: rename zcrx [un]register functions
  io_uring/zcrx: check ctrl op payload struct sizes
  io_uring/zcrx: cache fallback availability in zcrx ctx
  io_uring/zcrx: warn on a repeated area append
  io_uring/zcrx: consolidate dma syncing
  io_uring/zcrx: netmem array as refiling format
  ...
</content>
</entry>
<entry>
<title>Merge tag 'for-7.1/block-20260411' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux</title>
<updated>2026-04-13T22:51:31Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-04-13T22:51:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7fe6ac157b7e15c8976bd62ad7cb98e248884e83'/>
<id>urn:sha1:7fe6ac157b7e15c8976bd62ad7cb98e248884e83</id>
<content type='text'>
Pull block updates from Jens Axboe:

 - Add shared memory zero-copy I/O support for ublk, bypassing per-I/O
   copies between kernel and userspace by matching registered buffer
   PFNs at I/O time. Includes selftests.

 - Refactor bio integrity to support filesystem initiated integrity
   operations and arbitrary buffer alignment.

 - Clean up bio allocation, splitting bio_alloc_bioset() into clear fast
   and slow paths. Add bio_await() and bio_submit_or_kill() helpers,
   unify synchronous bi_end_io callbacks.

 - Fix zone write plug refcount handling and plug removal races. Add
   support for serializing zone writes at QD=1 for rotational zoned
   devices, yielding significant throughput improvements.

 - Add SED-OPAL ioctls for Single User Mode management and a STACK_RESET
   command.

 - Add io_uring passthrough (uring_cmd) support to the BSG layer.

 - Replace pp_buf in partition scanning with struct seq_buf.

 - zloop improvements and cleanups.

 - drbd genl cleanup, switching to pre_doit/post_doit.

 - NVMe pull request via Keith:
      - Fabrics authentication updates
      - Enhanced block queue limits support
      - Workqueue usage updates
      - A new write zeroes device quirk
      - Tagset cleanup fix for loop device

 - MD pull requests via Yu Kuai:
      - Fix raid5 soft lockup in retry_aligned_read()
      - Fix raid10 deadlock with check operation and nowait requests
      - Fix raid1 overlapping writes on writemostly disks
      - Fix sysfs deadlock on array_state=clear
      - Proactive RAID-5 parity building with llbitmap, with
        write_zeroes_unmap optimization for initial sync
      - Fix llbitmap barrier ordering, rdev skipping, and bitmap_ops
        version mismatch fallback
      - Fix bcache use-after-free and uninitialized closure
      - Validate raid5 journal metadata payload size
      - Various cleanups

 - Various other fixes, improvements, and cleanups

* tag 'for-7.1/block-20260411' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (146 commits)
  ublk: fix tautological comparison warning in ublk_ctrl_reg_buf
  scsi: bsg: fix buffer overflow in scsi_bsg_uring_cmd()
  block: refactor blkdev_zone_mgmt_ioctl
  MAINTAINERS: update ublk driver maintainer email
  Documentation: ublk: address review comments for SHMEM_ZC docs
  ublk: allow buffer registration before device is started
  ublk: replace xarray with IDA for shmem buffer index allocation
  ublk: simplify PFN range loop in __ublk_ctrl_reg_buf
  ublk: verify all pages in multi-page bvec fall within registered range
  ublk: widen ublk_shmem_buf_reg.len to __u64 for 4GB buffer support
  xfs: use bio_await in xfs_zone_gc_reset_sync
  block: add a bio_submit_or_kill helper
  block: factor out a bio_await helper
  block: unify the synchronous bi_end_io callbacks
  xfs: fix number of GC bvecs
  selftests/ublk: add read-only buffer registration test
  selftests/ublk: add filesystem fio verify test for shmem_zc
  selftests/ublk: add hugetlbfs shmem_zc test for loop target
  selftests/ublk: add shared memory zero-copy test
  selftests/ublk: add UBLK_F_SHMEM_ZC support for loop target
  ...
</content>
</entry>
<entry>
<title>nvme-auth: Don't propose NVME_AUTH_DHGROUP_NULL with SC_C</title>
<updated>2026-03-27T14:35:05Z</updated>
<author>
<name>Alistair Francis</name>
<email>alistair.francis@wdc.com</email>
</author>
<published>2026-03-20T00:20:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=33eb451044498098babb93b4161e896e0a3e9291'/>
<id>urn:sha1:33eb451044498098babb93b4161e896e0a3e9291</id>
<content type='text'>
Section 8.3.4.5.2 of the NVMe 2.1 base spec states that

"""
The 00h identifier shall not be proposed in an AUTH_Negotiate message
that requests secure channel concatenation (i.e., with the SC_C field
set to a non-zero value).
"""

We need to ensure that we don't set the NVME_AUTH_DHGROUP_NULL idlist if
SC_C is set.

Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Chris Leech &lt;cleech@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Kamaljit Singh &lt;kamaljit.singh@opensource.wdc.com&gt;
Signed-off-by: Alistair Francis &lt;alistair.francis@wdc.com&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-pci: add NVME_QUIRK_DISABLE_WRITE_ZEROES for Kingston OM3SGP4</title>
<updated>2026-03-27T14:35:05Z</updated>
<author>
<name>Robert Beckett</name>
<email>bob.beckett@collabora.com</email>
</author>
<published>2026-03-20T19:22:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a8eebf9699d69987cc49cec4e4fdb4111ab32423'/>
<id>urn:sha1:a8eebf9699d69987cc49cec4e4fdb4111ab32423</id>
<content type='text'>
The Kingston OM3SGP42048K2-A00 (PCI ID 2646:502f) firmware has a race
condition when processing concurrent write zeroes and DSM (discard)
commands, causing spurious "LBA Out of Range" errors and IOMMU page
faults at address 0x0.

The issue is reliably triggered by running two concurrent mkfs commands
on different partitions of the same drive, which generates interleaved
write zeroes and discard operations.

Disable write zeroes for this device, matching the pattern used for
other Kingston OM* drives that have similar firmware issues.

Cc: stable@vger.kernel.org
Signed-off-by: Robert Beckett &lt;bob.beckett@collabora.com&gt;
Assisted-by: claude-opus-4-6-v1
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme: respect NVME_QUIRK_DISABLE_WRITE_ZEROES when wzsl is set</title>
<updated>2026-03-27T14:35:05Z</updated>
<author>
<name>Robert Beckett</name>
<email>bob.beckett@collabora.com</email>
</author>
<published>2026-03-20T19:22:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=40f0496b617b431f8d2dd94d7f785c1121f8a68a'/>
<id>urn:sha1:40f0496b617b431f8d2dd94d7f785c1121f8a68a</id>
<content type='text'>
The NVM Command Set Identify Controller data may report a non-zero
Write Zeroes Size Limit (wzsl). When present, nvme_init_non_mdts_limits()
unconditionally overrides max_zeroes_sectors from wzsl, even if
NVME_QUIRK_DISABLE_WRITE_ZEROES previously set it to zero.

This effectively re-enables write zeroes for devices that need it
disabled, defeating the quirk. Several Kingston OM* drives rely on
this quirk to avoid firmware issues with write zeroes commands.

Check for the quirk before applying the wzsl override.

Fixes: 5befc7c26e5a ("nvme: implement non-mdts command limits")
Cc: stable@vger.kernel.org
Signed-off-by: Robert Beckett &lt;bob.beckett@collabora.com&gt;
Assisted-by: claude-opus-4-6-v1
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme: set discard_granularity from NPDG/NPDA</title>
<updated>2026-03-27T14:35:04Z</updated>
<author>
<name>Caleb Sander Mateos</name>
<email>csander@purestorage.com</email>
</author>
<published>2026-02-27T20:23:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1029298da36559866ecb5ddaa3d78deb02f2ab9b'/>
<id>urn:sha1:1029298da36559866ecb5ddaa3d78deb02f2ab9b</id>
<content type='text'>
Currently, nvme_config_discard() always sets the discard_granularity
queue limit to the logical block size. However, NVMe namespaces can
advertise a larger preferred discard granularity in the NPDG or NPDA
field of the Identify Namespace structure or the NPDGL or NPDAL fields
of the I/O Command Set Specific Identify Namespace structure.

Use these fields to compute the discard_granularity limit. The logic is
somewhat involved. First, the fields are optional. NPDG is only reported
if the low bit of OPTPERF is set in NSFEAT. NPDA is reported if any bit
of OPTPERF is set. And NPDGL and NPDAL are reported if the high bit of
OPTPERF is set. NPDGL and NPDAL can also each be set to 0 to opt out of
reporting a limit. I/O Command Set Specific Identify Namespace may also
not be supported by older NVMe controllers. Another complication is that
multiple values may be reported among NPDG, NPDGL, NPDA, and NPDAL. The
spec says to prefer the values reported in the L variants. The spec says
NPDG should be a multiple of NPDA and NPDGL should be a multiple of
NPDAL, but it doesn't specify a relationship between NPDG and NPDAL or
NPDGL and NPDA. So use the maximum of the reported NPDG(L) and NPDA(L)
values as the discard_granularity.

Signed-off-by: Caleb Sander Mateos &lt;csander@purestorage.com&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme: add from0based() helper</title>
<updated>2026-03-27T14:35:04Z</updated>
<author>
<name>Caleb Sander Mateos</name>
<email>csander@purestorage.com</email>
</author>
<published>2026-02-27T20:23:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b465046c8cca9e418c7b39cfb41bb8e3b62f22f6'/>
<id>urn:sha1:b465046c8cca9e418c7b39cfb41bb8e3b62f22f6</id>
<content type='text'>
The NVMe specifications are big fans of "0's based"/"0-based" fields for
encoding values that must be positive. The encoded value is 1 less than
the value it represents. nvmet already provides a helper to0based() for
encoding 0's based values, so add a corresponding helper to decode these
fields on the host side.

Suggested-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Caleb Sander Mateos &lt;csander@purestorage.com&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme: always issue I/O Command Set specific Identify Namespace</title>
<updated>2026-03-27T14:35:04Z</updated>
<author>
<name>Caleb Sander Mateos</name>
<email>csander@purestorage.com</email>
</author>
<published>2026-02-27T20:23:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=823340b7e877b410a814177360df34810878e916'/>
<id>urn:sha1:823340b7e877b410a814177360df34810878e916</id>
<content type='text'>
Currently, the I/O Command Set specific Identify Namespace structure is
only fetched for controllers that support extended LBA formats. This is
because struct nvme_id_ns_nvm is only used by nvme_configure_pi_elbas(),
which is only called when the ELBAS bit is set in the CTRATT field of
the Identify Controller structure.

However, the I/O Command Set specific Identify Namespace structure will
soon be used in nvme_update_disk_info(), so always try to obtain it in
nvme_update_ns_info_block(). This Identify structure is first defined in
NVMe spec version 2.0, but controllers reporting older versions could
still implement it.

Signed-off-by: Caleb Sander Mateos &lt;csander@purestorage.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme: update nvme_id_ns OPTPERF constants</title>
<updated>2026-03-27T14:35:04Z</updated>
<author>
<name>Caleb Sander Mateos</name>
<email>csander@purestorage.com</email>
</author>
<published>2026-02-27T20:23:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d3c04a6ea5fd7a3d81f7c80880125108df9a4cbd'/>
<id>urn:sha1:d3c04a6ea5fd7a3d81f7c80880125108df9a4cbd</id>
<content type='text'>
In NVMe verson 2.0 and below, OPTPERF comprises only bit 4 of NSFEAT in
the Identify Namespace structure. Since version 2.1, OPTPERF includes
both bits 4 and 5 of NSFEAT. Replace the NVME_NS_FEAT_IO_OPT constant
with NVME_NS_FEAT_OPTPERF_SHIFT, NVME_NS_FEAT_OPTPERF_MASK, and
NVME_NS_FEAT_OPTPERF_MASK_2_1, representing the first bit, pre-2.1 bit
width, and post-2.1 bit width of OPTPERF.

Update nvme_update_disk_info() to check both OPTPERF bits for
controllers that report version 2.1 or newer, as NPWG and NOWS are
supported even if only bit 5 is set.

Signed-off-by: Caleb Sander Mateos &lt;csander@purestorage.com&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme: fold nvme_config_discard() into nvme_update_disk_info()</title>
<updated>2026-03-27T14:35:04Z</updated>
<author>
<name>Caleb Sander Mateos</name>
<email>csander@purestorage.com</email>
</author>
<published>2026-02-27T20:23:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9110b85244f142ca4bcaea27be408c778d3c48d0'/>
<id>urn:sha1:9110b85244f142ca4bcaea27be408c778d3c48d0</id>
<content type='text'>
The choice of what queue limits are set in nvme_update_disk_info() vs.
nvme_config_discard() seems a bit arbitrary. A subsequent commit will
compute the discard_granularity limit using struct nvme_id_ns, which is
only passed to nvme_update_disk_info() currently. So move the logic in
nvme_config_discard() to nvme_update_disk_info(). Replace several
instances of ns-&gt;ctrl in nvme_update_disk_info() with the ctrl variable
brought from nvme_config_discard().

Signed-off-by: Caleb Sander Mateos &lt;csander@purestorage.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
</feed>
