<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/block, branch v5.12</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.12</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.12'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2021-04-21T16:49:37Z</updated>
<entry>
<title>block: return -EBUSY when there are open partitions in blkdev_reread_part</title>
<updated>2021-04-21T16:49:37Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2021-04-21T16:05:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=68e6582e8f2dc32fd2458b9926564faa1fb4560e'/>
<id>urn:sha1:68e6582e8f2dc32fd2458b9926564faa1fb4560e</id>
<content type='text'>
The switch to go through blkdev_get_by_dev means we now ignore the
return value from bdev_disk_changed in __blkdev_get.  Add a manual
check to restore the old semantics.

Fixes: 4601b4b130de ("block: reopen the device in blkdev_reread_part")
Reported-by: Karel Zak &lt;kzak@redhat.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/r/20210421160502.447418-1-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: remove the unused RQF_ALLOCED flag</title>
<updated>2021-04-02T17:18:31Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2021-04-02T17:17:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f06c609645ecd043c79380fac94145926603fb33'/>
<id>urn:sha1:f06c609645ecd043c79380fac94145926603fb33</id>
<content type='text'>
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: only update parent bi_status when bio fail</title>
<updated>2021-04-01T01:18:04Z</updated>
<author>
<name>Yufen Yu</name>
<email>yuyufen@huawei.com</email>
</author>
<published>2021-03-31T11:53:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3edf5346e4f2ce2fa0c94651a90a8dda169565ee'/>
<id>urn:sha1:3edf5346e4f2ce2fa0c94651a90a8dda169565ee</id>
<content type='text'>
For multiple split bios, if one of the bio is fail, the whole
should return error to application. But we found there is a race
between bio_integrity_verify_fn and bio complete, which return
io success to application after one of the bio fail. The race as
following:

split bio(READ)          kworker

nvme_complete_rq
blk_update_request //split error=0
  bio_endio
    bio_integrity_endio
      queue_work(kintegrityd_wq, &amp;bip-&gt;bip_work);

                         bio_integrity_verify_fn
                         bio_endio //split bio
                          __bio_chain_endio
                             if (!parent-&gt;bi_status)

                               &lt;interrupt entry&gt;
                               nvme_irq
                                 blk_update_request //parent error=7
                                 req_bio_endio
                                    bio-&gt;bi_status = 7 //parent bio
                               &lt;interrupt exit&gt;

                               parent-&gt;bi_status = 0
                        parent-&gt;bi_end_io() // return bi_status=0

The bio has been split as two: split and parent. When split
bio completed, it depends on kworker to do endio, while
bio_integrity_verify_fn have been interrupted by parent bio
complete irq handler. Then, parent bio-&gt;bi_status which have
been set in irq handler will overwrite by kworker.

In fact, even without the above race, we also need to conside
the concurrency beteen mulitple split bio complete and update
the same parent bi_status. Normally, multiple split bios will
be issued to the same hctx and complete from the same irq
vector. But if we have updated queue map between multiple split
bios, these bios may complete on different hw queue and different
irq vector. Then the concurrency update parent bi_status may
cause the final status error.

Suggested-by: Keith Busch &lt;kbusch@kernel.org&gt;
Signed-off-by: Yufen Yu &lt;yuyufen@huawei.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Link: https://lore.kernel.org/r/20210331115359.1125679-1-yuyufen@huawei.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: don't create too many partitions</title>
<updated>2021-03-27T15:22:18Z</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2021-03-27T07:13:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e82fc7855749aa197740a60ef22c492c41ea5d5f'/>
<id>urn:sha1:e82fc7855749aa197740a60ef22c492c41ea5d5f</id>
<content type='text'>
Commit a33df75c6328 ("block: use an xarray for disk-&gt;part_tbl") drops the
check on max supported number of partitionsr, and allows partition with
bigger partition numbers to be added. However, -&gt;bd_partno is defined as
u8, so partition index of xarray table may not match with -&gt;bd_partno.
Then delete_partition() may delete one unmatched partition, and caused
use-after-free.

Reviewed-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Reported-by: syzbot+8fede7e30c7cee0de139@syzkaller.appspotmail.com
Fixes: a33df75c6328 ("block: use an xarray for disk-&gt;part_tbl")
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: support zone append bvecs</title>
<updated>2021-03-24T17:36:51Z</updated>
<author>
<name>Johannes Thumshirn</name>
<email>johannes.thumshirn@wdc.com</email>
</author>
<published>2021-03-24T15:47:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7de55b7d6f09a2865279d3c41c0fbdbfdb87486a'/>
<id>urn:sha1:7de55b7d6f09a2865279d3c41c0fbdbfdb87486a</id>
<content type='text'>
Christoph reported that we'll likely trigger the WARN_ON_ONCE() checking
that we're not submitting a bvec with REQ_OP_ZONE_APPEND in
bio_iov_iter_get_pages() some time ago using zoned btrfs, but I couldn't
reproduce it back then.

Now Naohiro was able to trigger the bug as well with xfstests generic/095
on a zoned btrfs.

There is nothing that prevents bvec submissions via REQ_OP_ZONE_APPEND if
the hardware's zone append limit is met.

Reported-by: Naohiro Aota &lt;naohiro.aota@wdc.com&gt;
Reported-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/r/10bd414d9326c90cd69029077db63b363854eee5.1616600835.git.johannes.thumshirn@wdc.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: recalculate segment count for multi-segment discards correctly</title>
<updated>2021-03-23T16:39:57Z</updated>
<author>
<name>David Jeffery</name>
<email>djeffery@redhat.com</email>
</author>
<published>2021-02-11T14:38:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a958937ff166fc60d1c3a721036f6ff41bfa2821'/>
<id>urn:sha1:a958937ff166fc60d1c3a721036f6ff41bfa2821</id>
<content type='text'>
When a stacked block device inserts a request into another block device
using blk_insert_cloned_request, the request's nr_phys_segments field gets
recalculated by a call to blk_recalc_rq_segments in
blk_cloned_rq_check_limits. But blk_recalc_rq_segments does not know how to
handle multi-segment discards. For disk types which can handle
multi-segment discards like nvme, this results in discard requests which
claim a single segment when it should report several, triggering a warning
in nvme and causing nvme to fail the discard from the invalid state.

 WARNING: CPU: 5 PID: 191 at drivers/nvme/host/core.c:700 nvme_setup_discard+0x170/0x1e0 [nvme_core]
 ...
 nvme_setup_cmd+0x217/0x270 [nvme_core]
 nvme_loop_queue_rq+0x51/0x1b0 [nvme_loop]
 __blk_mq_try_issue_directly+0xe7/0x1b0
 blk_mq_request_issue_directly+0x41/0x70
 ? blk_account_io_start+0x40/0x50
 dm_mq_queue_rq+0x200/0x3e0
 blk_mq_dispatch_rq_list+0x10a/0x7d0
 ? __sbitmap_queue_get+0x25/0x90
 ? elv_rb_del+0x1f/0x30
 ? deadline_remove_request+0x55/0xb0
 ? dd_dispatch_request+0x181/0x210
 __blk_mq_do_dispatch_sched+0x144/0x290
 ? bio_attempt_discard_merge+0x134/0x1f0
 __blk_mq_sched_dispatch_requests+0x129/0x180
 blk_mq_sched_dispatch_requests+0x30/0x60
 __blk_mq_run_hw_queue+0x47/0xe0
 __blk_mq_delay_run_hw_queue+0x15b/0x170
 blk_mq_sched_insert_requests+0x68/0xe0
 blk_mq_flush_plug_list+0xf0/0x170
 blk_finish_plug+0x36/0x50
 xlog_cil_committed+0x19f/0x290 [xfs]
 xlog_cil_process_committed+0x57/0x80 [xfs]
 xlog_state_do_callback+0x1e0/0x2a0 [xfs]
 xlog_ioend_work+0x2f/0x80 [xfs]
 process_one_work+0x1b6/0x350
 worker_thread+0x53/0x3e0
 ? process_one_work+0x350/0x350
 kthread+0x11b/0x140
 ? __kthread_bind_mask+0x60/0x60
 ret_from_fork+0x22/0x30

This patch fixes blk_recalc_rq_segments to be aware of devices which can
have multi-segment discards. It calculates the correct discard segment
count by counting the number of bio as each discard bio is considered its
own segment.

Fixes: 1e739730c5b9 ("block: optionally merge discontiguous discard bios into a single request")
Signed-off-by: David Jeffery &lt;djeffery@redhat.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Laurence Oberman &lt;loberman@redhat.com&gt;
Link: https://lore.kernel.org/r/20210211143807.GA115624@redhat
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: Discard page cache of zone reset target range</title>
<updated>2021-03-11T18:49:25Z</updated>
<author>
<name>Shin'ichiro Kawasaki</name>
<email>shinichiro.kawasaki@wdc.com</email>
</author>
<published>2021-03-11T07:25:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e5113505904ea1c1c0e1f92c1cfa91fbf4da1694'/>
<id>urn:sha1:e5113505904ea1c1c0e1f92c1cfa91fbf4da1694</id>
<content type='text'>
When zone reset ioctl and data read race for a same zone on zoned block
devices, the data read leaves stale page cache even though the zone
reset ioctl zero clears all the zone data on the device. To avoid
non-zero data read from the stale page cache after zone reset, discard
page cache of reset target zones in blkdev_zone_mgmt_ioctl(). Introduce
the helper function blkdev_truncate_zone_range() to discard the page
cache. Ensure the page cache discarded by calling the helper function
before and after zone reset in same manner as fallocate does.

This patch can be applied back to the stable kernel version v5.10.y.
Rework is needed for older stable kernels.

Signed-off-by: Shin'ichiro Kawasaki &lt;shinichiro.kawasaki@wdc.com&gt;
Fixes: 3ed05a987e0f ("blk-zoned: implement ioctls")
Cc: &lt;stable@vger.kernel.org&gt; # 5.10+
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Link: https://lore.kernel.org/r/20210311072546.678999-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: Suppress uevent for hidden device when removed</title>
<updated>2021-03-11T18:48:25Z</updated>
<author>
<name>Daniel Wagner</name>
<email>dwagner@suse.de</email>
</author>
<published>2021-03-11T15:19:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9ec491447b90ad6a4056a9656b13f0b3a1e83043'/>
<id>urn:sha1:9ec491447b90ad6a4056a9656b13f0b3a1e83043</id>
<content type='text'>
register_disk() suppress uevents for devices with the GENHD_FL_HIDDEN
but enables uevents at the end again in order to announce disk after
possible partitions are created.

When the device is removed the uevents are still on and user land sees
'remove' messages for devices which were never 'add'ed to the system.

  KERNEL[95481.571887] remove   /devices/virtual/nvme-fabrics/ctl/nvme5/nvme0c5n1 (block)

Let's suppress the uevents for GENHD_FL_HIDDEN by not enabling the
uevents at all.

Signed-off-by: Daniel Wagner &lt;dwagner@suse.de&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Martin Wilck &lt;mwilck@suse.com&gt;
Link: https://lore.kernel.org/r/20210311151917.136091-1-dwagner@suse.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: rename BIO_MAX_PAGES to BIO_MAX_VECS</title>
<updated>2021-03-11T14:47:48Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2021-03-11T11:01:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a8affc03a9b375e19bc81573de0c9108317d78c7'/>
<id>urn:sha1:a8affc03a9b375e19bc81573de0c9108317d78c7</id>
<content type='text'>
Ever since the addition of multipage bio_vecs BIO_MAX_PAGES has been
horribly confusingly misnamed.  Rename it to BIO_MAX_VECS to stop
confusing users of the bio API.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Link: https://lore.kernel.org/r/20210311110137.1132391-2-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: Fix REQ_OP_ZONE_RESET_ALL handling</title>
<updated>2021-03-10T14:45:47Z</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@wdc.com</email>
</author>
<published>2021-03-10T09:09:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=faa44c69daf9ccbd5b8a1aee13e0e0d037c0be17'/>
<id>urn:sha1:faa44c69daf9ccbd5b8a1aee13e0e0d037c0be17</id>
<content type='text'>
Similarly to a single zone reset operation (REQ_OP_ZONE_RESET), execute
REQ_OP_ZONE_RESET_ALL operations with REQ_SYNC set.

Signed-off-by: Damien Le Moal &lt;damien.lemoal@wdc.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
