<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/block/elevator.c, branch v5.8</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.8</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.8'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-11-07T19:27:19Z</updated>
<entry>
<title>Merge branch 'for-linus' into for-5.5/block</title>
<updated>2019-11-07T19:27:19Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2019-11-07T19:27:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=912c0a85911a6364ac127b6e1de8c2c7782db1bc'/>
<id>urn:sha1:912c0a85911a6364ac127b6e1de8c2c7782db1bc</id>
<content type='text'>
Pull on for-linus to resolve what otherwise would have been a conflict
with the cgroups rstat patchset from Tejun.

* for-linus: (942 commits)
  blkcg: make blkcg_print_stat() print stats only for online blkgs
  nvme: change nvme_passthru_cmd64 to explicitly mark rsvd
  nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths
  nvme-rdma: fix a segmentation fault during module unload
  iocost: don't nest spin_lock_irq in ioc_weight_write()
  io_uring: ensure we clear io_kiocb-&gt;result before each issue
  um-ubd: Entrust re-queue to the upper layers
  nvme-multipath: remove unused groups_only mode in ana log
  nvme-multipath: fix possible io hang after ctrl reconnect
  io_uring: don't touch ctx in setup after ring fd install
  io_uring: Fix leaked shadow_req
  Linux 5.4-rc5
  riscv: cleanup do_trap_break
  nbd: verify socket is supported during setup
  ata: libahci_platform: Fix regulator_get_optional() misuse
  nbd: handle racing with error'ed out commands
  nbd: protect cmd-&gt;status with cmd-&gt;lock
  io_uring: fix bad inflight accounting for SETUP_IOPOLL|SETUP_SQTHREAD
  io_uring: used cached copies of sq-&gt;dropped and cq-&gt;overflow
  ARM: dts: stm32: relax qspi pins slew-rate for stm32mp157
  ...
</content>
</entry>
<entry>
<title>block: Warn if elevator= parameter is used</title>
<updated>2019-11-06T14:16:07Z</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2019-11-06T10:48:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f8db383507d658c5a729b062c97710efda876cd4'/>
<id>urn:sha1:f8db383507d658c5a729b062c97710efda876cd4</id>
<content type='text'>
With transition to blk-mq, the elevator= kernel argument was removed as
it makes less and less sense with the current variety of devices.  Since
this may surprise some users and there are advices on the Internet that
still suggest to use it, let's at least warn if the parameter is used.

Reviewed-by: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: Fix elv_support_iosched()</title>
<updated>2019-10-14T19:54:09Z</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@wdc.com</email>
</author>
<published>2019-10-08T22:39:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7a7c5e715e722c86d602c56a09e77f000364e263'/>
<id>urn:sha1:7a7c5e715e722c86d602c56a09e77f000364e263</id>
<content type='text'>
A BIO based request queue does not have a tag_set, which prevent testing
for the flag BLK_MQ_F_NO_SCHED indicating that the queue does not
require an elevator. This leads to an incorrect initialization of a
default elevator in some cases such as BIO based null_blk
(queue_mode == BIO) with zoned mode enabled as the default elevator in
this case is mq-deadline instead of "none".

Fix this by testing for a NULL queue mq_ops field which indicates that
the queue is BIO based and should not have an elevator.

Reported-by: Shinichiro Kawasaki &lt;shinichiro.kawasaki@wdc.com&gt;
Reviewed-by: Bob Liu &lt;bob.liu@oracle.com&gt;
Signed-off-by: Damien Le Moal &lt;damien.lemoal@wdc.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: don't release queue's sysfs lock during switching elevator</title>
<updated>2019-09-26T06:45:51Z</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-09-23T15:12:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b89f625e28d44552083f43752f62d8621ded0a04'/>
<id>urn:sha1:b89f625e28d44552083f43752f62d8621ded0a04</id>
<content type='text'>
cecf5d87ff20 ("block: split .sysfs_lock into two locks") starts to
release &amp; acquire sysfs_lock before registering/un-registering elevator
queue during switching elevator for avoiding potential deadlock from
showing &amp; storing 'queue/iosched' attributes and removing elevator's
kobject.

Turns out there isn't such deadlock because 'q-&gt;sysfs_lock' isn't
required in .show &amp; .store of queue/iosched's attributes, and just
elevator's sysfs lock is acquired in elv_iosched_store() and
elv_iosched_show(). So it is safe to hold queue's sysfs lock when
registering/un-registering elevator queue.

The biggest issue is that commit cecf5d87ff20 assumes that concurrent
write on 'queue/scheduler' can't happen. However, this assumption isn't
true, because kernfs_fop_write() only guarantees that concurrent write
aren't called on the same open file, but the write could be from
different open on the file. So we can't release &amp; re-acquire queue's
sysfs lock during switching elevator, otherwise use-after-free on
elevator could be triggered.

Fixes the issue by not releasing queue's sysfs lock during switching
elevator.

Fixes: cecf5d87ff20 ("block: split .sysfs_lock into two locks")
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Hannes Reinecke &lt;hare@suse.com&gt;
Cc: Greg KH &lt;gregkh@linuxfoundation.org&gt;
Cc: Mike Snitzer &lt;snitzer@redhat.com&gt;
Reviewed-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: fix elevator_get_by_features()</title>
<updated>2019-09-06T13:02:31Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2019-09-06T13:02:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a26142559c2be8c0975b941e3110d23a9e552ce5'/>
<id>urn:sha1:a26142559c2be8c0975b941e3110d23a9e552ce5</id>
<content type='text'>
The lookup logic is broken - 'e' will never be NULL, even if the
list is empty. Maintain lookup hit in a separate variable instead.

Fixes: a0958ba7fcdc ("block: Improve default elevator selection")
Reported-by: Julia Lawall &lt;julia.lawall@lip6.fr&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: Delay default elevator initialization</title>
<updated>2019-09-06T01:52:34Z</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@wdc.com</email>
</author>
<published>2019-09-05T09:51:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=737eb78e82d52d35df166d29af32bf61992de71d'/>
<id>urn:sha1:737eb78e82d52d35df166d29af32bf61992de71d</id>
<content type='text'>
When elevator_init_mq() is called from blk_mq_init_allocated_queue(),
the only information known about the device is the number of hardware
queues as the block device scan by the device driver is not completed
yet for most drivers. The device type and elevator required features
are not set yet, preventing to correctly select the default elevator
most suitable for the device.

This currently affects all multi-queue zoned block devices which default
to the "none" elevator instead of the required "mq-deadline" elevator.
These drives currently include host-managed SMR disks connected to a
smartpqi HBA and null_blk block devices with zoned mode enabled.
Upcoming NVMe Zoned Namespace devices will also be affected.

Fix this by adding the boolean elevator_init argument to
blk_mq_init_allocated_queue() to control the execution of
elevator_init_mq(). Two cases exist:
1) elevator_init = false is used for calls to
   blk_mq_init_allocated_queue() within blk_mq_init_queue(). In this
   case, a call to elevator_init_mq() is added to __device_add_disk(),
   resulting in the delayed initialization of the queue elevator
   after the device driver finished probing the device information. This
   effectively allows elevator_init_mq() access to more information
   about the device.
2) elevator_init = true preserves the current behavior of initializing
   the elevator directly from blk_mq_init_allocated_queue(). This case
   is used for the special request based DM devices where the device
   gendisk is created before the queue initialization and device
   information (e.g. queue limits) is already known when the queue
   initialization is executed.

Additionally, to make sure that the elevator initialization is never
done while requests are in-flight (there should be none when the device
driver calls device_add_disk()), freeze and quiesce the device request
queue before calling blk_mq_init_sched() in elevator_init_mq().

Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Damien Le Moal &lt;damien.lemoal@wdc.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: Improve default elevator selection</title>
<updated>2019-09-06T01:52:34Z</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@wdc.com</email>
</author>
<published>2019-09-05T09:51:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a0958ba7fcdc316e3900f8d2afda519850d60985'/>
<id>urn:sha1:a0958ba7fcdc316e3900f8d2afda519850d60985</id>
<content type='text'>
For block devices that do not specify required features, preserve the
current default elevator selection (mq-deadline for single queue
devices, none for multi-queue devices). However, for devices specifying
required features (e.g. zoned block devices ELEVATOR_F_ZBD_SEQ_WRITE
feature), select the first available elevator providing the required
features.

In all cases, default to "none" if no elevator is available or if the
initialization of the default elevator fails.

Reviewed-by: Johannes Thumshirn &lt;jthumshirn@suse.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Damien Le Moal &lt;damien.lemoal@wdc.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: Introduce elevator features</title>
<updated>2019-09-06T01:52:33Z</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@wdc.com</email>
</author>
<published>2019-09-05T09:51:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=68c43f133a754c7bf5cb1018bb16dc0821cc43a1'/>
<id>urn:sha1:68c43f133a754c7bf5cb1018bb16dc0821cc43a1</id>
<content type='text'>
Introduce the definition of elevator features through the
elevator_features flags in the elevator_type structure. Each flag can
represent a feature supported by an elevator. The first feature defined
by this patch is support for zoned block device sequential write
constraint with the flag ELEVATOR_F_ZBD_SEQ_WRITE, which is implemented
by the mq-deadline elevator using zone write locking.

Other possible features are IO priorities, write hints, latency targets
or single-LUN dual-actuator disks (for which the elevator could maintain
one LBA ordered list per actuator).

The required_elevator_features field is also added to the request_queue
structure to allow a device driver to specify elevator feature flags
that an elevator must support for the correct operation of the device
(e.g. device drivers for zoned block devices can have the
ELEVATOR_F_ZBD_SEQ_WRITE flag as a required feature).
The helper function blk_queue_required_elevator_features() is
defined for setting this new field.

With these two new fields in place, the elevator functions
elevator_match() and elevator_find() are modified to allow a user to set
only an elevator with a set of features that satisfies the device
required features. Elevators not matching the device requirements are
not shown in the device sysfs queue/scheduler file to prevent their use.

The "none" elevator can always be selected as before.

Reviewed-by: Johannes Thumshirn &lt;jthumshirn@suse.de&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Damien Le Moal &lt;damien.lemoal@wdc.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: Change elevator_init_mq() to always succeed</title>
<updated>2019-09-06T01:52:33Z</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@wdc.com</email>
</author>
<published>2019-09-05T09:51:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=954b4a5ce4a806e7c284ce6b2659abdd03d0b6e2'/>
<id>urn:sha1:954b4a5ce4a806e7c284ce6b2659abdd03d0b6e2</id>
<content type='text'>
If the default elevator chosen is mq-deadline, elevator_init_mq() may
return an error if mq-deadline initialization fails, leading to
blk_mq_init_allocated_queue() returning an error, which in turn will
cause the block device initialization to fail and the device not being
exposed.

Instead of taking such extreme measure, handle mq-deadline
initialization failures in the same manner as when mq-deadline is not
available (no module to load), that is, default to the "none" scheduler.
With this change, elevator_init_mq() return type can be changed to void.

Reviewed-by: Johannes Thumshirn &lt;jthumshirn@suse.de&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Damien Le Moal &lt;damien.lemoal@wdc.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: Cleanup elevator_init_mq() use</title>
<updated>2019-09-06T01:52:33Z</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@wdc.com</email>
</author>
<published>2019-09-05T09:51:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=61db437d1cc16c470cf6fccc04d34be9cf6e4e4b'/>
<id>urn:sha1:61db437d1cc16c470cf6fccc04d34be9cf6e4e4b</id>
<content type='text'>
Instead of checking a queue tag_set BLK_MQ_F_NO_SCHED flag before
calling elevator_init_mq() to make sure that the queue supports IO
scheduling, use the elevator.c function elv_support_iosched() in
elevator_init_mq(). This does not introduce any functional change but
ensure that elevator_init_mq() does the right thing based on the queue
settings.

Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Johannes Thumshirn &lt;jthumshirn@suse.de&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Damien Le Moal &lt;damien.lemoal@wdc.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
