<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/block, branch v4.3</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.3</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.3'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2015-10-30T18:25:02Z</updated>
<entry>
<title>rbd: require stable pages if message data CRCs are enabled</title>
<updated>2015-10-30T18:25:02Z</updated>
<author>
<name>Ronny Hegewald</name>
<email>ronny.hegewald@online.de</email>
</author>
<published>2015-10-15T18:50:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bae818ee1577c27356093901a0ea48f672eda514'/>
<id>urn:sha1:bae818ee1577c27356093901a0ea48f672eda514</id>
<content type='text'>
rbd requires stable pages, as it performs a crc of the page data before
they are send to the OSDs.

But since kernel 3.9 (patch 1d1d1a767206fbe5d4c69493b7e6d2a8d08cc0a0
"mm: only enforce stable page writes if the backing device requires
it") it is not assumed anymore that block devices require stable pages.

This patch sets the necessary flag to get stable pages back for rbd.

In a ceph installation that provides multiple ext4 formatted rbd
devices "bad crc" messages appeared regularly (ca 1 message every 1-2
minutes on every OSD that provided the data for the rbd) in the
OSD-logs before this patch. After this patch this messages are pretty
much gone (only ca 1-2 / month / OSD).

Cc: stable@vger.kernel.org # 3.9+, needs backporting
Signed-off-by: Ronny Hegewald &lt;Ronny.Hegewald@online.de&gt;
[idryomov@gmail.com: require stable pages only in crc case, changelog]
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.dk/linux-block</title>
<updated>2015-10-23T22:20:57Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-10-23T22:20:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ea1ee5ff1b500ccdc64782ecef13d276afb08f14'/>
<id>urn:sha1:ea1ee5ff1b500ccdc64782ecef13d276afb08f14</id>
<content type='text'>
Pull block layer fixes from Jens Axboe:
 "A final set of fixes for 4.3.

  It is (again) bigger than I would have liked, but it's all been
  through the testing mill and has been carefully reviewed by multiple
  parties.  Each fix is either a regression fix for this cycle, or is
  marked stable.  You can scold me at KS.  The pull request contains:

   - Three simple fixes for NVMe, fixing regressions since 4.3.  From
     Arnd, Christoph, and Keith.

   - A single xen-blkfront fix from Cathy, fixing a NULL dereference if
     an error is returned through the staste change callback.

   - Fixup for some bad/sloppy code in nbd that got introduced earlier
     in this cycle.  From Markus Pargmann.

   - A blk-mq tagset use-after-free fix from Junichi.

   - A backing device lifetime fix from Tejun, fixing a crash.

   - And finally, a set of regression/stable fixes for cgroup writeback
     from Tejun"

* 'for-linus' of git://git.kernel.dk/linux-block:
  writeback: remove broken rbtree_postorder_for_each_entry_safe() usage in cgwb_bdi_destroy()
  NVMe: Fix memory leak on retried commands
  block: don't release bdi while request_queue has live references
  nvme: use an integer value to Linux errno values
  blk-mq: fix use-after-free in blk_mq_free_tag_set()
  nvme: fix 32-bit build warning
  writeback: fix incorrect calculation of available memory for memcg domains
  writeback: memcg dirty_throttle_control should be initialized with wb-&gt;memcg_completions
  writeback: bdi_writeback iteration must not skip dying ones
  writeback: fix bdi_writeback iteration in wakeup_dirtytime_writeback()
  writeback: laptop_mode_timer_fn() needs rcu_read_lock() around bdi_writeback iteration
  nbd: Add locking for tasks
  xen-blkfront: check for null drvdata in blkback_changed (XenbusStateClosing)
</content>
</entry>
<entry>
<title>rbd: prevent kernel stack blow up on rbd map</title>
<updated>2015-10-23T16:37:24Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2015-10-11T17:38:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6d69bb536bac0d403d83db1ca841444981b280cd'/>
<id>urn:sha1:6d69bb536bac0d403d83db1ca841444981b280cd</id>
<content type='text'>
Mapping an image with a long parent chain (e.g. image foo, whose parent
is bar, whose parent is baz, etc) currently leads to a kernel stack
overflow, due to the following recursion in the reply path:

  rbd_osd_req_callback()
    rbd_obj_request_complete()
      rbd_img_obj_callback()
        rbd_img_parent_read_callback()
          rbd_obj_request_complete()
            ...

Limit the parent chain to 16 images, which is ~5K worth of stack.  When
the above recursion is eliminated, this limit can be lifted.

Fixes: http://tracker.ceph.com/issues/12538

Cc: stable@vger.kernel.org # 3.10+, needs backporting for &lt; 4.2
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Josh Durgin &lt;jdurgin@redhat.com&gt;
</content>
</entry>
<entry>
<title>rbd: don't leak parent_spec in rbd_dev_probe_parent()</title>
<updated>2015-10-23T16:36:03Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2015-10-11T17:38:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1f2c6651f69c14d0d3a9cfbda44ea101b02160ba'/>
<id>urn:sha1:1f2c6651f69c14d0d3a9cfbda44ea101b02160ba</id>
<content type='text'>
Currently we leak parent_spec and trigger a "parent reference
underflow" warning if rbd_dev_create() in rbd_dev_probe_parent() fails.
The problem is we take the !parent out_err branch and that only drops
refcounts; parent_spec that would've been freed had we called
rbd_dev_unparent() remains and triggers rbd_warn() in
rbd_dev_parent_put() - at that point we have parent_spec != NULL and
parent_ref == 0, so counter ends up being -1 after the decrement.

Redo rbd_dev_probe_parent() to fix this.

Cc: stable@vger.kernel.org # 3.10+, needs backporting for &lt; 4.2
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
</content>
</entry>
<entry>
<title>rbd: use writefull op for object size writes</title>
<updated>2015-10-16T14:49:01Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2015-10-07T15:27:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e30b7577bf1d338ca8a273bd2f881de5a41572b7'/>
<id>urn:sha1:e30b7577bf1d338ca8a273bd2f881de5a41572b7</id>
<content type='text'>
This covers only the simplest case - an object size sized write, but
it's still useful in tiering setups when EC is used for the base tier
as writefull op can be proxied, saving an object promotion.

Even though updating ceph_osdc_new_request() to allow writefull should
just be a matter of fixing an assert, I didn't do it because its only
user is cephfs.  All other sites were updated.

Reflects ceph.git commit 7bfb7f9025a8ee0d2305f49bf0336d2424da5b5b.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
</content>
</entry>
<entry>
<title>rbd: set max_sectors explicitly</title>
<updated>2015-10-16T14:48:36Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2015-10-07T14:09:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0d9fde4fc8f59a6bd316559d267a936b0737d05a'/>
<id>urn:sha1:0d9fde4fc8f59a6bd316559d267a936b0737d05a</id>
<content type='text'>
Commit 30e2bc08b2bb ("Revert "block: remove artifical max_hw_sectors
cap"") restored a clamp on max_sectors.  It's now 2560 sectors instead
of 1024, but it's not good enough: we set max_hw_sectors to rbd object
size because we don't want object sized I/Os to be split, and the
default object size is 4M.

So, set max_sectors to max_hw_sectors in rbd at queue init time.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
</content>
</entry>
<entry>
<title>NVMe: Fix memory leak on retried commands</title>
<updated>2015-10-15T19:38:48Z</updated>
<author>
<name>Keith Busch</name>
<email>keith.busch@intel.com</email>
</author>
<published>2015-10-15T19:38:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0dfc70c33409afc232ef0b9ec210535dfbf9bc61'/>
<id>urn:sha1:0dfc70c33409afc232ef0b9ec210535dfbf9bc61</id>
<content type='text'>
Resources are reallocated for requeued commands, so unmap and release
the iod for the failed command.

It's a pretty bad memory leak and causes a kernel hang if you remove a
drive because of a busy dma pool. You'll get messages spewing like this:

  nvme 0000:xx:xx.x: dma_pool_destroy prp list 256, ffff880420dec000 busy

and lock up pci and the driver since removal never completes while
holding a lock.

Cc: stable@vger.kernel.org
Cc: &lt;stable@vger.kernel.org&gt; # 4.0.x-
Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>nvme: use an integer value to Linux errno values</title>
<updated>2015-10-15T15:49:20Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2015-10-12T19:23:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=81c04b943872e0332872df18cec1dec89b178b4d'/>
<id>urn:sha1:81c04b943872e0332872df18cec1dec89b178b4d</id>
<content type='text'>
Use a separate integer variable to hold the signed Linux errno
values we pass back to the block layer.  Note that for pass through
commands those might still be NVMe values, but those fit into the
int as well.

Fixes: f4829a9b7a61: ("blk-mq: fix racy updates of rq-&gt;errors")
Reported-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>nvme: fix 32-bit build warning</title>
<updated>2015-10-12T19:09:40Z</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2015-10-06T20:29:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=835da3f99d329b1160a1f7fc82c7ac81163d63d0'/>
<id>urn:sha1:835da3f99d329b1160a1f7fc82c7ac81163d63d0</id>
<content type='text'>
Compiling the nvme driver on 32-bit warns about a cast from a __u64
variable to a pointer:

drivers/block/nvme-core.c: In function 'nvme_submit_io':
drivers/block/nvme-core.c:1847:4: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
    (void __user *)io.addr, length, NULL, 0);

The cast here is intentional and safe, so we can shut up the
gcc warning by adding an intermediate cast to 'uintptr_t'.

I had previously submitted a patch to fix this problem in the
nvme driver, but it was accepted on the same day that two new
warnings got added.

For clarification, I also change the third instance of this cast
to use uintptr_t instead of unsigned long now.

Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Fixes: d29ec8241c10e ("nvme: submit internal commands through the block layer")
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>nbd: Add locking for tasks</title>
<updated>2015-10-08T20:21:24Z</updated>
<author>
<name>Markus Pargmann</name>
<email>mpa@pengutronix.de</email>
</author>
<published>2015-10-06T18:03:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=dcc909d90ccdbb73226397ff6d298f7af35b0e11'/>
<id>urn:sha1:dcc909d90ccdbb73226397ff6d298f7af35b0e11</id>
<content type='text'>
The timeout handling introduced in
	7e2893a16d3e (nbd: Fix timeout detection)
introduces a race condition which may lead to killing of tasks that are
not in nbd context anymore. This was not observed or reproducable yet.

This patch adds locking to critical use of task_recv and task_send to
avoid killing tasks that already left the NBD thread functions. This
lock is only acquired if a timeout occures or the nbd device
starts/stops.

Reported-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Signed-off-by: Markus Pargmann &lt;mpa@pengutronix.de&gt;
Reviewed-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Fixes: 7e2893a16d3e ("nbd: Fix timeout detection")
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
</feed>
