<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/md/dm-writecache.c, branch v5.6</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.6</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.6'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2020-03-03T16:10:21Z</updated>
<entry>
<title>dm: bump version of core and various targets</title>
<updated>2020-03-03T16:10:21Z</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2020-02-27T19:25:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=636be4241bdd88fec273b38723e44bad4e1c4fae'/>
<id>urn:sha1:636be4241bdd88fec273b38723e44bad4e1c4fae</id>
<content type='text'>
Changes made during the 5.6 cycle warrant bumping the version number
for DM core and the targets modified by this commit.

It should be noted that dm-thin, dm-crypt and dm-raid already had
their target version bumped during the 5.6 merge window.

Signed-off-by; Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm writecache: verify watermark during resume</title>
<updated>2020-02-27T21:44:24Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2020-02-24T09:20:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=41c526c5af46d4c4dab7f72c99000b7fac0b9702'/>
<id>urn:sha1:41c526c5af46d4c4dab7f72c99000b7fac0b9702</id>
<content type='text'>
Verify the watermark upon resume - so that if the target is reloaded
with lower watermark, it will start the cleanup process immediately.

Fixes: 48debafe4f2f ("dm: add writecache target")
Cc: stable@vger.kernel.org # 4.18+
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm: report suspended device during destroy</title>
<updated>2020-02-27T21:40:58Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2020-02-24T09:20:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=adc0daad366b62ca1bce3e2958a40b0b71a8b8b3'/>
<id>urn:sha1:adc0daad366b62ca1bce3e2958a40b0b71a8b8b3</id>
<content type='text'>
The function dm_suspended returns true if the target is suspended.
However, when the target is being suspended during unload, it returns
false.

An example where this is a problem: the test "!dm_suspended(wc-&gt;ti)" in
writecache_writeback is not sufficient, because dm_suspended returns
zero while writecache_suspend is in progress.  As is, without an
enhanced dm_suspended, simply switching from flush_workqueue to
drain_workqueue still emits warnings:
workqueue writecache-writeback: drain_workqueue() isn't complete after 10 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 100 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 200 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 300 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 400 tries

writecache_suspend calls flush_workqueue(wc-&gt;writeback_wq) - this function
flushes the current work. However, the workqueue may re-queue itself and
flush_workqueue doesn't wait for re-queued works to finish. Because of
this - the function writecache_writeback continues execution after the
device was suspended and then concurrently with writecache_dtr, causing
a crash in writecache_writeback.

We must use drain_workqueue - that waits until the work and all re-queued
works finish.

As a prereq for switching to drain_workqueue, this commit fixes
dm_suspended to return true after the presuspend hook and before the
postsuspend hook - just like during a normal suspend. It allows
simplifying the dm-integrity and dm-writecache targets so that they
don't have to maintain suspended flags on their own.

With this change use of drain_workqueue() can be used effectively.  This
change was tested with the lvm2 testsuite and cryptsetup testsuite and
the are no regressions.

Fixes: 48debafe4f2f ("dm: add writecache target")
Cc: stable@vger.kernel.org # 4.18+
Reported-by: Corey Marthaler &lt;cmarthal@redhat.com&gt;
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm writecache: improve performance of large linear writes on SSDs</title>
<updated>2020-01-16T18:34:17Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2020-01-15T09:35:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=dcd195071f22d4770911ca46694ca398b6d5101d'/>
<id>urn:sha1:dcd195071f22d4770911ca46694ca398b6d5101d</id>
<content type='text'>
When dm-writecache is used with SSD as a cache device, it would submit a
separate bio for each written block. The I/Os would be merged by the disk
scheduler, but this merging degrades performance.

Improve dm-writecache performance by submitting larger bios - this is
possible as long as there is consecutive free space on the cache
device.

Benchmark (arm64 with 64k page size, using /dev/ram0 as a cache device):

fio --bs=512k --iodepth=32 --size=400M --direct=1 \
    --filename=/dev/mapper/cache --rw=randwrite --numjobs=1 --name=test

block	old	new
size	MiB/s	MiB/s
---------------------
512	181	700
1k	347	1256
2k	644	2020
4k	1183	2759
8k	1852	3333
16k	2469	3509
32k	2974	3670
64k	3404	3810

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm writecache: fix incorrect flush sequence when doing SSD mode commit</title>
<updated>2020-01-15T01:22:48Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2020-01-08T15:46:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=aa9509209c5ac2f0b35d01a922bf9ae072d0c2fc'/>
<id>urn:sha1:aa9509209c5ac2f0b35d01a922bf9ae072d0c2fc</id>
<content type='text'>
When committing state, the function writecache_flush does the following:
1. write metadata (writecache_commit_flushed)
2. flush disk cache (writecache_commit_flushed)
3. wait for data writes to complete (writecache_wait_for_ios)
4. increase superblock seq_count
5. write the superblock
6. flush disk cache

It may happen that at step 3, when we wait for some write to finish, the
disk may report the write as finished, but the write only hit the disk
cache and it is not yet stored in persistent storage. At step 5 we write
the superblock - it may happen that the superblock is written before the
write that we waited for in step 3. If the machine crashes, it may result
in incorrect data being returned after reboot.

In order to fix the bug, we must swap steps 2 and 3 in the above sequence,
so that we first wait for writes to complete and then flush the disk
cache.

Fixes: 48debafe4f2f ("dm: add writecache target")
Cc: stable@vger.kernel.org # 4.18+
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm writecache: handle REQ_FUA</title>
<updated>2019-11-05T19:21:40Z</updated>
<author>
<name>Maged Mokhtar</name>
<email>mmokhtar@petasan.org</email>
</author>
<published>2019-10-23T20:41:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c1005322ff02110a4df7f0033368ea015062b583'/>
<id>urn:sha1:c1005322ff02110a4df7f0033368ea015062b583</id>
<content type='text'>
Call writecache_flush() on REQ_FUA in writecache_map().

Cc: stable@vger.kernel.org # 4.18+
Signed-off-by: Maged Mokhtar &lt;mmokhtar@petasan.org&gt;
Acked-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm writecache: fix uninitialized variable warning</title>
<updated>2019-11-05T19:11:44Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2019-10-02T11:07:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8dd85873a0bd8f162e95bc9402eea0050dbdac01'/>
<id>urn:sha1:8dd85873a0bd8f162e95bc9402eea0050dbdac01</id>
<content type='text'>
This fixes coverity warning CID 1454301.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm writecache: skip writecache_wait for pmem mode</title>
<updated>2019-09-05T17:22:05Z</updated>
<author>
<name>Huaisheng Ye</name>
<email>yehs1@lenovo.com</email>
</author>
<published>2019-09-02T10:04:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6d1959138c8bdaf69f1116c86c77e6733db6ab34'/>
<id>urn:sha1:6d1959138c8bdaf69f1116c86c77e6733db6ab34</id>
<content type='text'>
The array bio_in_progress[2] only have chance to be increased and
decreased with ssd mode. For pmem mode, they are not involved at all.
So skip writecache_wait_for_ios in writecache_flush for pmem.

Suggested-by: Doris Yu &lt;tyu1@lenovo.com&gt;
Signed-off-by: Huaisheng Ye &lt;yehs1@lenovo.com&gt;
Acked-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm writecache: optimize performance by sorting the blocks for writeback_all</title>
<updated>2019-08-26T14:59:00Z</updated>
<author>
<name>Huaisheng Ye</name>
<email>yehs1@lenovo.com</email>
</author>
<published>2019-08-25T07:24:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5229b4896e8f32bda4bfe29ff91e594ae7aa8a75'/>
<id>urn:sha1:5229b4896e8f32bda4bfe29ff91e594ae7aa8a75</id>
<content type='text'>
During the process of writeback, the blocks, which have been placed in wbl.list
for writeback soon, are partially ordered for the contiguous ones.

When writeback_all has been set, for most cases, also by default, there will be
a lot of blocks in pmem need to writeback at the same time.
For this case, we could optimize the performance by sorting all blocks in
wbl.list. writecache_writeback doesn't need to get blocks from the tail of
wc-&gt;lru, whereas from the first rb_node from the rb_tree.

The benefit is that, writecache_writeback doesn't need to have any cost to sort
the blocks, because of all blocks are incremental originally in rb_tree.
There will be a writecache_flush when writeback_all begins to work, that will
eliminate duplicate blocks in cache by committed/uncommitted.

Testing platform: Thinksystem SR630 with persistent memory.
The cache comes from pmem, which has 1006MB size. The origin device is HDD, 2GB
of which for using.

Testing steps:
 1) dmsetup create mycache --table '0 4194304 writecache p /dev/sdb1 /dev/pmem4  4096 0'
 2) fio -filename=/dev/mapper/mycache -direct=1 -iodepth=20 -rw=randwrite
 -ioengine=libaio -bs=4k -loops=1  -size=2g -group_reporting -name=mytest1
 3) time dmsetup message /dev/mapper/mycache 0 flush

Here is the results below,
With the patch:
 # fio -filename=/dev/mapper/mycache -direct=1 -iodepth=20 -rw=randwrite
 -ioengine=libaio -bs=4k -loops=1  -size=2g -group_reporting -name=mytest1
   iops        : min= 1582, max=199470, avg=5305.94, stdev=21273.44, samples=197
 # time dmsetup message /dev/mapper/mycache 0 flush
real	0m44.020s
user	0m0.002s
sys	0m0.003s

Without the patch:
 # fio -filename=/dev/mapper/mycache -direct=1 -iodepth=20 -rw=randwrite
 -ioengine=libaio -bs=4k -loops=1  -size=2g -group_reporting -name=mytest1
   iops        : min= 1202, max=197650, avg=4968.67, stdev=20480.17, samples=211
 # time dmsetup message /dev/mapper/mycache 0 flush
real	1m39.221s
user	0m0.001s
sys	0m0.003s

I also have checked the data accuracy with this patch by making EXT4 filesystem
on mycache, then mount it for checking md5 of files on that.
The test result is positive, with this patch it could save more than half of time
when writeback_all.

Signed-off-by: Huaisheng Ye &lt;yehs1@lenovo.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm writecache: add unlikely for getting two block with same LBA</title>
<updated>2019-08-26T14:54:41Z</updated>
<author>
<name>Huaisheng Ye</name>
<email>yehs1@lenovo.com</email>
</author>
<published>2019-08-25T07:24:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=62421b3880c71da7ecbf9c4072dc64ef3e65ad0a'/>
<id>urn:sha1:62421b3880c71da7ecbf9c4072dc64ef3e65ad0a</id>
<content type='text'>
In function writecache_writeback, entries g and f has same original
sector only happens at entry f has been committed, but entry g has
NOT yet.

The probability of this happening is very low in the following
256 blocks at most of entry e.

Signed-off-by: Huaisheng Ye &lt;yehs1@lenovo.com&gt;
Acked-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
</feed>
