<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/block/loop.h, branch v4.15</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.15</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.15'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2017-09-26T13:41:22Z</updated>
<entry>
<title>block/loop: make loop cgroup aware</title>
<updated>2017-09-26T13:41:22Z</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2017-09-25T19:07:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d4478e92d6186ce37947a36994de407c27446266'/>
<id>urn:sha1:d4478e92d6186ce37947a36994de407c27446266</id>
<content type='text'>
loop block device handles IO in a separate thread. The actual IO
dispatched isn't cloned from the IO loop device received, so the
dispatched IO loses the cgroup context.

I'm ignoring buffer IO case now, which is quite complicated.  Making the
loop thread aware cgroup context doesn't really help. The loop device
only writes to a single file. In current writeback cgroup
implementation, the file can only belong to one cgroup.

For direct IO case, we could workaround the issue in theory. For
example, say we assign cgroup1 5M/s BW for loop device and cgroup2
10M/s. We can create a special cgroup for loop thread and assign at
least 15M/s for the underlayer disk. In this way, we correctly throttle
the two cgroups. But this is tricky to setup.

This patch tries to address the issue. We record bio's css in loop
command. When loop thread is handling the command, we then use the API
provided in patch 1 to set the css for current task. The bio layer will
use the css for new IO (from patch 3).

Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>loop: remove union of use_aio and ref in struct loop_cmd</title>
<updated>2017-09-25T14:56:05Z</updated>
<author>
<name>Omar Sandoval</name>
<email>osandov@fb.com</email>
</author>
<published>2017-09-20T21:24:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e5313c141b49c1b1af43d1ca81398185d66ad1a6'/>
<id>urn:sha1:e5313c141b49c1b1af43d1ca81398185d66ad1a6</id>
<content type='text'>
When the request is completed, lo_complete_rq() checks cmd-&gt;use_aio.
However, if this is in fact an aio request, cmd-&gt;use_aio will have
already been reused as cmd-&gt;ref by lo_rw_aio*. Fix it by not using a
union. On x86_64, there's a hole after the union anyways, so this
doesn't make struct loop_cmd any bigger.

Fixes: 92d773324b7e ("block/loop: fix use after free")
Signed-off-by: Omar Sandoval &lt;osandov@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block/loop: remove unused field</title>
<updated>2017-09-01T19:57:35Z</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2017-09-01T18:15:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bc75705d00637c5f7b0346bf63094a9899c3d516'/>
<id>urn:sha1:bc75705d00637c5f7b0346bf63094a9899c3d516</id>
<content type='text'>
nobody uses the list.

Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block/loop: fix use after free</title>
<updated>2017-09-01T19:57:33Z</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2017-09-01T18:15:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=92d773324b7edbd36bf0c28c1e0157763aeccc92'/>
<id>urn:sha1:92d773324b7edbd36bf0c28c1e0157763aeccc92</id>
<content type='text'>
lo_rw_aio-&gt;call_read_iter-&gt;
1       aops-&gt;direct_IO
2       iov_iter_revert
lo_rw_aio_complete could happen between 1 and 2, the bio and bvec could
be freed before 2, which accesses bvec.

Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block/loop: allow request merge for directio mode</title>
<updated>2017-09-01T14:44:34Z</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2017-09-01T05:09:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=40326d8a33d5b70039849d233975b63c733d94a2'/>
<id>urn:sha1:40326d8a33d5b70039849d233975b63c733d94a2</id>
<content type='text'>
Currently loop disables merge. While it makes sense for buffer IO mode,
directio mode can benefit from request merge. Without merge, loop could
send small size IO to underlayer disk and harm performance.

Reviewed-by: Omar Sandoval &lt;osandov@fb.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>loop: get rid of lo_blocksize</title>
<updated>2017-08-31T19:51:10Z</updated>
<author>
<name>Omar Sandoval</name>
<email>osandov@fb.com</email>
</author>
<published>2017-08-24T07:03:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8a0740c4109d646d8697d359962edea47301c652'/>
<id>urn:sha1:8a0740c4109d646d8697d359962edea47301c652</id>
<content type='text'>
This is only used for setting the soft block size on the struct
block_device once and then never used again.

Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Omar Sandoval &lt;osandov@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>Revert "loop: support 4k physical blocksize"</title>
<updated>2017-08-23T21:57:55Z</updated>
<author>
<name>Omar Sandoval</name>
<email>osandov@fb.com</email>
</author>
<published>2017-08-23T21:54:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1e6ec9ea89d30739b9447c1860fcb07fc29f3aef'/>
<id>urn:sha1:1e6ec9ea89d30739b9447c1860fcb07fc29f3aef</id>
<content type='text'>
There's some stuff still up in the air, let's not get stuck with a
subpar ABI. I'll follow up with something better for 4.14.

Signed-off-by: Omar Sandoval &lt;osandov@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>loop: support 4k physical blocksize</title>
<updated>2017-06-08T14:40:00Z</updated>
<author>
<name>Hannes Reinecke</name>
<email>hare@suse.de</email>
</author>
<published>2017-06-08T11:46:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f2c6df7dbf9a60e1cd9941f9fb376d4d9ad1e8dd'/>
<id>urn:sha1:f2c6df7dbf9a60e1cd9941f9fb376d4d9ad1e8dd</id>
<content type='text'>
When generating bootable VM images certain systems (most notably
s390x) require devices with 4k blocksize. This patch implements
a new flag 'LO_FLAGS_BLOCKSIZE' which will set the physical
blocksize to that of the underlying device, and allow to change
the logical blocksize for up to the physical blocksize.

Signed-off-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>loop: zero-fill bio on the submitting cpu</title>
<updated>2017-04-20T18:16:10Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2017-04-20T14:03:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fe2cb2905c3dc872158e7ce30df05d72c3989645'/>
<id>urn:sha1:fe2cb2905c3dc872158e7ce30df05d72c3989645</id>
<content type='text'>
In thruth I've just audited which blk-mq drivers don't currently have a
complete callback, but I think this change is at least borderline useful.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>block: loop: support DIO &amp; AIO</title>
<updated>2015-09-23T17:01:16Z</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@canonical.com</email>
</author>
<published>2015-08-17T02:31:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bc07c10a3603a5ab3ef01ba42b3d41f9ac63d1b6'/>
<id>urn:sha1:bc07c10a3603a5ab3ef01ba42b3d41f9ac63d1b6</id>
<content type='text'>
There are at least 3 advantages to use direct I/O and AIO on
read/write loop's backing file:

1) double cache can be avoided, then memory usage gets
decreased a lot

2) not like user space direct I/O, there isn't cost of
pinning pages

3) avoid context switch for obtaining good throughput
- in buffered file read, random I/O top throughput is often obtained
only if they are submitted concurrently from lots of tasks; but for
sequential I/O, most of times they can be hit from page cache, so
concurrent submissions often introduce unnecessary context switch
and can't improve throughput much. There was such discussion[1]
to use non-blocking I/O to improve the problem for application.
- with direct I/O and AIO, concurrent submissions can be
avoided and random read throughput can't be affected meantime

xfstests(-g auto, ext4) is basically passed when running with
direct I/O(aio), one exception is generic/232, but it failed in
loop buffered I/O(4.2-rc6-next-20150814) too.

Follows the fio test result for performance purpose:
	4 jobs fio test inside ext4 file system over loop block

1) How to run
	- KVM: 4 VCPUs, 2G RAM
	- linux kernel: 4.2-rc6-next-20150814(base) with the patchset
	- the loop block is over one image on SSD.
	- linux psync, 4 jobs, size 1500M, ext4 over loop block
	- test result: IOPS from fio output

2) Throughput(IOPS) becomes a bit better with direct I/O(aio)
        -------------------------------------------------------------
        test cases          |randread   |read   |randwrite  |write  |
        -------------------------------------------------------------
        base                |8015       |113811 |67442      |106978
        -------------------------------------------------------------
        base+loop aio       |8136       |125040 |67811      |111376
        -------------------------------------------------------------

- somehow, it should be caused by more page cache avaiable for
application or one extra page copy is avoided in case of direct I/O

3) context switch
        - context switch decreased by ~50% with loop direct I/O(aio)
	compared with loop buffered I/O(4.2-rc6-next-20150814)

4) memory usage from /proc/meminfo
        -------------------------------------------------------------
                                   | Buffers       | Cached
        -------------------------------------------------------------
        base                       | &gt; 760MB       | ~950MB
        -------------------------------------------------------------
        base+loop direct I/O(aio)  | &lt; 5MB         | ~1.6GB
        -------------------------------------------------------------

- so there are much more page caches available for application with
direct I/O

[1] https://lwn.net/Articles/612483/

Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
</feed>
