<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/fs/ext4, branch v3.1</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.1</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.1'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2011-09-21T20:20:21Z</updated>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.dk/linux-block</title>
<updated>2011-09-21T20:20:21Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-09-21T20:20:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fed678dc8a8b839c8189b5d889a94e865cd327dd'/>
<id>urn:sha1:fed678dc8a8b839c8189b5d889a94e865cd327dd</id>
<content type='text'>
* 'for-linus' of git://git.kernel.dk/linux-block:
  floppy: use del_timer_sync() in init cleanup
  blk-cgroup: be able to remove the record of unplugged device
  block: Don't check QUEUE_FLAG_SAME_COMP in __blk_complete_request
  mm: Add comment explaining task state setting in bdi_forker_thread()
  mm: Cleanup clearing of BDI_pending bit in bdi_forker_thread()
  block: simplify force plug flush code a little bit
  block: change force plug flush call order
  block: Fix queue_flag update when rq_affinity goes from 2 to 1
  block: separate priority boosting from REQ_META
  block: remove READ_META and WRITE_META
  xen-blkback: fixed indentation and comments
  xen-blkback: Don't disconnect backend until state switched to XenbusStateClosed.
</content>
</entry>
<entry>
<title>ext4: remove i_mutex lock in ext4_evict_inode to fix lockdep complaining</title>
<updated>2011-08-31T15:50:51Z</updated>
<author>
<name>Jiaying Zhang</name>
<email>jiayingz@google.com</email>
</author>
<published>2011-08-31T15:50:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8c0bec2151a47906bf779c6715a10ce04453ab77'/>
<id>urn:sha1:8c0bec2151a47906bf779c6715a10ce04453ab77</id>
<content type='text'>
The i_mutex lock and flush_completed_IO() added by commit 2581fdc810
in ext4_evict_inode() causes lockdep complaining about potential
deadlock in several places.  In most/all of these LOCKDEP complaints
it looks like it's a false positive, since many of the potential
circular locking cases can't take place by the time the
ext4_evict_inode() is called; but since at the very least it may mask
real problems, we need to address this.

This change removes the flush_completed_IO() and i_mutex lock in
ext4_evict_inode().  Instead, we take a different approach to resolve
the software lockup that commit 2581fdc810 intends to fix.  Rather
than having ext4-dio-unwritten thread wait for grabing the i_mutex
lock of an inode, we use mutex_trylock() instead, and simply requeue
the work item if we fail to grab the inode's i_mutex lock.

This should speed up work queue processing in general and also
prevents the following deadlock scenario: During page fault,
shrink_icache_memory is called that in turn evicts another inode B.
Inode B has some pending io_end work so it calls ext4_ioend_wait()
that waits for inode B's i_ioend_count to become zero.  However, inode
B's ioend work was queued behind some of inode A's ioend work on the
same cpu's ext4-dio-unwritten workqueue.  As the ext4-dio-unwritten
thread on that cpu is processing inode A's ioend work, it tries to
grab inode A's i_mutex lock.  Since the i_mutex lock of inode A is
still hold before the page fault happened, we enter a deadlock.

Signed-off-by: Jiaying Zhang &lt;jiayingz@google.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>block: separate priority boosting from REQ_META</title>
<updated>2011-08-23T12:50:29Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@infradead.org</email>
</author>
<published>2011-08-23T12:50:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=65299a3b788bd274bed92f9fa3232082c9f3ea70'/>
<id>urn:sha1:65299a3b788bd274bed92f9fa3232082c9f3ea70</id>
<content type='text'>
Add a new REQ_PRIO to let requests preempt others in the cfq I/O schedule,
and lave REQ_META purely for marking requests as metadata in blktrace.

All existing callers of REQ_META except for XFS are updated to also
set REQ_PRIO for now.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
</entry>
<entry>
<title>block: remove READ_META and WRITE_META</title>
<updated>2011-08-23T12:49:55Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@infradead.org</email>
</author>
<published>2011-08-23T12:49:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5dc06c5a70b79a323152bec7e55783e705767e63'/>
<id>urn:sha1:5dc06c5a70b79a323152bec7e55783e705767e63</id>
<content type='text'>
Replace all occurnanced of the undocumented READ_META with READ | REQ_META
and remove the unused WRITE_META define.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4</title>
<updated>2011-08-21T13:59:41Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-08-21T13:59:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c063d8a60fc912ae198f054608ad461a69dc9a19'/>
<id>urn:sha1:c063d8a60fc912ae198f054608ad461a69dc9a19</id>
<content type='text'>
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: flush any pending end_io requests before DIO reads w/dioread_nolock
  ext4: fix nomblk_io_submit option so it correctly converts uninit blocks
  ext4: Resolve the hang of direct i/o read in handling EXT4_IO_END_UNWRITTEN.
  ext4: call ext4_ioend_wait and ext4_flush_completed_IO in ext4_evict_inode
  ext4: Fix ext4_should_writeback_data() for no-journal mode
</content>
</entry>
<entry>
<title>ext4: flush any pending end_io requests before DIO reads w/dioread_nolock</title>
<updated>2011-08-19T23:13:32Z</updated>
<author>
<name>Jiaying Zhang</name>
<email>jiayingz@google.com</email>
</author>
<published>2011-08-19T23:13:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=dccaf33fa37a1bc5d651baeb3bfeb6becb86597b'/>
<id>urn:sha1:dccaf33fa37a1bc5d651baeb3bfeb6becb86597b</id>
<content type='text'>
There is a race between ext4 buffer write and direct_IO read with
dioread_nolock mount option enabled. The problem is that we clear
PageWriteback flag during end_io time but will do
uninitialized-to-initialized extent conversion later with dioread_nolock.
If an O_direct read request comes in during this period, ext4 will return
zero instead of the recently written data.

This patch checks whether there are any pending uninitialized-to-initialized
extent conversion requests before doing O_direct read to close the race.
Note that this is just a bandaid fix. The fundamental issue is that we
clear PageWriteback flag before we really complete an IO, which is
problem-prone. To fix the fundamental issue, we may need to implement an
extent tree cache that we can use to look up pending to-be-converted extents.

Signed-off-by: Jiaying Zhang &lt;jiayingz@google.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
</content>
</entry>
<entry>
<title>ext4: fix nomblk_io_submit option so it correctly converts uninit blocks</title>
<updated>2011-08-13T16:58:21Z</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2011-08-13T16:58:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9dd75f1f1a02d656a11a7b9b9e6c2759b9c1e946'/>
<id>urn:sha1:9dd75f1f1a02d656a11a7b9b9e6c2759b9c1e946</id>
<content type='text'>
Bug discovered by Jan Kara:

Finally, commit 1449032be17abb69116dbc393f67ceb8bd034f92 returned back
the old IO submission code but apparently it forgot to return the old
handling of uninitialized buffers so we unconditionnaly call
block_write_full_page() without specifying end_io function. So AFAICS
we never convert unwritten extents to written in some cases. For
example when I mount the fs as: mount -t ext4 -o
nomblk_io_submit,dioread_nolock /dev/ubdb /mnt and do
        int fd = open(argv[1], O_RDWR | O_CREAT | O_TRUNC, 0600);
        char buf[1024];
        memset(buf, 'a', sizeof(buf));
        fallocate(fd, 0, 0, 16384);
        write(fd, buf, sizeof(buf));

I get a file full of zeros (after remounting the filesystem so that
pagecache is dropped) instead of seeing the first KB contain 'a's.

Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
</content>
</entry>
<entry>
<title>ext4: Resolve the hang of direct i/o read in handling EXT4_IO_END_UNWRITTEN.</title>
<updated>2011-08-13T16:30:59Z</updated>
<author>
<name>Tao Ma</name>
<email>boyu.mt@taobao.com</email>
</author>
<published>2011-08-13T16:30:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=32c80b32c053dc52712dedac5e4d0aa7c93fc353'/>
<id>urn:sha1:32c80b32c053dc52712dedac5e4d0aa7c93fc353</id>
<content type='text'>
EXT4_IO_END_UNWRITTEN flag set and the increase of i_aiodio_unwritten
should be done simultaneously since ext4_end_io_nolock always clear
the flag and decrease the counter in the same time.

We don't increase i_aiodio_unwritten when setting
EXT4_IO_END_UNWRITTEN so it will go nagative and causes some process
to wait forever.

Part of the patch came from Eric in his e-mail, but it doesn't fix the
problem met by Michael actually.

http://marc.info/?l=linux-ext4&amp;m=131316851417460&amp;w=2

Reported-and-Tested-by: Michael Tokarev&lt;mjt@tls.msk.ru&gt;
Signed-off-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Signed-off-by: Tao Ma &lt;boyu.mt@taobao.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
</content>
</entry>
<entry>
<title>ext4: call ext4_ioend_wait and ext4_flush_completed_IO in ext4_evict_inode</title>
<updated>2011-08-13T16:17:13Z</updated>
<author>
<name>Jiaying Zhang</name>
<email>jiayingz@google.com</email>
</author>
<published>2011-08-13T16:17:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2581fdc810889fdea97689cb62481201d579c796'/>
<id>urn:sha1:2581fdc810889fdea97689cb62481201d579c796</id>
<content type='text'>
Flush inode's i_completed_io_list before calling ext4_io_wait to
prevent the following deadlock scenario: A page fault happens while
some process is writing inode A. During page fault,
shrink_icache_memory is called that in turn evicts another inode
B. Inode B has some pending io_end work so it calls ext4_ioend_wait()
that waits for inode B's i_ioend_count to become zero. However, inode
B's ioend work was queued behind some of inode A's ioend work on the
same cpu's ext4-dio-unwritten workqueue. As the ext4-dio-unwritten
thread on that cpu is processing inode A's ioend work, it tries to
grab inode A's i_mutex lock. Since the i_mutex lock of inode A is
still hold before the page fault happened, we enter a deadlock.

Also moves ext4_flush_completed_IO and ext4_ioend_wait from
ext4_destroy_inode() to ext4_evict_inode(). During inode deleteion,
ext4_evict_inode() is called before ext4_destroy_inode() and in
ext4_evict_inode(), we may call ext4_truncate() without holding
i_mutex lock. As a result, there is a race between flush_completed_IO
that is called from ext4_ext_truncate() and ext4_end_io_work, which
may cause corruption on an io_end structure. This change moves
ext4_flush_completed_IO and ext4_ioend_wait from ext4_destroy_inode()
to ext4_evict_inode() to resolve the race between ext4_truncate() and
ext4_end_io_work during inode deletion.

Signed-off-by: Jiaying Zhang &lt;jiayingz@google.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
</content>
</entry>
<entry>
<title>ext4: Fix ext4_should_writeback_data() for no-journal mode</title>
<updated>2011-08-13T15:25:18Z</updated>
<author>
<name>Curt Wohlgemuth</name>
<email>curtw@google.com</email>
</author>
<published>2011-08-13T15:25:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=441c850857148935babe000fc2ba1455fe54a6a9'/>
<id>urn:sha1:441c850857148935babe000fc2ba1455fe54a6a9</id>
<content type='text'>
ext4_should_writeback_data() had an incorrect sequence of
tests to determine if it should return 0 or 1: in
particular, even in no-journal mode, 0 was being returned
for a non-regular-file inode.

This meant that, in non-journal mode, we would use
ext4_journalled_aops for directories, symlinks, and other
non-regular files.  However, calling journalled aop
callbacks when there is no valid handle, can cause problems.

This would cause a kernel crash with Jan Kara's commit
2d859db3e4 ("ext4: fix data corruption in inodes with
journalled data"), because we now dereference 'handle' in
ext4_journalled_write_end().

I also added BUG_ONs to check for a valid handle in the
obviously journal-only aops callbacks.

I tested this running xfstests with a scratch device in
these modes:

   - no-journal
   - data=ordered
   - data=writeback
   - data=journal

All work fine; the data=journal run has many failures and a
crash in xfstests 074, but this is no different from a
vanilla kernel.

Signed-off-by: Curt Wohlgemuth &lt;curtw@google.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
</content>
</entry>
</feed>
