<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/fs/sync.c, branch v2.6.34</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v2.6.34</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v2.6.34'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2010-04-25T06:54:42Z</updated>
<entry>
<title>Catch filesystems lacking s_bdi</title>
<updated>2010-04-25T06:54:42Z</updated>
<author>
<name>Jörn Engel</name>
<email>joern@logfs.org</email>
</author>
<published>2010-04-25T06:54:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5129a469a91a91427334c40e29e64c6d0ab68caf'/>
<id>urn:sha1:5129a469a91a91427334c40e29e64c6d0ab68caf</id>
<content type='text'>
noop_backing_dev_info is used only as a flag to mark filesystems that
don't have any backing store, like tmpfs, procfs, spufs, etc.

Signed-off-by: Joern Engel &lt;joern@logfs.org&gt;

Changed the BUG_ON() to a WARN_ON(). Note that adding dirty inodes
to the noop_backing_dev_info is not legal and will not result in
them being flushed, but we already catch this condition in
__mark_inode_dirty() when checking for a registered bdi.

Signed-off-by: Jens Axboe &lt;jens.axboe@oracle.com&gt;
</content>
</entry>
<entry>
<title>include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h</title>
<updated>2010-03-30T13:02:32Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2010-03-24T08:04:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5a0e3ad6af8660be21ca98a971cd00f331318c05'/>
<id>urn:sha1:5a0e3ad6af8660be21ca98a971cd00f331318c05</id>
<content type='text'>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -&gt; slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Guess-its-ok-by: Christoph Lameter &lt;cl@linux-foundation.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Lee Schermerhorn &lt;Lee.Schermerhorn@hp.com&gt;
</content>
</entry>
<entry>
<title>quota: move code from sync_quota_sb into vfs_quota_sync</title>
<updated>2010-03-04T23:20:24Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@infradead.org</email>
</author>
<published>2010-02-16T08:44:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5fb324ad24febe57a8a2e62903dcb7bad546ea71'/>
<id>urn:sha1:5fb324ad24febe57a8a2e62903dcb7bad546ea71</id>
<content type='text'>
Currenly sync_quota_sb does a lot of sync and truncate action that only
applies to "VFS" style quotas and is actively harmful for the sync
performance in XFS.  Move it into vfs_quota_sync and add a wait parameter
to -&gt;quota_sync to tell if we need it or not.

My audit of the GFS2 code says it's also not needed given the way GFS2
implements quotas, but I'd be happy if this can get a detailed review.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>fold do_sync_file_range into sys_sync_file_range</title>
<updated>2009-12-17T16:03:25Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2009-12-17T13:24:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7a0ad10c367ab57c899d340372f37880cbe6ab52'/>
<id>urn:sha1:7a0ad10c367ab57c899d340372f37880cbe6ab52</id>
<content type='text'>
We recently go rid of all callers of do_sync_file_range as they're better
served with vfs_fsync or the filemap_write_and_wait.  Now that
do_sync_file_range is down to a single caller fold it into it so that people
don't start using it again accidentally.  While at it also switch it from
using __filemap_fdatawrite_range(..., WB_SYNC_ALL) to the more clear
filemap_fdatawrite_range().

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>kill wait_on_page_writeback_range</title>
<updated>2009-12-10T14:02:50Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2009-09-30T20:16:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=94004ed726f38a841cc51f97c4a3f9eda9fbd0d9'/>
<id>urn:sha1:94004ed726f38a841cc51f97c4a3f9eda9fbd0d9</id>
<content type='text'>
All callers really want the more logical filemap_fdatawait_range interface,
so convert them to use it and merge wait_on_page_writeback_range into
filemap_fdatawait_range.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>vfs: Implement proper O_SYNC semantics</title>
<updated>2009-12-10T14:02:50Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2009-10-27T10:05:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6b2f3d1f769be5779b479c37800229d9a4809fc3'/>
<id>urn:sha1:6b2f3d1f769be5779b479c37800229d9a4809fc3</id>
<content type='text'>
While Linux provided an O_SYNC flag basically since day 1, it took until
Linux 2.4.0-test12pre2 to actually get it implemented for filesystems,
since that day we had generic_osync_around with only minor changes and the
great "For now, when the user asks for O_SYNC, we'll actually give
O_DSYNC" comment.  This patch intends to actually give us real O_SYNC
semantics in addition to the O_DSYNC semantics.  After Jan's O_SYNC
patches which are required before this patch it's actually surprisingly
simple, we just need to figure out when to set the datasync flag to
vfs_fsync_range and when not.

This patch renames the existing O_SYNC flag to O_DSYNC while keeping it's
numerical value to keep binary compatibility, and adds a new real O_SYNC
flag.  To guarantee backwards compatiblity it is defined as expanding to
both the O_DSYNC and the new additional binary flag (__O_SYNC) to make
sure we are backwards-compatible when compiled against the new headers.

This also means that all places that don't care about the differences can
just check O_DSYNC and get the right behaviour for O_SYNC, too - only
places that actuall care need to check __O_SYNC in addition.  Drivers and
network filesystems have been updated in a fail safe way to always do the
full sync magic if O_DSYNC is set.  The few places setting O_SYNC for
lower layers are kept that way for now to stay failsafe.

We enforce that O_DSYNC is set when __O_SYNC is set early in the open path
to make sure we always get these sane options.

Note that parisc really screwed up their headers as they already define a
O_DSYNC that has always been a no-op.  We try to repair it by using it for
the new O_DSYNC and redefinining O_SYNC to send both the traditional
O_SYNC numerical value _and_ the O_DSYNC one.

Cc: Richard Henderson &lt;rth@twiddle.net&gt;
Cc: Ivan Kokshaysky &lt;ink@jurassic.park.msu.ru&gt;
Cc: Grant Grundler &lt;grundler@parisc-linux.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Andreas Dilger &lt;adilger@sun.com&gt;
Acked-by: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;
Acked-by: Kyle McMartin &lt;kyle@mcmartin.ca&gt;
Acked-by: Ulrich Drepper &lt;drepper@redhat.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>fs/buffer.c: clean up EXPORT* macros</title>
<updated>2009-09-23T14:39:29Z</updated>
<author>
<name>H Hartley Sweeten</name>
<email>hartleys@visionengravers.com</email>
</author>
<published>2009-09-22T23:43:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1fe72eaa0f46a0fa4cdcd8f3f7853b6d39469784'/>
<id>urn:sha1:1fe72eaa0f46a0fa4cdcd8f3f7853b6d39469784</id>
<content type='text'>
According to Documentation/CodingStyle the EXPORT* macro should follow
immediately after the closing function brace line.

Also, mark_buffer_async_write_endio() and do_thaw_all() are not used
elsewhere so they should be marked as static.

In addition, file_fsync() is actually in fs/sync.c so move the EXPORT* to
that file.

Signed-off-by: H Hartley Sweeten &lt;hsweeten@visionengravers.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fs: Assign bdi in super_block</title>
<updated>2009-09-16T13:18:51Z</updated>
<author>
<name>Jens Axboe</name>
<email>jens.axboe@oracle.com</email>
</author>
<published>2009-09-16T13:02:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=32a88aa1b6dfb901cec64e1898cac78d0f25028a'/>
<id>urn:sha1:32a88aa1b6dfb901cec64e1898cac78d0f25028a</id>
<content type='text'>
We do this automatically in get_sb_bdev() from the set_bdev_super()
callback. Filesystems that have their own private backing_dev_info
must assign that in -&gt;fill_super().

Note that -&gt;s_bdi assignment is required for proper writeback!

Acked-by: Christoph Hellwig &lt;hch@infradead.org&gt;
Signed-off-by: Jens Axboe &lt;jens.axboe@oracle.com&gt;
</content>
</entry>
<entry>
<title>fsync: wait for data writeout completion before calling -&gt;fsync</title>
<updated>2009-09-14T15:08:17Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2009-09-03T10:39:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2daea67e966dc0c42067ebea015ddac6834cef88'/>
<id>urn:sha1:2daea67e966dc0c42067ebea015ddac6834cef88</id>
<content type='text'>
Currenly vfs_fsync(_range) first calls filemap_fdatawrite to write out
the data, the calls into -&gt;fsync to write out the metadata and then finally
calls filemap_fdatawait to wait for the data I/O to complete.  What sounds
like a clever micro-optimization actually is nast trap for many filesystems.

For many modern filesystems i_size or other inode information is only
updated on I/O completion and we need to wait for I/O to finish before
we can write out the metadata.  For old fashionen filesystems that
instanciate blocks during the actual write and also update the metadata
at that point it opens up a large window were we could expose uninitialized
blocks after a crash.  While a few filesystems that need it already wait
for the I/O to finish inside their -&gt;fsync methods it is rather suboptimal
as it is done under the i_mutex and also always for the whole file instead
of just a part as we could do for O_SYNC handling.

Here is a small audit of all fsync instances in the tree:

 - spufs_mfc_fsync:
 - ps3flash_fsync:
 - vol_cdev_fsync:
 - printer_fsync:
 - fb_deferred_io_fsync:
 - bad_file_fsync:
 - simple_sync_file:

	don't care - filesystems/drivers do't use the page cache or are
	purely in-memory.

 - simple_fsync:
 - file_fsync:
 - affs_file_fsync:
 - fat_file_fsync:
 - jfs_fsync:
 - ubifs_fsync:
 - reiserfs_dir_fsync:
 - reiserfs_sync_file:

	never touch pagecache themselves.  We need to wait before if we do
	not want to expose stale data after an allocation.

 - afs_fsync:
 - fuse_fsync_common:

	do the waiting writeback itself in awkward ways, would benefit from
	proper semantics

 - block_fsync:

	Does a filemap_write_and_wait on the block device inode.  Because we
	now have f_mapping that is the same inode we call it on in vfs_fsync.
	So just removing it and letting the VFS do the work in one go would
	be an improvement.

 - btrfs_sync_file:
 - cifs_fsync:
 - xfs_file_fsync:

	need the wait first and currently do it themselves. would benefit from
	doing it outside i_mutex.

 - coda_fsync:
 - ecryptfs_fsync:
 - exofs_file_fsync:
 - shm_fsync:

	only passes the fsync through to the lower layer

 - ext3_sync_file:

	doesn't seem to care, comments are confusing.

 - ext4_sync_file:

	would need the wait to work correctly for delalloc mode with late
	i_size updates.  Otherwise the ext3 comment applies.

	currently implemens it's own writeback and wait in an odd way,
	could benefit from doing it properly.

 - gfs2_fsync:

	not needed for journaled data mode, but probably harmless there.
	Currently writes back data asynchronously itself.  Needs some
	major audit.

 - hostfs_fsync:

	just calls fsync/datasync on the host FD.  Without the wait before
	data might not even be inflight yet if we're unlucky.

 - hpfs_file_fsync:
 - ncp_fsync:

	no-ops.  Dangerous before and after.

 - jffs2_fsync:

	just calls jffs2_flush_wbuf_gc, not sure how this relates to data.

 - nfs_fsync_dir:

	just increments stats, claims all directory operations are synchronous

 - nfs_file_fsync:

	only writes out data???  Looks very odd.

 - nilfs_sync_file:

	looks like it expects all data done, but not sure from the code

 - ntfs_dir_fsync:
 - ntfs_file_fsync:

	appear to do their own data writeback.  Very convoluted code.

 - ocfs2_sync_file:

	does it's own data writeback, but no wait.  probably needs the wait.

 - smb_fsync:

	according to a comment expects all pages written already, probably needs
	the wait before.

This patch only changes vfs_fsync_range, removal of the wait in the methods
that have it is left to the filesystem maintainers.  Note that most
filesystems really do need an audit for their fsync methods given the
gems found in this very brief audit.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode</title>
<updated>2009-09-14T15:08:15Z</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2009-08-17T17:52:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=148f948ba877f4d3cdef036b1ff6d9f68986706a'/>
<id>urn:sha1:148f948ba877f4d3cdef036b1ff6d9f68986706a</id>
<content type='text'>
Introduce new function for generic inode syncing (vfs_fsync_range) and use
it from fsync() path. Introduce also new helper for syncing after a sync
write (generic_write_sync) using the generic function.

Use these new helpers for syncing from generic VFS functions. This makes
O_SYNC writes to block devices acquire i_mutex for syncing. If we really
care about this, we can make block_fsync() drop the i_mutex and reacquire
it before it returns.

CC: Evgeniy Polyakov &lt;zbr@ioremap.net&gt;
CC: ocfs2-devel@oss.oracle.com
CC: Joel Becker &lt;joel.becker@oracle.com&gt;
CC: Felix Blyakher &lt;felixb@sgi.com&gt;
CC: xfs@oss.sgi.com
CC: Anton Altaparmakov &lt;aia21@cantab.net&gt;
CC: linux-ntfs-dev@lists.sourceforge.net
CC: OGAWA Hirofumi &lt;hirofumi@mail.parknet.co.jp&gt;
CC: linux-ext4@vger.kernel.org
CC: tytso@mit.edu
Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
</feed>
