<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/dir.h, branch v2.7.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.7.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.7.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2015-05-26T20:24:46Z</updated>
<entry>
<title>Merge branch 'nd/untracked-cache'</title>
<updated>2015-05-26T20:24:46Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2015-05-26T20:24:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=38ccaf93bbf5a99dbff908068292ffaa5bafe25e'/>
<id>urn:sha1:38ccaf93bbf5a99dbff908068292ffaa5bafe25e</id>
<content type='text'>
Teach the index to optionally remember already seen untracked files
to speed up "git status" in a working tree with tons of cruft.

* nd/untracked-cache: (24 commits)
  git-status.txt: advertisement for untracked cache
  untracked cache: guard and disable on system changes
  mingw32: add uname()
  t7063: tests for untracked cache
  update-index: test the system before enabling untracked cache
  update-index: manually enable or disable untracked cache
  status: enable untracked cache
  untracked-cache: temporarily disable with $GIT_DISABLE_UNTRACKED_CACHE
  untracked cache: mark index dirty if untracked cache is updated
  untracked cache: print stats with $GIT_TRACE_UNTRACKED_STATS
  untracked cache: avoid racy timestamps
  read-cache.c: split racy stat test to a separate function
  untracked cache: invalidate at index addition or removal
  untracked cache: load from UNTR index extension
  untracked cache: save to an index extension
  ewah: add convenient wrapper ewah_serialize_strbuf()
  untracked cache: don't open non-existent .gitignore
  untracked cache: mark what dirs should be recursed/saved
  untracked cache: record/validate dir mtime and reuse cached output
  untracked cache: make a wrapper around {open,read,close}dir()
  ...
</content>
</entry>
<entry>
<title>Merge branch 'jc/report-path-error-to-dir'</title>
<updated>2015-03-26T18:57:13Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2015-03-26T18:57:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=574ee8ae8636ffad8146e0f8e648b866dad725e6'/>
<id>urn:sha1:574ee8ae8636ffad8146e0f8e648b866dad725e6</id>
<content type='text'>
Code clean-up.

* jc/report-path-error-to-dir:
  report_path_error(): move to dir.c
</content>
</entry>
<entry>
<title>report_path_error(): move to dir.c</title>
<updated>2015-03-24T21:12:10Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2015-03-24T21:12:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=777c55a61615837d4391facd75cf334b96635801'/>
<id>urn:sha1:777c55a61615837d4391facd75cf334b96635801</id>
<content type='text'>
The expected call sequence is for the caller to use match_pathspec()
repeatedly on a set of pathspecs, accumulating the "hits" in a
separate array, and then call this function to diagnose a pathspec
that never matched anything, as that can indicate a typo from the
command line, e.g. "git commit Maekfile".

Many builtin commands use this function from builtin/ls-files.c,
which is not a very healthy arrangement.  ls-files might have been
the first command to feel the need for such a helper, but the need
is shared by everybody who uses the "match and then report" pattern.

Move it to dir.c where match_pathspec() is defined.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>untracked cache: guard and disable on system changes</title>
<updated>2015-03-12T20:45:18Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2015-03-08T10:12:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=1e8fef609e78110e276df633c5ba1fb1f1589fa5'/>
<id>urn:sha1:1e8fef609e78110e276df633c5ba1fb1f1589fa5</id>
<content type='text'>
If the user enables untracked cache, then

 - move worktree to an unsupported filesystem
 - or simply upgrade OS
 - or move the whole (portable) disk from one machine to another
 - or access a shared fs from another machine

there's no guarantee that untracked cache can still function properly.
Record the worktree location and OS footprint in the cache. If it
changes, err on the safe side and disable the cache. The user can
'update-index --untracked-cache' again to make sure all conditions are
met.

This adds a new requirement that setup_git_directory* must be called
before read_cache() because we need worktree location by then, or the
cache is dropped.

This change does not cover all bases, you can fool it if you try
hard. The point is to stop accidents.

Helped-by: Eric Sunshine &lt;sunshine@sunshineco.com&gt;
Helped-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Helped-by: Torsten Bögershausen &lt;tboegi@web.de&gt;
Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>untracked cache: invalidate at index addition or removal</title>
<updated>2015-03-12T20:45:16Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2015-03-08T10:12:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e931371a8f1164185486a1f5fdaaa708b4a6217c'/>
<id>urn:sha1:e931371a8f1164185486a1f5fdaaa708b4a6217c</id>
<content type='text'>
Ideally we should implement untracked_cache_remove_from_index() and
untracked_cache_add_to_index() so that they update untracked cache
right away instead of invalidating it and wait for read_directory()
next time to deal with it. But that may need some more work in
unpack-trees.c. So stay simple as the first step.

The new call in add_index_entry_with_check() may look strange because
new calls usually stay close to cache_tree_invalidate_path(). We do it
a bit later than c_t_i_p() in this function because if it's about
replacing the entry with the same name, we don't care (but cache-tree
does).

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>untracked cache: load from UNTR index extension</title>
<updated>2015-03-12T20:45:16Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2015-03-08T10:12:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=f9e6c649589e0940ccb82821107fb658277ed86b'/>
<id>urn:sha1:f9e6c649589e0940ccb82821107fb658277ed86b</id>
<content type='text'>
Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>untracked cache: save to an index extension</title>
<updated>2015-03-12T20:45:16Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2015-03-08T10:12:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=83c094ad0dd2104adbbec034f802dceb1d052981'/>
<id>urn:sha1:83c094ad0dd2104adbbec034f802dceb1d052981</id>
<content type='text'>
Helped-by: Stefan Beller &lt;sbeller@google.com&gt;
Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>untracked cache: mark what dirs should be recursed/saved</title>
<updated>2015-03-12T20:45:16Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2015-03-08T10:12:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=26cb0182b8b2e119f469750b3511fac4624f6667'/>
<id>urn:sha1:26cb0182b8b2e119f469750b3511fac4624f6667</id>
<content type='text'>
If we redo this thing in a functional style, we would have one struct
untracked_dir as input tree and another as output. The input is used
for verification. The output is a brand new tree, reflecting current
worktree.

But that means recreate a lot of dir nodes even if a lot could be
shared between input and output trees in good cases. So we go with the
messy but efficient way, combining both input and output trees into
one. We need a way to know which node in this combined tree belongs to
the output. This is the purpose of this "recurse" flag.

"valid" bit can't be used for this because it's about data of the node
except the subdirs. When we invalidate a directory, we want to keep
cached data of the subdirs intact even though we don't really know
what subdir still exists (yet). Then we check worktree to see what
actual subdir remains on disk. Those will have 'recurse' bit set
again. If cached data for those are still valid, we may be able to
avoid computing exclude files for them. Those subdirs that are deleted
will have 'recurse' remained clear and their 'valid' bits do not
matter.

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>untracked cache: record/validate dir mtime and reuse cached output</title>
<updated>2015-03-12T20:45:15Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2015-03-08T10:12:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=91a2288b5f63fba82e912dca475154d5b9dd233a'/>
<id>urn:sha1:91a2288b5f63fba82e912dca475154d5b9dd233a</id>
<content type='text'>
The main readdir loop in read_directory_recursive() is replaced with a
new one that checks if cached results of a directory is still valid.

If a file is added or removed from the index, the containing directory
is invalidated (but not its subdirs). If directory's mtime is changed,
the same happens. If a .gitignore is updated, the containing directory
and all subdirs are invalidated recursively. If dir_struct#flags or
other conditions change, the cache is ignored.

If a directory is invalidated, we opendir/readdir/closedir and run the
exclude machinery on that directory listing as usual. If untracked
cache is also enabled, we'll update the cache along the way. If a
directory is validated, we simply pull the untracked listing out from
the cache. The cache also records the list of direct subdirs that we
have to recurse in. Fully excluded directories are seen as "untracked
files".

In the best case when no dirs are invalidated, read_directory()
becomes a series of

  stat(dir), open(.gitignore), fstat(), read(), close() and optionally
  hash_sha1_file()

For comparison, standard read_directory() is a sequence of

  opendir(), readdir(), open(.gitignore), fstat(), read(), close(), the
  expensive last_exclude_matching() and closedir().

We already try not to open(.gitignore) if we know it does not exist,
so open/fstat/read/close sequence does not apply to every
directory. The sequence could be reduced further, as noted in
prep_exclude() in another patch. So in theory, the entire best-case
read_directory sequence could be reduced to a series of stat() and
nothing else.

This is not a silver bullet approach. When you compile a C file, for
example, the old .o file is removed and a new one with the same name
created, effectively invalidating the containing directory's cache
(but not its subdirectories). If your build process touches every
directory, this cache adds extra overhead for nothing, so it's a good
idea to separate generated files from tracked files.. Editors may use
the same strategy for saving files. And of course you're out of luck
running your repo on an unsupported filesystem and/or operating system.

Helped-by: Eric Sunshine &lt;sunshine@sunshineco.com&gt;
Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>untracked cache: initial untracked cache validation</title>
<updated>2015-03-12T20:45:15Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2015-03-08T10:12:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ccad261f07900b55029f3fd42a9ec8f17229808f'/>
<id>urn:sha1:ccad261f07900b55029f3fd42a9ec8f17229808f</id>
<content type='text'>
Make sure the starting conditions and all global exclude files are
good to go. If not, either disable untracked cache completely, or wipe
out the cache and start fresh.

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
