<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/dir.h, branch v2.36.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.36.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.36.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2021-12-09T21:33:13Z</updated>
<entry>
<title>dir: new flag to remove_dir_recurse() to spare the original_cwd</title>
<updated>2021-12-09T21:33:13Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2021-12-09T05:08:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=580a5d7f75fc7b6c4c369ef429742d9d417acddd'/>
<id>urn:sha1:580a5d7f75fc7b6c4c369ef429742d9d417acddd</id>
<content type='text'>
remove_dir_recurse(), and its non-static wrapper called
remove_dir_recursively(), both take flags for modifying its behavior.
As with the previous commits, we would generally like to protect
the original_cwd, but we want to forced user commands (e.g. 'git rm -rf
...') or other special cases to remove it.  Add a flag for this purpose.
After reading through every caller of remove_dir_recursively() in the
current codebase, there was only one that should be adjusted and that
one only in a very unusual circumstance.  Add a pair of new testcases to
highlight that very specific case involving submodules &amp;&amp; --git-dir &amp;&amp;
--work-tree.

Acked-by: Derrick Stolee &lt;stolee@gmail.com&gt;
Acked-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>dir: avoid incidentally removing the original_cwd in remove_path()</title>
<updated>2021-12-09T21:33:13Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2021-12-09T05:08:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=63bbe8beb78ee8af5a7faeee4be747a82d8e2dc7'/>
<id>urn:sha1:63bbe8beb78ee8af5a7faeee4be747a82d8e2dc7</id>
<content type='text'>
Modern git often tries to avoid leaving empty directories around when
removing files.  Originally, it did not bother.  This behavior started
with commit 80e21a9ed809 (merge-recursive::removeFile: remove empty
directories, 2005-11-19), stating the reason simply as:

    When the last file in a directory is removed as the result of a
    merge, try to rmdir the now-empty directory.

This was reimplemented in C and renamed to remove_path() in commit
e1b3a2cad7 ("Build-in merge-recursive", 2008-02-07), but was still
internal to merge-recursive.

This trend towards removing leading empty directories continued with
commit d9b814cc97f1 (Add builtin "git rm" command, 2006-05-19), which
stated the reasoning as:

    The other question is what to do with leading directories. The old
    "git rm" script didn't do anything, which is somewhat inconsistent.
    This one will actually clean up directories that have become empty
    as a result of removing the last file, but maybe we want to have a
    flag to decide the behaviour?

remove_path() in dir.c was added in 4a92d1bfb784 (Add remove_path: a
function to remove as much as possible of a path, 2008-09-27), because
it was noted that we had two separate implementations of the same idea
AND both were buggy.  It described the purpose of the function as

    a function to remove as much as possible of a path

Why remove as much as possible?  Well, at the time we probably would
have said something like:

  * removing leading directories makes things feel tidy
  * removing leading directories doesn't hurt anything so long as they
    had no files in them.

But I don't believe those reasons hold when the empty directory happens
to be the current working directory we inherited from our parent
process.  Leaving the parent process in a deleted directory can cause
user confusion when subsequent processes fail: any git command, for
example, will immediately fail with

    fatal: Unable to read current working directory: No such file or directory

Other commands may similarly get confused.  Modify remove_path() so that
the empty leading directories it also deletes does not include the
current working directory we inherited from our parent process.  I have
looked through every caller of remove_path() in the current codebase to
make sure that all should take this change.

Acked-by: Derrick Stolee &lt;stolee@gmail.com&gt;
Acked-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ds/sparse-index-ignored-files'</title>
<updated>2021-09-20T22:20:44Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-09-20T22:20:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=dc89c34d9e9237293d0ed73adc454fedfc620f74'/>
<id>urn:sha1:dc89c34d9e9237293d0ed73adc454fedfc620f74</id>
<content type='text'>
In cone mode, the sparse-index code path learned to remove ignored
files (like build artifacts) outside the sparse cone, allowing the
entire directory outside the sparse cone to be removed, which is
especially useful when the sparse patterns change.

* ds/sparse-index-ignored-files:
  sparse-checkout: clear tracked sparse dirs
  sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag
  attr: be careful about sparse directories
  sparse-checkout: create helper methods
  sparse-index: use WRITE_TREE_MISSING_OK
  sparse-index: silently return when cache tree fails
  unpack-trees: fix nested sparse-dir search
  sparse-index: silently return when not using cone-mode patterns
  t7519: rewrite sparse index test
</content>
</entry>
<entry>
<title>sparse-checkout: create helper methods</title>
<updated>2021-09-08T05:41:10Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2021-09-08T01:42:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=02155c8c005d0f8ee20a38245e9a065e8b5cc7dc'/>
<id>urn:sha1:02155c8c005d0f8ee20a38245e9a065e8b5cc7dc</id>
<content type='text'>
As we integrate the sparse index into more builtins, we occasionally
need to check the sparse-checkout patterns to see if a path is within
the sparse-checkout cone. Create some helper methods that help
initialize the patterns and check for pattern matching to make this
easier.

The existing callers of commands like get_sparse_checkout_patterns() use
a custom 'struct pattern_list' that is not necessarily the one in the
'struct index_state', so there are not many previous uses that could
adopt these helpers. There are just two in builtin/add.c and
sparse-index.c that can use path_in_sparse_checkout().

We add a path_in_cone_mode_sparse_checkout() as well that will only
return false if the path is outside of the sparse-checkout definition
_and_ the sparse-checkout patterns are in cone mode.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Reviewed-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>dir: libify and export helper functions from clone.c</title>
<updated>2021-08-10T18:45:11Z</updated>
<author>
<name>Atharva Raykar</name>
<email>raykar.ath@gmail.com</email>
</author>
<published>2021-08-10T11:46:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ed86301f68fcbb17c5d1c7a3258e4705b3b1da9c'/>
<id>urn:sha1:ed86301f68fcbb17c5d1c7a3258e4705b3b1da9c</id>
<content type='text'>
These functions can be useful to other parts of Git. Let's move them to
dir.c, while renaming them to be make their functionality more explicit.

Signed-off-by: Atharva Raykar &lt;raykar.ath@gmail.com&gt;
Mentored-by: Christian Couder &lt;christian.couder@gmail.com&gt;
Mentored-by: Shourya Shukla &lt;periperidip@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ew/many-alternate-optim'</title>
<updated>2021-07-28T20:17:57Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-07-28T20:17:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e5cc59c77c875aeda93a3cdfe70c8254164324ab'/>
<id>urn:sha1:e5cc59c77c875aeda93a3cdfe70c8254164324ab</id>
<content type='text'>
Optimization for repositories with many alternate object store.

* ew/many-alternate-optim:
  oidtree: a crit-bit tree for odb_loose_cache
  oidcpy_with_padding: constify `src' arg
  make object_directory.loose_objects_subdir_seen a bitmap
  avoid strlen via strbuf_addstr in link_alt_odb_entry
  speed up alt_odb_usable() with many alternates
</content>
</entry>
<entry>
<title>speed up alt_odb_usable() with many alternates</title>
<updated>2021-07-08T00:21:12Z</updated>
<author>
<name>Eric Wong</name>
<email>e@80x24.org</email>
</author>
<published>2021-07-07T23:10:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=cf2dc1c238c6fd5f93c315a3045ccf95459701cd'/>
<id>urn:sha1:cf2dc1c238c6fd5f93c315a3045ccf95459701cd</id>
<content type='text'>
With many alternates, the duplicate check in alt_odb_usable()
wastes many cycles doing repeated fspathcmp() on every existing
alternate.  Use a khash to speed up lookups by odb-&gt;path.

Since the kh_put_* API uses the supplied key without
duplicating it, we also take advantage of it to replace both
xstrdup() and strbuf_release() in link_alt_odb_entry() with
strbuf_detach() to avoid the allocation and copy.

In a test repository with 50K alternates and each of those 50K
alternates having one alternate each (for a total of 100K total
alternates); this speeds up lookup of a non-existent blob from
over 16 minutes to roughly 2.7 seconds on my busy workstation.

Note: all underlying git object directories were small and
unpacked with only loose objects and no packs.  Having to load
packs increases times significantly.

Signed-off-by: Eric Wong &lt;e@80x24.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>dir.[ch]: replace dir_init() with DIR_INIT</title>
<updated>2021-07-01T19:32:22Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2021-07-01T10:51:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ce93a4c6127abdf1ad9eacd537edd1c571a18e41'/>
<id>urn:sha1:ce93a4c6127abdf1ad9eacd537edd1c571a18e41</id>
<content type='text'>
Remove the dir_init() function and replace it with a DIR_INIT
macro. In many cases in the codebase we need to initialize things with
a function for good reasons, e.g. needing to call another function on
initialization. The "dir_init()" function was not one such case, and
could trivially be replaced with a more idiomatic macro initialization
pattern.

The only place where we made use of its use of memset() was in
dir_clear() itself, which resets the contents of an an existing struct
pointer. Let's use the new "memcpy() a 'blank' struct on the stack"
idiom to do that reset.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'en/dir-traversal'</title>
<updated>2021-05-19T23:54:59Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-05-19T23:54:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=33be431c0c7284c1adf0fe49f7838dbc8aee6ea9'/>
<id>urn:sha1:33be431c0c7284c1adf0fe49f7838dbc8aee6ea9</id>
<content type='text'>
"git clean" and "git ls-files -i" had confusion around working on
or showing ignored paths inside an ignored directory, which has
been corrected.

* en/dir-traversal:
  dir: introduce readdir_skip_dot_and_dotdot() helper
  dir: update stale description of treat_directory()
  dir: traverse into untracked directories if they may have ignored subfiles
  dir: avoid unnecessary traversal into ignored directory
  t3001, t7300: add testcase showcasing missed directory traversal
  t7300: add testcase showing unnecessary traversal into ignored directory
  ls-files: error out on -i unless -o or -c are specified
  dir: report number of visited directories and paths with trace2
  dir: convert trace calls to trace2 equivalents
</content>
</entry>
<entry>
<title>dir: introduce readdir_skip_dot_and_dotdot() helper</title>
<updated>2021-05-12T23:45:03Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2021-05-12T17:28:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b548f0f1568f6b01e55ca69c24d3cb19489f92aa'/>
<id>urn:sha1:b548f0f1568f6b01e55ca69c24d3cb19489f92aa</id>
<content type='text'>
Many places in the code were doing
    while ((d = readdir(dir)) != NULL) {
        if (is_dot_or_dotdot(d-&gt;d_name))
            continue;
        ...process d...
    }
Introduce a readdir_skip_dot_and_dotdot() helper to make that a one-liner:
    while ((d = readdir_skip_dot_and_dotdot(dir)) != NULL) {
        ...process d...
    }

This helper particularly simplifies checks for empty directories.

Also use this helper in read_cached_dir() so that our statistics are
consistent across platforms.  (In other words, read_cached_dir() should
have been using is_dot_or_dotdot() and skipping such entries, but did
not and left it to treat_path() to detect and mark such entries as
path_none.)

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
