<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/dir.c, branch v2.32.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.32.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.32.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2021-05-28T04:03:00Z</updated>
<entry>
<title>Merge branch 'en/dir-traversal'</title>
<updated>2021-05-28T04:03:00Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-05-28T04:03:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=329d63e7be044524d29dcb43e8ed52bc1f3b9241'/>
<id>urn:sha1:329d63e7be044524d29dcb43e8ed52bc1f3b9241</id>
<content type='text'>
Fix-up to a topic that is already in 'master'.

* en/dir-traversal:
  dir: introduce readdir_skip_dot_and_dotdot() helper
  dir: update stale description of treat_directory()
  Revert "dir: update stale description of treat_directory()"
  Revert "dir: introduce readdir_skip_dot_and_dotdot() helper"
</content>
</entry>
<entry>
<title>dir: introduce readdir_skip_dot_and_dotdot() helper</title>
<updated>2021-05-27T05:02:37Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2021-05-27T04:53:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=906fc557b70b2b2995785c9b37e212d2f86b469e'/>
<id>urn:sha1:906fc557b70b2b2995785c9b37e212d2f86b469e</id>
<content type='text'>
Many places in the code were doing
    while ((d = readdir(dir)) != NULL) {
        if (is_dot_or_dotdot(d-&gt;d_name))
            continue;
        ...process d...
    }
Introduce a readdir_skip_dot_and_dotdot() helper to make that a one-liner:
    while ((d = readdir_skip_dot_and_dotdot(dir)) != NULL) {
        ...process d...
    }

This helper particularly simplifies checks for empty directories.

Also use this helper in read_cached_dir() so that our statistics are
consistent across platforms.  (In other words, read_cached_dir() should
have been using is_dot_or_dotdot() and skipping such entries, but did
not and left it to treat_path() to detect and mark such entries as
path_none.)

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>dir: update stale description of treat_directory()</title>
<updated>2021-05-27T05:02:37Z</updated>
<author>
<name>Derrick Stolee</name>
<email>stolee@gmail.com</email>
</author>
<published>2021-05-27T04:53:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=eef814828f175290ca1fcd17ad74faff95346c85'/>
<id>urn:sha1:eef814828f175290ca1fcd17ad74faff95346c85</id>
<content type='text'>
The documentation comment for treat_directory() was originally written
in 095952 (Teach directory traversal about subprojects, 2007-04-11)
which was before the 'struct dir_struct' split its bitfield of named
options into a 'flags' enum in 7c4c97c0 (Turn the flags in struct
dir_struct into a single variable, 2009-02-16). When those flags
changed, the comment became stale, since members like
'show_other_directories' transitioned into flags like
DIR_SHOW_OTHER_DIRECTORIES.

Update the comments for treat_directory() to use these flag names rather
than the old member names.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Revert "dir: update stale description of treat_directory()"</title>
<updated>2021-05-27T05:02:37Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-05-27T05:00:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=2c9f1bfdb47acde21052bb33ff083347cbcb574d'/>
<id>urn:sha1:2c9f1bfdb47acde21052bb33ff083347cbcb574d</id>
<content type='text'>
This reverts commit 4e689d81718eb6e939cace317ea3e33cb994dcbb,
to be replaced with a reworked version.
</content>
</entry>
<entry>
<title>Revert "dir: introduce readdir_skip_dot_and_dotdot() helper"</title>
<updated>2021-05-27T05:02:37Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-05-27T04:59:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=1df046bcff6085f3800088c51e344594f822df57'/>
<id>urn:sha1:1df046bcff6085f3800088c51e344594f822df57</id>
<content type='text'>
This reverts commit b548f0f1568f6b01e55ca69c24d3cb19489f92aa,
to be replaced with a reworked version.
</content>
</entry>
<entry>
<title>Merge branch 'en/dir-traversal'</title>
<updated>2021-05-19T23:54:59Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-05-19T23:54:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=33be431c0c7284c1adf0fe49f7838dbc8aee6ea9'/>
<id>urn:sha1:33be431c0c7284c1adf0fe49f7838dbc8aee6ea9</id>
<content type='text'>
"git clean" and "git ls-files -i" had confusion around working on
or showing ignored paths inside an ignored directory, which has
been corrected.

* en/dir-traversal:
  dir: introduce readdir_skip_dot_and_dotdot() helper
  dir: update stale description of treat_directory()
  dir: traverse into untracked directories if they may have ignored subfiles
  dir: avoid unnecessary traversal into ignored directory
  t3001, t7300: add testcase showcasing missed directory traversal
  t7300: add testcase showing unnecessary traversal into ignored directory
  ls-files: error out on -i unless -o or -c are specified
  dir: report number of visited directories and paths with trace2
  dir: convert trace calls to trace2 equivalents
</content>
</entry>
<entry>
<title>dir: introduce readdir_skip_dot_and_dotdot() helper</title>
<updated>2021-05-12T23:45:03Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2021-05-12T17:28:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b548f0f1568f6b01e55ca69c24d3cb19489f92aa'/>
<id>urn:sha1:b548f0f1568f6b01e55ca69c24d3cb19489f92aa</id>
<content type='text'>
Many places in the code were doing
    while ((d = readdir(dir)) != NULL) {
        if (is_dot_or_dotdot(d-&gt;d_name))
            continue;
        ...process d...
    }
Introduce a readdir_skip_dot_and_dotdot() helper to make that a one-liner:
    while ((d = readdir_skip_dot_and_dotdot(dir)) != NULL) {
        ...process d...
    }

This helper particularly simplifies checks for empty directories.

Also use this helper in read_cached_dir() so that our statistics are
consistent across platforms.  (In other words, read_cached_dir() should
have been using is_dot_or_dotdot() and skipping such entries, but did
not and left it to treat_path() to detect and mark such entries as
path_none.)

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>dir: update stale description of treat_directory()</title>
<updated>2021-05-12T23:45:03Z</updated>
<author>
<name>Derrick Stolee</name>
<email>stolee@gmail.com</email>
</author>
<published>2021-05-12T17:28:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4e689d81718eb6e939cace317ea3e33cb994dcbb'/>
<id>urn:sha1:4e689d81718eb6e939cace317ea3e33cb994dcbb</id>
<content type='text'>
The documentation comment for treat_directory() was originally written
in 095952 (Teach directory traversal about subprojects, 2007-04-11)
which was before the 'struct dir_struct' split its bitfield of named
options into a 'flags' enum in 7c4c97c0 (Turn the flags in struct
dir_struct into a single variable, 2009-02-16). When those flags
changed, the comment became stale, since members like
'show_other_directories' transitioned into flags like
DIR_SHOW_OTHER_DIRECTORIES.

Update the comments for treat_directory() to use these flag names rather
than the old member names.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Reviewed-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>dir: traverse into untracked directories if they may have ignored subfiles</title>
<updated>2021-05-12T23:45:03Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2021-05-12T17:28:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=dd55fc0df15c338a8dbec6dec6ca6b58edc8acef'/>
<id>urn:sha1:dd55fc0df15c338a8dbec6dec6ca6b58edc8acef</id>
<content type='text'>
A directory that is untracked does not imply that all files under it
should be categorized as untracked; in particular, if the caller is
interested in ignored files, many files or directories underneath the
untracked directory may be ignored.  We previously partially handled
this right with DIR_SHOW_IGNORED_TOO, but missed DIR_SHOW_IGNORED.  It
was not obvious, though, because the logic for untracked and excluded
files had been fused together making it harder to reason about.  The
previous commit split that logic out, making it easier to notice that
DIR_SHOW_IGNORED was missing.  Add it.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>dir: avoid unnecessary traversal into ignored directory</title>
<updated>2021-05-12T23:45:03Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2021-05-12T17:28:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=aa6e1b21e5de8fae7121b8c6543e1482144c1ed3'/>
<id>urn:sha1:aa6e1b21e5de8fae7121b8c6543e1482144c1ed3</id>
<content type='text'>
The show_other_directories case in treat_directory() tried to handle
both excludes and untracked files with the same logic, and mishandled
both the excludes and the untracked files in the process, in different
ways.  Split that logic apart, and then focus on the logic for the
excludes; a subsequent commit will address the logic for untracked
files.

For show_other_directories, an excluded directory means that
every path underneath that directory will also be excluded.  Given that
the calling code requested to just show directories when everything
under a directory had the same state (that's what the
"DIR_SHOW_OTHER_DIRECTORIES" flag means), we generally do not need to
traverse into such directories and can just immediately mark them as
ignored (i.e. as path_excluded).  The only reason we cannot just
immediately return path_excluded is the DIR_HIDE_EMPTY_DIRECTORIES flag
and the possibility that the ignored directory is an empty directory.
The code previously treated DIR_SHOW_IGNORED_TOO in most cases as an
exception as well, which was wrong.  It can sometimes reduce the number
of cases where we need to recurse (namely if
DIR_SHOW_IGNORED_TOO_MODE_MATCHING is also set), but should not be able
to increase the number of cases where we need to recurse.  Fix the logic
accordingly.

Some sidenotes about possible confusion with dir.c:

* "ignored" often refers to an untracked ignore", i.e. a file which is
  not tracked which matches one of the ignore/exclusion rules.  But you
  can also have a "tracked ignore", a tracked file that happens to match
  one of the ignore/exclusion rules and which dir.c has to worry about
  since "git ls-files -c -i" is supposed to list them.

* The dir code often uses "ignored" and "excluded" interchangeably,
  which you need to keep in mind while reading the code.

* "exclude" is used multiple ways in the code:

  * As noted above, "exclude" is often a synonym for "ignored".

  * The logic for parsing .gitignore files was re-used in
    .git/info/sparse-checkout, except there it is used to mark paths that
    the user wants to *keep*.  This was mostly addressed by commit
    65edd96aec ("treewide: rename 'exclude' methods to 'pattern'",
    2019-09-03), but every once in a while you'll find a comment about
    "exclude" referring to these patterns that might in fact be in use
    by the sparse-checkout machinery for inclusion rules.

  * The word "EXCLUDE" is also used for pathspec negation, as in
      (pathspec-&gt;items[3].magic &amp; PATHSPEC_EXCLUDE)
    Thus if a user had a .gitignore file containing
      *~
      *.log
      !settings.log
    And then ran
      git add -- 'settings.*' ':^settings.log'
    Then :^settings.log is a pathspec negation making settings.log not
    be requested to be added even though all other settings.* files are
    being added.  Also, !settings.log in the gitignore file is a negative
    exclude pattern meaning that settings.log is normally a file we
    want to track even though all other *.log files are ignored.

Sometimes it feels like dir.c needs its own glossary with its many
definitions, including the multiply-defined terms.

Reported-by: Jason Gore &lt;Jason.Gore@microsoft.com&gt;
Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
