<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/entry.c, branch v2.40.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.40.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.40.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2024-04-19T10:38:37Z</updated>
<entry>
<title>Sync with 2.39.4</title>
<updated>2024-04-19T10:38:37Z</updated>
<author>
<name>Johannes Schindelin</name>
<email>johannes.schindelin@gmx.de</email>
</author>
<published>2024-04-12T07:45:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=93a88f42db7ed9a975768df0e5f4516317c50dda'/>
<id>urn:sha1:93a88f42db7ed9a975768df0e5f4516317c50dda</id>
<content type='text'>
* maint-2.39: (38 commits)
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  clone: prevent clashing git dirs when cloning submodule in parallel
  t7423: add tests for symlinked submodule directories
  has_dir_name(): do not get confused by characters &lt; '/'
  docs: document security issues around untrusted .git dirs
  upload-pack: disable lazy-fetching by default
  fetch/clone: detect dubious ownership of local repositories
  ...
</content>
</entry>
<entry>
<title>clone: when symbolic links collide with directories, keep the latter</title>
<updated>2024-04-17T20:30:08Z</updated>
<author>
<name>Johannes Schindelin</name>
<email>johannes.schindelin@gmx.de</email>
</author>
<published>2024-03-28T09:55:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=31572dc420afee36db8fbbbe060dd78c9a48778c'/>
<id>urn:sha1:31572dc420afee36db8fbbbe060dd78c9a48778c</id>
<content type='text'>
When recursively cloning a repository with submodules, we must ensure
that the submodules paths do not suddenly contain symbolic links that
would let Git write into unintended locations. We just plugged that
vulnerability, but let's add some more defense-in-depth.

Since we can only keep one item on disk if multiple index entries' paths
collide, we may just as well avoid keeping a symbolic link (because that
would allow attack vectors where Git follows those links by mistake).

Technically, we handle more situations than cloning submodules into
paths that were (partially) replaced by symbolic links. This provides
defense-in-depth in case someone finds a case-folding confusion
vulnerability in the future that does not even involve submodules.

Signed-off-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
</content>
</entry>
<entry>
<title>entry: report more colliding paths</title>
<updated>2024-04-17T20:30:07Z</updated>
<author>
<name>Johannes Schindelin</name>
<email>johannes.schindelin@gmx.de</email>
</author>
<published>2024-03-28T08:44:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=850c3a220e7a0b1bf740fba9ac8f3f2b0486a1af'/>
<id>urn:sha1:850c3a220e7a0b1bf740fba9ac8f3f2b0486a1af</id>
<content type='text'>
In b878579ae7 (clone: report duplicate entries on case-insensitive
filesystems, 2018-08-17) code was added to warn about index entries that
resolve to the same file system entity (usually the cause is a
case-insensitive filesystem).

In Git for Windows, where inodes are not trusted (because of a
performance trade-off, inodes are equal to 0 by default), that check
does not compare inode numbers but the verbatim path.

This logic works well when index entries' paths differ only in case.

However, for file/directory conflicts only the file's path was reported,
leaving the user puzzled with what that path collides.

Let's try ot catch colliding paths even if one path is the prefix of the
other. We do this also in setups where the file system is case-sensitive
because the inode check would not be able to catch those collisions.

While not a complete solution (for example, on macOS, Unicode
normalization could also lead to file/directory conflicts but be missed
by this logic), it is at least another defensive layer on top of what
the previous commits added.

Signed-off-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
</content>
</entry>
<entry>
<title>read-tree: add "--super-prefix" option, eliminate global</title>
<updated>2022-12-26T01:21:44Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2022-12-20T12:39:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4002ec3dcf0f89db46fbdf56549218c573a9c0f2'/>
<id>urn:sha1:4002ec3dcf0f89db46fbdf56549218c573a9c0f2</id>
<content type='text'>
The "--super-prefix" option to "git" was initially added in [1] for
use with "ls-files"[2], and shortly thereafter "submodule--helper"[3]
and "grep"[4]. It wasn't until [5] that "read-tree" made use of it.

At the time [5] made sense, but since then we've made "ls-files"
recurse in-process in [6], "grep" in [7], and finally
"submodule--helper" in the preceding commits.

Let's also remove it from "read-tree", which allows us to remove the
option to "git" itself.

We can do this because the only remaining user of it is the submodule
API, which will now invoke "read-tree" with its new "--super-prefix"
option. It will only do so when the "submodule_move_head()" function
is called.

That "submodule_move_head()" function was then only invoked by
"read-tree" itself, but now rather than setting an environment
variable to pass "--super-prefix" between cmd_read_tree() we:

- Set a new "super_prefix" in "struct unpack_trees_options". The
  "super_prefixed()" function in "unpack-trees.c" added in [5] will now
  use this, rather than get_super_prefix() looking up the environment
  variable we set earlier in the same process.

- Add the same field to the "struct checkout", which is only needed to
  ferry the "super_prefix" in the "struct unpack_trees_options" all the
  way down to the "entry.c" callers of "submodule_move_head()".

  Those calls which used the super prefix all originated in
  "cmd_read_tree()". The only other caller is the "unlink_entry()"
  caller in "builtin/checkout.c", which now passes a "NULL".

1. 74866d75793 (git: make super-prefix option, 2016-10-07)
2. e77aa336f11 (ls-files: optionally recurse into submodules, 2016-10-07)
3. 89c86265576 (submodule helper: support super prefix, 2016-12-08)
4. 0281e487fd9 (grep: optionally recurse into submodules, 2016-12-16)
5. 3d415425c7b (unpack-trees: support super-prefix option, 2017-01-17)
6. 188dce131fa (ls-files: use repository object, 2017-06-22)
7. f9ee2fcdfa0 (grep: recurse in-process using 'struct repository', 2017-08-02)

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>checkout: fix two bugs on the final count of updated entries</title>
<updated>2022-07-14T17:19:28Z</updated>
<author>
<name>Matheus Tavares</name>
<email>matheus.bernardino@usp.br</email>
</author>
<published>2022-07-14T11:49:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=611c7785e8e22637e183333c54ed266e6e83e163'/>
<id>urn:sha1:611c7785e8e22637e183333c54ed266e6e83e163</id>
<content type='text'>
At the end of `git checkout &lt;pathspec&gt;`, we get a message informing how
many entries were updated in the working tree. However, this number can
be inaccurate for two reasons:

1) Delayed entries currently get counted twice.
2) Failed entries are included in the count.

The first problem happens because the counter is first incremented
before inserting the entry in the delayed checkout queue, and once again
when finish_delayed_checkout() calls checkout_entry(). And the second
happens because the counter is incremented too early in
checkout_entry(), before the entry was in fact checked out. Fix that by
moving the count increment further down in the call stack and removing
the duplicate increment on delayed entries. Note that we have to keep
a per-entry reference for the counter (both on parallel checkout and
delayed checkout) because not all entries are always accumulated at the
same counter. See checkout_worktree(), at builtin/checkout.c for an
example.

Signed-off-by: Matheus Tavares &lt;matheus.bernardino@usp.br&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'mc/clean-smudge-with-llp64'</title>
<updated>2021-11-29T23:41:51Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-11-29T23:41:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=f9ba6acaa9348ea7b733bf78adc2f084247a912f'/>
<id>urn:sha1:f9ba6acaa9348ea7b733bf78adc2f084247a912f</id>
<content type='text'>
The clean/smudge conversion code path has been prepared to better
work on platforms where ulong is narrower than size_t.

* mc/clean-smudge-with-llp64:
  clean/smudge: allow clean filters to process extremely large files
  odb: guard against data loss checking out a huge file
  git-compat-util: introduce more size_t helpers
  odb: teach read_blob_entry to use size_t
  t1051: introduce a smudge filter test for extremely large files
  test-lib: add prerequisite for 64-bit platforms
  test-tool genzeros: generate large amounts of data more efficiently
  test-genzeros: allow more than 2G zeros in Windows
</content>
</entry>
<entry>
<title>odb: teach read_blob_entry to use size_t</title>
<updated>2021-11-03T18:22:27Z</updated>
<author>
<name>Matt Cooper</name>
<email>vtbassmatt@gmail.com</email>
</author>
<published>2021-11-02T15:46:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e9aa762cc72e6cf8fd76fefe5ca2b5064be1a821'/>
<id>urn:sha1:e9aa762cc72e6cf8fd76fefe5ca2b5064be1a821</id>
<content type='text'>
There is mixed use of size_t and unsigned long to deal with sizes in the
codebase. Recall that Windows defines unsigned long as 32 bits even on
64-bit platforms, meaning that converting size_t to unsigned long narrows
the range. This mostly doesn't cause a problem since Git rarely deals
with files larger than 2^32 bytes.

But adjunct systems such as Git LFS, which use smudge/clean filters to
keep huge files out of the repository, may have huge file contents passed
through some of the functions in entry.c and convert.c. On Windows, this
results in a truncated file being written to the workdir. I traced this to
one specific use of unsigned long in write_entry (and a similar instance
in write_pc_item_to_fd for parallel checkout). That appeared to be for
the call to read_blob_entry, which expects a pointer to unsigned long.

By altering the signature of read_blob_entry to expect a size_t,
write_entry can be switched to use size_t internally (which all of its
callers and most of its callees already used). To avoid touching dozens of
additional files, read_blob_entry uses a local unsigned long to call a
chain of functions which aren't prepared to accept size_t.

Helped-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
Signed-off-by: Matt Cooper &lt;vtbassmatt@gmail.com&gt;
Signed-off-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>entry: show finer-grained counter in "Filtering content" progress line</title>
<updated>2021-09-09T16:58:19Z</updated>
<author>
<name>SZEDER Gábor</name>
<email>szeder.dev@gmail.com</email>
</author>
<published>2021-09-09T01:10:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=bf6d819bc1589a41fbd4c8d9f55f0a6ebd6ef86f'/>
<id>urn:sha1:bf6d819bc1589a41fbd4c8d9f55f0a6ebd6ef86f</id>
<content type='text'>
The "Filtering content" progress in entry.c:finish_delayed_checkout()
is unusual because of how it calculates the progress count and because
it shows the progress of a nested loop.  It works basically like this:

  start_delayed_progress(p, nr_of_paths_to_filter)
  for_each_filter {
      display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter)
      for_each_path_handled_by_the_current_filter {
          checkout_entry()
      }
  }
  stop_progress(p)

There are two issues with this approach:

  - The work done by the last filter (or the only filter if there is
    only one) is never counted, so if the last filter still has some
    paths to process, then the counter shown in the "done" progress
    line will not match the expected total.

    The partially-RFC series to add a GIT_TEST_CHECK_PROGRESS=1
    mode[1] helps spot this issue. Under it the 'missing file in
    delayed checkout' and 'invalid file in delayed checkout' tests in
    't0021-conversion.sh' fail, because both use only one
    filter.  (The test 'delayed checkout in process filter' uses two
    filters but the first one does all the work, so that test already
    happens to succeed even with GIT_TEST_CHECK_PROGRESS=1.)

  - The progress counter is updated only once per filter, not once per
    processed path, so if a filter has a lot of paths to process, then
    the counter might stay unchanged for a long while and then make a
    big jump (though the user still gets a sense of progress, because
    we call display_throughput() after each processed path to show the
    amount of processed data).

Move the display_progress() call to the inner loop, right next to that
checkout_entry() call that does the hard work for each path, and use a
dedicated counter variable that is incremented upon processing each
path.

After this change the 'invalid file in delayed checkout' in
't0021-conversion.sh' would succeed with the GIT_TEST_CHECK_PROGRESS=1
assertion discussed above, but the 'missing file in delayed checkout'
test would still fail.

It'll fail because its purposefully buggy filter doesn't process any
paths, so we won't execute that inner loop at all, see [2] for how to
spot that issue without GIT_TEST_CHECK_PROGRESS=1. It's not
straightforward to fix it with the current progress.c library (see [3]
for an attempt), so let's leave it for now.

Let's also initialize the *progress to "NULL" while we're at it. Since
7a132c628e5 (checkout: make delayed checkout respect --quiet and
--no-progress, 2021-08-26) we have had progress conditional on
"show_progress", usually we use the idiom of a "NULL" initialization
of the "*progress", rather than the more verbose ternary added in
7a132c628e5.

1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/
2. http://lore.kernel.org/git/20210802214827.GE23408@szeder.dev
3. https://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.com/

Signed-off-by: SZEDER Gábor &lt;szeder.dev@gmail.com&gt;
Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>checkout: make delayed checkout respect --quiet and --no-progress</title>
<updated>2021-08-27T06:15:33Z</updated>
<author>
<name>Matheus Tavares</name>
<email>matheus.bernardino@usp.br</email>
</author>
<published>2021-08-26T19:10:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=7a132c628e57b9bceeb88832ea051395c0637b16'/>
<id>urn:sha1:7a132c628e57b9bceeb88832ea051395c0637b16</id>
<content type='text'>
The 'Filtering contents...' progress report from delayed checkout is
displayed even when checkout and clone are invoked with --quiet or
--no-progress. Furthermore, it is displayed unconditionally, without
first checking whether stdout is a tty. Let's fix these issues and also
add some regression tests for the two code paths that currently use
delayed checkout: unpack_trees.c:check_updates() and
builtin/checkout.c:checkout_worktree().

Signed-off-by: Matheus Tavares &lt;matheus.bernardino@usp.br&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>string-list.h users: change to use *_{nodup,dup}()</title>
<updated>2021-07-01T19:32:22Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2021-07-01T10:51:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=bc40dfb10a06612813a3f9b733d65af0732208b2'/>
<id>urn:sha1:bc40dfb10a06612813a3f9b733d65af0732208b2</id>
<content type='text'>
Change all in-tree users of the string_list_init(LIST, BOOL) API to
use string_list_init_{nodup,dup}(LIST) instead.

As noted in the preceding commit let's leave the now-unused
string_list_init() wrapper in-place for any in-flight users, it can be
removed at some later date.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
