<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/cache-tree.c, branch jch</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=jch</id>
<link rel='self' href='https://git.shady.money/git/atom?h=jch'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2026-04-17T04:27:18Z</updated>
<entry>
<title>Merge branch 'dl/cache-tree-fully-valid-fix' into jch</title>
<updated>2026-04-17T04:27:18Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2026-04-17T04:27:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=60f76338538b8ceea66521a66ce43281c91762df'/>
<id>urn:sha1:60f76338538b8ceea66521a66ce43281c91762df</id>
<content type='text'>
The check that implements the logic to see if an in-core cache-tree
is fully ready to write out a tree object was broken, which has
been corrected.

* dl/cache-tree-fully-valid-fix:
  cache-tree: fix inverted object existence check in cache_tree_fully_valid
</content>
</entry>
<entry>
<title>Merge branch 'jd/cache-tree-trace-wo-the-repository'</title>
<updated>2026-04-08T17:19:17Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2026-04-08T17:19:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0c0cbd8ab7c8e1884d6f0bdf1f36f5f9d4732553'/>
<id>urn:sha1:0c0cbd8ab7c8e1884d6f0bdf1f36f5f9d4732553</id>
<content type='text'>
Code cleanup.

* jd/cache-tree-trace-wo-the-repository:
  cache-tree: use index state repository in trace2 calls
</content>
</entry>
<entry>
<title>cache-tree: fix inverted object existence check in cache_tree_fully_valid</title>
<updated>2026-04-06T21:21:03Z</updated>
<author>
<name>David Lin</name>
<email>davidzylin@gmail.com</email>
</author>
<published>2026-04-06T19:27:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=521731213c905f0dfec6a55393f010d185492c85'/>
<id>urn:sha1:521731213c905f0dfec6a55393f010d185492c85</id>
<content type='text'>
The negation in front of the object existence check in
cache_tree_fully_valid() was lost in 062b914c84 (treewide: convert
users of `repo_has_object_file()` to `has_object()`, 2025-04-29),
turning `!repo_has_object_file(...)` into `has_object(...)` instead
of `!has_object(...)`.

This makes cache_tree_fully_valid() always report the cache tree as
invalid when objects exist (the common case), forcing callers like
write_index_as_tree() to call cache_tree_update() on every
invocation.  An odb_has_object() check inside update_one() avoids a
full tree rebuild, but the unnecessary call still pays the cost of
opening an ODB transaction and, in partial clones, a promisor remote
check.

Restore the missing negation and add a test that verifies write-tree
takes the cache-tree shortcut when the cache tree is valid.

Helped-by: Derrick Stolee &lt;stolee@gmail.com&gt;
Signed-off-by: David Lin &lt;davidlin@stripe.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>odb: rename `odb_has_object()` flags</title>
<updated>2026-04-01T03:43:14Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2026-03-31T23:57:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c63911b052dc286de5daddba8d4a20fd59348cee'/>
<id>urn:sha1:c63911b052dc286de5daddba8d4a20fd59348cee</id>
<content type='text'>
Rename `odb_has_object()` flags to be properly prefixed with the
function name.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>odb: rename `odb_write_object()` flags</title>
<updated>2026-04-01T03:43:13Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2026-03-31T23:57:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ff2e9d85d61f2f51793acbdb4bad68d48cc8bb85'/>
<id>urn:sha1:ff2e9d85d61f2f51793acbdb4bad68d48cc8bb85</id>
<content type='text'>
Rename `odb_write_object()` flags to be properly prefixed with the
function name.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>cache-tree: use index state repository in trace2 calls</title>
<updated>2026-03-31T16:39:03Z</updated>
<author>
<name>Jayesh Daga</name>
<email>jayeshdaga99@gmail.com</email>
</author>
<published>2026-03-31T10:02:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=882c8e351d700e1738e696dfbc6312617f394570'/>
<id>urn:sha1:882c8e351d700e1738e696dfbc6312617f394570</id>
<content type='text'>
trace2 calls in cache-tree.c use the global 'the_repository',
even though cache_tree_update() has access to an explicit
repository pointer via 'istate-&gt;repo'.

Using the global repository can result in incorrect trace2
output when multiple repository instances are in use, as
events may be attributed to the wrong repository.

Use 'istate-&gt;repo' in cache_tree_update() to ensure correct
repository attribution.

Other call sites are left unchanged as they do not have
access to a repository instance.

Signed-off-by: Jayesh Daga &lt;jayeshdaga99@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>cache-tree: allow writing in-memory index as tree</title>
<updated>2026-03-03T23:09:36Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2026-03-02T12:13:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a021e4f92cbacdb028b3efa49f619b076e72c9a6'/>
<id>urn:sha1:a021e4f92cbacdb028b3efa49f619b076e72c9a6</id>
<content type='text'>
The function `write_in_core_index_as_tree()` takes a repository and
writes its index into a tree object. What this function cannot do though
is to take an _arbitrary_ in-memory index.

Introduce a new `struct index_state` parameter so that the caller can
pass a different index than the one belonging to the repository. This
will be used in a subsequent commit.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>cocci: convert parse_tree functions to repo_ variants</title>
<updated>2026-01-10T02:36:18Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2026-01-09T21:30:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ec7a16b14551fed736ecfe0a9d4f6d6f9e03be79'/>
<id>urn:sha1:ec7a16b14551fed736ecfe0a9d4f6d6f9e03be79</id>
<content type='text'>
Add and apply a semantic patch to convert calls to parse_tree() and
friends to the corresponding variant that takes a repository argument,
to allow the functions that implicitly use the_repository to be retired
once all potential in-flight topics are settled and converted as well.

The changes in .c files were generated by Coccinelle, but I fixed a
whitespace bug it would have introduced to builtin/commit.c.

Signed-off-by: René Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jk/asan-bonanza'</title>
<updated>2025-12-01T02:31:41Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-12-01T02:31:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=aea8cc3a10c325a22a75e2d4f582db959d3854ae'/>
<id>urn:sha1:aea8cc3a10c325a22a75e2d4f582db959d3854ae</id>
<content type='text'>
Various issues detected by Asan have been corrected.

* jk/asan-bonanza:
  t: enable ASan's strict_string_checks option
  fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated
  fsck: remove redundant date timestamp check
  fsck: avoid strcspn() in fsck_ident()
  fsck: assert newline presence in fsck_ident()
  cache-tree: avoid strtol() on non-string buffer
  Makefile: turn on NO_MMAP when building with ASan
  pack-bitmap: handle name-hash lookups in incremental bitmaps
  compat/mmap: mark unused argument in git_munmap()
</content>
</entry>
<entry>
<title>cache-tree: avoid strtol() on non-string buffer</title>
<updated>2025-11-18T17:36:06Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2025-11-18T09:12:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c4c9089584d0ed04978e8d0945b2ba2985e67bd3'/>
<id>urn:sha1:c4c9089584d0ed04978e8d0945b2ba2985e67bd3</id>
<content type='text'>
A cache-tree extension entry in the index looks like this:

  &lt;name&gt; NUL &lt;entry_nr&gt; SPACE &lt;subtree_nr&gt; NEWLINE &lt;binary_oid&gt;

where the "_nr" items are human-readable base-10 ASCII. We parse them
with strtol(), even though we do not have a NUL-terminated string (we'd
generally have an mmap() of the on-disk index file). For a well-formed
entry, this is not a problem; strtol() will stop when it sees the
newline. But there are two problems:

  1. A corrupted entry could omit the newline, causing us to read
     further. You'd mostly get stopped by seeing non-digits in the oid
     field (and if it is likewise truncated, there will still be 20 or
     more bytes of the index checksum). So it's possible, though
     unlikely, to read off the end of the mmap'd buffer. Of course a
     malicious index file can fake the oid and the index checksum to all
     (ASCII) 0's.

     This is further complicated by the fact that mmap'd buffers tend to
     be zero-padded up to the page boundary. So to run off the end, the
     index size also has to be a multiple of the page size. This is also
     unlikely, though you can construct a malicious index file that
     matches this.

     The security implications aren't too interesting. The index file is
     a local file anyway (so you can't attack somebody by cloning, but
     only if you convince them to operate in a .git directory you made,
     at which point attacking .git/config is much easier). And it's just
     a read overflow via strtol(), which is unlikely to buy you much
     beyond a crash.

  2. ASan has a strict_string_checks option, which tells it to make sure
     that options to string functions (like strtol) have some eventual
     NUL, without regard to what the function would actually do (like
     stopping at a newline here). This option sometimes has false
     positives, but it can point to sketchy areas (like this one) where
     the input we use doesn't exhibit a problem, but different input
     _could_ cause us to misbehave.

Let's fix it by just parsing the values ourselves with a helper function
that is careful not to go past the end of the buffer. There are a few
behavior changes here that should not matter:

  - We do not consider overflow, as strtol() would. But nor did the
    original code. However, we don't trust the value we get from the
    on-disk file, and if it says to read 2^30 entries, we would notice
    that we do not have that many and bail before reading off the end of
    the buffer.

  - Our helper does not skip past extra leading whitespace as strtol()
    would, but according to gitformat-index(5) there should not be any.

  - The original quit parsing at a newline or a NUL byte, but now we
    insist on a newline (which is what the documentation says, and what
    Git has always produced).

Since we are providing our own helper function, we can tweak the
interface a bit to make our lives easier. The original code does not use
strtol's "end" pointer to find the end of the parsed data, but rather
uses a separate loop to advance our "buf" pointer to the trailing
newline. We can instead provide a helper that advances "buf" as it
parses, letting us read strictly left-to-right through the buffer.

I didn't add a new test here. It's surprisingly difficult to construct
an index of exactly the right size due to the way we pad entries. But it
is easy to trigger the problem in existing tests when using ASan's
strict string checking, coupled with a recent change to use NO_MMAP with
ASan builds. So:

  make SANITIZE=address
  cd t
  ASAN_OPTIONS=strict_string_checks=1 ./t0090-cache-tree.sh

triggers it reliably. Technically it is not deterministic because there
is ~8% chance (it's 1-(255/256)^20, or ^32 for sha256) that the trailing
checksum hash has a NUL byte in it. But we compute enough cache-trees in
the course of that script that we are very likely to hit the problem in
one of them.

We can look at making strict_string_checks the default for ASan builds,
but there are some other cases we'd want to fix first.

Reported-by: correctmost &lt;cmlists@sent.com&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
