<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/refs/ref-cache.c, branch v2.50.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.50.0</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.50.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2025-03-12T18:31:20Z</updated>
<entry>
<title>refs/iterator: implement seeking for ref-cache iterators</title>
<updated>2025-03-12T18:31:20Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-03-12T15:56:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=84e656919cb7237f1b11a948974d0591d9d3434f'/>
<id>urn:sha1:84e656919cb7237f1b11a948974d0591d9d3434f</id>
<content type='text'>
Implement seeking of ref-cache iterators. This is done by splitting most
of the logic to seek iterators out of `cache_ref_iterator_begin()` and
putting it into `cache_ref_iterator_seek()` so that we can reuse the
logic.

Note that we cannot use the optimization anymore where we return an
empty ref iterator when there aren't any references, as otherwise it
wouldn't be possible to reseek the iterator to a different prefix that
may exist. This shouldn't be much of a performance concern though as we
now start to bail out early in case `advance()` sees that there are no
more directories to be searched.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs/iterator: separate lifecycle from iteration</title>
<updated>2025-03-12T18:31:18Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-03-12T15:56:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=cec2b6f55a805c010d2acc81abf4cbc41b712130'/>
<id>urn:sha1:cec2b6f55a805c010d2acc81abf4cbc41b712130</id>
<content type='text'>
The ref and reflog iterators have their lifecycle attached to iteration:
once the iterator reaches its end, it is automatically released and the
caller doesn't have to care about that anymore. When the iterator should
be released before it has been exhausted, callers must explicitly abort
the iterator via `ref_iterator_abort()`.

This lifecycle is somewhat unusual in the Git codebase and creates two
problems:

  - Callsites need to be very careful about when exactly they call
    `ref_iterator_abort()`, as calling the function is only valid when
    the iterator itself still is. This leads to somewhat awkward calling
    patterns in some situations.

  - It is impossible to reuse iterators and re-seek them to a different
    prefix. This feature isn't supported by any iterator implementation
    except for the reftable iterators anyway, but if it was implemented
    it would allow us to optimize cases where we need to search for
    specific references repeatedly by reusing internal state.

Detangle the lifecycle from iteration so that we don't deallocate the
iterator anymore once it is exhausted. Instead, callers are now expected
to always call a newly introduce `ref_iterator_free()` function that
deallocates the iterator and its internal state.

Note that the `dir_iterator` is somewhat special because it does not
implement the `ref_iterator` interface, but is only used to implement
other iterators. Consequently, we have to provide `dir_iterator_free()`
instead of `dir_iterator_release()` as the allocated structure itself is
managed by the `dir_iterator` interfaces, as well, and not freed by
`ref_iterator_free()` like in all the other cases.

While at it, drop the return value of `ref_iterator_abort()`, which
wasn't really required by any of the iterator implementations anyway.
Furthermore, stop calling `base_ref_iterator_free()` in any of the
backends, but instead call it in `ref_iterator_free()`.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>ref-cache: fix invalid free operation in `free_ref_entry`</title>
<updated>2024-11-26T19:34:37Z</updated>
<author>
<name>shejialuo</name>
<email>shejialuo@gmail.com</email>
</author>
<published>2024-11-26T14:40:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b6318cf23a21b4d1917438c642d8877fb31e7c3e'/>
<id>urn:sha1:b6318cf23a21b4d1917438c642d8877fb31e7c3e</id>
<content type='text'>
In cfd971520e (refs: keep track of unresolved reference value in
iterators, 2024-08-09), we added a new field "referent" into the "struct
ref" structure. In order to free the "referent", we unconditionally
freed the "referent" by simply adding a "free" statement.

However, this is a bad usage. Because when ref entry is either directory
or loose ref, we will always execute the following statement:

  free(entry-&gt;u.value.referent);

This does not make sense. We should never access the "entry-&gt;u.value"
field when "entry" is a directory. However, the change obviously doesn't
break the tests. Let's analysis why.

The anonymous union in the "ref_entry" has two members: one is "struct
ref_value", another is "struct ref_dir". On a 64-bit machine, the size
of "struct ref_dir" is 32 bytes, which is smaller than the 48-byte size
of "struct ref_value". And the offset of "referent" field in "struct
ref_value" is 40 bytes. So, whenever we create a new "ref_entry" for a
directory, we will leave the offset from 40 bytes to 48 bytes untouched,
which means the value for this memory is zero (NULL). It's OK to free a
NULL pointer, but this is merely a coincidence of memory layout.

To fix this issue, we now ensure that "free(entry-&gt;u.value.referent)" is
only called when "entry-&gt;flag" indicates that it represents a loose
reference and not a directory to avoid the invalid memory operation.

Signed-off-by: shejialuo &lt;shejialuo@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs: keep track of unresolved reference value in iterators</title>
<updated>2024-08-09T15:47:33Z</updated>
<author>
<name>John Cai</name>
<email>johncai86@gmail.com</email>
</author>
<published>2024-08-09T15:37:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=cfd971520ed11096aaa11f6bd1ee99b307f3146c'/>
<id>urn:sha1:cfd971520ed11096aaa11f6bd1ee99b307f3146c</id>
<content type='text'>
Since ref iterators do not hold onto the direct value of a reference
without resolving it, the only way to get ahold of a direct value of a
symbolic ref is to make a separate call to refs_read_symbolic_ref.

To make accessing the direct value of a symbolic ref more efficient,
let's save the direct value of the ref in the iterators for both the
files backend and the reftable backend.

Signed-off-by: John Cai &lt;johncai86@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs/files: fix NULL pointer deref when releasing ref store</title>
<updated>2024-06-06T16:04:32Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-06-06T05:29:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b3e098d6e77db87946c50c3517fbdeffe9168ca9'/>
<id>urn:sha1:b3e098d6e77db87946c50c3517fbdeffe9168ca9</id>
<content type='text'>
The `free_ref_cache()` function is not `NULL` safe and will thus
segfault when being passed such a pointer. This can easily happen when
trying to release a partially initialized "files" ref store. Fix this.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs: pass repo when peeling objects</title>
<updated>2024-05-17T17:33:39Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-05-17T08:19:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=30aaff437fddd889ba429b50b96ea4c151c502c5'/>
<id>urn:sha1:30aaff437fddd889ba429b50b96ea4c151c502c5</id>
<content type='text'>
Both `peel_object()` and `peel_iterated_oid()` implicitly rely on
`the_repository` to look up objects. Despite the fact that we want to
get rid of `the_repository`, it also leads to some restrictions in our
ref iterators when trying to retrieve the peeled value for a repository
other than `the_repository`.

Refactor these functions such that both take a repository as argument
and remove the now-unnecessary restrictions.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs: always treat iterators as ordered</title>
<updated>2024-02-21T17:58:06Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-02-21T12:37:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5e01d838412d6679c40c929bbb2591669ae393d4'/>
<id>urn:sha1:5e01d838412d6679c40c929bbb2591669ae393d4</id>
<content type='text'>
In the preceding commit we have converted the reflog iterator of the
"files" backend to be ordered, which was the only remaining ref iterator
that wasn't ordered. Refactor the ref iterator infrastructure so that we
always assume iterators to be ordered, thus simplifying the code.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>treewide: remove unnecessary includes in source files</title>
<updated>2023-12-26T20:04:31Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-12-23T17:14:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=eea0e59ffbed6e33d171ace5be13cde9faa41639'/>
<id>urn:sha1:eea0e59ffbed6e33d171ace5be13cde9faa41639</id>
<content type='text'>
Each of these were checked with
   gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).

...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file.  These cases were:
  * builtin/credential-cache.c
  * builtin/pull.c
  * builtin/send-pack.c

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>ref-cache.c: fix prefix matching in ref iteration</title>
<updated>2023-10-09T22:53:13Z</updated>
<author>
<name>Victoria Dye</name>
<email>vdye@github.com</email>
</author>
<published>2023-10-09T21:58:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5305474ec4650755f2c0437c004cabfb1c60cf6a'/>
<id>urn:sha1:5305474ec4650755f2c0437c004cabfb1c60cf6a</id>
<content type='text'>
Update 'cache_ref_iterator_advance' to skip over refs that are not matched
by the given prefix.

Currently, a ref entry is considered "matched" if the entry name is fully
contained within the prefix:

* prefix: "refs/heads/v1"
* entry: "refs/heads/v1.0"

OR if the prefix is fully contained in the entry name:

* prefix: "refs/heads/v1.0"
* entry: "refs/heads/v1"

The first case is always correct, but the second is only correct if the ref
cache entry is a directory, for example:

* prefix: "refs/heads/example"
* entry: "refs/heads/"

Modify the logic in 'cache_ref_iterator_advance' to reflect these
expectations:

1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and
   ref cache entry do not overlap at all. Skip this entry.
2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches
   inside this entry if it is a directory. Skip if the entry is not a
   directory, otherwise iterate over it.
3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating
   that the cache entry (directory or not) is fully contained by or equal to
   the prefix. Iterate over this entry.

Note that condition 2 relies on the names of directory entries having the
appropriate trailing slash. The existing function documentation of
'create_dir_entry' explicitly calls out the trailing slash requirement, so
this is a safe assumption to make.

This bug generally doesn't have any user-facing impact, since it requires:

1. using a non-empty prefix without a trailing slash in an iteration like
   'for_each_fullref_in',
2. the callback to said iteration not reapplying the original filter (as
   for-each-ref does) to ensure unmatched refs are skipped, and
3. the repository having one or more refs that match part of, but not all
   of, the prefix.

However, there are some niche scenarios that meet those criteria
(specifically, 'rev-parse --bisect' and '(log|show|shortlog) --bisect'). Add
tests covering those cases to demonstrate the fix in this patch.

Signed-off-by: Victoria Dye &lt;vdye@github.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hash-ll.h: split out of hash.h to remove dependency on repository.h</title>
<updated>2023-04-24T19:47:32Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-04-22T20:17:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d1cbe1e6d8a9cab2b4ffe8a17d34db214dce1e49'/>
<id>urn:sha1:d1cbe1e6d8a9cab2b4ffe8a17d34db214dce1e49</id>
<content type='text'>
hash.h depends upon and includes repository.h, due to the definition and
use of the_hash_algo (defined as the_repository-&gt;hash_algo).  However,
most headers trying to include hash.h are only interested in the layout
of the structs like object_id.  Move the parts of hash.h that do not
depend upon repository.h into a new file hash-ll.h (the "low level"
parts of hash.h), and adjust other files to use this new header where
the convenience inline functions aren't needed.

This allows hash.h and object.h to be fairly small, minimal headers.  It
also exposes a lot of hidden dependencies on both path.h (which was
brought in by repository.h) and repository.h (which was previously
implicitly brought in by object.h), so also adjust other files to be
more explicit about what they depend upon.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
