<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/refs/ref-cache.c, branch v2.51.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.51.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.51.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2025-07-28T21:16:36Z</updated>
<entry>
<title>ref-cache: use 'size_t' instead of int for length</title>
<updated>2025-07-28T21:16:36Z</updated>
<author>
<name>Karthik Nayak</name>
<email>karthik.188@gmail.com</email>
</author>
<published>2025-07-28T20:20:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=65855751d190d66d15bc2c7e17e93fe00d04c539'/>
<id>urn:sha1:65855751d190d66d15bc2c7e17e93fe00d04c539</id>
<content type='text'>
The commit 090eb5336c (refs: selectively set prefix in the seek
functions, 2025-07-15) modified the ref-cache iterator to support
seeking to a specified marker without setting the prefix.

The commit adds and uses an integer 'len' to capture the length of the
seek marker to compare with the entries of a given directory. Since the
type of the variable is 'int', this is met with a typecast of converting
a `strlen` to 'int' so it can be assigned to the 'len' variable.

This is whole operation is a bit wrong:
1. Since the 'len' variable is eventually used in a 'strncmp', it should
have been of type 'size_t'.
2. This also truncates the value provided from 'strlen' to an int, which
could cause a large refname to produce a negative number.

Let's do the correct thing here and simply use 'size_t' for `len`.

Signed-off-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>ref-cache: set prefix_state when seeking</title>
<updated>2025-07-24T22:31:09Z</updated>
<author>
<name>Karthik Nayak</name>
<email>karthik.188@gmail.com</email>
</author>
<published>2025-07-24T22:11:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9201261a70b5d325b344036de15901a4064d66c0'/>
<id>urn:sha1:9201261a70b5d325b344036de15901a4064d66c0</id>
<content type='text'>
In 090eb5336c (refs: selectively set prefix in the seek functions,
2025-07-15) we separated the seeking functionality of reference
iterators from the functionality to set prefix to an iterator. This
allows users of ref iterators to seek to a particular reference to
provide pagination support.

The files-backend, uses the ref-cache iterator to iterate over loose
refs. The iterator tracks directories and entries already processed via
a stack of levels. Each level corresponds to a directory under the files
backend. New levels are added to the stack, and when all entries from a
level is yielded, the corresponding level is popped from the stack.

To accommodate seeking, we need to populate and traverse the levels to
stop the requested seek marker at the appropriate level and its entry
index. Each level also contains a 'prefix_state' which is used for
prefix matching, this allows the iterator to skip levels/entries which
don't match a prefix. The default value of 'prefix_state' is
PREFIX_CONTAINS_DIR, which yields all entries within a level. When
purely seeking without prefix matching, we want to yield all entries.
The commit however, skips setting the value explicitly. This causes the
MemorySanitizer to issue a 'use-of-uninitialized-value' error when
running 't/t6302-for-each-ref-filter'.

Set the value explicitly to avoid to fix the issue.

Reported-by: Kyle Lippincott &lt;spectral@google.com&gt;
Helped-by: Kyle Lippincott &lt;spectral@google.com&gt;
Helped-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs: selectively set prefix in the seek functions</title>
<updated>2025-07-15T18:54:20Z</updated>
<author>
<name>Karthik Nayak</name>
<email>karthik.188@gmail.com</email>
</author>
<published>2025-07-15T11:28:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=2b4648b9190b552e942d93363482bd617a510fc1'/>
<id>urn:sha1:2b4648b9190b552e942d93363482bd617a510fc1</id>
<content type='text'>
The ref iterator exposes a `ref_iterator_seek()` function. The name
suggests that this would seek the iterator to a specific reference in
some ways similar to how `fseek()` works for the filesystem.

However, the function actually sets the prefix for refs iteration. So
further iteration would only yield references which match the particular
prefix. This is a bit confusing.

Let's add a 'flags' field to the function, which when set with the
'REF_ITERATOR_SEEK_SET_PREFIX' flag, will set the prefix for the
iteration in-line with the existing behavior. Otherwise, the reference
backends will simply seek to the specified reference and clears any
previously set prefix. This allows users to start iteration from a
specific reference.

In the packed and reftable backend, since references are available in a
sorted list, the changes are simply setting the prefix if needed. The
changes on the files-backend are a little more involved, since the files
backend uses the 'ref-cache' mechanism. We move out the existing logic
within `cache_ref_iterator_seek()` to `cache_ref_iterator_set_prefix()`
which is called when the 'REF_ITERATOR_SEEK_SET_PREFIX' flag is set. We
then parse the provided seek string and set the required levels and
their indexes to ensure that seeking is possible.

Helped-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>ref-cache: remove unused function 'find_ref_entry()'</title>
<updated>2025-07-15T18:54:19Z</updated>
<author>
<name>Karthik Nayak</name>
<email>karthik.188@gmail.com</email>
</author>
<published>2025-07-15T11:28:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=883a7ea0543426d0ecb172c72c8f706348961605'/>
<id>urn:sha1:883a7ea0543426d0ecb172c72c8f706348961605</id>
<content type='text'>
The 'find_ref_entry' function is no longer used, so remove it.

Signed-off-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs/iterator: implement seeking for ref-cache iterators</title>
<updated>2025-03-12T18:31:20Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-03-12T15:56:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=84e656919cb7237f1b11a948974d0591d9d3434f'/>
<id>urn:sha1:84e656919cb7237f1b11a948974d0591d9d3434f</id>
<content type='text'>
Implement seeking of ref-cache iterators. This is done by splitting most
of the logic to seek iterators out of `cache_ref_iterator_begin()` and
putting it into `cache_ref_iterator_seek()` so that we can reuse the
logic.

Note that we cannot use the optimization anymore where we return an
empty ref iterator when there aren't any references, as otherwise it
wouldn't be possible to reseek the iterator to a different prefix that
may exist. This shouldn't be much of a performance concern though as we
now start to bail out early in case `advance()` sees that there are no
more directories to be searched.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs/iterator: separate lifecycle from iteration</title>
<updated>2025-03-12T18:31:18Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-03-12T15:56:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=cec2b6f55a805c010d2acc81abf4cbc41b712130'/>
<id>urn:sha1:cec2b6f55a805c010d2acc81abf4cbc41b712130</id>
<content type='text'>
The ref and reflog iterators have their lifecycle attached to iteration:
once the iterator reaches its end, it is automatically released and the
caller doesn't have to care about that anymore. When the iterator should
be released before it has been exhausted, callers must explicitly abort
the iterator via `ref_iterator_abort()`.

This lifecycle is somewhat unusual in the Git codebase and creates two
problems:

  - Callsites need to be very careful about when exactly they call
    `ref_iterator_abort()`, as calling the function is only valid when
    the iterator itself still is. This leads to somewhat awkward calling
    patterns in some situations.

  - It is impossible to reuse iterators and re-seek them to a different
    prefix. This feature isn't supported by any iterator implementation
    except for the reftable iterators anyway, but if it was implemented
    it would allow us to optimize cases where we need to search for
    specific references repeatedly by reusing internal state.

Detangle the lifecycle from iteration so that we don't deallocate the
iterator anymore once it is exhausted. Instead, callers are now expected
to always call a newly introduce `ref_iterator_free()` function that
deallocates the iterator and its internal state.

Note that the `dir_iterator` is somewhat special because it does not
implement the `ref_iterator` interface, but is only used to implement
other iterators. Consequently, we have to provide `dir_iterator_free()`
instead of `dir_iterator_release()` as the allocated structure itself is
managed by the `dir_iterator` interfaces, as well, and not freed by
`ref_iterator_free()` like in all the other cases.

While at it, drop the return value of `ref_iterator_abort()`, which
wasn't really required by any of the iterator implementations anyway.
Furthermore, stop calling `base_ref_iterator_free()` in any of the
backends, but instead call it in `ref_iterator_free()`.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>ref-cache: fix invalid free operation in `free_ref_entry`</title>
<updated>2024-11-26T19:34:37Z</updated>
<author>
<name>shejialuo</name>
<email>shejialuo@gmail.com</email>
</author>
<published>2024-11-26T14:40:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b6318cf23a21b4d1917438c642d8877fb31e7c3e'/>
<id>urn:sha1:b6318cf23a21b4d1917438c642d8877fb31e7c3e</id>
<content type='text'>
In cfd971520e (refs: keep track of unresolved reference value in
iterators, 2024-08-09), we added a new field "referent" into the "struct
ref" structure. In order to free the "referent", we unconditionally
freed the "referent" by simply adding a "free" statement.

However, this is a bad usage. Because when ref entry is either directory
or loose ref, we will always execute the following statement:

  free(entry-&gt;u.value.referent);

This does not make sense. We should never access the "entry-&gt;u.value"
field when "entry" is a directory. However, the change obviously doesn't
break the tests. Let's analysis why.

The anonymous union in the "ref_entry" has two members: one is "struct
ref_value", another is "struct ref_dir". On a 64-bit machine, the size
of "struct ref_dir" is 32 bytes, which is smaller than the 48-byte size
of "struct ref_value". And the offset of "referent" field in "struct
ref_value" is 40 bytes. So, whenever we create a new "ref_entry" for a
directory, we will leave the offset from 40 bytes to 48 bytes untouched,
which means the value for this memory is zero (NULL). It's OK to free a
NULL pointer, but this is merely a coincidence of memory layout.

To fix this issue, we now ensure that "free(entry-&gt;u.value.referent)" is
only called when "entry-&gt;flag" indicates that it represents a loose
reference and not a directory to avoid the invalid memory operation.

Signed-off-by: shejialuo &lt;shejialuo@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs: keep track of unresolved reference value in iterators</title>
<updated>2024-08-09T15:47:33Z</updated>
<author>
<name>John Cai</name>
<email>johncai86@gmail.com</email>
</author>
<published>2024-08-09T15:37:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=cfd971520ed11096aaa11f6bd1ee99b307f3146c'/>
<id>urn:sha1:cfd971520ed11096aaa11f6bd1ee99b307f3146c</id>
<content type='text'>
Since ref iterators do not hold onto the direct value of a reference
without resolving it, the only way to get ahold of a direct value of a
symbolic ref is to make a separate call to refs_read_symbolic_ref.

To make accessing the direct value of a symbolic ref more efficient,
let's save the direct value of the ref in the iterators for both the
files backend and the reftable backend.

Signed-off-by: John Cai &lt;johncai86@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs/files: fix NULL pointer deref when releasing ref store</title>
<updated>2024-06-06T16:04:32Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-06-06T05:29:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b3e098d6e77db87946c50c3517fbdeffe9168ca9'/>
<id>urn:sha1:b3e098d6e77db87946c50c3517fbdeffe9168ca9</id>
<content type='text'>
The `free_ref_cache()` function is not `NULL` safe and will thus
segfault when being passed such a pointer. This can easily happen when
trying to release a partially initialized "files" ref store. Fix this.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs: pass repo when peeling objects</title>
<updated>2024-05-17T17:33:39Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-05-17T08:19:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=30aaff437fddd889ba429b50b96ea4c151c502c5'/>
<id>urn:sha1:30aaff437fddd889ba429b50b96ea4c151c502c5</id>
<content type='text'>
Both `peel_object()` and `peel_iterated_oid()` implicitly rely on
`the_repository` to look up objects. Despite the fact that we want to
get rid of `the_repository`, it also leads to some restrictions in our
ref iterators when trying to retrieve the peeled value for a repository
other than `the_repository`.

Refactor these functions such that both take a repository as argument
and remove the now-unnecessary restrictions.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
