<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/ref-filter.c, branch jch</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=jch</id>
<link rel='self' href='https://git.shady.money/git/atom?h=jch'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2026-03-09T21:36:55Z</updated>
<entry>
<title>Merge branch 'ps/refs-for-each'</title>
<updated>2026-03-09T21:36:55Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2026-03-09T21:36:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d445aecfb013ae7b45e946f9aea06464aee69ed8'/>
<id>urn:sha1:d445aecfb013ae7b45e946f9aea06464aee69ed8</id>
<content type='text'>
Code refactoring around refs-for-each-* API functions.

* ps/refs-for-each:
  refs: replace `refs_for_each_fullref_in()`
  refs: replace `refs_for_each_namespaced_ref()`
  refs: replace `refs_for_each_glob_ref()`
  refs: replace `refs_for_each_glob_ref_in()`
  refs: replace `refs_for_each_rawref_in()`
  refs: replace `refs_for_each_rawref()`
  refs: replace `refs_for_each_ref_in()`
  refs: improve verification for-each-ref options
  refs: generalize `refs_for_each_fullref_in_prefixes()`
  refs: generalize `refs_for_each_namespaced_ref()`
  refs: speed up `refs_for_each_glob_ref_in()`
  refs: introduce `refs_for_each_ref_ext`
  refs: rename `each_ref_fn`
  refs: rename `do_for_each_ref_flags`
  refs: move `do_for_each_ref_flags` further up
  refs: move `refs_head_ref_namespaced()`
  refs: remove unused `refs_for_each_include_root_ref()`
</content>
</entry>
<entry>
<title>Merge branch 'bk/mailmap-wo-the-repository'</title>
<updated>2026-03-04T18:53:00Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2026-03-04T18:53:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=2d843a2d3d6c2d5e7861e6aa99743d15d36746b9'/>
<id>urn:sha1:2d843a2d3d6c2d5e7861e6aa99743d15d36746b9</id>
<content type='text'>
Wean the mailmap code off of the_repository dependency.

* bk/mailmap-wo-the-repository:
  mailmap: drop global config variables
  mailmap: stop using the_repository
</content>
</entry>
<entry>
<title>Merge branch 'jk/ref-filter-lrstrip-optim'</title>
<updated>2026-02-27T23:11:51Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2026-02-27T23:11:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ce4530ac10eecfcd80bd54c6993f2279a377e193'/>
<id>urn:sha1:ce4530ac10eecfcd80bd54c6993f2279a377e193</id>
<content type='text'>
Code clean-up.

* jk/ref-filter-lrstrip-optim:
  ref-filter: clarify lstrip/rstrip component counting
  ref-filter: avoid strrchr() in rstrip_ref_components()
  ref-filter: simplify rstrip_ref_components() memory handling
  ref-filter: simplify lstrip_ref_components() memory handling
  ref-filter: factor out refname component counting
</content>
</entry>
<entry>
<title>refs: generalize `refs_for_each_fullref_in_prefixes()`</title>
<updated>2026-02-23T21:21:18Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2026-02-23T11:59:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=f503bb7dc96ee92623ade8d60eed401ecfddae0f'/>
<id>urn:sha1:f503bb7dc96ee92623ade8d60eed401ecfddae0f</id>
<content type='text'>
The function `refs_for_each_fullref_in_prefixes()` can be used to
iterate over all references part of any of the user-provided prefixes.
In contrast to the `prefix` parameter of `refs_for_each_ref_ext()` it
knows to handle the case well where multiple of the passed-in prefixes
start with a common prefix by computing longest common prefixes and then
iterating over those.

While we could move this logic into `refs_for_each_ref_ext()`, this one
feels somewhat special as we perform multiple iterations. But what we
_can_ do is to generalize how this function works: instead of accepting
only a small handful of parameters, we can have it accept the full
options structure.

One obvious exception is that the caller must not provide a prefix via
the options. But this case can be easily detected.

Refactor the code accordingly.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs: rename `each_ref_fn`</title>
<updated>2026-02-23T21:21:18Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2026-02-23T11:59:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=635f08b7394b9dda013a0b78f4db11348dc7717b'/>
<id>urn:sha1:635f08b7394b9dda013a0b78f4db11348dc7717b</id>
<content type='text'>
Similar to the preceding commit, rename `each_ref_fn` to better match
our current best practices around how we name things.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs: rename `do_for_each_ref_flags`</title>
<updated>2026-02-23T21:21:18Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2026-02-23T11:59:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8f0720a5a781562fb1f750b351e14129fc8930ea'/>
<id>urn:sha1:8f0720a5a781562fb1f750b351e14129fc8930ea</id>
<content type='text'>
The enum `do_for_each_ref_flags` and its individual values don't match
to our current best practices when it comes to naming things. Rename it
to `refs_for_each_flag`.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>ref-filter: clarify lstrip/rstrip component counting</title>
<updated>2026-02-20T16:40:06Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2026-02-20T06:00:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8b0061b5c5008375ef0986b7aafedbd7d79da0f6'/>
<id>urn:sha1:8b0061b5c5008375ef0986b7aafedbd7d79da0f6</id>
<content type='text'>
When a strip option to the %(refname) placeholder is asked to leave N
path components, we first count up the path components to know how many
to remove. That happens with a loop like this:

	/* Find total no of '/' separated path-components */
	for (i = 0; p[i]; p[i] == '/' ? i++ : *p++)
		;

which is a little hard to understand for two reasons.

First, the dereference in "*p++" is seemingly useless, since nobody
looks at the result. And static analyzers like Coverity will complain
about that. But removing the "*" will cause gcc to complain with
-Wint-conversion, since the two sides of the ternary do not match (one
is a pointer and the other an int).

Second, it is not clear what the meaning of "p" is at each iteration of
the loop, as its position with respect to our walk over the string
depends on how many slashes we've seen. The answer is that by itself, it
doesn't really mean anything: "p + i" represents the current state of
our walk, with "i" counting up slashes, and "p" by itself essentially
meaningless.

None of this behaves incorrectly, but ultimately the loop is just
counting the slashes in the refname. We can do that much more simply
with a for-loop iterating over the string and a separate slash counter.

We can also drop the comment, which is somewhat misleading. We are
counting slashes, not components (and a comment later in the function
makes it clear that we must add one to compensate). In the new code it
is obvious that we are counting slashes here.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>mailmap: stop using the_repository</title>
<updated>2026-02-20T16:13:58Z</updated>
<author>
<name>Burak Kaan Karaçay</name>
<email>bkkaracay@gmail.com</email>
</author>
<published>2026-02-20T06:04:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=999b09348d6302d018165b4b3d289d4579d08e9e'/>
<id>urn:sha1:999b09348d6302d018165b4b3d289d4579d08e9e</id>
<content type='text'>
The 'read_mailmap' and 'read_mailmap_blob' functions rely on the global
'the_repository' variable. Update both functions to accept a
'struct repository' parameter.

Update all callers to pass 'the_repository' to retain the current
behavior.

Signed-off-by: Burak Kaan Karaçay &lt;bkkaracay@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>ref-filter: avoid strrchr() in rstrip_ref_components()</title>
<updated>2026-02-17T17:45:29Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2026-02-15T09:07:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=fe732a8b9f1837189eb3533a262548a652c11e61'/>
<id>urn:sha1:fe732a8b9f1837189eb3533a262548a652c11e61</id>
<content type='text'>
To strip path components from our refname string, we repeatedly call
strrchr() to find the trailing slash, shortening the string each time by
assigning NUL over it. This has two downsides:

  1. Calling strrchr() in a loop is quadratic, since each call has to
     call strlen() under the hood to find the end of the string (even
     though we know exactly where it is from the last loop iteration).

  2. We need a temporary buffer, since we're munging the string with NUL
     as we shorten it (which we must do, because strrchr() has no other
     way of knowing what we consider the end of the string).

Using memrchr() would let us fix both of these, but it isn't portable.
So instead, let's just open-code the string traversal from back to
front as we loop.

I doubt that the quadratic nature is a serious concern. You can see it
in practice with something like:

  git init
  git commit --allow-empty -m foo
  echo "$(git rev-parse HEAD) refs/heads$(perl -e 'print "/a" x 500_000')" &gt;.git/packed-refs
  time git for-each-ref --format='%(refname:rstrip=-1)'

That takes ~5.5s to run on my machine before this patch, and ~11ms
after. But I don't think there's a reasonable way for somebody to infect
you with such a garbage ref, as the wire protocol is limited to 64k
pkt-lines. The difference is measurable for me for a 32k-component ref
(about 19ms vs 7ms), so perhaps you could create some chaos by pushing a
lot of them. But we also run into filesystem limits (if the loose
backend is in use), and in practice it seems like there are probably
simpler and more effective ways to waste CPU.

Likewise the extra allocation probably isn't really measurable. In fact,
since our goal is to return an allocated string, we end up having to
make the same allocation anyway (though it is sized to the result,
rather than the input). My main goal was simplicity in avoiding the need
to handle cleaning it up in the early return path.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>ref-filter: simplify rstrip_ref_components() memory handling</title>
<updated>2026-02-17T17:45:29Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2026-02-15T09:05:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=2ec30e71f44233b9afceda0f2992029674187674'/>
<id>urn:sha1:2ec30e71f44233b9afceda0f2992029674187674</id>
<content type='text'>
We're stripping path components from the end of a string, which we do by
assigning a NUL as we parse each component, shortening the string. This
requires an extra temporary buffer to avoid munging our input string.

But the way that we allocate the buffer is unusual. We have an extra
"to_free" variable. Usually this is used when the access variable is
conceptually const, like:

   const char *foo;
   char *to_free = NULL;

   if (...)
           foo = to_free = xstrdup(...);
   else
           foo = some_const_string;
   ...
   free(to_free);

But that's not what's happening here. Our "start" variable always points
to the allocated buffer, and to_free is redundant. Worse, it is marked
as const itself, requiring a cast when we free it.

Let's drop to_free entirely, and mark "start" as non-const, making the
memory handling more clear. As a bonus, this also silences a warning
from glibc-2.43 that our call to strrchr() implicitly strips away the
const-ness of "start".

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
