<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/refs, branch v2.42.3</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.42.3</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.42.3'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2023-07-21T20:47:26Z</updated>
<entry>
<title>Merge branch 'tb/refs-exclusion-and-packed-refs'</title>
<updated>2023-07-21T20:47:26Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-07-21T20:47:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=39fe402d6727efb6c98ddca19fae1f094ffaa6b3'/>
<id>urn:sha1:39fe402d6727efb6c98ddca19fae1f094ffaa6b3</id>
<content type='text'>
Enumerating refs in the packed-refs file, while excluding refs that
match certain patterns, has been optimized.

* tb/refs-exclusion-and-packed-refs:
  ls-refs.c: avoid enumerating hidden refs where possible
  upload-pack.c: avoid enumerating hidden refs where possible
  builtin/receive-pack.c: avoid enumerating hidden references
  refs.h: implement `hidden_refs_to_excludes()`
  refs.h: let `for_each_namespaced_ref()` take excluded patterns
  revision.h: store hidden refs in a `strvec`
  refs/packed-backend.c: add trace2 counters for jump list
  refs/packed-backend.c: implement jump lists to avoid excluded pattern(s)
  refs/packed-backend.c: refactor `find_reference_location()`
  refs: plumb `exclude_patterns` argument throughout
  builtin/for-each-ref.c: add `--exclude` option
  ref-filter.c: parameterize match functions over patterns
  ref-filter: add `ref_filter_clear()`
  ref-filter: clear reachable list pointers after freeing
  ref-filter.h: provide `REF_FILTER_INIT`
  refs.c: rename `ref_filter`
</content>
</entry>
<entry>
<title>refs/packed-backend.c: add trace2 counters for jump list</title>
<updated>2023-07-10T21:48:55Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2023-07-10T21:12:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c489f47a649da8984a2240032e1d819e9a80ea2a'/>
<id>urn:sha1:c489f47a649da8984a2240032e1d819e9a80ea2a</id>
<content type='text'>
The previous commit added low-level tests to ensure that the packed-refs
iterator did not enumerate excluded sections of the refspace.

However, there was no guarantee that these sections weren't being
visited, only that they were being suppressed from the output. To harden
these tests, add a trace2 counter which tracks the number of regions
skipped by the packed-refs iterator, and assert on its value.

Suggested-by: Derrick Stolee &lt;derrickstolee@github.com&gt;
Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs/packed-backend.c: implement jump lists to avoid excluded pattern(s)</title>
<updated>2023-07-10T21:48:55Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2023-07-10T21:12:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=59c35fac54054b3f781b0275eac7d7c54468f0d5'/>
<id>urn:sha1:59c35fac54054b3f781b0275eac7d7c54468f0d5</id>
<content type='text'>
When iterating through the `packed-refs` file in order to answer a query
like:

    $ git for-each-ref --exclude=refs/__hidden__

it would be useful to avoid walking over all of the entries in
`refs/__hidden__/*` when possible, since we know that the ref-filter
code is going to throw them away anyways.

In certain circumstances, doing so is possible. The algorithm for doing
so is as follows:

  - For each excluded pattern, find the first record that matches it,
    and the first record that *doesn't* match it (i.e. the location
    you'd next want to consider when excluding that pattern).

  - Sort the set of excluded regions from the previous step in ascending
    order of the first location within the `packed-refs` file that
    matches.

  - Clean up the results from the previous step: discard empty regions,
    and combine adjacent regions. The set of regions which remains is
    referred to as the "jump list", and never contains any references
    which should be included in the result set.

Then when iterating through the `packed-refs` file, if `iter-&gt;pos` is
ever contained in one of the regions from the previous steps, advance
`iter-&gt;pos` past the end of that region, and continue enumeration.

Note that we only perform this optimization when none of the excluded
pattern(s) have special meta-characters in them. For a pattern like
"refs/foo[ac]", the excluded regions ("refs/fooa", "refs/fooc", and
everything underneath them) are not connected. A future implementation
that handles this case may split the character class (pretending as if
two patterns were excluded: "refs/fooa", and "refs/fooc").

There are a few other gotchas worth considering. First, note that the
jump list is sorted, so once we jump past a region, we can avoid
considering it (or any regions preceding it) again. The member
`jump_pos` is used to track the first next-possible region to jump
through.

Second, note that the jump list is best-effort, since we do not handle
loose references, and because of the meta-character issue above. The
jump list may not skip past all references which won't appear in the
results, but will never skip over a reference which does appear in the
result set.

In repositories with a large number of hidden references, the speed-up
can be significant. Tests here are done with a copy of linux.git with a
reference "refs/pull/N" pointing at every commit, as in:

    $ git rev-list HEAD | awk '{ print "create refs/pull/" NR " " $0 }' |
        git update-ref --stdin
    $ git pack-refs --all

, it is significantly faster to have `for-each-ref` jump over the
excluded references, as opposed to filtering them out after the fact:

    $ hyperfine \
      'git for-each-ref --format="%(objectname) %(refname)" | grep -vE "^[0-9a-f]{40} refs/pull/"' \
      'git.prev for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"' \
      'git.compile for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"'
    Benchmark 1: git for-each-ref --format="%(objectname) %(refname)" | grep -vE "^[0-9a-f]{40} refs/pull/"
      Time (mean ± σ):     798.1 ms ±   3.3 ms    [User: 687.6 ms, System: 146.4 ms]
      Range (min … max):   794.5 ms … 805.5 ms    10 runs

    Benchmark 2: git.prev for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"
      Time (mean ± σ):      98.9 ms ±   1.4 ms    [User: 93.1 ms, System: 5.7 ms]
      Range (min … max):    97.0 ms … 104.0 ms    29 runs

    Benchmark 3: git.compile for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"
      Time (mean ± σ):       4.5 ms ±   0.2 ms    [User: 0.7 ms, System: 3.8 ms]
      Range (min … max):     4.1 ms …   5.8 ms    524 runs

    Summary
      'git.compile for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"' ran
       21.87 ± 1.05 times faster than 'git.prev for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"'
      176.52 ± 8.19 times faster than 'git for-each-ref --format="%(objectname) %(refname)" | grep -vE "^[0-9a-f]{40} refs/pull/"'

(Comparing stock git and this patch isn't quite fair, since an earlier
commit in this series adds a naive implementation of the `--exclude`
option. `git.prev` is built from the previous commit and includes this
naive implementation).

Using the jump list is fairly straightforward (see the changes to
`refs/packed-backend.c::next_record()`), but constructing the list is
not. To ensure that the construction is correct, add a new suite of
tests in t1419 covering various corner cases (overlapping regions,
partially overlapping regions, adjacent regions, etc.).

Co-authored-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs/packed-backend.c: refactor `find_reference_location()`</title>
<updated>2023-07-10T21:48:55Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2023-07-10T21:12:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d22d941ac08aa31266bb0b38d78c79c3ffdc4f86'/>
<id>urn:sha1:d22d941ac08aa31266bb0b38d78c79c3ffdc4f86</id>
<content type='text'>
The function `find_reference_location()` is used to perform a
binary search-like function over the contents of a repository's
`$GIT_DIR/packed-refs` file.

The search it implements is unlike a standard binary search in that the
records it searches over are not of a fixed width, so the comparison
must locate the end of a record before comparing it.

Extract the core routine of `find_reference_location()` in order to
implement a function in the following patch which will find the first
location in the `packed-refs` file that *doesn't* match the given
pattern.

The behavior of `find_reference_location()` is unchanged.

Co-authored-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs: plumb `exclude_patterns` argument throughout</title>
<updated>2023-07-10T21:48:55Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2023-07-10T21:12:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b269ac53c07aa46d5a88d05dac8216d189e69a50'/>
<id>urn:sha1:b269ac53c07aa46d5a88d05dac8216d189e69a50</id>
<content type='text'>
The subsequent patch will want to access an optional `excluded_patterns`
array within `refs/packed-backend.c` that will cull out certain
references matching any of the given patterns on a best-effort basis.

To do so, the refs subsystem needs to be updated to pass this value
across a number of different locations.

Prepare for a future patch by introducing this plumbing now, passing
NULLs at top-level APIs in order to make that patch less noisy and more
easily readable.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.co&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>repository: remove unnecessary include of path.h</title>
<updated>2023-06-21T20:39:53Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-05-16T06:33:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c339932bd858e84490c8690d393307a58764d6ed'/>
<id>urn:sha1:c339932bd858e84490c8690d393307a58764d6ed</id>
<content type='text'>
This also made it clear that several .c files that depended upon path.h
were missing a #include for it; add the missing includes while at it.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>cache.h: remove this no-longer-used header</title>
<updated>2023-06-21T20:39:53Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-05-16T06:33:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=bc5c5ec0446895f5c4139cd470066beb3c4ac6d5'/>
<id>urn:sha1:bc5c5ec0446895f5c4139cd470066beb3c4ac6d5</id>
<content type='text'>
Since this header showed up in some places besides just #include
statements, update/clean-up/remove those other places as well.

Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got
away with violating the rule that all files must start with an include
of git-compat-util.h (or a short-list of alternate headers that happen
to include it first).  This change exposed the violation and caused it
to stop building correctly; fix it by having it include
git-compat-util.h first, as per policy.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>statinfo: move stat_{data,validity} functions from cache/read-cache</title>
<updated>2023-06-21T20:39:53Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-05-16T06:33:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=90cbae9ce5d22df29867be9026c514b8c79e3d31'/>
<id>urn:sha1:90cbae9ce5d22df29867be9026c514b8c79e3d31</id>
<content type='text'>
These functions do not depend upon struct cache_entry or struct
index_state in any way, and it seems more logical to break them out into
this file, especially since statinfo.h already has the struct stat_data
declaration.

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jc/pack-ref-exclude-include'</title>
<updated>2023-06-13T19:29:45Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-06-13T19:29:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=cbc882ea388143bd6bbed139f97f67589777be60'/>
<id>urn:sha1:cbc882ea388143bd6bbed139f97f67589777be60</id>
<content type='text'>
"git pack-refs" learns "--include" and "--exclude" to tweak the ref
hierarchy to be packed using pattern matching.

* jc/pack-ref-exclude-include:
  pack-refs: teach pack-refs --include option
  pack-refs: teach --exclude option to exclude refs from being packed
  docs: clarify git-pack-refs --all will pack all refs
</content>
</entry>
<entry>
<title>pack-refs: teach pack-refs --include option</title>
<updated>2023-05-12T21:54:14Z</updated>
<author>
<name>John Cai</name>
<email>johncai86@gmail.com</email>
</author>
<published>2023-05-12T21:34:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4fe42f326e10a547dc65dfe9e5ceaeeee02b98db'/>
<id>urn:sha1:4fe42f326e10a547dc65dfe9e5ceaeeee02b98db</id>
<content type='text'>
Allow users to be more selective over which refs to pack by adding an
--include option to git-pack-refs.

The existing options allow some measure of selectivity. By default
git-pack-refs packs all tags. --all can be used to include all refs,
and the previous commit added the ability to exclude certain refs with
--exclude.

While these knobs give the user some selection over which refs to pack,
it could be useful to give more control. For instance, a repository may
have a set of branches that are rarely updated and would benefit from
being packed. --include would allow the user to easily include a set of
branches to be packed while leaving everything else unpacked.

Signed-off-by: John Cai &lt;johncai86@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
