<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/t/perf, branch jch</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=jch</id>
<link rel='self' href='https://git.shady.money/git/atom?h=jch'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2026-04-06T19:02:30Z</updated>
<entry>
<title>p6011: add perf test for rev-list --maximal-only</title>
<updated>2026-04-06T19:02:30Z</updated>
<author>
<name>Derrick Stolee</name>
<email>stolee@gmail.com</email>
</author>
<published>2026-04-06T13:27:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e8e5453ab8794cf29afe0a616d74319442b676bd'/>
<id>urn:sha1:e8e5453ab8794cf29afe0a616d74319442b676bd</id>
<content type='text'>
Add a performance test that compares 'git rev-list --maximal-only'
against 'git merge-base --independent'. These two commands are asking
essentially the same thing, but the rev-list implementation is more
generic and hence slower. These performance tests will demonstrate that
in the current state and also be used to show the equivalence in the
future.

We also add a case with '--since' to force the generic walk logic for
rev-list even when we make that future change to use the merge-base
algorithm on a simple walk.

When run on my copy of git.git, I see these results:

  Test                                      HEAD
  ----------------------------------------------
  6011.2: merge-base --independent          0.03
  6011.3: rev-list --maximal-only           0.06
  6011.4: rev-list --maximal-only --since   0.06

These numbers are low, but the --independent calculation is interesting
due to having a lot of local branches that are actually independent.

Running the same test on a fresh clone of the Linux kernel repository
shows a larger difference between the algorithms, especially because the
--independent algorithm is extremely fast when there are no independent
references selected:

  Test                                      HEAD
  ----------------------------------------------
  6011.2: merge-base --independent          0.00
  6011.3: rev-list --maximal-only           0.70
  6011.4: rev-list --maximal-only --since   0.70

Signed-off-by: Derrick Stolee &lt;stolee@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>t/perf/p3400: speed up setup using fast-import</title>
<updated>2026-01-30T17:13:47Z</updated>
<author>
<name>Tian Yuchen</name>
<email>a3205153416@gmail.com</email>
</author>
<published>2026-01-30T17:01:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8466efa4bd92b970f7a37159404aef33296d9d46'/>
<id>urn:sha1:8466efa4bd92b970f7a37159404aef33296d9d46</id>
<content type='text'>
The setup phase in 't/perf/p3400-rebase.sh' generates 100 commits to
simulate a noisy history. It currently uses a shell loop that invokes
'git add', 'git commit', 'test_seq', and 'sort' in each iteration.
This incurs significant overhead due to repeated process spawning.

Optimize the setup by using 'git fast-import' to generate the commit
history. Additionally, pre-compute the forward and reversed file contents
to avoid repetitive execution of 'seq' and 'sort'.

To ensure the test measures rebase performance against a consistent
object layout (rather than the suboptimal pack/loose objects created
by the raw import), perform a full repack (`git repack -a -d`) at the
end of the setup.

This reduces the setup time significantly while maintaining the validity
of the subsequent performance tests.

Performance enhancement (Average value of 5 tests):
            Real        Rebase
  Before:  29.045s      13.34s
   After:  21.989s      12.84s

Measured on Lenovo Yoga 2020, Ubuntu 24.04.

Signed-off-by: Tian Yuchen &lt;a3205153416@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jk/cat-file-avoid-bitmap-when-unneeded'</title>
<updated>2026-01-16T20:40:27Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2026-01-16T20:40:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8c5f0adf21df709a17866bfa36bb0b1df6344ee6'/>
<id>urn:sha1:8c5f0adf21df709a17866bfa36bb0b1df6344ee6</id>
<content type='text'>
Fix for a performance regression in "git cat-file".

* jk/cat-file-avoid-bitmap-when-unneeded:
  cat-file: only use bitmaps when filtering
</content>
</entry>
<entry>
<title>Merge branch 'jk/t-perf-fixes'</title>
<updated>2026-01-16T20:40:26Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2026-01-16T20:40:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=18d7d02088792a1571db1d7677f89e398a71ad44'/>
<id>urn:sha1:18d7d02088792a1571db1d7677f89e398a71ad44</id>
<content type='text'>
Perf-test fixes.

* jk/t-perf-fixes:
  t/perf/run: preserve GIT_PERF_* from environment
  t/perf/perf-lib: fix assignment of TEST_OUTPUT_DIRECTORY
</content>
</entry>
<entry>
<title>cat-file: only use bitmaps when filtering</title>
<updated>2026-01-07T00:05:12Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2026-01-06T10:25:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9e8b448dd83297ac85f6a62a0d2408629fb45cc0'/>
<id>urn:sha1:9e8b448dd83297ac85f6a62a0d2408629fb45cc0</id>
<content type='text'>
Commit 8002e8ee18 (builtin/cat-file: use bitmaps to efficiently filter
by object type, 2025-04-02) introduced a performance regression when we
are not filtering objects: it uses bitmaps even when they won't help,
incurring extra costs. For example, running the new perf tests from this
commit, which check the performance of listing objects by oid:

  $ export GIT_PERF_LARGE_REPO=/path/to/linux.git
  $ git -C "$GIT_PERF_LARGE_REPO" repack -adb
  $ GIT_SKIP_TESTS=p1006.1 ./run 8002e8ee18^ 8002e8ee18 p1006-cat-file.sh
  [...]
  Test                                  8002e8ee18^       8002e8ee18
  -------------------------------------------------------------------------------
  1006.2: list all objects (sorted)     1.48(1.44+0.04)   6.39(6.35+0.04) +331.8%
  1006.3: list all objects (unsorted)   3.01(2.97+0.04)   3.40(3.29+0.10) +13.0%
  1006.4: list blobs                    4.85(4.67+0.17)   1.68(1.58+0.10) -65.4%

An invocation that filters, like listing all blobs (1006.4), does
benefit from using the bitmaps; it now doesn't have to check the type of
each object from the pack data, so the tradeoff is worth it.

But for listing all objects in sorted idx order (1006.2), we otherwise
would never open the bitmap nor the revindex file. Worse, our sorting
step gets much worse. Normally we append into an array in pack .idx
order, and the sort step is trivial. But with bitmaps, we get the
objects in pack order, which is apparently random with respect to oid,
and have to sort the whole thing. (Note that this freshly-packed state
represents the best case for .idx sorting; if we had two packs, then
we'd have their objects one after the other and qsort would have to
interleave them).

The unsorted test in 1006.3 is interesting: there we are going in pack
order, so we load the revindex for the pack anyway. And though we don't
sort the result, we do use an oidset to check for duplicates. So we can
see in the 8002e8ee18^ timings that those two things cost ~1.5s over the
sorted case (mostly the oidset hash cost). We also incur the extra cost
to open the bitmap file as of 8002e8ee18, which seems to be ~400ms.
(This would probably be faster with a bitmap lookup table, but writing
that out is not yet the default).

So we know that bitmaps help when there's filtering to be done, but
otherwise make things worse. Let's only use them when there's a filter.

The perf script shows that we've fixed the regressions without hurting
the bitmap case:

  Test                                  8002e8ee18^       8002e8ee18                HEAD
  --------------------------------------------------------------------------------------------------------
  1006.2: list all objects (sorted)     1.56(1.53+0.03)   6.44(6.37+0.06) +312.8%   1.62(1.54+0.06) +3.8%
  1006.3: list all objects (unsorted)   3.04(2.98+0.06)   3.45(3.38+0.07) +13.5%    3.04(2.99+0.04) +0.0%
  1006.4: list blobs                    5.14(4.98+0.15)   1.76(1.68+0.06) -65.8%    1.73(1.64+0.09) -66.3%

Note that there's another related case: we might have a filter that
cannot be used with bitmaps. That check is handled already for us in
for_each_bitmapped_object(), though we'd still load the bitmap and
revindex files pointlessly in that case. I don't think it can happen in
practice for cat-file, though, since it allows only blob:none,
blob:limit, and object:type filters, all of which work with bitmaps.

It would be easy-ish to insert an extra check like:

  can_filter_bitmap(&amp;opt-&gt;objects_filter);

into the conditional, but I didn't bother here. It would be redundant
with the call in for_each_bitmapped_object(), and the can_filter helper
function is static local in the bitmap code (so we'd have to make it
public).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>t/perf/run: preserve GIT_PERF_* from environment</title>
<updated>2026-01-06T23:56:36Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2026-01-06T10:16:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=79d301c7676e79a09e7a1c65ca754132e1770828'/>
<id>urn:sha1:79d301c7676e79a09e7a1c65ca754132e1770828</id>
<content type='text'>
If you run:

  GIT_PERF_LARGE_REPO=/some/path ./p1006-cat-file.sh

it will use the repo in /some/path. But if you use the "run" helper
script to aggregate and compare results, like this:

  GIT_PERF_LARGE_REPO=/some/path ./run HEAD^ HEAD p1006-cat-file.sh

it will ignore that variable. This is because the presence of the
LARGE_REPO variable in GIT-BUILD-OPTIONS overrides what's in the
environment. This started with 4638e8806e (Makefile: use common template
for GIT-BUILD-OPTIONS, 2024-12-06), which now writes even empty
variables (though arguably it was wrong even before with a non-empty
value, as we generally prefer the environment to take precedence over
on-disk config).

We had the same problem in perf-lib.sh itself, and we hacked around it
with 32b74b9809 (perf: do allow `GIT_PERF_*` to be overridden again,
2025-04-04). That's what lets the direct invocation of "./p1006" work
above.

And in fact that was sufficient for "./run", too, until it started
loading GIT-BUILD-OPTIONS itself in 5756ccd181 (t/perf: fix benchmarks
with out-of-tree builds, 2025-04-28). Now it has the same problem: it
clobbers any incoming GIT_PERF options from the environment.

We can use the same hack here in the "run" script. It's quite ugly, but
it's just short enough that I don't think it's worth trying to factor it
out into a common shell library.

In the long run, we might consider teaching GIT-BUILD-OPTIONS to be more
gentle in overwriting existing entries. There are probably other
GIT_TEST_* variables which would need the same treatment. And if and
when we come up with a more complete solution, we can use it in both
spots.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>t/perf/perf-lib: fix assignment of TEST_OUTPUT_DIRECTORY</title>
<updated>2026-01-06T23:56:36Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2026-01-06T10:13:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=aad1d1c0d58f36d5cc7822c9ff6f0064708a1f20'/>
<id>urn:sha1:aad1d1c0d58f36d5cc7822c9ff6f0064708a1f20</id>
<content type='text'>
Using the perf suite's "run" helper in a vanilla build fails like this:

  $ make &amp;&amp; (cd t/perf &amp;&amp; ./run p0000-perf-lib-sanity.sh)
  === Running 1 tests in this tree ===
  perf 1 - test_perf_default_repo works: 1 2 3 ok
  perf 2 - test_checkout_worktree works: 1 2 3 ok
  ok 3 - test_export works
  perf 4 - export a weird var: 1 2 3 ok
  perf 5 - éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś: 1 2 3 ok
  ok 6 - test_export works with weird vars
  perf 7 - important variables available in subshells: 1 2 3 ok
  perf 8 - test-lib-functions correctly loaded in subshells: 1 2 3 ok
  # passed all 8 test(s)
  1..8
  cannot open test-results/p0000-perf-lib-sanity.subtests: No such file or directory at ./aggregate.perl line 159.

It is trying to aggregate results written into t/perf/test-results, but
the p0000 script did not write anything there.

The "run" script looks in $TEST_OUTPUT_DIRECTORY/test-results, or if
that variable is not set, in test-results in the current working
directory (which should be t/perf itself). It pulls the value of
$TEST_OUTPUT_DIRECTORY from the GIT-BUILD-OPTIONS file.

But that doesn't quite match the setup in perf-lib.sh (which is what
scripts like p0000 use). There we do this at the top of the script:

  TEST_OUTPUT_DIRECTORY=$(pwd)

and then let test-lib.sh append "/test-results" to that. Historically,
that made the vanilla case work: we'd always use t/perf/test-results.
But when $TEST_OUTPUT_DIRECTORY was set, it would break.

Commit 5756ccd181 (t/perf: fix benchmarks with out-of-tree builds,
2025-04-28) fixed that second case by loading GIT-BUILD-OPTIONS
ourselves. But that broke the vanilla case!

Now our setting of $TEST_OUTPUT_DIRECTORY in perf-lib.sh is ignored,
because it is overwritten by GIT-BUILD-OPTIONS. And when test-lib.sh
sees that the output directory is empty, it defaults to t/test-results,
rather than t/perf/test-results.

Nobody seems to have noticed, probably for two reasons:

  1. It only matters if you're trying to aggregate results (like the
     "run" script does). Just running "./p0000-perf-lib-sanity.sh"
     manually still produces useful output; the stored result files are
     just in an unexpected place.

  2. There might be leftover files in t/perf/test-results from previous
     runs (before 5756ccd181). In particular, the ".subtests" files
     don't tend to change, and the lack of that file is what causes it
     to barf completely. So it's possible that the aggregation could
     have been showing stale results that did not match the run that
     just happened.

We can fix it by setting TEST_OUTPUT_DIRECTORY only after we've loaded
GIT-BUILD-OPTIONS, so that we override its value and not the other way
around. And we'll do so only when the variable is not set, which should
retain the fix for that case from 5756ccd181.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>show-branch: use prio_queue</title>
<updated>2025-12-28T05:01:23Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2025-12-26T07:44:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=abf05d856f50fbd8f0390b31e7187d78930dbaf5'/>
<id>urn:sha1:abf05d856f50fbd8f0390b31e7187d78930dbaf5</id>
<content type='text'>
Building a list using commit_list_insert_by_date() has quadratic worst
case complexity.  Avoid it by using prio_queue.

Use prio_queue_peek()+prio_queue_replace() instead of prio_queue_get()+
prio_queue_put() if possible, as the former only rebalance the
prio_queue heap once instead of twice.

In sane repositories this won't make much of a difference because the
number of items in the list or queue won't be very high:

Benchmark 1: ./git_v2.52.0 show-branch origin/main origin/next origin/seen origin/todo
  Time (mean ± σ):     538.2 ms ±   0.8 ms    [User: 527.6 ms, System: 9.6 ms]
  Range (min … max):   537.0 ms … 539.2 ms    10 runs

Benchmark 2: ./git show-branch origin/main origin/next origin/seen origin/todo
  Time (mean ± σ):     530.6 ms ±   0.4 ms    [User: 519.8 ms, System: 9.8 ms]
  Range (min … max):   530.1 ms … 531.3 ms    10 runs

Summary
  ./git show-branch origin/main origin/next origin/seen origin/todo ran
    1.01 ± 0.00 times faster than ./git_v2.52.0 show-branch origin/main origin/next origin/seen origin/todo

That number is not limited, though, and in pathological cases like the
one in p6010 we see a sizable improvement:

Test                      v2.52.0           HEAD
------------------------------------------------------------------
6010.4: git show-branch   2.19(2.19+0.00)   0.03(0.02+0.00) -98.6%

Signed-off-by: René Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'rs/merge-base-optim'</title>
<updated>2025-11-03T14:49:55Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-11-03T14:49:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a4b1a1478b2a1e9f2f0b65e913baeed03f345a26'/>
<id>urn:sha1:a4b1a1478b2a1e9f2f0b65e913baeed03f345a26</id>
<content type='text'>
The code to walk revision graph to compute merge base has been
optimized.

* rs/merge-base-optim:
  commit-reach: avoid commit_list_insert_by_date()
</content>
</entry>
<entry>
<title>commit-reach: avoid commit_list_insert_by_date()</title>
<updated>2025-10-24T17:13:17Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2025-10-24T16:47:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=134ec330d2945002d0ceb7de2ac6cd7ab0af762d'/>
<id>urn:sha1:134ec330d2945002d0ceb7de2ac6cd7ab0af762d</id>
<content type='text'>
Building a list using commit_list_insert_by_date() has quadratic worst
case complexity.  Avoid it by just appending in the loop and sorting at
the end.

The number of merge bases is usually small, so don't expect speedups in
normal repositories.  It has no limit, though.  The added perf test
shows a nice improvement when dealing with 16384 merge bases:

Test                     v2.51.1           HEAD
-----------------------------------------------------------------
6010.2: git merge-base   0.55(0.54+0.00)   0.03(0.02+0.00) -94.5%

Signed-off-by: René Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
