git/t/perf, branch jch

p6011: add perf test for rev-list --maximal-only

2026-04-06T19:02:30Z

Add a performance test that compares 'git rev-list --maximal-only' against 'git merge-base --independent'. These two commands are asking essentially the same thing, but the rev-list implementation is more generic and hence slower. These performance tests will demonstrate that in the current state and also be used to show the equivalence in the future. We also add a case with '--since' to force the generic walk logic for rev-list even when we make that future change to use the merge-base algorithm on a simple walk. When run on my copy of git.git, I see these results: Test HEAD ---------------------------------------------- 6011.2: merge-base --independent 0.03 6011.3: rev-list --maximal-only 0.06 6011.4: rev-list --maximal-only --since 0.06 These numbers are low, but the --independent calculation is interesting due to having a lot of local branches that are actually independent. Running the same test on a fresh clone of the Linux kernel repository shows a larger difference between the algorithms, especially because the --independent algorithm is extremely fast when there are no independent references selected: Test HEAD ---------------------------------------------- 6011.2: merge-base --independent 0.00 6011.3: rev-list --maximal-only 0.70 6011.4: rev-list --maximal-only --since 0.70 Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano

t/perf/p3400: speed up setup using fast-import

2026-01-30T17:13:47Z

The setup phase in 't/perf/p3400-rebase.sh' generates 100 commits to simulate a noisy history. It currently uses a shell loop that invokes 'git add', 'git commit', 'test_seq', and 'sort' in each iteration. This incurs significant overhead due to repeated process spawning. Optimize the setup by using 'git fast-import' to generate the commit history. Additionally, pre-compute the forward and reversed file contents to avoid repetitive execution of 'seq' and 'sort'. To ensure the test measures rebase performance against a consistent object layout (rather than the suboptimal pack/loose objects created by the raw import), perform a full repack (`git repack -a -d`) at the end of the setup. This reduces the setup time significantly while maintaining the validity of the subsequent performance tests. Performance enhancement (Average value of 5 tests): Real Rebase Before: 29.045s 13.34s After: 21.989s 12.84s Measured on Lenovo Yoga 2020, Ubuntu 24.04. Signed-off-by: Tian Yuchen Signed-off-by: Junio C Hamano

Merge branch 'jk/cat-file-avoid-bitmap-when-unneeded'

2026-01-16T20:40:27Z

Fix for a performance regression in "git cat-file". * jk/cat-file-avoid-bitmap-when-unneeded: cat-file: only use bitmaps when filtering

Merge branch 'jk/t-perf-fixes'

2026-01-16T20:40:26Z

Perf-test fixes. * jk/t-perf-fixes: t/perf/run: preserve GIT_PERF_* from environment t/perf/perf-lib: fix assignment of TEST_OUTPUT_DIRECTORY

cat-file: only use bitmaps when filtering

2026-01-07T00:05:12Z

Commit 8002e8ee18 (builtin/cat-file: use bitmaps to efficiently filter by object type, 2025-04-02) introduced a performance regression when we are not filtering objects: it uses bitmaps even when they won't help, incurring extra costs. For example, running the new perf tests from this commit, which check the performance of listing objects by oid: $ export GIT_PERF_LARGE_REPO=/path/to/linux.git $ git -C "$GIT_PERF_LARGE_REPO" repack -adb $ GIT_SKIP_TESTS=p1006.1 ./run 8002e8ee18^ 8002e8ee18 p1006-cat-file.sh [...] Test 8002e8ee18^ 8002e8ee18 ------------------------------------------------------------------------------- 1006.2: list all objects (sorted) 1.48(1.44+0.04) 6.39(6.35+0.04) +331.8% 1006.3: list all objects (unsorted) 3.01(2.97+0.04) 3.40(3.29+0.10) +13.0% 1006.4: list blobs 4.85(4.67+0.17) 1.68(1.58+0.10) -65.4% An invocation that filters, like listing all blobs (1006.4), does benefit from using the bitmaps; it now doesn't have to check the type of each object from the pack data, so the tradeoff is worth it. But for listing all objects in sorted idx order (1006.2), we otherwise would never open the bitmap nor the revindex file. Worse, our sorting step gets much worse. Normally we append into an array in pack .idx order, and the sort step is trivial. But with bitmaps, we get the objects in pack order, which is apparently random with respect to oid, and have to sort the whole thing. (Note that this freshly-packed state represents the best case for .idx sorting; if we had two packs, then we'd have their objects one after the other and qsort would have to interleave them). The unsorted test in 1006.3 is interesting: there we are going in pack order, so we load the revindex for the pack anyway. And though we don't sort the result, we do use an oidset to check for duplicates. So we can see in the 8002e8ee18^ timings that those two things cost ~1.5s over the sorted case (mostly the oidset hash cost). We also incur the extra cost to open the bitmap file as of 8002e8ee18, which seems to be ~400ms. (This would probably be faster with a bitmap lookup table, but writing that out is not yet the default). So we know that bitmaps help when there's filtering to be done, but otherwise make things worse. Let's only use them when there's a filter. The perf script shows that we've fixed the regressions without hurting the bitmap case: Test 8002e8ee18^ 8002e8ee18 HEAD -------------------------------------------------------------------------------------------------------- 1006.2: list all objects (sorted) 1.56(1.53+0.03) 6.44(6.37+0.06) +312.8% 1.62(1.54+0.06) +3.8% 1006.3: list all objects (unsorted) 3.04(2.98+0.06) 3.45(3.38+0.07) +13.5% 3.04(2.99+0.04) +0.0% 1006.4: list blobs 5.14(4.98+0.15) 1.76(1.68+0.06) -65.8% 1.73(1.64+0.09) -66.3% Note that there's another related case: we might have a filter that cannot be used with bitmaps. That check is handled already for us in for_each_bitmapped_object(), though we'd still load the bitmap and revindex files pointlessly in that case. I don't think it can happen in practice for cat-file, though, since it allows only blob:none, blob:limit, and object:type filters, all of which work with bitmaps. It would be easy-ish to insert an extra check like: can_filter_bitmap(&opt->objects_filter); into the conditional, but I didn't bother here. It would be redundant with the call in for_each_bitmapped_object(), and the can_filter helper function is static local in the bitmap code (so we'd have to make it public). Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

t/perf/run: preserve GIT_PERF_* from environment

2026-01-06T23:56:36Z

If you run: GIT_PERF_LARGE_REPO=/some/path ./p1006-cat-file.sh it will use the repo in /some/path. But if you use the "run" helper script to aggregate and compare results, like this: GIT_PERF_LARGE_REPO=/some/path ./run HEAD^ HEAD p1006-cat-file.sh it will ignore that variable. This is because the presence of the LARGE_REPO variable in GIT-BUILD-OPTIONS overrides what's in the environment. This started with 4638e8806e (Makefile: use common template for GIT-BUILD-OPTIONS, 2024-12-06), which now writes even empty variables (though arguably it was wrong even before with a non-empty value, as we generally prefer the environment to take precedence over on-disk config). We had the same problem in perf-lib.sh itself, and we hacked around it with 32b74b9809 (perf: do allow `GIT_PERF_*` to be overridden again, 2025-04-04). That's what lets the direct invocation of "./p1006" work above. And in fact that was sufficient for "./run", too, until it started loading GIT-BUILD-OPTIONS itself in 5756ccd181 (t/perf: fix benchmarks with out-of-tree builds, 2025-04-28). Now it has the same problem: it clobbers any incoming GIT_PERF options from the environment. We can use the same hack here in the "run" script. It's quite ugly, but it's just short enough that I don't think it's worth trying to factor it out into a common shell library. In the long run, we might consider teaching GIT-BUILD-OPTIONS to be more gentle in overwriting existing entries. There are probably other GIT_TEST_* variables which would need the same treatment. And if and when we come up with a more complete solution, we can use it in both spots. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

t/perf/perf-lib: fix assignment of TEST_OUTPUT_DIRECTORY

2026-01-06T23:56:36Z

Using the perf suite's "run" helper in a vanilla build fails like this: $ make && (cd t/perf && ./run p0000-perf-lib-sanity.sh) === Running 1 tests in this tree === perf 1 - test_perf_default_repo works: 1 2 3 ok perf 2 - test_checkout_worktree works: 1 2 3 ok ok 3 - test_export works perf 4 - export a weird var: 1 2 3 ok perf 5 - éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś: 1 2 3 ok ok 6 - test_export works with weird vars perf 7 - important variables available in subshells: 1 2 3 ok perf 8 - test-lib-functions correctly loaded in subshells: 1 2 3 ok # passed all 8 test(s) 1..8 cannot open test-results/p0000-perf-lib-sanity.subtests: No such file or directory at ./aggregate.perl line 159. It is trying to aggregate results written into t/perf/test-results, but the p0000 script did not write anything there. The "run" script looks in $TEST_OUTPUT_DIRECTORY/test-results, or if that variable is not set, in test-results in the current working directory (which should be t/perf itself). It pulls the value of $TEST_OUTPUT_DIRECTORY from the GIT-BUILD-OPTIONS file. But that doesn't quite match the setup in perf-lib.sh (which is what scripts like p0000 use). There we do this at the top of the script: TEST_OUTPUT_DIRECTORY=$(pwd) and then let test-lib.sh append "/test-results" to that. Historically, that made the vanilla case work: we'd always use t/perf/test-results. But when $TEST_OUTPUT_DIRECTORY was set, it would break. Commit 5756ccd181 (t/perf: fix benchmarks with out-of-tree builds, 2025-04-28) fixed that second case by loading GIT-BUILD-OPTIONS ourselves. But that broke the vanilla case! Now our setting of $TEST_OUTPUT_DIRECTORY in perf-lib.sh is ignored, because it is overwritten by GIT-BUILD-OPTIONS. And when test-lib.sh sees that the output directory is empty, it defaults to t/test-results, rather than t/perf/test-results. Nobody seems to have noticed, probably for two reasons: 1. It only matters if you're trying to aggregate results (like the "run" script does). Just running "./p0000-perf-lib-sanity.sh" manually still produces useful output; the stored result files are just in an unexpected place. 2. There might be leftover files in t/perf/test-results from previous runs (before 5756ccd181). In particular, the ".subtests" files don't tend to change, and the lack of that file is what causes it to barf completely. So it's possible that the aggregation could have been showing stale results that did not match the run that just happened. We can fix it by setting TEST_OUTPUT_DIRECTORY only after we've loaded GIT-BUILD-OPTIONS, so that we override its value and not the other way around. And we'll do so only when the variable is not set, which should retain the fix for that case from 5756ccd181. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

show-branch: use prio_queue

2025-12-28T05:01:23Z

Building a list using commit_list_insert_by_date() has quadratic worst case complexity. Avoid it by using prio_queue. Use prio_queue_peek()+prio_queue_replace() instead of prio_queue_get()+ prio_queue_put() if possible, as the former only rebalance the prio_queue heap once instead of twice. In sane repositories this won't make much of a difference because the number of items in the list or queue won't be very high: Benchmark 1: ./git_v2.52.0 show-branch origin/main origin/next origin/seen origin/todo Time (mean ± σ): 538.2 ms ± 0.8 ms [User: 527.6 ms, System: 9.6 ms] Range (min … max): 537.0 ms … 539.2 ms 10 runs Benchmark 2: ./git show-branch origin/main origin/next origin/seen origin/todo Time (mean ± σ): 530.6 ms ± 0.4 ms [User: 519.8 ms, System: 9.8 ms] Range (min … max): 530.1 ms … 531.3 ms 10 runs Summary ./git show-branch origin/main origin/next origin/seen origin/todo ran 1.01 ± 0.00 times faster than ./git_v2.52.0 show-branch origin/main origin/next origin/seen origin/todo That number is not limited, though, and in pathological cases like the one in p6010 we see a sizable improvement: Test v2.52.0 HEAD ------------------------------------------------------------------ 6010.4: git show-branch 2.19(2.19+0.00) 0.03(0.02+0.00) -98.6% Signed-off-by: René Scharfe Signed-off-by: Junio C Hamano

Merge branch 'rs/merge-base-optim'

2025-11-03T14:49:55Z

The code to walk revision graph to compute merge base has been optimized. * rs/merge-base-optim: commit-reach: avoid commit_list_insert_by_date()

commit-reach: avoid commit_list_insert_by_date()

2025-10-24T17:13:17Z

Building a list using commit_list_insert_by_date() has quadratic worst case complexity. Avoid it by just appending in the loop and sorting at the end. The number of merge bases is usually small, so don't expect speedups in normal repositories. It has no limit, though. The added perf test shows a nice improvement when dealing with 16384 merge bases: Test v2.51.1 HEAD ----------------------------------------------------------------- 6010.2: git merge-base 0.55(0.54+0.00) 0.03(0.02+0.00) -94.5% Signed-off-by: René Scharfe Signed-off-by: Junio C Hamano