git/t/perf, branch maint

Merge branch 'jk/cat-file-avoid-bitmap-when-unneeded'

2026-01-16T20:40:27Z

Fix for a performance regression in "git cat-file". * jk/cat-file-avoid-bitmap-when-unneeded: cat-file: only use bitmaps when filtering

Merge branch 'jk/t-perf-fixes'

2026-01-16T20:40:26Z

Perf-test fixes. * jk/t-perf-fixes: t/perf/run: preserve GIT_PERF_* from environment t/perf/perf-lib: fix assignment of TEST_OUTPUT_DIRECTORY

cat-file: only use bitmaps when filtering

2026-01-07T00:05:12Z

Commit 8002e8ee18 (builtin/cat-file: use bitmaps to efficiently filter by object type, 2025-04-02) introduced a performance regression when we are not filtering objects: it uses bitmaps even when they won't help, incurring extra costs. For example, running the new perf tests from this commit, which check the performance of listing objects by oid: $ export GIT_PERF_LARGE_REPO=/path/to/linux.git $ git -C "$GIT_PERF_LARGE_REPO" repack -adb $ GIT_SKIP_TESTS=p1006.1 ./run 8002e8ee18^ 8002e8ee18 p1006-cat-file.sh [...] Test 8002e8ee18^ 8002e8ee18 ------------------------------------------------------------------------------- 1006.2: list all objects (sorted) 1.48(1.44+0.04) 6.39(6.35+0.04) +331.8% 1006.3: list all objects (unsorted) 3.01(2.97+0.04) 3.40(3.29+0.10) +13.0% 1006.4: list blobs 4.85(4.67+0.17) 1.68(1.58+0.10) -65.4% An invocation that filters, like listing all blobs (1006.4), does benefit from using the bitmaps; it now doesn't have to check the type of each object from the pack data, so the tradeoff is worth it. But for listing all objects in sorted idx order (1006.2), we otherwise would never open the bitmap nor the revindex file. Worse, our sorting step gets much worse. Normally we append into an array in pack .idx order, and the sort step is trivial. But with bitmaps, we get the objects in pack order, which is apparently random with respect to oid, and have to sort the whole thing. (Note that this freshly-packed state represents the best case for .idx sorting; if we had two packs, then we'd have their objects one after the other and qsort would have to interleave them). The unsorted test in 1006.3 is interesting: there we are going in pack order, so we load the revindex for the pack anyway. And though we don't sort the result, we do use an oidset to check for duplicates. So we can see in the 8002e8ee18^ timings that those two things cost ~1.5s over the sorted case (mostly the oidset hash cost). We also incur the extra cost to open the bitmap file as of 8002e8ee18, which seems to be ~400ms. (This would probably be faster with a bitmap lookup table, but writing that out is not yet the default). So we know that bitmaps help when there's filtering to be done, but otherwise make things worse. Let's only use them when there's a filter. The perf script shows that we've fixed the regressions without hurting the bitmap case: Test 8002e8ee18^ 8002e8ee18 HEAD -------------------------------------------------------------------------------------------------------- 1006.2: list all objects (sorted) 1.56(1.53+0.03) 6.44(6.37+0.06) +312.8% 1.62(1.54+0.06) +3.8% 1006.3: list all objects (unsorted) 3.04(2.98+0.06) 3.45(3.38+0.07) +13.5% 3.04(2.99+0.04) +0.0% 1006.4: list blobs 5.14(4.98+0.15) 1.76(1.68+0.06) -65.8% 1.73(1.64+0.09) -66.3% Note that there's another related case: we might have a filter that cannot be used with bitmaps. That check is handled already for us in for_each_bitmapped_object(), though we'd still load the bitmap and revindex files pointlessly in that case. I don't think it can happen in practice for cat-file, though, since it allows only blob:none, blob:limit, and object:type filters, all of which work with bitmaps. It would be easy-ish to insert an extra check like: can_filter_bitmap(&opt->objects_filter); into the conditional, but I didn't bother here. It would be redundant with the call in for_each_bitmapped_object(), and the can_filter helper function is static local in the bitmap code (so we'd have to make it public). Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

t/perf/run: preserve GIT_PERF_* from environment

2026-01-06T23:56:36Z

If you run: GIT_PERF_LARGE_REPO=/some/path ./p1006-cat-file.sh it will use the repo in /some/path. But if you use the "run" helper script to aggregate and compare results, like this: GIT_PERF_LARGE_REPO=/some/path ./run HEAD^ HEAD p1006-cat-file.sh it will ignore that variable. This is because the presence of the LARGE_REPO variable in GIT-BUILD-OPTIONS overrides what's in the environment. This started with 4638e8806e (Makefile: use common template for GIT-BUILD-OPTIONS, 2024-12-06), which now writes even empty variables (though arguably it was wrong even before with a non-empty value, as we generally prefer the environment to take precedence over on-disk config). We had the same problem in perf-lib.sh itself, and we hacked around it with 32b74b9809 (perf: do allow `GIT_PERF_*` to be overridden again, 2025-04-04). That's what lets the direct invocation of "./p1006" work above. And in fact that was sufficient for "./run", too, until it started loading GIT-BUILD-OPTIONS itself in 5756ccd181 (t/perf: fix benchmarks with out-of-tree builds, 2025-04-28). Now it has the same problem: it clobbers any incoming GIT_PERF options from the environment. We can use the same hack here in the "run" script. It's quite ugly, but it's just short enough that I don't think it's worth trying to factor it out into a common shell library. In the long run, we might consider teaching GIT-BUILD-OPTIONS to be more gentle in overwriting existing entries. There are probably other GIT_TEST_* variables which would need the same treatment. And if and when we come up with a more complete solution, we can use it in both spots. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

t/perf/perf-lib: fix assignment of TEST_OUTPUT_DIRECTORY

2026-01-06T23:56:36Z

Using the perf suite's "run" helper in a vanilla build fails like this: $ make && (cd t/perf && ./run p0000-perf-lib-sanity.sh) === Running 1 tests in this tree === perf 1 - test_perf_default_repo works: 1 2 3 ok perf 2 - test_checkout_worktree works: 1 2 3 ok ok 3 - test_export works perf 4 - export a weird var: 1 2 3 ok perf 5 - éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś: 1 2 3 ok ok 6 - test_export works with weird vars perf 7 - important variables available in subshells: 1 2 3 ok perf 8 - test-lib-functions correctly loaded in subshells: 1 2 3 ok # passed all 8 test(s) 1..8 cannot open test-results/p0000-perf-lib-sanity.subtests: No such file or directory at ./aggregate.perl line 159. It is trying to aggregate results written into t/perf/test-results, but the p0000 script did not write anything there. The "run" script looks in $TEST_OUTPUT_DIRECTORY/test-results, or if that variable is not set, in test-results in the current working directory (which should be t/perf itself). It pulls the value of $TEST_OUTPUT_DIRECTORY from the GIT-BUILD-OPTIONS file. But that doesn't quite match the setup in perf-lib.sh (which is what scripts like p0000 use). There we do this at the top of the script: TEST_OUTPUT_DIRECTORY=$(pwd) and then let test-lib.sh append "/test-results" to that. Historically, that made the vanilla case work: we'd always use t/perf/test-results. But when $TEST_OUTPUT_DIRECTORY was set, it would break. Commit 5756ccd181 (t/perf: fix benchmarks with out-of-tree builds, 2025-04-28) fixed that second case by loading GIT-BUILD-OPTIONS ourselves. But that broke the vanilla case! Now our setting of $TEST_OUTPUT_DIRECTORY in perf-lib.sh is ignored, because it is overwritten by GIT-BUILD-OPTIONS. And when test-lib.sh sees that the output directory is empty, it defaults to t/test-results, rather than t/perf/test-results. Nobody seems to have noticed, probably for two reasons: 1. It only matters if you're trying to aggregate results (like the "run" script does). Just running "./p0000-perf-lib-sanity.sh" manually still produces useful output; the stored result files are just in an unexpected place. 2. There might be leftover files in t/perf/test-results from previous runs (before 5756ccd181). In particular, the ".subtests" files don't tend to change, and the lack of that file is what causes it to barf completely. So it's possible that the aggregation could have been showing stale results that did not match the run that just happened. We can fix it by setting TEST_OUTPUT_DIRECTORY only after we've loaded GIT-BUILD-OPTIONS, so that we override its value and not the other way around. And we'll do so only when the variable is not set, which should retain the fix for that case from 5756ccd181. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

show-branch: use prio_queue

2025-12-28T05:01:23Z

Building a list using commit_list_insert_by_date() has quadratic worst case complexity. Avoid it by using prio_queue. Use prio_queue_peek()+prio_queue_replace() instead of prio_queue_get()+ prio_queue_put() if possible, as the former only rebalance the prio_queue heap once instead of twice. In sane repositories this won't make much of a difference because the number of items in the list or queue won't be very high: Benchmark 1: ./git_v2.52.0 show-branch origin/main origin/next origin/seen origin/todo Time (mean ± σ): 538.2 ms ± 0.8 ms [User: 527.6 ms, System: 9.6 ms] Range (min … max): 537.0 ms … 539.2 ms 10 runs Benchmark 2: ./git show-branch origin/main origin/next origin/seen origin/todo Time (mean ± σ): 530.6 ms ± 0.4 ms [User: 519.8 ms, System: 9.8 ms] Range (min … max): 530.1 ms … 531.3 ms 10 runs Summary ./git show-branch origin/main origin/next origin/seen origin/todo ran 1.01 ± 0.00 times faster than ./git_v2.52.0 show-branch origin/main origin/next origin/seen origin/todo That number is not limited, though, and in pathological cases like the one in p6010 we see a sizable improvement: Test v2.52.0 HEAD ------------------------------------------------------------------ 6010.4: git show-branch 2.19(2.19+0.00) 0.03(0.02+0.00) -98.6% Signed-off-by: René Scharfe Signed-off-by: Junio C Hamano

Merge branch 'rs/merge-base-optim'

2025-11-03T14:49:55Z

The code to walk revision graph to compute merge base has been optimized. * rs/merge-base-optim: commit-reach: avoid commit_list_insert_by_date()

commit-reach: avoid commit_list_insert_by_date()

2025-10-24T17:13:17Z

Building a list using commit_list_insert_by_date() has quadratic worst case complexity. Avoid it by just appending in the loop and sorting at the end. The number of merge bases is usually small, so don't expect speedups in normal repositories. It has no limit, though. The added perf test shows a nice improvement when dealing with 16384 merge bases: Test v2.51.1 HEAD ----------------------------------------------------------------- 6010.2: git merge-base 0.55(0.54+0.00) 0.03(0.02+0.00) -94.5% Signed-off-by: René Scharfe Signed-off-by: Junio C Hamano

t/perf: add last-modified perf script

2025-08-28T23:44:58Z

This just runs some simple last-modified commands. We already test correctness in the regular suite, so this is just about finding performance regressions from one version to another. Based-on-patch-by: Jeff King Signed-off-by: Toon Claes Signed-off-by: Junio C Hamano

Merge branch 'rs/pop-recent-commit-with-prio-queue'

2025-07-28T19:02:34Z

The pop_most_recent_commit() function can have quite expensive worst case performance characteristics, which has been optimized by using prio-queue data structure. * rs/pop-recent-commit-with-prio-queue: commit: use prio_queue_replace() in pop_most_recent_commit() prio-queue: add prio_queue_replace() commit: convert pop_most_recent_commit() to prio_queue