git/refs, branch v2.45.3

reftable/stack: add env to disable autocompaction

2024-04-08T19:11:10Z

In future tests it will be neccesary to create repositories with a set number of tables. To make this easier, introduce the `GIT_TEST_REFTABLE_AUTOCOMPACTION` environment variable that, when set to false, disables autocompaction of reftables. Signed-off-by: Justin Tobler Acked-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

Merge branch 'ps/pack-refs-auto' into jt/reftable-geometric-compaction

2024-04-05T17:34:23Z

* ps/pack-refs-auto: builtin/gc: pack refs when using `git maintenance run --auto` builtin/gc: forward git-gc(1)'s `--auto` flag when packing refs t6500: extract objects with "17" prefix builtin/gc: move `struct maintenance_run_opts` builtin/pack-refs: introduce new "--auto" flag builtin/pack-refs: release allocated memory refs/reftable: expose auto compaction via new flag refs: remove `PACK_REFS_ALL` flag refs: move `struct pack_refs_opts` to where it's used t/helper: drop pack-refs wrapper refs/reftable: print errors on compaction failure reftable/stack: gracefully handle failed auto-compaction due to locks reftable/stack: use error codes when locking fails during compaction reftable/error: discern locked/outdated errors reftable/stack: fix error handling in `reftable_stack_init_addition()`

refs/reftable: expose auto compaction via new flag

2024-03-25T16:54:07Z

Under normal circumstances, the "reftable" backend will automatically perform compaction after appending to the stack. It is thus not necessary and may even be considered wasteful to run git-pack-refs(1) in "reftable"-backed repositories as it will cause the backend to compact all tables into a single one. We do exactly that though when running `git maintenance run --auto` or `git gc --auto`, which gets spawned by Git after running some specific commands. The `--auto` mode is typically only executing optimizations as needed. To do so, we already use several heuristics for the various different data structures in Git to determine whether to optimize them or not. We do not use any heuristics for refs though and instead always optimize them. Introduce a new `PACK_REFS_AUTO` flag that can be passed to the backend. When not handled by the backend we will continue to behave the exact same as we do right now, that is we optimize refs unconditionally. This is done for the "files" backend for now to retain current behaviour, even though we may eventually also want to introduce heuristics here. For the "reftable" backend though we already do have auto-compaction, so we can easily reuse that logic to implement the new auto-packing flag. Note that under normal circumstances, this should always end up being a no-op. After all, we already invoke the code for every single addition to the stack. But there are special cases where it can still be helpful to execute the auto-compaction code explicitly: - Concurrent writers may cause compaction to not run due to locks. - Callers may decide to disable compaction altogether and then pack refs at a later point due to various reasons. - Other implementations of the reftable format may do compaction differently or even not at all. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

refs/reftable: print errors on compaction failure

2024-03-25T16:54:07Z

When git-pack-refs(1) fails in the reftable backend we end up printing no error message at all, leaving the caller puzzled as to why compaction has failed. Fix this. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

Merge branch 'ps/reftable-reflog-iteration-perf'

2024-03-21T21:55:13Z

The code to iterate over reflogs in the reftable has been optimized to reduce memory allocation and deallocation. Reviewed-by: Josh Steadmon cf. * ps/reftable-reflog-iteration-perf: refs/reftable: track last log record name via strbuf reftable/record: use scratch buffer when decoding records reftable/record: reuse message when decoding log records reftable/record: reuse refnames when decoding log records reftable/record: avoid copying author info reftable/record: convert old and new object IDs to arrays refs/reftable: reload correct stack when creating reflog iter

Merge branch 'ps/reftable-iteration-perf-part2'

2024-03-14T21:05:23Z

The code to iterate over refs with the reftable backend has seen some optimization. * ps/reftable-iteration-perf-part2: refs/reftable: precompute prefix length reftable: allow inlining of a few functions reftable/record: decode keys in place reftable/record: reuse refname when copying reftable/record: reuse refname when decoding reftable/merged: avoid duplicate pqueue emptiness check reftable/merged: circumvent pqueue with single subiter reftable/merged: handle subiter cleanup on close only reftable/merged: remove unnecessary null check for subiters reftable/merged: make subiters own their records reftable/merged: advance subiter on subsequent iteration reftable/merged: make `merged_iter` structure private reftable/pq: use `size_t` to track iterator index

Merge branch 'ps/reftable-repo-init-fix'

2024-03-07T23:59:42Z

Clear the fallout from a fix for 2.44 regression. * ps/reftable-repo-init-fix: t0610: remove unused variable assignment refs/reftable: don't fail empty transactions in repo without HEAD

Merge branch 'kn/for-all-refs'

2024-03-05T17:44:44Z

"git for-each-ref" learned "--include-root-refs" option to show even the stuff outside the 'refs/' hierarchy. * kn/for-all-refs: for-each-ref: add new option to include root refs ref-filter: rename 'FILTER_REFS_ALL' to 'FILTER_REFS_REGULAR' refs: introduce `refs_for_each_include_root_refs()` refs: extract out `loose_fill_ref_dir_regular_file()` refs: introduce `is_pseudoref()` and `is_headref()`

refs/reftable: track last log record name via strbuf

2024-03-05T17:10:07Z

The reflog iterator enumerates all reflogs known to a ref backend. In the "reftable" backend there is no way to list all existing reflogs directly. Instead, we have to iterate through all reflog entries and discard all those redundant entries for which we have already returned a reflog entry. This logic is implemented by tracking the last reflog name that we have emitted to the iterator's user. If the next log record has the same name we simply skip it until we find another record with a different refname. This last reflog name is stored in a simple C string, which requires us to free and reallocate it whenever we need to update the reflog name. Convert it to use a `struct strbuf` instead, which reduces the number of allocations. Before: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 1,068,485 allocs, 1,068,363 frees, 281,122,886 bytes allocated After: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 68,485 allocs, 68,363 frees, 256,234,072 bytes allocated Note that even after this change we still allocate quite a lot of data, even though the number of allocations does not scale with the number of log records anymore. This remainder comes mostly from decompressing the log blocks, where we decompress each block into newly allocated memory. This will be addressed at a later point in time. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

reftable/record: convert old and new object IDs to arrays

2024-03-05T17:10:06Z

In 7af607c58d (reftable/record: store "val1" hashes as static arrays, 2024-01-03) and b31e3cc620 (reftable/record: store "val2" hashes as static arrays, 2024-01-03) we have converted ref records to store their object IDs in a static array. Convert log records to do the same so that their old and new object IDs are arrays, too. This change results in two allocations less per log record that we're iterating over. Before: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 8,068,495 allocs, 8,068,373 frees, 401,011,862 bytes allocated After: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 6,068,489 allocs, 6,068,367 frees, 361,011,822 bytes allocated Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano