git/reftable, branch v2.45.4

Merge branch 'ps/reftable-block-iteration-optim'

2024-04-23T18:52:37Z

The code to iterate over reftable blocks has seen some optimization to reduce memory allocation and deallocation. * ps/reftable-block-iteration-optim: reftable/block: avoid copying block iterators on seek reftable/block: reuse `zstream` state on inflation reftable/block: open-code call to `uncompress2()` reftable/block: reuse uncompressed blocks reftable/reader: iterate to next block in place reftable/block: move ownership of block reader into `struct table_iter` reftable/block: introduce `block_reader_release()` reftable/block: better grouping of functions reftable/block: merge `block_iter_seek()` and `block_reader_seek()` reftable/block: rename `block_reader_start()`

Merge branch 'jt/reftable-geometric-compaction'

2024-04-16T21:50:30Z

The strategy to compact multiple tables of reftables after many operations accumulate many entries has been improved to avoid accumulating too many tables uncollected. * jt/reftable-geometric-compaction: reftable/stack: use geometric table compaction reftable/stack: add env to disable autocompaction reftable/stack: expose option to disable auto-compaction

reftable/block: avoid copying block iterators on seek

2024-04-15T17:37:59Z

When seeking a reftable record in a block we need to position the iterator _before_ the sought-after record so that the next call to `block_iter_next()` would yield that record. To achieve this, the loop that performs the linear seeks to restore the previous position once it has found the record. This is done by advancing two `block_iter`s: one to check whether the next record is our sought-after record, and one that we update after every iteration. This of course involves quite a lot of copying and also leads to needless memory allocations. Refactor the code to get rid of the `next` iterator and the copying this involves. Instead, we can restore the previous offset such that the call to `next` will return the correct record. Next to being simpler conceptually this also leads to a nice speedup. The following benchmark parser 10k refs out of 100k existing refs via `git-rev-list --no-walk`: Benchmark 1: rev-list: print many refs (HEAD~) Time (mean ± σ): 170.2 ms ± 1.7 ms [User: 86.1 ms, System: 83.6 ms] Range (min … max): 166.4 ms … 180.3 ms 500 runs Benchmark 2: rev-list: print many refs (HEAD~) Time (mean ± σ): 161.6 ms ± 1.6 ms [User: 78.1 ms, System: 83.0 ms] Range (min … max): 158.4 ms … 172.3 ms 500 runs Summary rev-list: print many refs (HEAD) ran 1.05 ± 0.01 times faster than rev-list: print many refs (HEAD~) Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

reftable/block: reuse `zstream` state on inflation

2024-04-15T17:36:09Z

When calling `inflateInit()` and `inflate()`, the zlib library will allocate several data structures for the underlying `zstream` to keep track of various information. Thus, when inflating repeatedly, it is possible to optimize memory allocation patterns by reusing the `zstream` and then calling `inflateReset()` on it to prepare it for the next chunk of data to inflate. This is exactly what the reftable code is doing: when iterating through reflogs we need to potentially inflate many log blocks, but we discard the `zstream` every single time. Instead, as we reuse the `block_reader` for each of the blocks anyway, we can initialize the `zstream` once and then reuse it for subsequent inflations. Refactor the code to do so, which leads to a significant reduction in the number of allocations. The following measurements were done when iterating through 1 million reflog entries. Before: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 23,028 allocs, 22,906 frees, 162,813,552 bytes allocated After: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 302 allocs, 180 frees, 88,352 bytes allocated Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

reftable/block: open-code call to `uncompress2()`

2024-04-15T17:36:09Z

The reftable format stores log blocks in a compressed format. Thus, whenever we want to read such a block we first need to decompress it. This is done by calling the convenience function `uncompress2()` of the zlib library, which is a simple wrapper that manages the lifecycle of the `zstream` structure for us. While nice for one-off inflation of data, when iterating through reflogs we will likely end up inflating many such log blocks. This requires us to reallocate the state of the `zstream` every single time, which adds up over time. It would thus be great to reuse the `zstream` instead of discarding it after every inflation. Open-code the call to `uncompress2()` such that we can start reusing the `zstream` in the subsequent commit. Note that our open-coded variant is different from `uncompress2()` in two ways: - We do not loop around `inflate()` until we have processed all input. As our input is limited by the maximum block size, which is 16MB, we should not hit limits of `inflate()`. - We use `Z_FINISH` instead of `Z_NO_FLUSH`. Quoting the `inflate()` documentation: "inflate() should normally be called until it returns Z_STREAM_END or an error. However if all decompression is to be performed in a single step (a single call of inflate), the parameter flush should be set to Z_FINISH." Furthermore, "Z_FINISH also informs inflate to not maintain a sliding window if the stream completes, which reduces inflate's memory footprint." Other than that this commit is expected to be functionally equivalent and does not yet reuse the `zstream`. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

reftable/block: reuse uncompressed blocks

2024-04-15T17:36:09Z

The reftable backend stores reflog entries in a compressed format and thus needs to uncompress blocks before one can read records from it. For each reflog block we thus have to allocate an array that we can decompress the block contents into. This block is being discarded whenever the table iterator moves to the next block. Consequently, we reallocate a new array on every block, which is quite wasteful. Refactor the code to reuse the uncompressed block data when moving the block reader to a new block. This significantly reduces the number of allocations when iterating through many compressed blocks. The following measurements are done with `git reflog list` when listing 100k reflogs. Before: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 45,755 allocs, 45,633 frees, 254,779,456 bytes allocated After: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 23,028 allocs, 22,906 frees, 162,813,547 bytes allocated Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

reftable/reader: iterate to next block in place

2024-04-15T17:36:09Z

The table iterator has to iterate towards the next block once it has yielded all records of the current block. This is done by creating a new table iterator, initializing it to the next block, releasing the old iterator and then copying over the data. Refactor the code to instead advance the table iterator in place. This is simpler and unlocks some optimizations in subsequent patches. Also, it allows us to avoid some allocations. The following measurements show a single matching ref out of 1 million refs. Before this change: HEAP SUMMARY: in use at exit: 13,603 bytes in 125 blocks total heap usage: 7,235 allocs, 7,110 frees, 301,481 bytes allocated After: HEAP SUMMARY: in use at exit: 13,603 bytes in 125 blocks total heap usage: 315 allocs, 190 frees, 107,027 bytes allocated Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

reftable/block: move ownership of block reader into `struct table_iter`

2024-04-15T17:36:09Z

The table iterator allows the caller to iterate through all records in a reftable table. To do so it iterates through all blocks of the desired type one by one, where for each block it creates a new block iterator and yields all its entries. One of the things that is somewhat confusing in this context is who owns the block reader that is being used to read the blocks and pass them to the block iterator. Intuitively, as the table iterator is responsible for iterating through the blocks, one would assume that this iterator is also responsible for managing the lifecycle of the reader. And while it somewhat is, the block reader is ultimately stored inside of the block iterator. Refactor the code such that the block reader is instead fully managed by the table iterator. Instead of passing the reader to the block iterator, we now only end up passing the block data to it. Despite clearing up the lifecycle of the reader, it will also allow for better reuse of the reader in subsequent patches. The following benchmark prints a single matching ref out of 1 million refs. Before: HEAP SUMMARY: in use at exit: 13,603 bytes in 125 blocks total heap usage: 6,607 allocs, 6,482 frees, 509,635 bytes allocated After: HEAP SUMMARY: in use at exit: 13,603 bytes in 125 blocks total heap usage: 7,235 allocs, 7,110 frees, 301,481 bytes allocated Note that while there are more allocation and free calls now, the overall number of bytes allocated is significantly lower. The number of allocations will be reduced significantly by the next patch though. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

reftable/block: introduce `block_reader_release()`

2024-04-15T17:36:09Z

Introduce a new function `block_reader_release()` that releases resources acquired by the block reader. This function will be extended in a subsequent commit. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

reftable/block: better grouping of functions

2024-04-15T17:36:09Z

Function definitions and declaration of `struct block_reader` and `struct block_iter` are somewhat mixed up, making it hard to see which functions belong together. Rearrange them. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano