<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/pack-bitmap.h, branch v2.41.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.41.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.41.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2023-05-02T15:48:22Z</updated>
<entry>
<title>fsck: verify checksums of all .bitmap files</title>
<updated>2023-05-02T15:48:22Z</updated>
<author>
<name>Derrick Stolee</name>
<email>derrickstolee@github.com</email>
</author>
<published>2023-05-02T13:27:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=756f1bcd29a711075cbaecb02c923328e86a41a4'/>
<id>urn:sha1:756f1bcd29a711075cbaecb02c923328e86a41a4</id>
<content type='text'>
If a filesystem-level corruption occurs in a .bitmap file, Git can react
poorly. This could take the form of a run-time error due to failing to
parse an EWAH bitmap or be more subtle such as returning the wrong set
of objects to a fetch or clone.

A natural first response to either of these kinds of errors is to run
'git fsck' to see if any files are corrupt. This currently ignores all
.bitmap files.

Add checks to 'git fsck' for all .bitmap files that are currently
associated with a multi-pack-index or pack file. Verify their checksums
using the hashfile API.

We iterate through all multi-pack-indexes and pack-files to be sure to
check all .bitmap files, not just the one that would be read by the
process. For example, a multi-pack-index bitmap overrules a pack-bitmap.
However, if the multi-pack-index is removed, the pack-bitmap may be
selected instead. Be thorough to include every file that could become
active in such a way. This includes checking files in alternates.

There is potential that we could extend this effort to check the
structure of the reachability bitmaps themselves, but it is very
expensive to do so. At minimum, it's as expensive as generating the
bitmaps in the first place, and that's assuming that we don't use the
trivial algorithm of verifying each bitmap individually. The trivial
algorithm will result in quadratic behavior (number of objects times
number of bitmapped commits) while the bitmap building operation
constructs a lattice of commits to build bitmaps incrementally and then
generate the final bitmaps from a subset of those commits.

If we were to extend 'git fsck' to check .bitmap file contents more
closely like this, then we would likely want to hide it behind an option
that signals the user is more willing to do expensive operations such as
this.

For testing, set up a repository with a pack-bitmap _and_ a
multi-pack-index bitmap. This requires some file movement to avoid
deleting the pack-bitmap during the repack that creates the
multi-pack-index bitmap. We can then verify that 'git fsck' is checking
all files, not just the "active" bitmap.

Signed-off-by: Derrick Stolee &lt;derrickstolee@github.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap: prepare to read lookup table extension</title>
<updated>2022-08-26T17:13:58Z</updated>
<author>
<name>Abhradeep Chakraborty</name>
<email>chakrabortyabhradeep79@gmail.com</email>
</author>
<published>2022-08-14T16:55:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=28cd730680dd7d5c0e0971827315bf8ae32f47b7'/>
<id>urn:sha1:28cd730680dd7d5c0e0971827315bf8ae32f47b7</id>
<content type='text'>
Earlier change teaches Git to write bitmap lookup table. But Git
does not know how to parse them.

Teach Git to parse the existing bitmap lookup table. The older
versions of Git are not affected by it. Those versions ignore the
lookup table.

Mentored-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Co-Mentored-by: Kaartic Sivaraam &lt;kaartic.sivaraam@gmail.com&gt;
Signed-off-by: Abhradeep Chakraborty &lt;chakrabortyabhradeep79@gmail.com&gt;
Reviewed-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap-write.c: write lookup table extension</title>
<updated>2022-08-26T17:13:50Z</updated>
<author>
<name>Abhradeep Chakraborty</name>
<email>chakrabortyabhradeep79@gmail.com</email>
</author>
<published>2022-08-14T16:55:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=93eb41e2403788fa9105211956e87b6b2c22c68c'/>
<id>urn:sha1:93eb41e2403788fa9105211956e87b6b2c22c68c</id>
<content type='text'>
The bitmap lookup table extension was documented by an earlier
change, but Git does not yet know how to write that extension.

Teach Git to write bitmap lookup table extension. The table contains
the list of `N` &lt;commit_pos, offset, xor_row&gt;` triplets. These
triplets are sorted according to their commit pos (ascending order).
The meaning of each data in the i'th triplet is given below:

  - commit_pos stores commit position (in the pack-index or midx).
    It is a 4 byte network byte order unsigned integer.

  - offset is the position (in the bitmap file) from which that
    commit's bitmap can be read.

  - xor_row is the position of the triplet in the lookup table
    whose bitmap is used to compress this bitmap, or `0xffffffff`
    if no such bitmap exists.

Mentored-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Co-mentored-by: Kaartic Sivaraam &lt;kaartic.sivaraam@gmail.com&gt;
Co-authored-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Abhradeep Chakraborty &lt;chakrabortyabhradeep79@gmail.com&gt;
Reviewed-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap-write: use const for hashes</title>
<updated>2022-07-19T15:38:17Z</updated>
<author>
<name>Derrick Stolee</name>
<email>derrickstolee@github.com</email>
</author>
<published>2022-07-19T15:26:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5766524956714d51b131d826a2894c85949e5770'/>
<id>urn:sha1:5766524956714d51b131d826a2894c85949e5770</id>
<content type='text'>
The next change will use a const array when calling this method. There
is no need for the non-const version, so let's do this cleanup quickly.

Signed-off-by: Derrick Stolee &lt;derrickstolee@github.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap: drop filter in prepare_bitmap_walk()</title>
<updated>2022-03-09T18:25:27Z</updated>
<author>
<name>Derrick Stolee</name>
<email>derrickstolee@github.com</email>
</author>
<published>2022-03-09T16:01:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=09d4a79effac002399557392e21c9f8829056ca3'/>
<id>urn:sha1:09d4a79effac002399557392e21c9f8829056ca3</id>
<content type='text'>
Now that all consumers of prepare_bitmap_walk() have populated the
'filter' member of 'struct rev_info', we can drop that extra parameter
from the method and access it directly from the 'struct rev_info'.

Signed-off-by: Derrick Stolee &lt;derrickstolee@github.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'tb/repack-write-midx'</title>
<updated>2021-10-18T22:47:57Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-10-18T22:47:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0b69bb0fb1ebe1a9ab7a3f4bfde5cad82eb892e3'/>
<id>urn:sha1:0b69bb0fb1ebe1a9ab7a3f4bfde5cad82eb892e3</id>
<content type='text'>
"git repack" has been taught to generate multi-pack reachability
bitmaps.

* tb/repack-write-midx:
  test-read-midx: fix leak of bitmap_index struct
  builtin/repack.c: pass `--refs-snapshot` when writing bitmaps
  builtin/repack.c: make largest pack preferred
  builtin/repack.c: support writing a MIDX while repacking
  builtin/repack.c: extract showing progress to a variable
  builtin/repack.c: rename variables that deal with non-kept packs
  builtin/repack.c: keep track of existing packs unconditionally
  midx: preliminary support for `--refs-snapshot`
  builtin/multi-pack-index.c: support `--stdin-packs` mode
  midx: expose `write_midx_file_only()` publicly
</content>
</entry>
<entry>
<title>builtin/repack.c: make largest pack preferred</title>
<updated>2021-09-29T04:20:56Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2021-09-29T01:55:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=6d08b9d4caa230441b7d9e2b4f23deaf9ff74c13'/>
<id>urn:sha1:6d08b9d4caa230441b7d9e2b4f23deaf9ff74c13</id>
<content type='text'>
When repacking into a geometric series and writing a multi-pack bitmap,
it is beneficial to have the largest resulting pack be the preferred
object source in the bitmap's MIDX, since selecting the large packs can
lead to fewer broken delta chains and better compression.

Teach 'git repack' to identify this pack and pass it to the MIDX write
machinery in order to mark it as preferred.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>t/helper/test-bitmap.c: add 'dump-hashes' mode</title>
<updated>2021-09-14T23:34:17Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2021-09-14T22:06:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a05f02b1d9a1253e11a327c95cd47cbd24317ba6'/>
<id>urn:sha1:a05f02b1d9a1253e11a327c95cd47cbd24317ba6</id>
<content type='text'>
The pack-bitmap writer code is about to learn how to propagate values
from an existing hash-cache. To prepare, teach the test-bitmap helper to
dump the values from a bitmap's hash-cache extension in order to test
those changes.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap: drop repository argument from prepare_midx_bitmap_git()</title>
<updated>2021-09-10T00:32:37Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2021-09-09T19:56:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=bfbb60d328426f0fcc708e2da13d0063ba63e9db'/>
<id>urn:sha1:bfbb60d328426f0fcc708e2da13d0063ba63e9db</id>
<content type='text'>
We never look at the repository argument which is passed. This makes
sense, since the multi_pack_index struct already tells us everything we
need to access the files in its associated object directory.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Reviewed-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap: read multi-pack bitmaps</title>
<updated>2021-09-01T20:56:43Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2021-08-31T20:52:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0f533c728418fd3ef6ebcae5240e8df566cdaa72'/>
<id>urn:sha1:0f533c728418fd3ef6ebcae5240e8df566cdaa72</id>
<content type='text'>
This prepares the code in pack-bitmap to interpret the new multi-pack
bitmaps described in Documentation/technical/bitmap-format.txt, which
mostly involves converting bit positions to accommodate looking them up
in a MIDX.

Note that there are currently no writers who write multi-pack bitmaps,
and that this will be implemented in the subsequent commit. Note also
that get_midx_checksum() and get_midx_filename() are made non-static so
they can be called from pack-bitmap.c.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
