aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/gitformat-commit-graph.txt
diff options
context:
space:
mode:
authorJunio C Hamano <gitster@pobox.com>2024-07-08 14:53:09 -0700
committerJunio C Hamano <gitster@pobox.com>2024-07-08 14:53:10 -0700
commitecf7fc600a5218c9ee3863ee70d5a6e312164f30 (patch)
treebcea9bd3948c998759ea848ee55592ac1e1f8a16 /Documentation/gitformat-commit-graph.txt
parentMerge branch 'db/date-underflow-fix' (diff)
parentbloom: introduce `deinit_bloom_filters()` (diff)
downloadgit-ecf7fc600a5218c9ee3863ee70d5a6e312164f30.tar.gz
git-ecf7fc600a5218c9ee3863ee70d5a6e312164f30.zip
Merge branch 'tb/path-filter-fix'
The Bloom filter used for path limited history traversal was broken on systems whose "char" is unsigned; update the implementation and bump the format version to 2. * tb/path-filter-fix: bloom: introduce `deinit_bloom_filters()` commit-graph: reuse existing Bloom filters where possible object.h: fix mis-aligned flag bits table commit-graph: new Bloom filter version that fixes murmur3 commit-graph: unconditionally load Bloom filters bloom: prepare to discard incompatible Bloom filters bloom: annotate filters with hash version repo-settings: introduce commitgraph.changedPathsVersion t4216: test changed path filters with high bit paths t/helper/test-read-graph: implement `bloom-filters` mode bloom.h: make `load_bloom_filter_from_graph()` public t/helper/test-read-graph.c: extract `dump_graph_info()` gitformat-commit-graph: describe version 2 of BDAT commit-graph: ensure Bloom filters are read with consistent settings revision.c: consult Bloom filters for root commits t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
Diffstat (limited to 'Documentation/gitformat-commit-graph.txt')
-rw-r--r--Documentation/gitformat-commit-graph.txt9
1 files changed, 6 insertions, 3 deletions
diff --git a/Documentation/gitformat-commit-graph.txt b/Documentation/gitformat-commit-graph.txt
index 31cad585e2..3e906e8030 100644
--- a/Documentation/gitformat-commit-graph.txt
+++ b/Documentation/gitformat-commit-graph.txt
@@ -142,13 +142,16 @@ All multi-byte numbers are in network byte order.
==== Bloom Filter Data (ID: {'B', 'D', 'A', 'T'}) [Optional]
* It starts with header consisting of three unsigned 32-bit integers:
- - Version of the hash algorithm being used. We currently only support
- value 1 which corresponds to the 32-bit version of the murmur3 hash
+ - Version of the hash algorithm being used. We currently support
+ value 2 which corresponds to the 32-bit version of the murmur3 hash
implemented exactly as described in
https://en.wikipedia.org/wiki/MurmurHash#Algorithm and the double
hashing technique using seed values 0x293ae76f and 0x7e646e2 as
described in https://doi.org/10.1007/978-3-540-30494-4_26 "Bloom Filters
- in Probabilistic Verification"
+ in Probabilistic Verification". Version 1 Bloom filters have a bug that appears
+ when char is signed and the repository has path names that have characters >=
+ 0x80; Git supports reading and writing them, but this ability will be removed
+ in a future version of Git.
- The number of times a path is hashed and hence the number of bit positions
that cumulatively determine whether a file is present in the commit.
- The minimum number of bits 'b' per entry in the Bloom filter. If the filter