<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/Documentation/gitformat-pack.txt, branch v2.45.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.45.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.45.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2023-12-14T22:38:07Z</updated>
<entry>
<title>midx: implement `BTMP` chunk</title>
<updated>2023-12-14T22:38:07Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2023-12-14T22:23:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5f5ccd959573f88e126d53df16b149c64e6e9091'/>
<id>urn:sha1:5f5ccd959573f88e126d53df16b149c64e6e9091</id>
<content type='text'>
When a multi-pack bitmap is used to implement verbatim pack reuse (that
is, when verbatim chunks from an on-disk packfile are copied
directly[^1]), it does so by using its "preferred pack" as the source
for pack-reuse.

This allows repositories to pack the majority of their objects into a
single (often large) pack, and then use it as the single source for
verbatim pack reuse. This increases the amount of objects that are
reused verbatim (and consequently, decrease the amount of time it takes
to generate many packs). But this performance comes at a cost, which is
that the preferred packfile must pace its growth with that of the entire
repository in order to maintain the utility of verbatim pack reuse.

As repositories grow beyond what we can reasonably store in a single
packfile, the utility of verbatim pack reuse diminishes. Or, at the very
least, it becomes increasingly more expensive to maintain as the pack
grows larger and larger.

It would be beneficial to be able to perform this same optimization over
multiple packs, provided some modest constraints (most importantly, that
the set of packs eligible for verbatim reuse are disjoint with respect
to the subset of their objects being sent).

If we assume that the packs which we treat as candidates for verbatim
reuse are disjoint with respect to any of their objects we may output,
we need to make only modest modifications to the verbatim pack-reuse
code itself. Most notably, we need to remove the assumption that the
bits in the reachability bitmap corresponding to objects from the single
reuse pack begin at the first bit position.

Future patches will unwind these assumptions and reimplement their
existing functionality as special cases of the more general assumptions
(e.g. that reuse bits can start anywhere within the bitset, but happen
to start at 0 for all existing cases).

This patch does not yet relax any of those assumptions. Instead, it
implements a foundational data-structure, the "Bitampped Packs" (`BTMP`)
chunk of the multi-pack index. The `BTMP` chunk's contents are described
in detail here. Importantly, the `BTMP` chunk contains information to
map regions of a multi-pack index's reachability bitmap to the packs
whose objects they represent.

For now, this chunk is only written, not read (outside of the test-tool
used in this patch to test the new chunk's behavior). Future patches
will begin to make use of this new chunk.

[^1]: Modulo patching any `OFS_DELTA`'s that cross over a region of the
  pack that wasn't used verbatim.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'tb/format-pack-doc-update'</title>
<updated>2023-11-08T02:04:00Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-11-08T02:04:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ed14fa1c2aa0788c1e7bbd7fdbac5edae09679bf'/>
<id>urn:sha1:ed14fa1c2aa0788c1e7bbd7fdbac5edae09679bf</id>
<content type='text'>
Doc update.

* tb/format-pack-doc-update:
  Documentation/gitformat-pack.txt: fix incorrect MIDX documentation
  Documentation/gitformat-pack.txt: fix typo
</content>
</entry>
<entry>
<title>Documentation/gitformat-pack.txt: fix incorrect MIDX documentation</title>
<updated>2023-11-01T04:25:04Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2023-10-31T19:24:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=1bd809938a470ff92bc8963d59381c74a62839cc'/>
<id>urn:sha1:1bd809938a470ff92bc8963d59381c74a62839cc</id>
<content type='text'>
Back in 32f3c541e3 (multi-pack-index: write pack names in chunk,
2018-07-12) the MIDX's "Packfile Names" (or "PNAM", for short) chunk was
described as containing an array of string entries. e0d1bcf825 notes
that this is the only chunk in the MIDX format's specification that is
not guaranteed to be 4-byte aligned, and so should be placed last.

This isn't quite accurate: the entries within the PNAM chunk are not
guaranteed to be 4-byte aligned since they are arbitrary strings, but
the chunk itself is 4-byte aligned since the ending is padded with NUL
bytes.

That padding has always been there since 32f3c541e3 via
midx.c::write_midx_pack_names(), which ended with:

    i = MIDX_CHUNK_ALIGNMENT - (written % MIDX_CHUNK_ALIGNMENT)
    if (i &lt; MIDX_CHUNK_ALIGNMENT) {
      unsigned char padding[MIDX_CHUNK_ALIGNMENT];
      memset(padding, 0, sizeof(padding))
      hashwrite(f, padding, i);
      written += i;
    }

In fact, 32f3c541e3's log message itself describes the chunk in its
first paragraph with:

    Since filenames are not well structured, add padding to keep good
    alignment in later chunks.

So these have always been externally aligned. Correct the corresponding
part of our documentation to reflect that.

Helped-by: Junio C Hamano &lt;gitster@pobox.com&gt;
Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Documentation/gitformat-pack.txt: fix typo</title>
<updated>2023-11-01T04:25:02Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2023-10-31T19:24:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=530a9f183f03cfa49d1951a5c204ef44f41f8c9d'/>
<id>urn:sha1:530a9f183f03cfa49d1951a5c204ef44f41f8c9d</id>
<content type='text'>
e0d1bcf825 (multi-pack-index: add format details, 2018-07-12) describes
the MIDX's "PNAM" chunk as having entries which are "null-terminated
strings".

This is a typo, as strings are terminated with a NUL character, which is
a distinct concept from "NULL" or "null", which we typically reserve for
the void pointer to address 0.

Correct the documentation accordingly.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>documentation: add some commas where they are helpful</title>
<updated>2023-10-09T19:06:44Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-10-08T06:45:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4d542687fcea27c6cce9a79415ad8cb1a817697c'/>
<id>urn:sha1:4d542687fcea27c6cce9a79415ad8cb1a817697c</id>
<content type='text'>
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>documentation: add missing article</title>
<updated>2023-10-09T19:06:29Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-10-08T06:45:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0a4f051f9318c3dd9db69c4bebecdc6d160a5fc6'/>
<id>urn:sha1:0a4f051f9318c3dd9db69c4bebecdc6d160a5fc6</id>
<content type='text'>
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>documentation: fix verb tense</title>
<updated>2023-10-09T19:06:29Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-10-08T06:45:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5676b04a44935752314483182114069ecabe230a'/>
<id>urn:sha1:5676b04a44935752314483182114069ecabe230a</id>
<content type='text'>
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>documentation: fix typos</title>
<updated>2023-10-09T19:06:24Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-10-08T06:45:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=384f7d17d2f0d0ca689d8dda16f752c75a8ac634'/>
<id>urn:sha1:384f7d17d2f0d0ca689d8dda16f752c75a8ac634</id>
<content type='text'>
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Documentation/gitformat-pack.txt: drop mixed version section</title>
<updated>2023-08-29T18:58:26Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2023-08-28T22:49:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c0b5d46ded46bf6e2cf4bb5325e4bf43374dd1ed'/>
<id>urn:sha1:c0b5d46ded46bf6e2cf4bb5325e4bf43374dd1ed</id>
<content type='text'>
This section was added in 3d89a8c118 (Documentation/technical: add
cruft-packs.txt, 2022-05-20) to highlight a potential pitfall when
deploying cruft packs in an environment where multiple versions of Git
are GC-ing the same repository.

Now that it has been more than a year since 3d89a8c118 was written,
let's drop this section as it is no longer relevant.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Documentation/gitformat-pack.txt: remove multi-cruft packs alternative</title>
<updated>2023-08-29T18:58:26Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2023-08-28T22:49:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3843ef89312593a1e4d70bcdb29fd6ffa87134d6'/>
<id>urn:sha1:3843ef89312593a1e4d70bcdb29fd6ffa87134d6</id>
<content type='text'>
This text, originally from 3d89a8c118 (Documentation/technical: add
cruft-packs.txt, 2022-05-20) lists multiple cruft packs as a potential
alternative to the design of cruft packs.

We have always supported multiple cruft packs (i.e. we use the most
recent mtime for a given object among all cruft packs which contain it,
etc.), but haven't encouraged its use.

We still aren't encouraging users to go out and generate multiple cruft
packs, but let's take a step in that direction by dropping language that
suggests we aren't capable of working with multiple cruft packs.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
