<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/packfile.c, branch v2.50.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.50.0</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.50.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2025-05-30T18:59:18Z</updated>
<entry>
<title>Merge branch 'ps/midx-negative-packfile-cache'</title>
<updated>2025-05-30T18:59:18Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-05-30T18:59:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9a43523dc39f88567e7e4e01fcde72194df7e467'/>
<id>urn:sha1:9a43523dc39f88567e7e4e01fcde72194df7e467</id>
<content type='text'>
When a stale .midx file refers to .pack files that no longer exist,
we ended up checking for these non-existent files repeatedly, which
has been optimized by memoizing the non-existence.

* ps/midx-negative-packfile-cache:
  midx: stop repeatedly looking up nonexistent packfiles
  packfile: explain ordering of how we look up auxiliary pack files
</content>
</entry>
<entry>
<title>packfile: explain ordering of how we look up auxiliary pack files</title>
<updated>2025-05-28T14:56:29Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-05-28T12:24:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=320572c43d7bc5afbcb8e5faf83b6eccfe6f4e32'/>
<id>urn:sha1:320572c43d7bc5afbcb8e5faf83b6eccfe6f4e32</id>
<content type='text'>
When adding a packfile to an object database we perform four syscalls:

  - Three calls to access(3p) are done to check for auxiliary data
    structures.

  - One call to stat(3p) is done to check for the ".pack" itself.

One curious bit is that we perform the access(3p) calls before checking
for the packfile itself, but if the packfile doesn't exist we discard
all results. The access(3p) calls are thus essentially wasted, so one
may be triggered to reorder those calls so that we can short-circuit the
other syscalls in case the packfile does not exist.

The order in which we look up files is quite important though to help
avoid races:

  - When installing a packfile we move auxiliary data structures into
    place before we install the ".idx" file.

  - When deleting a packfile we first delete the ".idx" and ".pack"
    files before deleting auxiliary data structures.

As such, to avoid any races with concurrently created or deleted packs
we need to make sure that we _first_ read auxiliary data structures
before we read the corresponding ".idx" or ".pack" file. Otherwise it
may easily happen that we return a populated but misclassified pack.

Add a comment to `add_packed_git()` to make future readers aware of this
ordering requirement.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>oid_object_info(): drop type_name strbuf</title>
<updated>2025-05-16T16:43:10Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2025-05-16T04:49:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d2956385a9319155928e2d7bc5f9d90eeac5d0a5'/>
<id>urn:sha1:d2956385a9319155928e2d7bc5f9d90eeac5d0a5</id>
<content type='text'>
We provide a mechanism for callers to get the object type as a raw
string, rather than an object_type enum. This was in theory useful for
returning types that are not representable in the enum, but we consider
any such type to be an error, and there are no callers that use the
strbuf anymore.

Let's drop support to simplify the code a bit.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>object-store: merge "object-store-ll.h" and "object-store.h"</title>
<updated>2025-04-15T15:24:37Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-04-15T09:38:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=68cd492a3e662c75dec364986c81e94716d4ac56'/>
<id>urn:sha1:68cd492a3e662c75dec364986c81e94716d4ac56</id>
<content type='text'>
The "object-store-ll.h" header has been introduced to keep transitive
header dependendcies and compile times at bay. Now that we have created
a new "object-store.c" file though we can easily move the last remaining
additional bit of "object-store.h", the `odb_path_map`, out of the
header.

Do so. As the "object-store.h" header is now equivalent to its low-level
alternative we drop the latter and inline it into the former.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>builtin/pack-objects.c: freshen objects from existing cruft packs</title>
<updated>2025-03-13T18:48:04Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2025-03-13T18:09:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=08f612ba7000bf181ef6d8baed9ece322e567efd'/>
<id>urn:sha1:08f612ba7000bf181ef6d8baed9ece322e567efd</id>
<content type='text'>
Once an object is written into a cruft pack, we can only freshen it by
writing a new loose or packed copy of that object with a more recent
mtime.

Prior to 61568efa95 (builtin/pack-objects.c: support `--max-pack-size`
with `--cruft`, 2023-08-28), we typically had at most one cruft pack in
a repository at any given time. So freshening unreachable objects was
straightforward when already rewriting the cruft pack (and its *.mtimes
file).

But 61568efa95 changes things: 'pack-objects' now supports writing
multiple cruft packs when invoked with `--cruft` and the
`--max-pack-size` flag. Cruft packs are rewritten until they reach some
size threshold, at which point they are considered "frozen", and will
only be modified in a pruning GC, or if the threshold itself is
adjusted.

Prior to this patch, however, this process breaks down when we attempt
to freshen an object packed in an earlier cruft pack, and that cruft
pack is larger than the threshold and thus will survive the repack.

When this is the case, it is impossible to freshen objects in cruft
pack(s) when those cruft packs are larger than the threshold. This is
because we would avoid writing them in the new cruft pack entirely, for
a couple of reasons.

 1. When enumerating packed objects via 'add_objects_in_unpacked_packs()'
    we pass the SKIP_IN_CORE_KEPT_PACKS, which is used to avoid looping
    over the packs we're going to retain (which are marked as kept
    in-core by 'read_cruft_objects()').

    This means that we will avoid enumerating additional packed copies
    of objects found in any cruft packs which are larger than the given
    size threshold. Thus there is no opportunity to call
    'create_object_entry()' whatsoever.

 2. We likewise will discard the loose copy (if one exists) of any
    unreachable object packed in a cruft pack that is larger than the
    threshold. Here our call path is 'add_unreachable_loose_objects()',
    which uses the 'add_loose_object()' callback.

    That function will eventually land us in 'want_object_in_pack()'
    (via 'add_cruft_object_entry()'), and we'll discard the object as it
    appears in one of the packs which we marked as kept in-core.

This means in effect that it is impossible to freshen an unreachable
object once it appears in a cruft pack larger than the given threshold.

Instead, we should pack an additional copy of an unreachable object we
want to freshen even if it appears in a cruft pack, provided that the
cruft copy has an mtime which is before the mtime of the copy we are
trying to pack/freshen. This is sub-optimal in the sense that it
requires keeping an additional copy of unreachable objects upon
freshening, but we don't have a better alternative without the ability
to make in-place modifications to existing *.mtimes files.

In order to implement this, we have to adjust the behavior of
'want_found_object()'. When 'pack-objects' is told that we're *not*
going to retain any cruft packs (i.e. the set of packs marked as kept
in-core does not contain a cruft pack), the behavior is unchanged.

But when there *is* at least one cruft pack that we're holding onto, it
is no longer sufficient to reject a copy of an object found in that
cruft pack for that reason alone. In this case, we only want to reject a
candidate object when copies of that object either:

 - exists in a non-cruft pack that we are retaining, regardless of that
   pack's mtime, or

 - exists in a cruft pack with an mtime at least as recent as the copy
   we are debating whether or not to pack, in which case freshening
   would be redundant.

To do this, keep track of whether or not we have any cruft packs in our
in-core kept list with a new 'ignore_packed_keep_in_core_has_cruft'
flag. When we end up in this new special case, we replace a call to
'has_object_kept_pack()' to 'want_cruft_object_mtime()', and only reject
objects when we have a copy in an existing cruft pack with at least as
recent an mtime as our candidate (in which case "freshening" would be
redundant).

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jk/pack-header-parse-alignment-fix'</title>
<updated>2025-01-28T21:02:23Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-01-28T21:02:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=f8b9821f7d4f5239911794d03ac74edf27d711e6'/>
<id>urn:sha1:f8b9821f7d4f5239911794d03ac74edf27d711e6</id>
<content type='text'>
It was possible for "git unpack-objects" and "git index-pack" to
make an unaligned access, which has been corrected.

* jk/pack-header-parse-alignment-fix:
  index-pack, unpack-objects: use skip_prefix to avoid magic number
  index-pack, unpack-objects: use get_be32() for reading pack header
  parse_pack_header_option(): avoid unaligned memory writes
  packfile: factor out --pack_header argument parsing
  bswap.h: squelch potential sparse -Wcast-truncate warnings
</content>
</entry>
<entry>
<title>parse_pack_header_option(): avoid unaligned memory writes</title>
<updated>2025-01-21T16:42:55Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2025-01-19T13:23:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4f02f4d68d8eefe728008974640839ef6e1b2182'/>
<id>urn:sha1:4f02f4d68d8eefe728008974640839ef6e1b2182</id>
<content type='text'>
In order to recreate a pack header in our in-memory buffer, we cast the
buffer to a "struct pack_header" and assign the individual fields. This
is reported to cause SIGBUS on sparc64 due to alignment issues.

We can work around this by using put_be32() which will write individual
bytes into the buffer.

Reported-by: Koakuma &lt;koachan@protonmail.com&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>packfile: factor out --pack_header argument parsing</title>
<updated>2025-01-21T16:42:55Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2025-01-19T13:23:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=798e0f451661f81f4568dce4033cf1c9076f224f'/>
<id>urn:sha1:798e0f451661f81f4568dce4033cf1c9076f224f</id>
<content type='text'>
Both index-pack and unpack-objects accept a --pack_header argument. This
is an undocumented internal argument used by receive-pack and fetch to
pass along information about the header of the pack, which they've
already read from the incoming stream.

In preparation for a bugfix, let's factor the duplicated code into a
common helper.

The callers are still responsible for identifying the option. While this
could likewise be factored out, it is more flexible this way (e.g., if
they ever started using parse-options and wanted to handle both the
stuck and unstuck forms).

Likewise, the callers are responsible for reporting errors, though they
both just call die(). I've tweaked unpack-objects to match index-pack in
marking the error for translation.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ps/build-sign-compare'</title>
<updated>2024-12-23T17:32:11Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-12-23T17:32:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4156b6a741c7fb15a4eccb320612fb6e453f439c'/>
<id>urn:sha1:4156b6a741c7fb15a4eccb320612fb6e453f439c</id>
<content type='text'>
Start working to make the codebase buildable with -Wsign-compare.

* ps/build-sign-compare:
  t/helper: don't depend on implicit wraparound
  scalar: address -Wsign-compare warnings
  builtin/patch-id: fix type of `get_one_patchid()`
  builtin/blame: fix type of `length` variable when emitting object ID
  gpg-interface: address -Wsign-comparison warnings
  daemon: fix type of `max_connections`
  daemon: fix loops that have mismatching integer types
  global: trivial conversions to fix `-Wsign-compare` warnings
  pkt-line: fix -Wsign-compare warning on 32 bit platform
  csum-file: fix -Wsign-compare warning on 32-bit platform
  diff.h: fix index used to loop through unsigned integer
  config.mak.dev: drop `-Wno-sign-compare`
  global: mark code units that generate warnings with `-Wsign-compare`
  compat/win32: fix -Wsign-compare warning in "wWinMain()"
  compat/regex: explicitly ignore "-Wsign-compare" warnings
  git-compat-util: introduce macros to disable "-Wsign-compare" warnings
</content>
</entry>
<entry>
<title>global: mark code units that generate warnings with `-Wsign-compare`</title>
<updated>2024-12-06T11:20:02Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-12-06T10:27:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=41f43b8243f42b9df2e98be8460646d4c0100ad3'/>
<id>urn:sha1:41f43b8243f42b9df2e98be8460646d4c0100ad3</id>
<content type='text'>
Mark code units that generate warnings with `-Wsign-compare`. This
allows for a structured approach to get rid of all such warnings over
time in a way that can be easily measured.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
