<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/midx-write.c, branch v2.50.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.50.0</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.50.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2025-05-22T21:48:37Z</updated>
<entry>
<title>midx: avoid negative array index</title>
<updated>2025-05-22T21:48:37Z</updated>
<author>
<name>Phillip Wood</name>
<email>phillip.wood@dunelm.org.uk</email>
</author>
<published>2025-05-22T15:55:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3aa98a61da6e1403081b4dfaa0c644614d228bac'/>
<id>urn:sha1:3aa98a61da6e1403081b4dfaa0c644614d228bac</id>
<content type='text'>
nth_midxed_pack_int_id() returns the index of the pack file in the multi
pack index's list of packfiles that the specified object. The index is
returned as a uint32_t. Storing this in an int will make the index
negative if the most significant bit is set. Fix this by using uint32_t
as the rest of the code does. This is unlikely to be a practical problem
as it requires the multipack index to reference 2^31 packfiles.

Signed-off-by: Phillip Wood &lt;phillip.wood@dunelm.org.uk&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>midx repack: avoid potential integer overflow on 64 bit systems</title>
<updated>2025-05-22T21:48:36Z</updated>
<author>
<name>Phillip Wood</name>
<email>phillip.wood@dunelm.org.uk</email>
</author>
<published>2025-05-22T15:55:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=f874c0ed90c63276e0ebc445ad6fee5dcbfacb86'/>
<id>urn:sha1:f874c0ed90c63276e0ebc445ad6fee5dcbfacb86</id>
<content type='text'>
On a 64 bit system the calculation

    p-&gt;pack_size * pack_info[i].referenced_objects

could overflow. If a pack file contains 2^28 objects with an average
compressed size of 1KB then the pack size will be 2^38B. If all of the
objects are referenced by the multi-pack index the sum above will
overflow. Avoid this by using shifted integer arithmetic and changing
the order of the calculation so that the pack size is divided by the
total number of objects in the pack before multiplying by the number of
objects referenced by the multi-pack index. Using a shift of 14 bits
should give reasonable accuracy while avoiding overflow for pack sizes
less that 1PB.

Signed-off-by: Phillip Wood &lt;phillip.wood@dunelm.org.uk&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>midx repack: avoid integer overflow on 32 bit systems</title>
<updated>2025-05-22T21:48:36Z</updated>
<author>
<name>Phillip Wood</name>
<email>phillip.wood@dunelm.org.uk</email>
</author>
<published>2025-05-22T15:55:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b103881d4f4b157d86813ba5f91acd7ed6c888d0'/>
<id>urn:sha1:b103881d4f4b157d86813ba5f91acd7ed6c888d0</id>
<content type='text'>
On a 32 bit system "git multi-pack-index --repack --batch-size=120M"
failed with

    fatal: size_t overflow: 6038786 * 1289

The calculation to estimated size of the objects in the pack referenced
by the multi-pack-index uses st_mult() to multiply the pack size by the
number of referenced objects before dividing by the total number of
objects in the pack. As size_t is 32 bits on 32 bit systems this
calculation easily overflows. Fix this by using 64bit arithmetic instead.

Also fix a potential overflow when caluculating the total size of the
objects referenced by the multipack index with a batch size larger
than SIZE_MAX / 2. In that case

    total_size += estimated_size

can overflow as both total_size and estimated_size can be greater that
SIZE_MAX / 2. This is addressed by using saturating arithmetic for the
addition. Although estimated_size is of type uint64_t by the time we
reach this sum it is bounded by the batch size which is of type size_t
and so casting estimated_size to size_t does not truncate the value.

Signed-off-by: Phillip Wood &lt;phillip.wood@dunelm.org.uk&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ps/object-file-cleanup'</title>
<updated>2025-04-25T00:25:33Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-04-25T00:25:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=36d8035d27d6100a525a0e25619868b9542a4f35'/>
<id>urn:sha1:36d8035d27d6100a525a0e25619868b9542a4f35</id>
<content type='text'>
Code clean-up.

* ps/object-file-cleanup:
  object-store: merge "object-store-ll.h" and "object-store.h"
  object-store: remove global array of cached objects
  object: split out functions relating to object store subsystem
  object-file: drop `index_blob_stream()`
  object-file: split up concerns of `HASH_*` flags
  object-file: split out functions relating to object store subsystem
  object-file: move `xmmap()` into "wrapper.c"
  object-file: move `git_open_cloexec()` to "compat/open.c"
  object-file: move `safe_create_leading_directories()` into "path.c"
  object-file: move `mkdir_in_gitdir()` into "path.c"
</content>
</entry>
<entry>
<title>Merge branch 'ps/object-wo-the-repository'</title>
<updated>2025-04-15T20:50:15Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-04-15T20:50:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ee847e0034dbfde11f901fbfb74d210c1edad496'/>
<id>urn:sha1:ee847e0034dbfde11f901fbfb74d210c1edad496</id>
<content type='text'>
The object layer has been updated to take an explicit repository
instance as a parameter in more code paths.

* ps/object-wo-the-repository:
  hash: stop depending on `the_repository` in `null_oid()`
  hash: fix "-Wsign-compare" warnings
  object-file: split out logic regarding hash algorithms
  delta-islands: stop depending on `the_repository`
  object-file-convert: stop depending on `the_repository`
  pack-bitmap-write: stop depending on `the_repository`
  pack-revindex: stop depending on `the_repository`
  pack-check: stop depending on `the_repository`
  environment: move access to "core.bigFileThreshold" into repo settings
  pack-write: stop depending on `the_repository` and `the_hash_algo`
  object: stop depending on `the_repository`
  csum-file: stop depending on `the_repository`
</content>
</entry>
<entry>
<title>object-file: move `safe_create_leading_directories()` into "path.c"</title>
<updated>2025-04-15T15:24:35Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-04-15T09:38:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=1a99fe8010642a71063536510c578c1543d763b4'/>
<id>urn:sha1:1a99fe8010642a71063536510c578c1543d763b4</id>
<content type='text'>
The `safe_create_leading_directories()` function and its relatives are
located in "object-file.c", which is not a good fit as they provide
generic functionality not related to objects at all. Move them into
"path.c", which already hosts `safe_create_dir()` and its relative
`safe_create_dir_in_gitdir()`.

"path.c" is free of `the_repository`, but the moved functions depend on
`the_repository` to read the "core.sharedRepository" config. Adapt the
function signature to accept a repository as argument to fix the issue
and adjust callers accordingly.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>midx: implement writing incremental MIDX bitmaps</title>
<updated>2025-03-21T11:34:16Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2025-03-20T17:57:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=27afc272c49137460fe9e58e1fcbe4c1d377b304'/>
<id>urn:sha1:27afc272c49137460fe9e58e1fcbe4c1d377b304</id>
<content type='text'>
Now that the pack-bitmap machinery has learned how to read and interact
with an incremental MIDX bitmap, teach the pack-bitmap-write.c machinery
(and relevant callers from within the MIDX machinery) to write such
bitmaps.

The details for doing so are mostly straightforward. The main changes
are as follows:

  - find_object_pos() now makes use of an extra MIDX parameter which is
    used to locate the bit positions of objects which are from previous
    layers (and thus do not exist in the current layer's pack_order
    field).

    (Note also that the pack_order field is moved into struct
    write_midx_context to further simplify the callers for
    write_midx_bitmap()).

  - bitmap_writer_build_type_index() first determines how many objects
    precede the current bitmap layer and offsets the bits it sets in
    each respective type-level bitmap by that amount so they can be OR'd
    together.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Acked-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-write: stop depending on `the_repository` and `the_hash_algo`</title>
<updated>2025-03-10T20:16:18Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-03-10T07:13:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=2582846f2fe21b23fe7c567e030510960f135160'/>
<id>urn:sha1:2582846f2fe21b23fe7c567e030510960f135160</id>
<content type='text'>
There are a couple of functions in "pack-write.c" that implicitly depend
on `the_repository` or `the_hash_algo`. Remove this dependency by
injecting the repository via a parameter and adapt callers accordingly.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>object: stop depending on `the_repository`</title>
<updated>2025-03-10T20:16:18Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-03-10T07:13:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=74d414c9f14a91a3b7bd04972bf3eb9bbe6fd81b'/>
<id>urn:sha1:74d414c9f14a91a3b7bd04972bf3eb9bbe6fd81b</id>
<content type='text'>
There are a couple of functions exposed by "object.c" that implicitly
depend on `the_repository`. Remove this dependency by injecting the
repository via a parameter. Adapt callers accordingly by simply using
`the_repository`, except in cases where the subsystem is already free of
the repository. In that case, we instead pass the repository provided by
the caller's context.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>csum-file: stop depending on `the_repository`</title>
<updated>2025-03-10T20:16:18Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-03-10T07:13:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=228457c9d9f32f000f5c04c36fcce9002f72965a'/>
<id>urn:sha1:228457c9d9f32f000f5c04c36fcce9002f72965a</id>
<content type='text'>
There are multiple sites in "csum-file.c" where we use the global
`the_repository` variable, either explicitly or implicitly by using
`the_hash_algo`.

Refactor the code to stop using `the_repository` by adapting functions
to receive required data as parameters. Adapt callsites accordingly by
either using `the_repository-&gt;hash_algo`, or by using a context-provided
hash algorithm in case the subsystem already got rid of its dependency
on `the_repository`.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
