aboutsummaryrefslogtreecommitdiffstats
path: root/pack-bitmap.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-10-16packfile: introduce macro to iterate through packsPatrick Steinhardt1-3/+3
We have a bunch of different sites that want to iterate through all packs of a given `struct packfile_store`. This pattern is somewhat verbose and repetitive, which makes it somewhat cumbersome. Introduce a new macro `repo_for_each_pack()` that removes some of the boilerplate. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24packfile: refactor `get_all_packs()` to work on packfile storePatrick Steinhardt1-2/+2
The `get_all_packs()` function prepares the packfile store and then returns its packfiles. Refactor it to accept a packfile store instead of a repository to clarify its scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11midx: compute paths via their sourcePatrick Steinhardt1-6/+4
With the preceding commits we started to always have the object database source available when we load, write or access multi-pack indices. With this in place we can change how MIDX paths are computed so that we don't have to pass in the combination of a hash algorithm and object directory anymore, but only the object database source. Refactor the code accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11midx: stop duplicating info redundant with its owning sourcePatrick Steinhardt1-6/+7
Multi-pack indices store some information that is redundant with their owning source: - The locality bit that tracks whether the source is the primary object source or an alternate. - The object directory path the multi-pack index is located in. - The pointer to the owning parent directory. All of this information is already contained in `struct odb_source`. So now that we always have that struct available when loading a multi-pack index we have it readily accessible. Drop the redundant information and instead store a pointer to the object source. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11midx: drop redundant `struct repository` parameterPatrick Steinhardt1-2/+2
There are a couple of functions that take both a `struct repository` and a `struct multi_pack_index`. This provides redundant information though without much benefit given that the multi-pack index already has a pointer to its owning repository. Drop the `struct repository` parameter from such functions. While at it, reorder the list of parameters of `fill_midx_entry()` so that the MIDX comes first to better align with our coding guidelines. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-29Merge branch 'ps/object-store-midx' into ps/object-store-midx-dedup-infoJunio C Hamano1-6/+15
* ps/object-store-midx: midx: remove now-unused linked list of multi-pack indices packfile: stop using linked MIDX list in `get_all_packs()` packfile: stop using linked MIDX list in `find_pack_entry()` packfile: refactor `get_multi_pack_index()` to work on sources midx: stop using linked list when closing MIDX packfile: refactor `prepare_packed_git_one()` to work on sources midx: start tracking per object database source
2025-07-15Merge branch 'ly/load-bitmap-leakfix'Junio C Hamano1-24/+64
Leakfix with a new and a bit invasive test. * ly/load-bitmap-leakfix: pack-bitmap: add load corrupt bitmap test pack-bitmap: reword comments in test_bitmap_commits() pack-bitmap: fix memory leak if load_bitmap() failed
2025-07-15Merge branch 'ps/object-store'Junio C Hamano1-5/+5
Code clean-up around object access API. * ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`
2025-07-15packfile: refactor `get_multi_pack_index()` to work on sourcesPatrick Steinhardt1-6/+15
The function `get_multi_pack_index()` loads multi-pack indices via `prepare_packed_git()` and then returns the linked list of multi-pack indices that is stored in `struct object_database`. That list is in the process of being removed though in favor of storing the MIDX as part of the object database source it belongs to. Refactor `get_multi_pack_index()` so that it returns the multi-pack index for a single object source. Callers are now expected to call this function for each source they are interested in. This requires them to iterate through alternates, so we have to prepare alternate object sources before doing so. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-09Merge branch 'ps/object-store' into ps/object-store-midxJunio C Hamano1-5/+5
* ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`
2025-07-01odb: rename `oid_object_info()`Patrick Steinhardt1-4/+4
Rename `oid_object_info()` to `odb_read_object_info()` as well as their `_extended()` variant to match other functions related to the object database and our modern coding guidelines. Introduce compatibility wrappers so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename files to "odb.{c,h}"Patrick Steinhardt1-1/+1
In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01pack-bitmap: add load corrupt bitmap testLidong Yan1-5/+57
t5310 lacks a test to ensure git works correctly when commit bitmap data is corrupted. So this patch add test helper in pack-bitmap.c to list each commit bitmap position in bitmap file and `load corrupt bitmap` test case in t/t5310 to corrupt a commit bitmap before loading it. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01pack-bitmap: reword comments in test_bitmap_commits()Lidong Yan1-2/+3
The comment in pack-bitmap.c:test_bitmap_commits(), suggests that we can avoid reading the commit table altogether. However, this comment is misleading. The reason we load bitmap entries here is because test_bitmap_commits() needs to print the commit IDs from the bitmap, and we must read the bitmap entries to obtain those commit IDs. So reword this comment. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01pack-bitmap: fix memory leak if load_bitmap() failedTaylor Blau1-17/+4
After going through the "failed" label, load_bitmap() will return -1, and its caller (either prepare_bitmap_walk() or prepare_bitmap_git()) will then call free_bitmap_index(). That function would have done: struct stored_bitmap *sb; kh_foreach_value(b->bitmaps, sb { ewah_pool_free(sb->root); free(sb); }); , but won't since load_bitmap() already called kh_destroy_oid_map() and NULL'd the "bitmaps" pointer from within its "failed" label. Thus if you got part of the way through loading bitmap entries and then failed, you would leak all of the previous entries that you were able to load successfully. The solution is to remove the error handling code in load_bitmap(), because its caller will always call free_bitmap_index() in case of an error. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-09pack-bitmap: remove checks before bitmap_freeLidong Yan1-2/+2
In pack-bitmap.c:find_boundary_objects(), the roots_bitmap is only freed if cascade_pseudo_merges_1() fails. However, cascade_pseudo_merges_1() uses roots_bitmap as a mutable reference without taking ownership of it. As a result, if cascade_pseudo_merges_1() succeeds, roots_bitmap is leaked. And this leak currently lacks a dedicated test to detect it. To fix this leak, remove if cascade_pseudo_merges_1() succeed check and always calling bitmap_free(roots_bitmap); To trigger this leak, we need roots_bitmap that contains at least one pseudo merge. So that we can use pseudo merge bitmap when we compute roots reachable bitmap. Here we create two commits: first A then B. Add A to the pseudo-merge and perform a traversal over the range A..B. In this scenario, the "haves" set will be {A}, and cascade_pseudo_merges_1 will succeed, thereby exposing the leak due to the missing roots_bitmap cleanup. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12pack-bitmap: fix memory leak if `load_bitmap_entries_v1` failedLidong Yan1-4/+4
In pack-bitmap.c:load_bitmap_entries_v1, the function `read_bitmap_1` allocates a bitmap and reads index data into it. However, if any of the validation checks following the allocation fail, the allocated bitmap is not freed, resulting in a memory leak. To avoid this, the validation checks should be performed before the bitmap is allocated. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-24Merge branch 'ps/object-file-cleanup'Junio C Hamano1-2/+1
Code clean-up. * ps/object-file-cleanup: object-store: merge "object-store-ll.h" and "object-store.h" object-store: remove global array of cached objects object: split out functions relating to object store subsystem object-file: drop `index_blob_stream()` object-file: split up concerns of `HASH_*` flags object-file: split out functions relating to object store subsystem object-file: move `xmmap()` into "wrapper.c" object-file: move `git_open_cloexec()` to "compat/open.c" object-file: move `safe_create_leading_directories()` into "path.c" object-file: move `mkdir_in_gitdir()` into "path.c"
2025-04-16Merge branch 'ps/cat-file-filter-batch'Junio C Hamano1-9/+72
"git cat-file --batch" and friends learned to allow "--filter=" to omit certain objects, just like the transport layer does. * ps/cat-file-filter-batch: builtin/cat-file: use bitmaps to efficiently filter by object type builtin/cat-file: deduplicate logic to iterate over all objects pack-bitmap: introduce function to check whether a pack is bitmapped pack-bitmap: add function to iterate over filtered bitmapped objects pack-bitmap: allow passing payloads to `show_reachable_fn()` builtin/cat-file: support "object:type=" objects filter builtin/cat-file: support "blob:limit=" objects filter builtin/cat-file: support "blob:none" objects filter builtin/cat-file: wire up an option to filter objects builtin/cat-file: introduce function to report object status builtin/cat-file: rename variable that tracks usage
2025-04-15Merge branch 'ps/object-wo-the-repository'Junio C Hamano1-7/+8
The object layer has been updated to take an explicit repository instance as a parameter in more code paths. * ps/object-wo-the-repository: hash: stop depending on `the_repository` in `null_oid()` hash: fix "-Wsign-compare" warnings object-file: split out logic regarding hash algorithms delta-islands: stop depending on `the_repository` object-file-convert: stop depending on `the_repository` pack-bitmap-write: stop depending on `the_repository` pack-revindex: stop depending on `the_repository` pack-check: stop depending on `the_repository` environment: move access to "core.bigFileThreshold" into repo settings pack-write: stop depending on `the_repository` and `the_hash_algo` object: stop depending on `the_repository` csum-file: stop depending on `the_repository`
2025-04-15object-store: merge "object-store-ll.h" and "object-store.h"Patrick Steinhardt1-1/+1
The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-file: move `git_open_cloexec()` to "compat/open.c"Patrick Steinhardt1-1/+0
The `git_open_cloexec()` wrapper function provides the ability to open a file with `O_CLOEXEC` in a platform-agnostic way. This function is provided by "object-file.c" even though it is not specific to the object subsystem at all. Move the file into "compat/open.c". This file already exists before this commit, but has only been compiled conditionally depending on whether or not open(3p) may return EINTR. With this change we now unconditionally compile the object, but wrap `git_open_with_retry()` in an ifdef. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07pack-bitmap: introduce function to check whether a pack is bitmappedPatrick Steinhardt1-0/+15
Introduce a function that allows us to verify whether a pack is bitmapped or not. This functionality will be used in a subsequent commit. Helped-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07pack-bitmap: add function to iterate over filtered bitmapped objectsPatrick Steinhardt1-6/+53
Introduce a function that allows the caller to iterate over all bitmapped objects that match a given filter. This mechanism will be used in a subsequent commit to optimize object filters in git-cat-file(1). Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07pack-bitmap: allow passing payloads to `show_reachable_fn()`Patrick Steinhardt1-7/+8
The `show_reachable_fn` callback is used by a couple of functions to present reachable objects to the caller. The function does not provide a way for the caller to pass a payload though, which is functionality that we'll require in a subsequent commit. Change the callback type to accept a payload and adapt all callsites accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21pack-bitmap.c: use `ewah_or_iterator` for type bitmap iteratorsTaylor Blau1-15/+27
Now that we have initialized arrays for each bitmap layer's type bitmaps in the previous commit, adjust existing callers to use them in preparation for multi-layered bitmaps. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21pack-bitmap.c: keep track of each layer's type bitmapsTaylor Blau1-4/+53
Prepare for reading the type-level bitmaps from previous bitmap layers by maintaining an array for each type, where each element in that type's array corresponds to one layer's bitmap for that type. These fields will be used in a later commit to instantiate the 'struct ewah_or_iterator' for each type. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21pack-bitmap.c: apply pseudo-merge commits with incremental MIDXsTaylor Blau1-3/+8
Prepare for using pseudo-merges with incremental MIDX bitmaps by attempting to apply pseudo-merges from each layer when encountering a given commit during a walk. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21pack-bitmap.c: compute disk-usage with incremental MIDXsTaylor Blau1-2/+2
In a similar fashion as previous commits, use nth_midxed_pack() instead of accessing the MIDX's ->packs array directly to support incremental MIDXs. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21pack-bitmap.c: teach `rev-list --test-bitmap` about incremental MIDXsTaylor Blau1-21/+86
Implement support for the special `--test-bitmap` mode of `git rev-list` when using incremental MIDXs. The bitmap_test_data structure is extended to contain a "base" pointer that mirrors the structure of the bitmap chain that it is being used to test. When we find a commit to test, we first chase down the ->base pointer to find the appropriate bitmap_test_data for the bitmap layer that the given commit is contained within, and then perform the test on that bitmap. In order to implement this, light modifications are made to bitmap_for_commit() to reimplement it in terms of a new function, find_bitmap_for_commit(), which fills out a pointer which indicates the bitmap layer which contains the given commit. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21pack-bitmap.c: support bitmap pack-reuse with incremental MIDXsTaylor Blau1-3/+8
In a similar fashion as previous commits in the first phase of incremental MIDXs, enumerate not just the packs in the current incremental MIDX layer, but previous ones as well. Likewise, in reuse_partial_packfile_from_bitmap(), when reusing only a single pack from a MIDX, use the oldest layer's preferred pack as it is likely to contain the largest number of reusable sections. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21pack-bitmap.c: teach `show_objects_for_type()` about incremental MIDXsTaylor Blau1-1/+1
Since we may ask for a pack_id that is in an earlier MIDX layer relative to the one corresponding to our bitmap, use nth_midxed_pack() instead of accessing the ->packs array directly. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXsTaylor Blau1-4/+7
The pack-bitmap machinery uses `bitmap_for_commit()` to locate the EWAH-compressed bitmap corresponding to some given commit object. Teach this function about incremental MIDX bitmaps by teaching it to recur on earlier bitmap layers when it fails to find a given commit in the current layer. The changes to do so are as follows: - Avoid initializing hash_pos at its declaration, since bitmap_for_commit() is now a recursive function and may receive a NULL bitmap_index pointer as its first argument. - In cases where we would previously return NULL (to indicate that a lookup failed and the given bitmap_index does not contain an entry corresponding to the given commit), recursively call the function on the previous bitmap layer. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21pack-bitmap.c: open and store incremental bitmap layersTaylor Blau1-14/+48
Prepare the pack-bitmap machinery to work with incremental MIDXs by adding a new "base" field to keep track of the bitmap index associated with the previous MIDX layer. The changes in this commit are mostly boilerplate to open the correct bitmap(s), add them to the chain of bitmap layers along the "base" pointer, ensure that the correct packs and their reverse indexes are loaded across MIDX layers, etc. While we're at it, keep track of a base_nr field to indicate how many bitmap layers (including the current bitmap) exist. This will be used in a future commit to allocate an array of 'struct ewah_bitmap' pointers to collect all of the respective type bitmaps among all layers to initialize a multi-EWAH iterator. Subsequent commits will teach the functions within the pack-bitmap machinery how to interact with these new fields. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21pack-revindex: prepare for incremental MIDX bitmapsTaylor Blau1-12/+31
Prepare the reverse index machinery to handle object lookups in an incremental MIDX bitmap. These changes are broken out across a few functions: - load_midx_revindex() learns to use the appropriate MIDX filename depending on whether the given 'struct multi_pack_index *' is incremental or not. - pack_pos_to_midx() and midx_to_pack_pos() now both take in a global object position in the MIDX pseudo-pack order, and find the earliest containing MIDX (similar to midx.c::midx_for_object(). - midx_pack_order_cmp() adjusts its call to pack_pos_to_midx() by the number of objects in the base (since 'vb - midx->revindx_data' is relative to the containing MIDX, and pack_pos_to_midx() expects a global position). Likewise, this function adjusts its output by adding m->num_objects_in_base to return a global position out through the `*pos` pointer. Together, these changes are sufficient to use the multi-pack index's reverse index format for incremental multi-pack reachability bitmaps. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10object: stop depending on `the_repository`Patrick Steinhardt1-3/+3
There are a couple of functions exposed by "object.c" that implicitly depend on `the_repository`. Remove this dependency by injecting the repository via a parameter. Adapt callers accordingly by simply using `the_repository`, except in cases where the subsystem is already free of the repository. In that case, we instead pass the repository provided by the caller's context. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10csum-file: stop depending on `the_repository`Patrick Steinhardt1-4/+5
There are multiple sites in "csum-file.c" where we use the global `the_repository` variable, either explicitly or implicitly by using `the_hash_algo`. Refactor the code to stop using `the_repository` by adapting functions to receive required data as parameters. Adapt callsites accordingly by either using `the_repository->hash_algo`, or by using a context-provided hash algorithm in case the subsystem already got rid of its dependency on `the_repository`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21Merge branch 'ps/the-repository'Junio C Hamano1-1/+3
More code paths have a repository passed through the callchain, instead of assuming the primary the_repository object. * ps/the-repository: match-trees: stop using `the_repository` graph: stop using `the_repository` add-interactive: stop using `the_repository` tmp-objdir: stop using `the_repository` resolve-undo: stop using `the_repository` credential: stop using `the_repository` mailinfo: stop using `the_repository` diagnose: stop using `the_repository` server-info: stop using `the_repository` send-pack: stop using `the_repository` serve: stop using `the_repository` trace: stop using `the_repository` pager: stop using `the_repository` progress: stop using `the_repository`
2024-12-23Merge branch 'tb/bitmap-fix-pack-reuse'Junio C Hamano1-23/+18
Code to reuse objects based on bitmap contents have been tightened to avoid race condition even when multiple packs are involved. * tb/bitmap-fix-pack-reuse: pack-bitmap.c: ensure pack validity for all reuse packs
2024-12-23Merge branch 'ps/build-sign-compare'Junio C Hamano1-0/+2
Start working to make the codebase buildable with -Wsign-compare. * ps/build-sign-compare: t/helper: don't depend on implicit wraparound scalar: address -Wsign-compare warnings builtin/patch-id: fix type of `get_one_patchid()` builtin/blame: fix type of `length` variable when emitting object ID gpg-interface: address -Wsign-comparison warnings daemon: fix type of `max_connections` daemon: fix loops that have mismatching integer types global: trivial conversions to fix `-Wsign-compare` warnings pkt-line: fix -Wsign-compare warning on 32 bit platform csum-file: fix -Wsign-compare warning on 32-bit platform diff.h: fix index used to loop through unsigned integer config.mak.dev: drop `-Wno-sign-compare` global: mark code units that generate warnings with `-Wsign-compare` compat/win32: fix -Wsign-compare warning in "wWinMain()" compat/regex: explicitly ignore "-Wsign-compare" warnings git-compat-util: introduce macros to disable "-Wsign-compare" warnings
2024-12-18progress: stop using `the_repository`Patrick Steinhardt1-1/+3
Stop using `the_repository` in the "progress" subsystem by passing in a repository when initializing `struct progress`. Furthermore, store a pointer to the repository in that struct so that we can pass it to the trace2 API when logging information. Adjust callers accordingly by using `the_repository`. While there may be some callers that have a repository available in their context, this trivial conversion allows for easier verification and bubbles up the use of `the_repository` by one level. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18Merge branch 'ps/build-sign-compare' into ps/the-repositoryJunio C Hamano1-0/+2
* ps/build-sign-compare: t/helper: don't depend on implicit wraparound scalar: address -Wsign-compare warnings builtin/patch-id: fix type of `get_one_patchid()` builtin/blame: fix type of `length` variable when emitting object ID gpg-interface: address -Wsign-comparison warnings daemon: fix type of `max_connections` daemon: fix loops that have mismatching integer types global: trivial conversions to fix `-Wsign-compare` warnings pkt-line: fix -Wsign-compare warning on 32 bit platform csum-file: fix -Wsign-compare warning on 32-bit platform diff.h: fix index used to loop through unsigned integer config.mak.dev: drop `-Wno-sign-compare` global: mark code units that generate warnings with `-Wsign-compare` compat/win32: fix -Wsign-compare warning in "wWinMain()" compat/regex: explicitly ignore "-Wsign-compare" warnings git-compat-util: introduce macros to disable "-Wsign-compare" warnings
2024-12-18pack-bitmap.c: ensure pack validity for all reuse packsTaylor Blau1-23/+18
Commit 44f9fd6496 (pack-bitmap.c: check preferred pack validity when opening MIDX bitmap, 2022-05-24) prevents a race condition whereby the preferred pack disappears between opening the MIDX bitmap and attempting verbatim reuse out of its packs. That commit forces open_midx_bitmap_1() to ensure the validity of the MIDX's preferred pack, meaning that we have an open file handle on the *.pack, ensuring that we can reuse bytes out of verbatim later on in the process[^1]. But 44f9fd6496 was not extended to cover multi-pack reuse, meaning that this same race condition exists for non-preferred packs during verbatim reuse. Work around that race in the same way by only marking valid packs as reuse-able. For packs that aren't reusable, skip over them but include the number of objects they have to ensure we allocate a large enough 'reuse' bitmap (e.g. if a pack in the middle of the MIDX disappeared but we still want to reuse later packs). Since we're ensuring the validity of these packs within the verbatim reuse code, we no longer have to special-case the preferred pack and open it within the open_midx_bitmap_1() function. An alternative approach to the one taken here would be to open all MIDX'd packs from within open_midx_bitmap_1(). But that would be both slower and make the bitmaps less useful, since we can still perform some pack reuse among the packs that still exist when the *.bitmap is opened. After applying this patch, we can simulate the new behavior after instrumenting Git like so: diff --git a/packfile.c b/packfile.c index 9560f0a33c..aedce72524 100644 --- a/packfile.c +++ b/packfile.c @@ -557,6 +557,11 @@ static int open_packed_git_1(struct packed_git *p) ; /* nothing */ p->pack_fd = git_open(p->pack_name); + { + const char *delete = getenv("GIT_RACILY_DELETE"); + if (delete && !strcmp(delete, pack_basename(p))) + return -1; + } if (p->pack_fd < 0 || fstat(p->pack_fd, &st)) return -1; pack_open_fds++; and adding the following test: test_expect_success 'disappearing packs' ' git init disappearing-packs && ( cd disappearing-packs && git config pack.allowPackReuse multi && test_commit A && test_commit B && test_commit C && A="$(echo "A" | git pack-objects --revs $packdir/pack-A)" && B="$(echo "A..B" | git pack-objects --revs $packdir/pack-B)" && C="$(echo "B..C" | git pack-objects --revs $packdir/pack-C)" && git multi-pack-index write --bitmap --preferred-pack=pack-A-$A.idx && test_pack_objects_reused_all 9 3 && test_env GIT_RACILY_DELETE=pack-A-$A.pack \ test_pack_objects_reused_all 6 2 && test_env GIT_RACILY_DELETE=pack-B-$B.pack \ test_pack_objects_reused_all 6 2 && test_env GIT_RACILY_DELETE=pack-C-$C.pack \ test_pack_objects_reused_all 6 2 ) ' Note that we could relax the single-pack version of this which was most recently addressed in dc1daacdcc (pack-bitmap: check pack validity when opening bitmap, 2021-07-23), but only partially. Because we still need to know the object count in the pack, we'd still have to open the pack's *.idx, so the savings there are marginal. Note likewise that we add a new "if (!packs_nr)" early return in the pack reuse code to avoid a potentially expensive allocation on the 'reuse' bitmap in the case that no packs are available for reuse. [^1]: Unless we run out of open file handles. If that happens and we are forced to close the only open file handle of a file that has been removed from underneath us, there is nothing we can do. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-13Merge branch 'kn/midx-wo-the-repository'Junio C Hamano1-38/+58
Yet another "pass the repository through the callchain" topic. * kn/midx-wo-the-repository: midx: inline the `MIDX_MIN_SIZE` definition midx: pass down `hash_algo` to functions using global variables midx: pass `repository` to `load_multi_pack_index` midx: cleanup internal usage of `the_repository` and `the_hash_algo` midx-write: pass down repository to `write_midx_file[_only]` write-midx: add repository field to `write_midx_context` midx-write: use `revs->repo` inside `read_refs_snapshot` midx-write: pass down repository to static functions packfile.c: remove unnecessary prepare_packed_git() call midx: add repository to `multi_pack_index` struct config: make `packed_git_(limit|window_size)` non-global variables config: make `delta_base_cache_limit` a non-global variable packfile: pass down repository to `for_each_packed_object` packfile: pass down repository to `has_object[_kept]_pack` packfile: pass down repository to `odb_pack_name` packfile: pass `repository` to static function in the file packfile: use `repository` from `packed_git` directly packfile: add repository to struct `packed_git`
2024-12-06global: mark code units that generate warnings with `-Wsign-compare`Patrick Steinhardt1-0/+1
Mark code units that generate warnings with `-Wsign-compare`. This allows for a structured approach to get rid of all such warnings over time in a way that can be easily measured. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04midx: pass down `hash_algo` to functions using global variablesKarthik Nayak1-3/+3
The functions `get_split_midx_filename_ext()`, `get_midx_filename()` and `get_midx_filename_ext()` use `hash_to_hex()` which internally uses the `the_hash_algo` global variable. Remove this dependency on global variables by passing down the `hash_algo` through to the functions mentioned and instead calling `hash_to_hex_algop()` along with the obtained `hash_algo`. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04Merge branch 'tb/boundary-traversal-fix'Junio C Hamano1-1/+1
A trivial "correctness" fix that does not yet matter in practice. * tb/boundary-traversal-fix: pack-bitmap.c: typofix in `find_boundary_objects()`
2024-12-04midx: add repository to `multi_pack_index` structKarthik Nayak1-35/+55
The `multi_pack_index` struct represents the MIDX for a repository. Here, we add a pointer to the repository in this struct, allowing direct use of the repository variable without relying on the global `the_repository` struct. With this addition, we can determine the repository associated with a `bitmap_index` struct. A `bitmap_index` points to either a `packed_git` or a `multi_pack_index`, both of which have direct repository references. To support this, we introduce a static helper function, `bitmap_repo`, in `pack-bitmap.c`, which retrieves a repository given a `bitmap_index`. With this, we clear up all usages of `the_repository` within `pack-bitmap.c` and also remove the `USE_THE_REPOSITORY_VARIABLE` definition. Bringing us another step closer to remove all global variable usage. Although this change also opens up the potential to clean up `midx.c`, doing so would require additional refactoring to pass the repository struct to functions where the MIDX struct is created: a task better suited for future patches. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04packfile: pass down repository to `has_object[_kept]_pack`Karthik Nayak1-1/+1
The functions `has_object[_kept]_pack` currently rely on the global variable `the_repository`. To eliminate global variable usage in `packfile.c`, we should progressively shift the dependency on the_repository to higher layers. Let's remove its usage from these functions and any related ones. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-22pack-bitmap.c: typofix in `find_boundary_objects()`Taylor Blau1-1/+1
In the boundary-based bitmap traversal, we use the given 'rev_info' structure to first do a commit-only walk in order to determine the boundary between interesting and uninteresting objects. That walk only looks at commit objects, regardless of the state of revs->blob_objects, revs->tree_objects, and so on. In order to do this, we store the state of these variables in temporary fields before setting them back to zero, performing the traversal, and then setting them back. But there is a typo here that dates back to b0afdce5da (pack-bitmap.c: use commit boundary during bitmap traversal, 2023-05-08), where we incorrectly store the value of the "tags" field as "revs->blob_objects". This could lead to problems later on if, say, the caller wants tag objects but *not* blob objects. In the pre-image behavior, we'd set revs->tag_objects back to the old value of revs->blob_objects, thus emitting fewer objects than expected back to the caller. Fix that by correctly assigning the value of 'revs->tag_objects' to the 'tmp_tags' field. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>