summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorLines
2024-09-03rebase: apply and cleanup autostash when rebase fails to startPhillip Wood-10/+81
If "git rebase" fails to start after stashing the user's uncommitted changes then it forgets to restore the stashed changes and remove the state directory. To make matters worse, running "git rebase --abort" to apply the stashed changes and cleanup the state directory fails because the state directory only contains the "autostash" file and is missing the "head-name" and "onto" files required by read_basic_state(). Fix this by applying the autostash and removing the state directory if the pre-rebase hook or initial checkout fail. This matches what finish_rebase() does at the end of a successful rebase. If the user modifies any files after the autostash is created it is possible there will be conflicts when the autostash is applied. In that case apply_autostash() saves the stash in a new entry under refs/stash and so it is safe to remove the state directory containing the autostash file. New tests are added to check the autostash is applied and the state directory is removed if the rebase fails to start. Checks are also added to some existing tests in order to ensure there is no state directory left behind when a rebase fails to start and no autostash has been created. Reported-by: Brian Lyles <brianmlyles@gmail.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-03Documentation/BreakingChanges: announce removal of git-pack-redundant(1)Patrick Steinhardt-0/+20
The git-pack-redundant(1) command is already in the process of being phased out and dies unless the user passes the `--i-still-use-this` flag since 4406522b76 (pack-redundant: escalate deprecation warning to an error, 2023-03-23). We haven't heard any complaints, so let's announce the removal of this command in Git 3.0 in our breaking changes document. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-03The twelfth batchJunio C Hamano-0/+11
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-03Merge branch 'jc/config-doc-update'Junio C Hamano-3/+3
Docfix. * jc/config-doc-update: git-config.1: fix description of --regexp in synopsis git-config.1: --get-all description update
2024-09-03Merge branch 'rs/remote-leakfix'Junio C Hamano-8/+15
Leakfix. * rs/remote-leakfix: remote: plug memory leaks at early returns
2024-09-03Merge branch 'ps/reftable-concurrent-compaction'Junio C Hamano-194/+316
The code path for compacting reftable files saw some bugfixes against concurrent operation. * ps/reftable-concurrent-compaction: reftable/stack: fix segfault when reload with reused readers fails reftable/stack: reorder swapping in the reloaded stack contents reftable/reader: keep readers alive during iteration reftable/reader: introduce refcounting reftable/stack: fix broken refnames in `write_n_ref_tables()` reftable/reader: inline `reader_close()` reftable/reader: inline `init_reader()` reftable/reader: rename `reftable_new_reader()` reftable/stack: inline `stack_compact_range_stats()` reftable/blocksource: drop malloc block source
2024-09-03Merge branch 'js/fetch-push-trace2-annotation'Junio C Hamano-8/+67
More trace2 events at key points on push and fetch code paths have been added. * js/fetch-push-trace2-annotation: send-pack: add new tracing regions for push fetch: add top-level trace2 regions trace2: implement trace2_printf() for event target
2024-09-03Merge branch 'aa/cat-file-batch-output-doc'Junio C Hamano-3/+3
Docfix. * aa/cat-file-batch-output-doc: docs: explain the order of output in the batched mode of git-cat-file(1)
2024-09-03Merge branch 'dh/runtime-prefix-on-zos'Junio C Hamano-0/+32
Support for the RUNTIME_PREFIX feature has been added to z/OS port. * dh/runtime-prefix-on-zos: exec_cmd: RUNTIME_PREFIX on z/OS systems
2024-09-03Merge branch 'ps/leakfixes-part-5'Junio C Hamano-73/+237
Even more leak fixes. * ps/leakfixes-part-5: transport: fix leaking negotiation tips transport: fix leaking arguments when fetching from bundle builtin/fetch: fix leaking transaction with `--atomic` remote: fix leaking peer ref when expanding refmap remote: fix leaks when matching refspecs remote: fix leaking config strings builtin/fetch-pack: fix leaking refs sideband: fix leaks when configuring sideband colors builtin/send-pack: fix leaking refspecs transport: fix leaking OID arrays in git:// transport data t/helper: fix leaking multi-pack-indices in "read-midx" builtin/repack: fix leaks when computing packs to repack midx-write: fix leaking hashfile on error cases builtin/archive: fix leaking `OPT_FILENAME()` value builtin/upload-archive: fix leaking args passed to `write_archive()` builtin/merge-tree: fix leaking `-X` strategy options pretty: fix leaking key/value separator buffer pretty: fix memory leaks when parsing pretty formats convert: fix leaks when resetting attributes mailinfo: fix leaking header data
2024-09-03Merge branch 'cl/config-regexp-docfix'Junio C Hamano-1/+1
Docfix. * cl/config-regexp-docfix: doc: replace 3 dash with correct 2 dash in git-config(1)
2024-09-01mergetools: vscode: new toolAlex Henrie-0/+19
VSCode has supported three-way merges since 2022, see <https://github.com/microsoft/vscode/issues/5770#issuecomment-1188658476>. Although the program binary is located at /usr/bin/code, name the mergetool "vscode" because the word "code" is too generic and would lead to confusion. The name "vscode" also matches Git's existing contrib/vscode directory. On Windows, VSCode adds the directory that contains code.cmd to %PATH%, so there is no need to invoke mergetool_find_win32_cmd to search for the program. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-01t: port helper/test-oid-array.c to unit-tests/t-oid-array.cGhanshyam Thakkar-175/+136
helper/test-oid-array.c along with t0064-oid-array.sh test the oid-array.h API, which provides storage and processing efficiency over large lists of object identifiers. Migrate them to the unit testing framework for better runtime performance and efficiency. As we don't initialize a repository in these tests, the hash algo that functions like oid_array_lookup() use is not initialized, therefore call repo_set_hash_algo() to initialize it. And init_hash_algo():lib-oid.c can aid in this process, so make it public. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Helped-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-01compat/terminal: mark parameter of git_terminal_prompt() UNUSEDRamsay Jones-1/+1
If neither HAVE_DEV_TTY nor GIT_WINDOWS_NATIVE is set, the fallback code calls the system getpass(). This unfortunately ignores the "echo" boolean parameter, as we have no way to implement that functionality. But we still have to keep the unused parameter, since our interface has to match the other implementations. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-30revision: free commit buffers for skipped commitsJeff King-0/+1
In git-log we leave the save_commit_buffer flag set to "1", which tells the commit parsing code to store the object content after it has parsed it to find parents, tree, etc. That lets us reuse the contents for pretty-printing the commit in the output. And then after printing each commit, we call free_commit_buffer(), since we don't need it anymore. But some options may cause us to traverse commits which are not part of the output. And so git-log does not see them at all, and doesn't free them. One such case is something like: git log -n 1000 --skip=1000000 which will churn through a million commits, before showing only a thousand. We loop through these inside get_revision(), without freeing the contents. As a result, we end up storing the object data for those million commits simultaneously. We should free the stored buffers (if any) for those commits as we skip over them, which is what this patch does. Running the above command in linux.git drops the peak heap usage from ~1.1GB to ~200MB, according to valgrind/massif. (I thought we might get an even bigger improvement, but the remaining memory is going to commit/tree structs, which we do hold on to forever). Note that this problem doesn't occur if: - you're running a git-rev-list without a --format parameter; it turns off save_commit_buffer by default, since it only output the object id - you've built a commit-graph file, since in that case we'd use the optimized graph data instead of the initial parse, and then do a lazy parse for commits we're actually going to output There are probably some other option combinations that can likewise end up with useless stored commit buffers. For example, if you ask for "foo..bar", then we'll have to walk down to the merge base, and everything on the "foo" side won't be shown. Tuning the "save" behavior to handle that might be tricky (I guess maybe drop buffers for anything we mark as UNINTERESTING?). And in the long run, the right solution here is probably to make sure the commit-graph is built (since it fixes the memory problem _and_ drastically reduces CPU usage). But since this "--skip" case is an easy one-liner, it's worth fixing in the meantime. It should be OK to make this call even if there is no saved buffer (e.g., because save_commit_buffer=0, or because a commit-graph was used), since it's O(1) to look up the buffer and is a noop if it isn't present. I verified by running the above command after "git commit-graph write --reachable", and it takes the same time with and without this patch. Reported-by: Yuri Karnilaev <karnilaev@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-30refs/files-backend: work around -Wunused-parameterJunio C Hamano-2/+5
This is needed to build things with -Werror=unused-parameter on a platform without symbolic link support. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-29grep: prefer UNUSED to MAYBE_UNUSED for pcre allocatorsJeff King-2/+2
We provide custom malloc/free callbacks for the pcre library to use. Those take an extra "data" parameter, but we don't use it. Back when these were added in 513f2b0bbd (grep: make PCRE2 aware of custom allocator, 2019-10-16), we only had MAYBE_UNUSED. But these days we have UNUSED, which we should prefer, as it will let the compiler inform us if the code changes to actually use the parameters. I also moved the annotations to come after the variable name, which is how we typically spell it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-29gc: drop MAYBE_UNUSED annotation from used parameterJeff King-1/+1
The "opts" parameter is always used, so marking it with MAYBE_UNUSED is just confusing. This annotation goes back to 41abfe15d9 (maintenance: add pack-refs task, 2021-02-09), when it really was unused. Back then we did not have the UNUSED macro that would complain if the code changed to use the parameter. So when we started using it in bfc2f9eb8e (builtin/gc: forward git-gc(1)'s `--auto` flag when packing refs, 2024-03-25), nobody noticed. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-29CodingGuidelines: also mention MAYBE_UNUSEDJunio C Hamano-2/+27
A function that uses a parameter in one build may lose all uses of the parameter in another build, depending on the configuration. A workaround for such a case, MAYBE_UNUSED, should also be mentioned when we recommend the use of UNUSED to our developers. Keep the addition to the guideline short and document the criteria to choose between UNUSED and MAYBE_UNUSED near their definition. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-29Merge branch 'jk/unused-parameters' into jc/maybe-unusedJunio C Hamano-33/+46
* jk/unused-parameters: CodingGuidelines: mention -Wunused-parameter and UNUSED config.mak.dev: enable -Wunused-parameter by default compat: mark unused parameters in win32/mingw functions compat: disable -Wunused-parameter in win32/headless.c compat: disable -Wunused-parameter in 3rd-party code t-reftable-readwrite: mark unused parameter in callback function gc: mark unused config parameter in virtual functions
2024-08-29The eleventh batchJunio C Hamano-2/+8
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-29Merge branch 'ds/sparse-diff-index'Junio C Hamano-3/+16
The underlying machinery for "git diff-index" has long been made to expand the sparse index as needed, but the command fully expanded the sparse index upfront, which now has been taught not to do. * ds/sparse-diff-index: diff-index: integrate with the sparse index
2024-08-29Merge branch 'cp/unit-test-reftable-block'Junio C Hamano-126/+371
Another test for reftable library ported to the unit test framework. * cp/unit-test-reftable-block: t-reftable-block: mark unused argv/argc t-reftable-block: add tests for index blocks t-reftable-block: add tests for obj blocks t-reftable-block: add tests for log blocks t-reftable-block: remove unnecessary variable 'j' t-reftable-block: use xstrfmt() instead of xstrdup() t-reftable-block: use block_iter_reset() instead of block_iter_close() t-reftable-block: use reftable_record_key() instead of strbuf_addstr() t-reftable-block: use reftable_record_equal() instead of check_str() t-reftable-block: release used block reader t: harmonize t-reftable-block.c with coding guidelines t: move reftable/block_test.c to the unit testing framework
2024-08-29Merge branch 'ps/reftable-drop-generic'Junio C Hamano-816/+422
The code in the reftable library has been cleaned up by discarding unused "generic" interface. * ps/reftable-drop-generic: reftable: mark unused parameters in empty iterator functions reftable/generic: drop interface t/helper: refactor to not use `struct reftable_table` t/helper: use `hash_to_hex_algop()` to print hashes t/helper: inline printing of reftable records t/helper: inline `reftable_table_print()` t/helper: inline `reftable_stack_print_directory()` t/helper: inline `reftable_reader_print_file()` t/helper: inline `reftable_dump_main()` reftable/dump: drop unused `compact_stack()` reftable/generic: move generic iterator code into iterator interface reftable/iter: drop double-checking logic reftable/stack: open-code reading refs reftable/merged: stop using generic tables in the merged table reftable/merged: rename `reftable_new_merged_table()` reftable/merged: expose functions to initialize iterators
2024-08-28The tenth batchJunio C Hamano-0/+5
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28Merge branch 'ah/git-prompt-portability'Junio C Hamano-65/+126
The command line prompt support used to be littered with bash-isms, which has been corrected to work with more shells. * ah/git-prompt-portability: git-prompt: support custom 0-width PS1 markers git-prompt: ta-da! document usage in other shells git-prompt: don't use shell $'...' git-prompt: add some missing quotes git-prompt: replace [[...]] with standard code git-prompt: don't use shell arrays git-prompt: fix uninitialized variable git-prompt: use here-doc instead of here-string
2024-08-28Merge branch 'gt/unit-test-urlmatch-normalization'Junio C Hamano-261/+272
Another rewrite of test. * gt/unit-test-urlmatch-normalization: t: migrate t0110-urlmatch-normalization to the new framework
2024-08-28Merge branch 'mt/rebase-x-quiet'Junio C Hamano-3/+12
"git rebase -x --quiet" was not quiet, which was corrected. * mt/rebase-x-quiet: rebase --exec: respect --quiet
2024-08-28reftable: mark unused parameters in empty iterator functionsJeff King-3/+3
These unused parameters were marked in a68ec8683a (reftable: mark unused parameters in virtual functions, 2024-08-17), but the functions were moved to a new file in a parallel branch via f2406c81b9 (reftable/generic: move generic iterator code into iterator interface, 2024-08-22). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28t-reftable-block: mark unused argv/argcJeff King-1/+1
This is conceptually the same as the cases in df9d638c24 (unit-tests: ignore unused argc/argv, 2024-08-17), but this unit test was migrated from the reftable tests in a parallel branch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28CodingGuidelines: mention -Wunused-parameter and UNUSEDJeff King-0/+7
Now that -Wunused-parameter is on by default for DEVELOPER=1 builds, people may trigger it, blocking their build. When it's a mistake for the parameter to exist, the path forward is obvious: remove it. But sometimes you need to suppress the warning, and the "UNUSED" mechanism for that is specific to our project, so people may not know about it. Let's put some advice in CodingGuidelines, including an example warning message. That should help people who grep for the warning text after seeing it from the compiler. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28config.mak.dev: enable -Wunused-parameter by defaultJeff King-1/+0
Having now removed or annotated all of the unused function parameters in our code base, I found that each instance falls into one of three categories: 1. ignoring the parameter is a bug (e.g., a function takes a ptr/len pair, but ignores the length). Detecting these helps us find the bugs. 2. the parameter is unnecessary (and usually left over from a refactoring or earlier iteration of a patches series). Removing these cleans up the code. 3. the function has to conform to a specific interface (because it's used via a function pointer, or matches something on the other side of an #ifdef). These ones are annoying, but annotating them with UNUSED is not too bad (especially if the compiler tells you about the problem promptly). Certainly instances of (3) are more common than (1), but after finding all of these, I think there were enough cases of (1) that it justifies the work in annotating all of the (3)s. And since the code base is now at a spot where we compile cleanly with -Wunused-parameter, turning it on will make it the responsibility of individual patch writers going forward. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28compat: mark unused parameters in win32/mingw functionsJeff King-23/+24
The compat/ directory contains many stub functions, wrappers, and so on that have to conform to a specific interface, but don't necessarily need to use all of their parameters. Let's mark them to avoid complaints from -Wunused-parameter. This was done mostly via guess-and-check with the Windows build in GitHub CI. I also confirmed that the win+VS build is similarly happy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28compat: disable -Wunused-parameter in win32/headless.cJeff King-0/+2
As with the files touched in the previous commit, win32/headless.c does not include git-compat-util.h, so it doesn't have our UNUSED macro. Unlike those ones, this is not third-party code, so it would not be a big deal to modify it. However, I'm not sure if including git-compat-util.h would create other headaches (and I don't even have a machine to test this on; I'm relying on Windows CI to compile it at all). Given how trivial the file is, and that the unused parameters are not interesting (they are just boilerplate for the wWinMain() function), we can just use the same trick as the previous commit and disable the warnings via pragma. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28compat: disable -Wunused-parameter in 3rd-party codeJeff King-0/+4
We carry some vendored 3rd-party code in compat/ that does not build cleanly with -Wunused-parameters. We could mark these with UNUSED, but there are two reasons not to: 1. This is code imported from elsewhere, so we'd prefer to avoid modifying it in an invasive way that could create conflicts if we tried to pull in a new version. 2. These files don't include git-compat-util.h at all, so we'd need to factor out (or repeat) our UNUSED macro. In theory we could modify the build process to invoke the compiler with the extra warning disabled for these files, but there are tricky corner cases there (e.g., for NO_REGEX we cannot assume that the compiler understands -Wno-unused-parameter as an option, so we'd have to use our detect-compiler script). Instead, let's rely on the gcc diagnostic #pragma. This is horribly unportable, of course, but it should do what we want. Compilers which don't understand this particular pragma should ignore it (per the standard), and compilers which do care about "-Wunused-parameter" will hopefully respect it, even if they are not gcc (e.g., clang does). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28t-reftable-readwrite: mark unused parameter in callback functionJeff King-1/+1
This spot was originally marked in in 4695c3f3a9 (reftable: mark unused parameters in virtual functions, 2024-08-17), but was copied in 5b539a5361 (t: move reftable/readwrite_test.c to the unit testing framework, 2024-08-13). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28gc: mark unused config parameter in virtual functionsJeff King-8/+8
Commit d1ae15d68b (builtin/gc: refactor to read config into structure, 2024-08-16) added a new parameter to the maintenance_task virtual functions, but most of them don't need to look at it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27send-email: add mailmap support via sendemail.mailmap and --mailmapJacob Keller-0/+142
In some cases, a user may be generating a patch for an old commit which now has an out-of-date author or other identity. For example, consider a team member who contributes to an internal fork of an upstream project, but leaves before this change is submitted upstream. In this case, the team members company address may no longer be valid, and will thus bounce when sending email. This can be manually avoided by editing the generated patch files, or by carefully using --suppress-<cc|to> options. This requires a lot of manual intervention and is easy to forget. Git has support for mapping old email addresses and names to a canonical name and address via the .mailmap file (and its associated mailmap.file, mailmap.blob, and log.mailmap options). Teach git send-email to enable mailmap support for all addresses. This ensures that addresses point to the canonical real name and email address. Add the sendemail.mailmap configuration option and its associated --mailmap (and --use-mailmap for compatibility with git log) options. For now, the default behavior is to disable the mailmap in order to avoid any surprises or breaking any existing setups. These options support per-identity configuration via the sendemail.identity configuration blocks. This enables identity-specific configuration in cases where users may not want to enable support. In addition, support send-email specific mailmap data via sendemail.mailmap.file, sendemail.mailmap.blob and their identity-specific variants. The intention of these options is to enable mapping addresses which are no longer valid to a current project or team maintainer. Such mappings may change the actual person being referred to, and may not make sense in a traditional mailmap file which is intended for updating canonical name and address for the same individual. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27check-mailmap: add options for additional mailmap sourcesJacob Keller-6/+27
The git check-mailmap command reads the mailmap from either the default .mailmap location and then from the mailmap.blob and mailmap.file configurations. A following change to git send-email will want to support new configuration options based on the configured identity. The identity-based configuration and options only make sense in the context of git send-email. Expose the read_mailmap_file and read_mailmap_blob functions from mailmap.c. Teach git check-mailmap the --mailmap-file and --mailmap-blob options which load the additional mailmap sources. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27check-mailmap: accept "user@host" contactsJacob Keller-15/+53
git check-mailmap splits each provided contact using split_ident_line. This function requires that the contact either be of the form "Name <user@host>" or of the form "<user@host>". In particular, if the mail portion of the contact is not surrounded by angle brackets, split_ident_line will reject it. This results in git check-mailmap rejecting attempts to translate simple email addresses: $ git check-mailmap user@host fatal: unable to parse contact: user@host This limits the usability of check-mailmap as it requires placing angle brackets around plain email addresses. In particular, attempting to use git check-mailmap to support mapping addresses in git send-email is not straight forward. The sanitization and validation functions in git send-email strip angle brackets from plain email addresses. It is not trivial to add brackets prior to invoking git check-mailmap. Instead, modify check_mailmap() to allow such strings as contacts. In particular, treat any line which cannot be split by split_ident_line as a simple email address. No attempt is made to actually parse the address line, or validate that it is actually an email address. Implementing such validation is not trivial. Besides, we weren't validating the address between angle brackets before anyways. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27builtin/pack-objects.c: do not open-code `MAX_PACK_OBJECT_HEADER`Taylor Blau-1/+1
The function `write_reused_pack_one()` defines an header to store the OFS_DELTA header, but uses the constant "10" instead of "MAX_PACK_OBJECT_HEADER" (as is done elsewhere in the same patch, circa bb514de356c (pack-objects: improve partial packfile reuse, 2019-12-18)). Declare the `ofs_header` field to be sized according to `MAX_PACK_OBJECT_HEADER` (which is 10, as defined in "pack.h") instead of the constant 10. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27pack-bitmap.c: avoid repeated `pack_pos_to_offset()` during reuseTaylor Blau-4/+7
When calling `try_partial_reuse()`, the (sole) caller from the function `reuse_partial_packfile_from_bitmap_1()` has to translate its bit position to a pack position. In the MIDX bitmap case, the caller translates from the bit position, to a position in the MIDX's pseudo-pack order (with `pack_pos_to_midx()`), then get a pack offset (with `nth_midxed_offset()`) before finally working backwards to get the pack position in the source pack by calling `offset_to_pack_pos()`. In the non-MIDX bitmap case, we can use the bit position as the pack position directly (see the comment at the beginning of the `reuse_partial_packfile_from_bitmap_1()` function for why). In either case, the first thing that `try_partial_reuse()` does after being called is determine the offset of the object at the given pack position by calling `pack_pos_to_offset()`. But we already have that information in the MIDX case! Avoid re-computing that information by instead passing it in. In the MIDX case, we already have that information stored. In the non-MIDX case, the call to `pack_pos_to_offset()` moves from the function `try_partial_reuse()` to its caller. In total, we'll save one call to `pack_pos_to_offset()` when processing MIDX bitmaps. (On my machine, there is a slight speed-up on the order of ~2ms, but it is within the margin of error over 10 runs, so I think you'd have to have a truly gigantic repository to confidently measure any significant improvement here). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27builtin/pack-objects.c: translate bit positions during pack-reuseTaylor Blau-12/+63
When reusing chunks verbatim from an existing source pack, the function write_reused_pack() first attempts to reuse whole words (via the function `write_reused_pack_verbatim()`), and then individual bits (via `write_reused_pack_one()`). In the non-MIDX case, all of this code works fine. Likewise, in the MIDX case, processing bits individually from the first (preferred) pack works fine. However, processing subsequent packs in the MIDX case is broken when there are duplicate objects among the set of MIDX'd packs. This is because we treat the individual bit positions as valid pack positions within the source pack(s), which does not account for gaps in the source pack, like we see when the MIDX must break ties between duplicate objects which appear in multiple packs. The broken code looks like: for (; i < reuse_packfile_bitmap->word_alloc; i++) { for (offset = 0; offset < BITS_IN_EWORD, offset++) { /* ... */ write_reused_pack_one(reuse_packfile->p, pos + offset - reuse_packfile->bitmap_pos, f, pack_start, &w_curs); } } , where the second argument is incorrect and does not account for gaps. Instead, make sure that we translate bit positions in the MIDX's pseudo-pack order to pack positions in the respective source packs by: - Translating the bit position (pseudo-pack order) to a MIDX position (lexical order). - Use the MIDX position to obtain the offset at which the given object occurs in the source pack. - Then translate that offset back into a pack relative position within the source pack by calling offset_to_pack_pos(). After doing this, then we can safely use the result as a pack position. Note that when doing single-pack reuse, as well as reusing objects from the MIDX's preferred pack, such translation is not necessary, since either ties are broken in favor of the preferred pack, or there are no ties to break at all (in the case of non-MIDX bitmaps). Failing to do this can result in strange failure modes. One example that can occur when misinterpreting bits in the above fashion is that Git thinks it's supposed to send a delta that the caller does not want. Under this (incorrect) assumption, we try to look up the delta's base (so that we can patch any OFS_DELTAs if necessary). We do this using find_reused_offset(). But if we try and call that function for an offset belonging to an object we did not send, we'll get back garbage. This can result in us computing a negative fixup value, which results in memory corruption when trying to write the (patched) OFS_DELTA header. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27pack-bitmap: tag bitmapped packs with their corresponding MIDXTaylor Blau-0/+3
The next commit will need to use the bitmap's MIDX (if one exists) to translate bit positions into pack-relative positions in the source pack. Ordinarily, we'd use the "midx" field of the bitmap_index struct. But since that struct is defined within pack-bitmap.c, and our caller is in a separate compilation unit, we do not have access to the MIDX field. Instead, add a "from_midx" field to the bitmapped_pack structure so that we can use that piece of data from outside of pack-bitmap.c. The caller that uses this new piece of information will be added in the following commit. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27t/t5332-multi-pack-reuse.sh: verify pack generation with --strictTaylor Blau-8/+12
In our tests for multi-pack reuse, we have two helper functions: - test_pack_objects_reused_all(), and - test_pack_objects_reused() which invoke pack-objects (either with `--all`, or the supplied tips via stdin, respectively) and ensure that (a) the number of reused objects, and (b) the number of packs which those objects were reused from both match the expected values. Both functions discard the output of pack-objects and assert only on the contents of the trace2 stream. However, if we store the pack and attempt to index it with `--strict`, we find that a number of our tests are broken, indicating a bug within multi-pack reuse. That bug will be addressed in a subsequent commit. But let's first harden these tests by trying to index the resulting pack, marking the tests which fail appropriately. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-26git-config.1: fix description of --regexp in synopsisJunio C Hamano-2/+2
The synopsis says --regexp=<regexp> but the --regexp option is a Boolean that says "the name given is not literal, but a pattern to match the name". Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-26git-config.1: --get-all description updateJunio C Hamano-1/+1
"git config --get-all foo.bar" shows all values for the foo.bar variable, but does not give the variable name in each output entry. Hence it is equivalent to "git config get --all foo.bar", without "--show-names", in the more modern syntax. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-26Sync with 'maint'Junio C Hamano-15/+22
2024-08-26The ninth batchJunio C Hamano-0/+21
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-26Merge branch 'jc/coding-style-c-operator-with-spaces'Junio C Hamano-1/+10
Write down whitespacing rules around C opeators. * jc/coding-style-c-operator-with-spaces: CodingGuidelines: spaces around C operators