aboutsummaryrefslogtreecommitdiffstats
path: root/builtin (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-03-18am: switch from merge_recursive_generic() to merge_ort_generic()Elijah Newren1-2/+3
Switch from merge-recursive to merge-ort. Adjust the following testcases due to the switch: * t4151: This test left an untracked file in the way of the merge. merge-recursive could only sometimes tell when untracked files were in the way, and by the time it discovers others, it has already made too many changes to back out of the merge. So, instead of writing the results to e.g. 'file1' it would instead write them to 'file1~branch1'. This is confusing for users, because they might not notice 'file1~branch1' and accidentally add and commit 'file1'. In contrast, merge-ort correctly notices the file in the way before making any changes and aborts. Since this test didn't care about the file in the way, just remove it before calling git-am. * t4255: Usage of merge-ort allows us to change two known failures into successes. * t6427: As noted a few commits ago, the choice of conflict label for diff3 markers for the ancestor commit was previously handled by merge-recursive.c rather than by callers. Since that has now changed, `git am` needs to specify that label. Although the previous conflict label ("constructed merge base") was already fairly somewhat slanted towards `git am`, let's use wording more along the lines of the related command-line flag from `git apply` and function involved to tie it more closely to `git am`. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-17Merge branch 'tb/multi-cruft-pack-refresh-fix' into tb/combine-cruft-below-sizeJunio C Hamano1-17/+101
* tb/multi-cruft-pack-refresh-fix: builtin/pack-objects.c: freshen objects from existing cruft packs
2025-03-17reflog: implement subcommand to drop reflogsKarthik Nayak1-1/+65
While 'git-reflog(1)' currently allows users to expire reflogs and delete individual entries, it lacks functionality to completely remove reflogs for specific references. This becomes problematic in repositories where reflogs are not needed but continue to accumulate entries despite setting 'core.logAllRefUpdates=false'. Add a new 'drop' subcommand to git-reflog that allows users to delete the entire reflog for a specified reference. Include an '--all' flag to enable dropping all reflogs from all worktrees and an addon flag '--single-worktree', to only drop all reflogs from the current worktree. While here, remove an extraneous newline in the file. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-17reflog: improve error for when reflog is not foundKarthik Nayak1-1/+1
The 'git reflog expire' prints the error message '<ref> points nowhere!' when used with a non-existent ref. This message is a bit confusing and vague. Modify the message to be more clear and direct. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-17stash: remove merge-recursive.h includeElijah Newren1-1/+0
stash was modified to use merge_ort_nonrecursive() instead of merge_recursive_generic() back in commit 874cf2a60444 (stash: apply stash using 'merge_ort_nonrecursive()', 2022-05-10). That makes the inclusion of merge-recursive.h unnecessary. In preparation for the removal of merge-recursive.h, remove the unnecessary include. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-13builtin/pack-objects.c: freshen objects from existing cruft packsTaylor Blau1-17/+101
Once an object is written into a cruft pack, we can only freshen it by writing a new loose or packed copy of that object with a more recent mtime. Prior to 61568efa95 (builtin/pack-objects.c: support `--max-pack-size` with `--cruft`, 2023-08-28), we typically had at most one cruft pack in a repository at any given time. So freshening unreachable objects was straightforward when already rewriting the cruft pack (and its *.mtimes file). But 61568efa95 changes things: 'pack-objects' now supports writing multiple cruft packs when invoked with `--cruft` and the `--max-pack-size` flag. Cruft packs are rewritten until they reach some size threshold, at which point they are considered "frozen", and will only be modified in a pruning GC, or if the threshold itself is adjusted. Prior to this patch, however, this process breaks down when we attempt to freshen an object packed in an earlier cruft pack, and that cruft pack is larger than the threshold and thus will survive the repack. When this is the case, it is impossible to freshen objects in cruft pack(s) when those cruft packs are larger than the threshold. This is because we would avoid writing them in the new cruft pack entirely, for a couple of reasons. 1. When enumerating packed objects via 'add_objects_in_unpacked_packs()' we pass the SKIP_IN_CORE_KEPT_PACKS, which is used to avoid looping over the packs we're going to retain (which are marked as kept in-core by 'read_cruft_objects()'). This means that we will avoid enumerating additional packed copies of objects found in any cruft packs which are larger than the given size threshold. Thus there is no opportunity to call 'create_object_entry()' whatsoever. 2. We likewise will discard the loose copy (if one exists) of any unreachable object packed in a cruft pack that is larger than the threshold. Here our call path is 'add_unreachable_loose_objects()', which uses the 'add_loose_object()' callback. That function will eventually land us in 'want_object_in_pack()' (via 'add_cruft_object_entry()'), and we'll discard the object as it appears in one of the packs which we marked as kept in-core. This means in effect that it is impossible to freshen an unreachable object once it appears in a cruft pack larger than the given threshold. Instead, we should pack an additional copy of an unreachable object we want to freshen even if it appears in a cruft pack, provided that the cruft copy has an mtime which is before the mtime of the copy we are trying to pack/freshen. This is sub-optimal in the sense that it requires keeping an additional copy of unreachable objects upon freshening, but we don't have a better alternative without the ability to make in-place modifications to existing *.mtimes files. In order to implement this, we have to adjust the behavior of 'want_found_object()'. When 'pack-objects' is told that we're *not* going to retain any cruft packs (i.e. the set of packs marked as kept in-core does not contain a cruft pack), the behavior is unchanged. But when there *is* at least one cruft pack that we're holding onto, it is no longer sufficient to reject a copy of an object found in that cruft pack for that reason alone. In this case, we only want to reject a candidate object when copies of that object either: - exists in a non-cruft pack that we are retaining, regardless of that pack's mtime, or - exists in a cruft pack with an mtime at least as recent as the copy we are debating whether or not to pack, in which case freshening would be redundant. To do this, keep track of whether or not we have any cruft packs in our in-core kept list with a new 'ignore_packed_keep_in_core_has_cruft' flag. When we end up in this new special case, we replace a call to 'has_object_kept_pack()' to 'want_cruft_object_mtime()', and only reject objects when we have a copy in an existing cruft pack with at least as recent an mtime as our candidate (in which case "freshening" would be redundant). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-12refs/iterator: separate lifecycle from iterationPatrick Steinhardt1-0/+2
The ref and reflog iterators have their lifecycle attached to iteration: once the iterator reaches its end, it is automatically released and the caller doesn't have to care about that anymore. When the iterator should be released before it has been exhausted, callers must explicitly abort the iterator via `ref_iterator_abort()`. This lifecycle is somewhat unusual in the Git codebase and creates two problems: - Callsites need to be very careful about when exactly they call `ref_iterator_abort()`, as calling the function is only valid when the iterator itself still is. This leads to somewhat awkward calling patterns in some situations. - It is impossible to reuse iterators and re-seek them to a different prefix. This feature isn't supported by any iterator implementation except for the reftable iterators anyway, but if it was implemented it would allow us to optimize cases where we need to search for specific references repeatedly by reusing internal state. Detangle the lifecycle from iteration so that we don't deallocate the iterator anymore once it is exhausted. Instead, callers are now expected to always call a newly introduce `ref_iterator_free()` function that deallocates the iterator and its internal state. Note that the `dir_iterator` is somewhat special because it does not implement the `ref_iterator` interface, but is only used to implement other iterators. Consequently, we have to provide `dir_iterator_free()` instead of `dir_iterator_release()` as the allocated structure itself is managed by the `dir_iterator` interfaces, as well, and not freed by `ref_iterator_free()` like in all the other cases. While at it, drop the return value of `ref_iterator_abort()`, which wasn't really required by any of the iterator implementations anyway. Furthermore, stop calling `base_ref_iterator_free()` in any of the backends, but instead call it in `ref_iterator_free()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-12builtin/update-ref: skip ambiguity checks when parsing object IDsPatrick Steinhardt1-5/+10
Most of the commands in git-update-ref(1) accept an old and/or new object ID to update a specific reference to. These object IDs get parsed via `repo_get_oid()`, which not only handles plain object IDs, but also those that have a suffix like "~" or "^2". More surprisingly though, it even knows to resolve arbitrary revisions, despite the fact that its manpage does not mention this fact even once. One consequence of this is that we also check for ambiguous references: when parsing a full object ID where the DWIM mechanism would also cause us to resolve it as a branch, we'd end up printing a warning. While this check makes sense to have in general, it is arguably less useful in the context of git-update-ref(1). This is due to multiple reasons: - The manpage is explicitly structured around object IDs. So if we see a fully blown object ID, the intent should be quite clear in general. - The command is part of our plumbing layer and not a tool that users would generally use in interactive workflows. As such, the warning will likely not be visible to anybody in the first place. - Users can and should use the fully-qualified refname in case there is any potential for ambiguity. And given that this command is part of our plumbing layer, one should always try to be as defensive as possible and use fully-qualified refnames. Furthermore, this check can be quite expensive when updating lots of references via `--stdin`, because we try to read multiple references per object ID that we parse according to the DWIM rules. This effect can be seen both with the "files" and "reftable" backend. The issue is not unique to git-update-ref(1), but was also an issue in git-cat-file(1), where it was addressed by disabling the ambiguity check in 25fba78d36b (cat-file: disable object/refname ambiguity check for batch mode, 2013-07-12). Disable the warning in git-update-ref(1), which provides a significant speedup with both backends. The user-visible outcome is unchanged even when ambiguity exists, except that we don't show the warning anymore. The following benchmark creates 10000 new references with a 100000 preexisting refs with the "files" backend: Benchmark 1: update-ref: create many refs (refformat = files, preexisting = 100000, new = 10000, revision = HEAD~) Time (mean ± σ): 467.3 ms ± 5.1 ms [User: 100.0 ms, System: 365.1 ms] Range (min … max): 461.9 ms … 479.3 ms 10 runs Benchmark 2: update-ref: create many refs (refformat = files, preexisting = 100000, new = 10000, revision = HEAD) Time (mean ± σ): 394.1 ms ± 5.8 ms [User: 63.3 ms, System: 327.6 ms] Range (min … max): 384.9 ms … 405.7 ms 10 runs Summary update-ref: create many refs (refformat = files, preexisting = 100000, new = 10000, revision = HEAD) ran 1.19 ± 0.02 times faster than update-ref: create many refs (refformat = files, preexisting = 100000, new = 10000, revision = HEAD~) And with the "reftable" backend: Benchmark 1: update-ref: create many refs (refformat = reftable, preexisting = 100000, new = 10000, revision = HEAD~) Time (mean ± σ): 146.9 ms ± 2.2 ms [User: 90.4 ms, System: 56.0 ms] Range (min … max): 142.7 ms … 150.8 ms 19 runs Benchmark 2: update-ref: create many refs (refformat = reftable, preexisting = 100000, new = 10000, revision = HEAD) Time (mean ± σ): 63.2 ms ± 1.1 ms [User: 41.0 ms, System: 21.8 ms] Range (min … max): 61.1 ms … 66.6 ms 41 runs Summary update-ref: create many refs (refformat = reftable, preexisting = 100000, new = 10000, revision = HEAD) ran 2.32 ± 0.05 times faster than update-ref: create many refs (refformat = reftable, preexisting = 100000, new = 10000, revision = HEAD~) Note that the absolute improvement with both backends is roughly in the same ballpark, but the relative improvement for the "reftable" backend is more significant because writing the new table to disk is faster in the first place. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-12name-rev: remove "--stdin" supportJunio C Hamano1-1/+9
As part of Git 3.0, remove the hidden synonym for "--annotate-stdin" for real. As this does not change the fact that it used to be called "--stdin" in older version of Git, keep that passage in the documentation for "--annotate-stdin". Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fast-export, fast-import: add support for signed-commitsLuke Shumaker2-20/+126
fast-export has a --signed-tags= option that controls how to handle tag signatures. However, there is no equivalent for commit signatures; it just silently strips the signature out of the commit (analogously to --signed-tags=strip). While signatures are generally problematic for fast-export/fast-import (because hashes are likely to change), if they're going to support tag signatures, there's no reason to not also support commit signatures. So, implement a --signed-commits= option that mirrors the --signed-tags= option. On the fast-export side, try to be as much like signed-tags as possible, in both implementation and in user-interface. This will change the default behavior to '--signed-commits=abort' from what is now '--signed-commits=strip'. In order to provide an escape hatch for users of third-party tools that call fast-export and do not yet know of the --signed-commits= option, add an environment variable 'FAST_EXPORT_SIGNED_COMMITS_NOABORT=1' that changes the default to '--signed-commits=warn-strip'. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fast-export: do not modify memory from get_commit_bufferLuke Shumaker1-28/+33
fast-export's helper function find_encoding() takes a `const char *`, but modifies that memory despite the `const`. Ultimately, this memory came from get_commit_buffer(), and you're not supposed to modify the memory that you get from get_commit_buffer(). So, get rid of find_encoding() in favor of commit.h:find_commit_header(), which gives back a string length, rather than mutating the memory to insert a '\0' terminator. Because find_commit_header() detects the "\n\n" string that separates the headers and the commit message, move the call to be above the `message = strstr(..., "\n\n")` call. This helps readability, and allows for the value of `encoding` to be used for a better value of "..." so that the same memory doesn't need to be checked twice. Introduce a `commit_buffer_cursor` variable to avoid writing an awkward `encoding ? encoding + encoding_len : committer_end` expression. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fast-export: rename --signed-tags='warn' to 'warn-verbatim'Luke Shumaker1-4/+4
The --signed-tags= option takes one of five arguments specifying how to handle signed tags during export. Among these arguments, 'strip' is to 'warn-strip' as 'verbatim' is to 'warn' (the unmentioned argument is 'abort', which stops the fast-export process entirely). That is, signatures are either stripped or copied verbatim while exporting, with or without a warning. Match the pattern and rename 'warn' to 'warn-verbatim' to make it clear that it instructs fast-export to copy signatures verbatim. To maintain backwards compatibility, 'warn' is still recognized as deprecated synonym of 'warn-verbatim'. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fast-export: fix missing whitespace after switchChristian Couder1-4/+4
"Documentation/CodingGuidelines" says that there should be whitespaces around operators like 'if', 'switch', 'for', etc. Let's fix this in "builtin/fast-export.c". Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10hash: stop depending on `the_repository` in `null_oid()`Patrick Steinhardt14-38/+40
The `null_oid()` function returns the object ID that only consists of zeroes. Naturally, this ID also depends on the hash algorithm used, as the number of zeroes is different between SHA1 and SHA256. Consequently, the function returns the hash-algorithm-specific null object ID. This is currently done by depending on `the_hash_algo`, which implicitly makes us depend on `the_repository`. Refactor the function to instead pass in the hash algorithm for which we want to retrieve the null object ID. Adapt callsites accordingly by passing in `the_repository`, thus bubbling up the dependency on that global variable by one layer. There are a couple of trivial exceptions for subsystems that already got rid of `the_repository`. These subsystems instead use the repository that is available via the calling context: - "builtin/grep.c" - "grep.c" - "refs/debug.c" There are also two non-trivial exceptions: - "diff-no-index.c": Here we know that we may not have a repository initialized at all, so we cannot rely on `the_repository`. Instead, we adapt `diff_no_index()` to get a `struct git_hash_algo` as parameter. The only caller is located in "builtin/diff.c", where we know to call `repo_set_hash_algo()` in case we're running outside of a Git repository. Consequently, it is fine to continue passing `the_repository->hash_algo` even in this case. - "builtin/ls-files.c": There is an in-flight patch series that drops `USE_THE_REPOSITORY_VARIABLE` in this file, which causes a semantic conflict because we use `null_oid()` in `show_submodule()`. The value is passed to `repo_submodule_init()`, which may use the object ID to resolve a tree-ish in the superproject from which we want to read the submodule config. As such, the object ID should refer to an object in the superproject, and consequently we need to use its hash algorithm. This means that we could in theory just not bother about this edge case at all and just use `the_repository` in "diff-no-index.c". But doing so would feel misdesigned. Remove the `USE_THE_REPOSITORY_VARIABLE` preprocessor define in "hash.c". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10delta-islands: stop depending on `the_repository`Patrick Steinhardt1-1/+1
There are multiple sites in "delta-islands.c" where we use the global `the_repository` variable, either explicitly or implicitly by using `the_hash_algo`. Refactor the code to stop using `the_repository`. In most cases this is trivial because we already had a repository available in the calling context, with the only exception being `propagate_island_marks()`. Adapt it so that the repository gets passed in via a parameter. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10object-file-convert: stop depending on `the_repository`Patrick Steinhardt1-1/+1
There are multiple sites in "object-file-convert.c" where we use the global `the_repository` variable, either explicitly or implicitly by using `the_hash_algo`. All of these callsites are transitively called from `convert_object_file()`, which indeed has no repo as input. Refactor the function so that it receives a repository as a parameter and pass it through to all internal functions to get rid of the dependency. Remove the `USE_THE_REPOSITORY_VARIABLE` define. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10environment: move access to "core.bigFileThreshold" into repo settingsPatrick Steinhardt4-7/+12
The "core.bigFileThreshold" setting is stored in a global variable and populated via `git_default_core_config()`. This may cause issues in the case where one is handling multiple different repositories in a single process with different values for that config key, as we may or may not see the correct value in that case. Furthermore, global state blocks our path towards libification. Refactor the code so that we instead store the value in `struct repo_settings`, where the value is computed as-needed and cached. Note that this change requires us to adapt one test in t1050 that verifies that we die when parsing an invalid "core.bigFileThreshold" value. The exercised Git command doesn't use the value at all, and thus it won't hit the new code path that parses the value. This is addressed by using git-hash-object(1) instead, which does read the value. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10pack-write: stop depending on `the_repository` and `the_hash_algo`Patrick Steinhardt3-5/+5
There are a couple of functions in "pack-write.c" that implicitly depend on `the_repository` or `the_hash_algo`. Remove this dependency by injecting the repository via a parameter and adapt callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10object: stop depending on `the_repository`Patrick Steinhardt7-10/+10
There are a couple of functions exposed by "object.c" that implicitly depend on `the_repository`. Remove this dependency by injecting the repository via a parameter. Adapt callers accordingly by simply using `the_repository`, except in cases where the subsystem is already free of the repository. In that case, we instead pass the repository provided by the caller's context. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10csum-file: stop depending on `the_repository`Patrick Steinhardt3-3/+4
There are multiple sites in "csum-file.c" where we use the global `the_repository` variable, either explicitly or implicitly by using `the_hash_algo`. Refactor the code to stop using `the_repository` by adapting functions to receive required data as parameters. Adapt callsites accordingly by either using `the_repository->hash_algo`, or by using a context-provided hash algorithm in case the subsystem already got rid of its dependency on `the_repository`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fetch: use ref prefix list to skip ls-refsJeff King1-20/+7
In git-fetch we have an optimization to avoid issuing an ls-refs command to the server if we don't care about the value of any refs (e.g., because we are fetching exact object ids), saving a round-trip to the server. This comes from e70a3030e7 (fetch: do not list refs if fetching only hashes, 2018-09-27). It uses an explicit flag "must_list_refs" to decide when we need to do so. That was needed back then, because the list of ref-prefixes was not always complete. If it was empty, it did not necessarily mean that we were not interested in any refs). But that is no longer the case; an empty list of prefixes means that we truly do not care about any refs. And so rather than an explicit flag, we can just check whether we are interested in any ref prefixes. This simplifies the code slightly, as there is now a single source of truth for the decision. It also fixes a bug in / optimizes a very unlikely case, which is: git fetch $remote ^foo $oid I.e., a negative refspec combined with an exact oid fetch. This is somewhat nonsense, in that there are no positive refspecs mentioning refs to countermand with the negative one. But we should be able to do this without issuing an ls-refs command (excluding "foo" from the empty set will obviously still be the empty set). However, the current code does not do so. The negative refspec is not counted as a noop in un-setting the must_list_refs flag (hardly the fault of e70a3030e7, as negative refspecs did not appear until much later). But by using the prefix list as a source of truth, this naturally just works; the negative refspec does not add a prefix to ask about, and hence does not trigger the ls-refs call. This is esoteric enough that I didn't bother adding a test. The real value here is in the code simplification. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fetch: avoid ls-refs only to ask for HEAD symref updateJeff King1-3/+2
When we fetch from a configured remote, we may try to update the local refs/remotes/<origin>/HEAD, and so we ask the server to advertise its HEAD to us. But if we aren't otherwise asking about any refs at all, then we know this HEAD update can never happen! To consider a new value for HEAD, the set_head() function uses guess_remote_head(). And even if it sees an explicit symref value for HEAD, it will only report that as a match if we also saw that remote ref advertised, and it mapped to a local tracking ref via get_fetch_map(). In other words, a fetch like this: git fetch origin $exact_oid:refs/heads/foo can never update HEAD, because we will never have fetched (nor even see the advertisement for) the ref that HEAD points to. Currently the command above will still call ls-refs to ask about the HEAD, even though it is pointless. This patch teaches it to skip the ls-refs call entirely in this case, which avoids a round-trip to the server. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fetch: stop protecting additions to ref-prefix listJeff King1-6/+4
When using the ref-prefix feature of protocol v2, a client which sends no prefixes at all will get the full advertisement. And so the code in git-fetch was historically loose about setting up that list based on our refspecs. There were cases where we needed to know about some refs, so we just didn't add anything to the ref-prefix list. And hence further code, like that for tag-following and updating origin/HEAD, had to be careful about adding to an empty list. E.g., see the bug fixed by bd52d9a058 (fetch: fix following tags when fetching specific OID, 2025-03-07). But the previous commit removed the last such case, and now we know an empty ref-prefix list (at least inside git-fetch's do_fetch() function) means that we really don't need to see any refs. So we can drop those extra conditionals. This simplifies the code a little. But it also means that some cases can now use ref prefixes when they would not otherwise. As the test shows, fetching an exact oid into a local ref can now avoid enumerating all of the refs. The refspec itself doesn't need to know about any remote refs, and the tag auto-following can just ask about refs/tags/. The same is true for asking about HEAD to update the local origin/HEAD. I didn't add a test for that yet, though, as we can optimize it even further. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fetch: ask server to advertise HEAD for config-less fetchJeff King1-0/+8
If we're not given any refspecs (either on the command line or via config) and we have no branch merge config, then we fetch the remote HEAD into our local FETCH_HEAD. In that case we do not send any ref-prefix option to the server at all, and we see the full advertisement. But this is sub-optimal. We only care about HEAD, so we can just ask for that, and ignore all of the other refs. The new test demonstrates a case where we see fewer refs (in this case only one less, but in theory we could be ignoring millions of them). This also removes the only case where we care about seeing some refs from the other side, but don't add anything to the ref_prefixes list. Cleaning this up means one less maintenance burden. Before this patch, any code which wanted to add to the list had to make sure the list was not empty, since an empty list meant "ask for everything". Now it really means "we are not interested in any refs". This should let us optimize a few more cases in subsequent patches. Note that we'll add "HEAD" to the list of prefixes, and later code for updating "refs/remotes/<remote>/HEAD" may likewise do so. In theory this could cause duplicates in the list, but in practice these can't both trigger. We hit our new case only if there are no refspecs, and the "<remote>/HEAD" feature is enabled only when we are fetching from a remote with configured refspecs. We could be defensive with a flag, but it didn't seem worth it to me (the absolute worse case is a useless redundant ref-prefix line sent to the server). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10Merge branch 'tb/fetch-follow-tags-fix'Junio C Hamano1-1/+3
* tb/fetch-follow-tags-fix: fetch: fix following tags when fetching specific OID
2025-03-07builtin/checkout-index: stop using `the_repository`Usman Akinyemi1-22/+21
Remove the_repository global variable in favor of the repository argument that gets passed in "builtin/checkout-index.c". When `-h` is passed to the command outside a Git repository, the `run_builtin()` will call the `cmd_checkout_index()` function with `repo` set to NULL and then early in the function, `show_usage_with_options_if_asked()` call will give the options help and exit. Pass an instance of "struct index_state" available in the calling context to both `checkout_all()` and `checkout_file()` to remove their dependency on the global `the_repository` variable. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07builtin/for-each-ref: stop using `the_repository`Usman Akinyemi1-3/+2
Remove the_repository global variable in favor of the repository argument that gets passed in "builtin/for-each-ref.c". When `-h` is passed to the command outside a Git repository, the `run_builtin()` will call the `cmd_for_each_ref()` function with `repo` set to NULL and then early in the function, `parse_options()` call will give the options help and exit. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07builtin/ls-files: stop using `the_repository`Usman Akinyemi1-16/+16
Remove the_repository global variable in favor of the repository argument that gets passed in "builtin/ls-files.c". When `-h` is passed to the command outside a Git repository, the `run_builtin()` will call the `cmd_ls_files()` function with `repo` set to NULL and then early in the function, `show_usage_with_options_if_asked()` call will give the options help and exit. Pass the repository available in the calling context to both `expand_objectsize()` and `show_ru_info()` to remove their dependency on the global `the_repository` variable. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07builtin/pack-refs: stop using `the_repository`Usman Akinyemi1-5/+3
Remove the_repository global variable in favor of the repository argument that gets passed in "builtin/pack-refs.c". When `-h` is passed to the command outside a Git repository, the `run_builtin()` will call the `cmd_pack_refs()` function with `repo` set to NULL and then early in the function, `parse_options()` call will give the options help and exit. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07builtin/send-pack: stop using `the_repository`Usman Akinyemi1-4/+3
Remove the_repository global variable in favor of the repository argument that gets passed in "builtin/send-pack.c". When `-h` is passed to the command outside a Git repository, the `run_builtin()` will call the `cmd_send_pack()` function with `repo` set to NULL and then early in the function, `parse_options()` call will give the options help and exit. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07builtin/verify-commit: stop using `the_repository`Usman Akinyemi1-7/+6
Remove the_repository global variable in favor of the repository argument that gets passed in "builtin/verify-commit.c". When `-h` is passed to the command outside a Git repository, the `run_builtin()` will call the `cmd_verify_commit()` function with `repo` set to NULL and then early in the function, `parse_options()` call will give the options help and exit. Pass the repository available in the calling context to `verify_commit()` to remove it's dependency on the global `the_repository` variable. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07builtin/verify-tag: stop using `the_repository`Usman Akinyemi1-4/+3
Remove the_repository global variable in favor of the repository argument that gets passed in "builtin/verify-tag.c". When `-h` is passed to the command outside a Git repository, the `run_builtin()` will call the `cmd_verify_tag()` function with `repo` set to NULL and then early in the function, `parse_options()` call will give the options help and exit. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07fetch: fix following tags when fetching specific OIDTaylor Blau1-1/+3
In 3f763ddf28 (fetch: set remote/HEAD if it does not exist, 2024-11-22), unconditionally adds "HEAD" to the list of ref prefixes we send to the server. This breaks a core assumption that the list of prefixes we send to the server is complete. We must either send all prefixes we care about, or none at all (in the latter case the server then advertises everything). The tag following code is careful to only add "refs/tags/" to the list of prefixes if there are already entries in the prefix list. But because the new code from 3f763ddf28 runs after the tag code, and because it unconditionally adds to the prefix list, we may end up with a prefix list that _should_ have "refs/tags/" in it, but doesn't. When that is the case, the server does not advertise any tags, and our auto-following breaks because we never learned about any tags in the first place. Fix this by only adding "HEAD" to the ref prefixes when we know that we are already limiting the advertisement. In either case we'll learn about HEAD (either through the limited advertisement, or implicitly through a full advertisement). Reported-by: Igor Todorovski <itodorov@ca.ibm.com> Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-05Merge branch 'kn/ref-migrate-skip-reflog'Junio C Hamano1-1/+1
Usage string of "git refs" has been corrected. * kn/ref-migrate-skip-reflog: refs: show --no-reflog in the help text
2025-03-05Merge branch 'ps/path-sans-the-repository'Junio C Hamano19-58/+106
The path.[ch] API takes an explicit repository parameter passed throughout the callchain, instead of relying on the_repository singleton instance. * ps/path-sans-the-repository: path: adjust last remaining users of `the_repository` environment: move access to "core.sharedRepository" into repo settings environment: move access to "core.hooksPath" into repo settings repo-settings: introduce function to clear struct path: drop `git_path()` in favor of `repo_git_path()` rerere: let `rerere_path()` write paths into a caller-provided buffer path: drop `git_common_path()` in favor of `repo_common_path()` worktree: return allocated string from `get_worktree_git_dir()` path: drop `git_path_buf()` in favor of `repo_git_path_replace()` path: drop `git_pathdup()` in favor of `repo_git_path()` path: drop unused `strbuf_git_path()` function path: refactor `repo_submodule_path()` family of functions submodule: refactor `submodule_to_gitdir()` to accept a repo path: refactor `repo_worktree_path()` family of functions path: refactor `repo_git_path()` family of functions path: refactor `repo_common_path()` family of functions
2025-03-03refs: show --no-reflog in the help textJunio C Hamano1-1/+1
We forgot that we must keep the documentation and help text in sync. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03builtin/diff-pairs: allow explicit diff queue flushJustin Tobler1-0/+14
The diffs queued from git-diff-pairs(1) are flushed when stdin is closed. To enable greater flexibility, allow control over when the diff queue is flushed by writing a single NUL byte on stdin between input file pairs. Diff output between flushes is separated by a single NUL byte. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03builtin: introduce diff-pairs commandJustin Tobler1-0/+193
Through git-diff(1), a single diff can be generated from a pair of blob revisions directly. Unfortunately, there is not a mechanism to compute batches of specific file pair diffs in a single process. Such a feature is particularly useful on the server-side where diffing between a large set of changes is not feasible all at once due to timeout concerns. To facilitate this, introduce git-diff-pairs(1) which acts as a backend passing its NUL-terminated raw diff format input from stdin through diff machinery to produce various forms of output such as patch or raw. The raw format was originally designed as an interchange format and represents the contents of the diff_queued_diff list making it possible to break the diff pipeline into separate stages. For example, git-diff-tree(1) can be used as a frontend to compute file pairs to queue and feed its raw output to git-diff-pairs(1) to compute patches. With this, batches of diffs can be progressively generated without having to recompute renames or retrieve object context. Something like the following: git diff-tree -r -z -M $old $new | git diff-pairs -p -z should generate the same output as `git diff-tree -p -M`. Furthermore, each line of raw diff formatted input can also be individually fed to a separate git-diff-pairs(1) process and still produce the same output. Based-on-patch-by: Jeff King <peff@peff.net> Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28path: adjust last remaining users of `the_repository`Patrick Steinhardt1-1/+1
With the preceding refactorings we now only have a couple of implicit users of `the_repository` left in the "path" subsystem, all of which depend on global state via `calc_shared_perm()`. Make the dependency on `the_repository` explicit by passing the repo as a parameter instead and adjust callers accordingly. Note that this change bubbles up into a couple of subsystems that were previously declared as free from `the_repository`. Instead of marking all of them as `the_repository`-dependent again, we instead use the repository that is available in the calling context. There are three exceptions though with "copy.c", "pack-write.c" and "tempfile.c". Adjusting these would require us to adapt callsites all over the place, so this is left for a future iteration. Mark "path.c" as free from `the_repository`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28environment: move access to "core.sharedRepository" into repo settingsPatrick Steinhardt2-7/+7
Similar as with the preceding commit, we track "core.sharedRepository" via a pair of global variables. Move them into `struct repo_settings` so that we can instead track them per-repository. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28path: drop `git_path()` in favor of `repo_git_path()`Patrick Steinhardt7-18/+43
Remove `git_path()` in favor of the `repo_git_path()` family of functions, which makes the implicit dependency on `the_repository` go away. Note that `git_path()` returned a string allocated via `get_pathname()`, which uses a rotating set of statically allocated buffers. Consequently, callers didn't have to free the returned string. The same isn't true for `repo_common_path()`, so we also have to add logic to free the returned strings. This refactoring also allows us to remove `repo_common_pathv()` as well as `get_pathname()` from the public interface. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28rerere: let `rerere_path()` write paths into a caller-provided bufferPatrick Steinhardt1-4/+7
Same as with `get_worktree_git_dir()` a couple of commits ago, the `rerere_path()` function returns paths that need not be free'd by the caller because `git_path()` internally uses `get_pathname()`. Refactor the function to instead accept a caller-provided buffer that the path will be written into, passing on ownership to the caller. This refactoring prepares us for the removal of `git_path()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-27Merge branch 'kn/ref-migrate-skip-reflog'Junio C Hamano1-0/+3
"git refs migrate" can optionally be told not to migrate the reflog. * kn/ref-migrate-skip-reflog: builtin/refs: add '--no-reflog' flag to drop reflogs
2025-02-27Merge branch 'ua/os-version-capability'Junio C Hamano1-11/+2
The value of "uname -s" is by default sent over the wire as a part of the "version" capability. * ua/os-version-capability: agent: advertise OS name via agent capability t5701: add setup test to remove side-effect dependency version: extend get_uname_info() to hide system details version: refactor get_uname_info() version: refactor redact_non_printables() version: replace manual ASCII checks with isprint() for clarity
2025-02-27builtin/fsck: add `git refs verify` child processshejialuo1-1/+32
At now, we have already implemented the ref consistency checks for both "files-backend" and "packed-backend". Although we would check some redundant things, it won't cause trouble. So, let's integrate it into the "git-fsck(1)" command to get feedback from the users. And also by calling "git refs verify" in "git-fsck(1)", we make sure that the new added checks don't break. Introduce a new function "fsck_refs" that initializes and runs a child process to execute the "git refs verify" command. In order to provide the user interface create a progress which makes the total task be 1. It's hard to know how many loose refs we will check now. We might improve this later. Then, introduce the option to allow the user to disable checking ref database consistency. Put this function in the very first execution sequence of "git-fsck(1)" due to that we don't want the existing code of "git-fsck(1)" which would implicitly check the consistency of refs to die the program. Last, update the test to exercise the code. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-27builtin/refs: get worktrees without reading head informationshejialuo1-1/+1
In "packed-backend.c", there are some functions such as "create_snapshot" and "next_record" which would check the correctness of the content of the "packed-ref" file. When anything is bad, the program will die. It may seem that we have nothing relevant to above feature, because we are going to read and parse the raw "packed-ref" file without creating the snapshot and using the ref iterator to check the consistency. However, when using "get_worktrees" in "builtin/refs", we would parse the "HEAD" information. If the referent of the "HEAD" is inside the "packed-ref", we will call "create_snapshot" function to parse the "packed-ref" to get the information. No matter whether the entry of "HEAD" in "packed-ref" is correct, "create_snapshot" would call "verify_buffer_safe" to check whether there is a newline in the last line of the file. If not, the program will die. Although this behavior has no harm for the program, it will short-circuit the program. When the users execute "git refs verify" or "git fsck", we should avoid reading the head information, which may execute the read operation in packed backend with stricter checks to die the program. Instead, we should continue to check other parts of the "packed-refs" file completely. Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing worktrees, 2023-12-29), we have introduced a function "get_worktrees_internal" which allows us to get worktrees without reading head information. Create a new exposed function "get_worktrees_without_reading_head", then replace the "get_worktrees" in "builtin/refs" with the new created function. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-26Merge branch 'jk/check-mailmap-wo-name-fix'Junio C Hamano1-1/+1
"git check-mailmap" segfault fix. * jk/check-mailmap-wo-name-fix: mailmap: fix check-mailmap with full mailmap line
2025-02-25Merge branch 'pw/merge-tree-stdin-deadlock-fix'Junio C Hamano1-6/+5
"git merge-tree --stdin" has been improved (including a workaround for a deadlock). * pw/merge-tree-stdin-deadlock-fix: merge-tree: fix link formatting in html docs merge-tree: improve docs for --stdin merge-tree: only use basic merge config merge-tree: remove redundant code merge-tree --stdin: flush stdout to avoid deadlock
2025-02-21mailmap: fix check-mailmap with full mailmap lineJacob Keller1-1/+1
I recently had reported to me a crash from a coworker using the recently added sendemail mailmap support: 3724814 Segmentation fault (core dumped) git check-mailmap "bugs@company.xx" This appears to happen because of the NULL pointer name passed into map_user(). Fix this by passing "" instead of NULL so that we have a valid pointer. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-21Merge branch 'ua/update-server-info-sans-the-repository'Junio C Hamano1-4/+4
Code clean-up. * ua/update-server-info-sans-the-repository: builtin/update-server-info: remove the_repository global variable