aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2025-07-07repository: move 'repository_format_precious_objects' to repo scopeAyush Chandekar8-7/+9
The 'extensions.preciousObjects' setting when set true, prevents operations that might drop objects from the object storage. This setting is populated in the global variable 'repository_format_precious_objects'. Move this global variable to repo scope by adding it to 'struct repository and also refactor all the occurences accordingly. This change is part of an ongoing effort to eliminate global variables, improve modularity and help libify the codebase. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07u-string-list: move "remove duplicates" test to "u-string-list.c"shejialuo4-67/+62
We use "test-tool string-list remove_duplicates" to test the "string_list_remove_duplicates" function. As we have introduced the unit test, we'd better remove the logic from shell script to C program to improve test speed and readability. As all the tests in shell script are removed, let's just delete the "t0063-string-list.sh" and update the "meson.build" file to align with this change. Also we could simply remove "DISABLE_SIGN_COMPARE_WARNINGS" due to we have already deleted related code. Unfortunately, we cannot totally remove "test-string-list.c" due to that we would test the performance of sorting about string list by executing "test-tool string-list sort" in "p0071-sort.sh". Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07u-string-list: move "filter string" test to "u-string-list.c"shejialuo3-32/+73
We use "test-tool string-list filter" to test the "filter_string_list" function. As we have introduced the unit test, we'd better remove the logic from shell script to C program to improve test speed and readability. Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07u-string-list: move "test_split_in_place" to "u-string-list.c"shejialuo3-73/+37
We use "test-tool string-list split_in_place" to test the "string_list_split_in_place" function. As we have introduced the unit test, we'd better remove the logic from shell script to C program to improve test speed and readability. Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07u-string-list: move "test_split" into "u-string-list.c"shejialuo5-67/+57
We rely on "test-tool string-list" command to test the functionality of the "string-list". However, as we have introduced clar test framework, we'd better move the shell script into C program to improve speed and readability. Create a new file "u-string-list.c" under "t/unit-tests", then update the Makefile and "meson.build" to build the file. And let's first move "test_split" into unit test and gradually convert the shell script into C program. In order to create `string_list` easily by simply specifying strings in the function call, create "t_vcreate_string_list_dup" function to do this. Then port the shell script tests to C program and remove unused "test-tool" code and tests. Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07string-list: enable sign compare warnings checkshejialuo1-11/+9
In "add_entry", we call "get_entry_index" function to get the inserted position. However, as the return type of "get_entry_index" function is `int`, there is a sign compare warning when comparing the `index` with the `list-nr` of unsigned type. "get_entry_index" would always return unsigned index. However, the current binary search algorithm initializes "left" to be "-1", which necessitates the use of signed `int` return type. The reason why we need to assign "left" to be "-1" is that in the `while` loop, we increment "left" by 1 to determine whether the loop should end. This design choice, while functional, forces us to use signed arithmetic throughout the function. To resolve this sign comparison issue, let's modify the binary search algorithm with the following approach: 1. Initialize "left" to 0 instead of -1 2. Use `left < right` as the loop termination condition instead of `left + 1 < right` 3. When searching the right part, set `left = middle + 1` instead of `middle` Then, we could delete "#define DISABLE_SIGN_COMPARE_WARNING" to enable sign warnings check for "string-list". Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07string-list: return index directly when inserting an existing elementshejialuo1-5/+1
When inserting an existing element, "add_entry" would convert "index" value to "-1-index" to indicate the caller that this element is in the list already. However, in "string_list_insert", we would simply convert this to the original positive index without any further action. In 8fd2cb4069 (Extract helper bits from c-merge-recursive work, 2006-07-25), we create "path-list.c" and then introduce above code path. Let's directly return the index as we don't care about whether the element is in the list by using "add_entry". In the future, if we want to let "add_entry" tell the caller, we may add "int *exact_match" parameter to "add_entry" instead of converting the index to negative to indicate. Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07string-list: remove unused "insert_at" parameter from add_entryshejialuo1-3/+3
In "add_entry", we accept "insert_at" parameter which must be either -1 (auto) or between 0 and `list->nr` inclusive. Any other value is invalid. When caller specify any invalid "insert_at" value, we won't check the range and move the element, which would definitely cause the trouble. However, we only use "add_entry" in "string_list_insert" function and we always pass the "-1" for "insert_at" parameter. So, we never use this parameter to insert element in a user specified position. And we should know why there is such code path in the first place. We used to have another function "string_list_insert_at_index()", which uses the extra "insert_at" parameter. And in f8c4ab611a (string_list: remove string_list_insert_at_index() from its API, 2014-11-24), we remove this function but we don't clean all the code path. Let's simply delete this parameter as we'd better use "strmap" for such functionality. Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07string-list: fix sign compare warnings for loop iteratorshejialuo1-12/+10
There are a couple of "-Wsign-compare" warnings in "string-list.c". Fix trivial ones that result from a mismatched loop iterator type. There is a single warning left after these fixes. This warning needs a bit more care and is thus handled in subsequent commits. Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07read-cache: report lock error when refreshing indexHan Young2-13/+8
In the repo_refresh_and_write_index of read-cache.c, we return -1 to indicate that writing the index to disk failed. However, callers do not use this information. Commands such as stash print "could not write index" and then exit, which does not help to discover the exact problem. We can let repo_hold_locked_index print the error message if the locking failed. Signed-off-by: Han Young <hanyang.tony@bytedance.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07apply docs: clarify wording for --intent-to-addRaymond E. Pasco1-4/+5
Avoid using a double negative, and keep in mind that --index and --cached are distinct modes of operation. Signed-off-by: Raymond E. Pasco <ray@ameretat.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07t4140: test apply --intent-to-add interactionsRaymond E. Pasco1-1/+30
Test that applying a new file creation patch with --intent-to-add to an existing index does not modify the index outside adding the correct intents-to-add, and that applying a patch with both modifications and new file creations with --intent-to-add correctly only adds intents-to-add to the index. Signed-off-by: Raymond E. Pasco <ray@ameretat.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07apply: only write intents to add for new filesRaymond E. Pasco1-1/+1
In the "apply only to files" mode (i.e., neither --index nor --cached mode), the index should not be touched except to record intents to add when --intent-to-add is on. Because having --intent-to-add on sets update_index, to indicate that we may touch the index, we can't rely only on that flag in create_file() (which is called to write both new files and updated files) to decide whether to write an index entry; if we did, we would write an index entry for every file being patched (which would moreover be an intent-to-add entry despite not being a new file, because we are going to turn on the CE_INTENT_TO_ADD flag in add_index_entry() if we enter it here and ita_only is true). To decide whether to touch the index, we need to check the specific reason the index would be updated, rather than merely their aggregate in the update_index flag. Because we have already entered write_out_results() and are performing writes, we know that state->apply is true. If state->check_index is additionally true, we are in --index or --cached mode, which updates the index and should always write, whereas if we are merely in ita_only mode we must only write if the patch is a new file creation patch. Signed-off-by: Raymond E. Pasco <ray@ameretat.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07apply: read in the index in --intent-to-add modeRaymond E. Pasco1-1/+1
There are three main modes of operation for apply: applying only to the worktree, applying to the worktree and index (--index), and applying only to the index (--cached). The --intent-to-add flag modifies the first of these modes, applying only to the worktree, in a way which touches the index, because intents to add are special index entries. However, since its introduction in cff5dc09ed (apply: add --intent-to-add, 2018-05-26), it has not worked correctly in any but the most trivial (empty repository) cases, because the index is never read in (in apply, this is done in read_apply_cache()) before writing to it. This causes the operation to clobber the old, correct index with a new empty-tree index before writing intent-to-add entries to this empty index; the final result is that the index now records every existing file in the repository as deleted, which is incorrect. This error can be corrected by first reading the index. The update_index flag is correctly set if ita_only is true, because this flag causes the index to be updated. However, if we merely gate the call to read_apply_cache() behind update_index, then it will not be read when state->apply is false, even if it must be checked due to being in --index or --cached mode. Therefore, we instead read the index if it will be either checked or updated, because reading the index is a prerequisite to either. Reported-by: Ryan Hodges <rhodges@cisco.com> Original-patch-by: Johannes Altmanninger <aclopte@gmail.com> Signed-off-by: Raymond E. Pasco <ray@ameretat.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07setup_revisions(): turn on diffs for all-negative diff filterJeff King2-1/+7
When the user gives us a diff filter like --diff-filter=D, we need to do a tree diff even if we're not planning to show the diff result itself, in order to decide whether to show the commit at all. So there's an explicit check of revs->diffopt.filter in setup_revisions(), and we set revs->diff if any bits are set. Originally that "filter" field covered both positive capital-letter filters (like "D") and also negative lowercase filters (like "d"), so it was sufficient for both cases. But later, 75408ca949 (diff-filter: be more careful when looking for negative bits, 2022-01-28) split the negative bits out into a "filter_not" field. We eventually fold those into "filter", but not until diff_setup_done() is called, which happens after our explicit check. As a result, a purely negative filter like: git log --diff-filter=d failed to turn on diffs at all. But rather than fail to filter by diff, because the filter variable is eventually set, we mistakenly show no commits at all, thinking that the empty diffs were cases where nothing passed through the filter. The smallest fix here is to just have our check look for any bits in either "filter" or "filter_not". I suspect it would also be OK to reorder the function a bit to call diff_setup_done() earlier, but that risks violating some other subtle ordering dependency. So I went with the simple and safe solution here. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07setup: use "reftable" format when experimental features are enabledPatrick Steinhardt3-0/+52
With the preceding commit we have announced the switch to the "reftable" format in Git 3.0 for newly created repositories. The format is being battle tested by GitLab and a couple of other developers, and except for a small handful of issues exposed early after it has been merged it has been rock solid. Regardless of that though the test user base is still comparatively small, which increases the risk that we miss critical bugs. Address this by enabling the reftable format when experimental features are enabled. This should increase the test user base by some margin and thus give us more input before making the format the default. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07BreakingChanges: announce switch to "reftable" formatPatrick Steinhardt5-0/+68
The "reftable" format has come a long way and has matured nicely since it has been merged into git via 57db2a094d5 (refs: introduce reftable backend, 2024-02-07). It fixes longstanding issues that cannot be fixed with the "files" format in a backwards-compatible way and performs significantly better in many use cases. Announce that we will switch to the "reftable" format in Git 3.0 for newly created repositories and wire up the change, hidden behind the WITH_BREAKING_CHANGES preprocessor define. This switch is dependent on support in the larger Git ecosystem. Most importantly, libraries like JGit, libgit2 and Gitoxide should support the reftable backend so that we don't break all applications and tools built on top of those libraries. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-02The sixth batchJunio C Hamano1-0/+7
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-02Merge branch 'jt/imap-send-message-fix'Junio C Hamano1-2/+7
Update some error messages from "git imap-send". * jt/imap-send-message-fix: imap-send: improve error messages with configuration hints imap-send: fix confusing 'store' terminology in error message
2025-07-02Merge branch 'ps/contrib-sweep'Junio C Hamano52-6900/+0
Remove bunch of stuff from contrib/ hierarchy. * ps/contrib-sweep: contrib: remove some scripts in "stats" directory contrib: remove "git-new-workdir" contrib: remove "emacs" directory contrib: remove "git-resurrect.sh" contrib: remove "persistent-https" remote helper contrib: remove "mw-to-git" contrib: remove "hooks" directory contrib: remove "thunderbird-patch-inline" contrib: remove remote-helper stubs contrib: remove "examples" directory contrib: remove "remotes2config.sh"
2025-07-02Merge branch 'ag/imap-send-resurrection'Junio C Hamano3-77/+407
"git imap-send" has been broken for a long time, which has been resurrected and then taught to talk OAuth2.0 etc. * ag/imap-send-resurrection: imap-send: fix minor mistakes in the logs imap-send: display the destination mailbox when sending a message imap-send: display port alongwith host when git credential is invoked imap-send: add ability to list the available folders imap-send: enable specifying the folder using the command line imap-send: add PLAIN authentication method to OpenSSL imap-send: add support for OAuth2.0 authentication imap-send: gracefully fail if CRAM-MD5 authentication is requested without OpenSSL imap-send: fix memory leak in case auth_cram_md5 fails imap-send: fix bug causing cfg->folder being set to NULL
2025-07-02gitremote-helpers.adoc: fix formattingBrett A C Sheffield1-1/+1
Add missing colon to fix formatting. Signed-off-by: Brett A C Sheffield <bacs@librecast.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-02build: retire NO_UINTMAX_TCarlo Marcelo Arenas Belón3-24/+0
A previous commit removed the last user of it, and it is no longer useful with the codebase moving towards C99, which specifies its definition. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-02config.mak.uname: set NO_MEMMEM only for functional versionCarlo Marcelo Arenas Belón1-6/+3
FreeBSD 6 introduced memmem(), but the implementation diverged from what was standard everywhere else (including our "compat" fallback). FreeBSD 10.4 (went EOL in 2018) corrected the functionality bugs but kept a suboptimal implementation until FreeBSD 11.4 (the last version of FreeBSD 11, that went EOL in September 2021). Let's draw the line to require FreeBSD 12 or newer, which allows us to drop the special casing of FreeBSD 4.x and rely on the platform implementation of memmem() unconditionally for all versions that are still being supported. Suggested-by: Brad Smith <brad@comstyle.com> Helped-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-02meson: add rule to run 'git clang-format'Karthik Nayak1-0/+12
The Makefile has a 'style' rule to run 'git clang-format'. While Meson intrinsically supports a 'clang-format' target, which can be run when using the ninja backend by running 'ninja clang-format', this runs the formatting on all existing files. Our Meson build doesn't yet support a way to run 'git clang-format', which runs the formatter between the working directory and commit provided. Add a new 'style' target to Meson to mimic the target in the Makefile. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-02clang-format: add 'RemoveBracesLLVM' to the main configKarthik Nayak2-17/+7
In 1b8f306612 (ci/style-check: add `RemoveBracesLLVM` in CI job, 2024-07-23) we added 'RemoveBracesLLVM' to the CI job of running the clang formatter. This rule checks and warns against using braces on simple single-statement bodies of statements. Since we haven't had any issues regarding this rule, we can now move it into the main clang-format config and remove it from being CI exclusive. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-02clang-format: set 'ColumnLimit' to 0Karthik Nayak1-12/+9
When clang-format was introduced to the Git project in 6134de6ac1 (clang-format: outline the git project's coding style, 2017-08-14), the 'ColumnLimit' was set to 80. This is inline with our recommendation in 'Documentation/CodingGuidelines', which states: We try to keep to at most 80 characters per line. However while this is recommended limit, this is not the enforced limit. In some cases in we do overflow this limit to prioritize readability. Setting the 'ColumnLimit' also means that shorter lines are concatenated to simply as the result would still be below 80 characters, which is undesirable. In the past, we tried to adjust the penalties around line wrapping, once in 42efde4c29 (clang-format: adjust line break penalties, 2017-09-29) and another time in 5e9fa0f9fa (clang-format: re-adjust line break penalties, 2024-10-18). While these settings help tweak the line break penalties to be more in-line with the requirements of the Git project, using 'clang-format' still produces a lot of false positives. So to make 'clang-format' more usable, set the 'ColumnLimit' to 0. This means that line-wrapping is no-longer a concern of the formatter and something that the user needs to take care of. The previous commit also added a more flexible guideline to the '.editorconfig' setting a 'max_line_length' of 120 characters. This should provide some guidance to users. In the future, it would be nice to re-instate this limit with adequate penalties which would follow our guidelines, but currently, it makes more sense to have a working formatter which we can rely on and which doesn't create too many false positives. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01clean up interface for refs_warn_dangling_symrefsPhil Hord4-15/+15
The refs_warn_dangling_symrefs interface is a bit fragile as it passes in printf-formatting strings with expectations about the number of arguments. This patch series made it worse by adding a 2nd positional argument. But there are only two call sites, and they both use almost identical display options. Make this safer by moving the format strings into the function that uses them to make it easier to see when the arguments don't match. Pass a prefix string and a dry_run flag so the decision logic can be handled where needed. Signed-off-by: Phil Hord <phil.hord@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01refs: remove old refs_warn_dangling_symrefPhil Hord2-18/+1
The dangling warning function that takes a single ref to search for is no longer used. Remove it. Signed-off-by: Phil Hord <phil.hord@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01fetch-prune: optimize dangling-ref reportingPhil Hord3-13/+13
When pruning during `git fetch` we check each pruned ref against the ref_store one at a time to decide whether to report it as dangling. This causes every local ref to be scanned for each ref being pruned. If there are N refs in the repo and M refs being pruned, this code is O(M*N). However, `git remote prune` uses a very similar function that is only O(N*log(M)). Remove the wasteful ref scanning for each pruned ref and use the faster version already available in refs_warn_dangling_symrefs. Change the message to include the original refname since the message is no longer printed immediately after the line that did just print the refname. In a repo with 126,000 refs, where I was pruning 28,000 refs, this code made about 3.6 billion calls to strcmp and consumed 410 seconds of CPU. (Invariably in that time, my remote would timeout and the fetch would fail anyway.) After this change, the same operation completes in under a second. Signed-off-by: Phil Hord <phil.hord@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01Enable SHA-256 by default in breaking changes modebrian m. carlson2-2/+8
Our document on breaking changes indicates that we intend to default to SHA-256 in Git 3.0. Since most people choose the default option, this is an important security upgrade to our defaults. To allow people to test this case, when WITH_BREAKING_CHANGES is set in the configuration, build Git with SHA-256 as the default hash. Update the testsuite to use the build options information to automatically choose the right value. Note that if the command substitution for GIT_TEST_BUILTIN_HASH fails, so does the testsuite—and quite spectacularly at that. Thus, the case where the Git binary is somehow subtly broken will not go undetected. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01help: add a build option for default hashbrian m. carlson1-0/+1
We'd like users to be able to determine the hash algorithm that is the builtin default in their version of Git. This is useful for troubleshooting, especially when we decide to change the default. Add an entry for the default hash in the output of git version --build-options so that users can easily access that information and include it in bug reports. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01t5300: choose the built-in hash outside of a repobrian m. carlson1-3/+3
Right now, the built-in default hash is always SHA-1, but that will change in a future commit. Instead of assuming that operating outside of a repository will always use SHA-1, look up the default hash algorithm for operating outside of a repository using an appropriate environment variable, which will always be correct. Additionally, for operations outside of a repository, use the DEFAULT_HASH_ALGORITHM prerequisite rather than SHA1. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01t4042: choose the built-in hash outside of a repobrian m. carlson1-2/+10
Right now, the built-in default hash is always SHA-1, but that will change in a future commit. Instead of assuming that operating outside of a repository will always use SHA-1, provide constants for both algorithms and then simply ask test_oid for the built-in hash instead, which will always be correct. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01t1007: choose the built-in hash outside of a repobrian m. carlson1-2/+2
Right now, the built-in default hash is always SHA-1, but that will change in a future commit. Instead of assuming that operating outside of a repository will always use SHA-1, simply ask test_oid for the built-in hash instead, which will always be correct. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01t: default to compile-time default hash if not setbrian m. carlson2-2/+10
Right now, the default compile-time hash is SHA-1. However, in the future, this might change and it would be helpful to gracefully handle this case in our testsuite. To avoid making these assumptions, let's introduce a variable that contains the built-in default hash and use it in our setup code as the fallback value if no hash was explicitly set. For now, this is always SHA-1, but in a future commit, we'll allow adjusting this and the variable will be more useful. To allow us to make our tests more robust, allow test_oid to take the --hash=builtin option to specify this hash, whatever it is. Additionally, add a DEFAULT_HASH_ALGORITHM prerequisite to check for the compile-time hash. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01setup: use the default algorithm to initialize repo formatbrian m. carlson2-2/+5
When we define a new repository format with REPOSITORY_FORMAT_INIT, we always use GIT_HASH_SHA1, and this value ends up getting used as the default value to initialize a repository if none of the command line, environment, or config tell us to do otherwise. Because we might not always want to use SHA-1 as the default, let's instead specify the default hash algorithm constant so that we will use whatever the specified default is. However, we also need to continue to read older repositories. If we're in a v0 repository or extensions.objectformat is not set, then we must continue to default to the original hash algorithm: SHA-1. If an algorithm is set explicitly, however, it will override the hash_algo member of the repository_format struct and we'll get the right value. Similarly, if the repository was initialized before Git 0.99.3, then it may lack a core.repositoryformatversion key, and some repositories lack a config file altogether. In both cases, format->version is -1 and we need to assume that SHA-1 is in use. Because clear_repository_format reinitializes the struct repository_format and therefore sets the hash_algo member to the default (which could in the future not be SHA-1), we need to reset this member explicitly. We know, however, that at the point we call read_repository_format, we are actually reading an existing repository and not initializing a new one or operating outside of a repository, so we are not changing the default behavior back to SHA-1 if the default algorithm is different. It is potentially questionable that we ignore all repository configuration if there is a config file but it doesn't have core.repositoryformatversion set, in which case we reset all of the configuration to the default. However, it is unclear what the right thing to do instead with such an old repository is and a simple git init will add the missing entry, so for now, we simply honor what the existing code does and reset the value to the default, simply adding our initialization to SHA-1. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01Use legacy hash for legacy formatsbrian m. carlson9-13/+13
We have a large variety of data formats and protocols where no hash algorithm was defined and the default was assumed to always be SHA-1. Instead of explicitly stating SHA-1, let's use the constant to represent the legacy hash algorithm (which is still SHA-1) so that it's clear for documentary purposes that it's a legacy fallback option and not an intentional choice to use SHA-1. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01builtin: use default hash when outside a repositorybrian m. carlson8-8/+8
We have some commands that can operate inside or outside a repository. If we're operating outside a repository, we clearly cannot use the repository's hash algorithm as a default since it doesn't exist, so instead, let's pick the default instead of specifically SHA-1. Right now this results in no functional change since the default is SHA-1, but that may change in the future. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01hash: add a constant for the legacy hash algorithmbrian m. carlson1-0/+2
We have a a variety of uses of GIT_HASH_SHA1 littered throughout our code. Some of these really mean to represent specifically SHA-1, but some actually represent the original hash algorithm used in Git which is implied by older, legacy formats and protocols which do not contain hash information. For instance, the bundle v1 and v2 formats do not contain hash algorithm information, and thus SHA-1 is implied by the use of these formats. Add a constant for documentary purposes which indicates this value. It will always be the same as SHA-1, since this is an essential part of these formats, but its use indicates this particular reason and not any other reason why SHA-1 might be used. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01hash: add a constant for the default hash algorithmbrian m. carlson1-0/+2
Right now, SHA-1 is the default hash algorithm in Git. However, this may change in the future. We have many places in our code that use the SHA-1 constant to indicate the default hash if none is specified, but it will end up being more practical to specify this explicitly and clearly using a constant for whatever the default hash algorithm is. Then, if we decide to change it in the future, we can simply replace the constant representing the default with a new value. For these reasons, introduce GIT_HASH_DEFAULT to represent the default hash algorithm. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `read_object_with_reference()`Patrick Steinhardt8-45/+37
Rename `read_object_with_reference()` to `odb_read_object_peeled()` to match other functions related to the object database and our modern coding guidelines. Furthermore though, the old name didn't really describe very well what this function actually does, which is to walk down any commit and tag objects until an object of the required type has been found. This is generally referred to as "peeling", so the new name should be way more descriptive. No compatibility wrapper is introduced as the function is not used a lot throughout our codebase. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `pretend_object_file()`Patrick Steinhardt3-13/+14
Rename `pretend_object_file()` to `odb_pretend_object()` to match other functions related to the object database and our modern coding guidelines. No compatibility wrapper is introduced as the function is not used a lot throughout our codebase. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `has_object()`Patrick Steinhardt30-74/+87
Rename `has_object()` to `odb_has_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `repo_read_object_file()`Patrick Steinhardt47-150/+157
Rename `repo_read_object_file()` to `odb_read_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `oid_object_info()`Patrick Steinhardt48-170/+213
Rename `oid_object_info()` to `odb_read_object_info()` as well as their `_extended()` variant to match other functions related to the object database and our modern coding guidelines. Introduce compatibility wrappers so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: trivial refactorings to get rid of `the_repository`Patrick Steinhardt1-16/+16
All of the external functions provided by the object database subsystem don't depend on `the_repository` anymore, but some internal functions still do. Refactor those cases by plumbing through the repository that owns the object database. This change allows us to get rid of the `USE_THE_REPOSITORY_VARIABLE` preprocessor define. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: get rid of `the_repository` when handling submodule sourcesPatrick Steinhardt6-43/+50
The "--recursive" flag for git-grep(1) allows users to grep for a string across submodule boundaries. To make this work we add each submodule's object sources to our own object database so that the objects can be accessed directly. The infrastructure for this depends on a global string list of submodule paths. The caller is expected to call `add_submodule_odb_by_path()` for each source and the object database will then eventually register all submodule sources via `do_oid_object_info_extended()` in case it isn't able to look up a specific object. This reliance on global state is of course suboptimal with regards to our libification efforts. Refactor the logic so that the list of submodule sources is instead tracked in the object database itself. This allows us to lose the condition of `r == the_repository` before registering submodule sources as we only ever add submodule sources to `the_repository` anyway. As such, behaviour before and after this refactoring should always be the same. Rename the functions accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: get rid of `the_repository` when handling the primary sourcePatrick Steinhardt3-27/+36
The functions `set_temporary_primary_odb()` and `restore_primary_odb()` are responsible for managing a temporary primary source for the database. Both of these functions implicitly rely on `the_repository`. Refactor them to instead take an explicit object database parameter as argument and adjust callers. Rename the functions accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: get rid of `the_repository` in `for_each()` functionsPatrick Steinhardt8-28/+47
There are a couple of iterator-style functions that execute a callback for each instance of a given set, all of which currently depend on `the_repository`. Refactor them to instead take an object database as parameter so that we can get rid of this dependency. Rename the functions accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>