aboutsummaryrefslogtreecommitdiffstats
path: root/builtin/fast-import.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-11-06Merge branch 'cc/fast-import-export-i18n-cleanup'Junio C Hamano1-140/+140
Messages from fast-import/export are now marked for i18n. * cc/fast-import-export-i18n-cleanup: gpg-interface: mark a string for translation fast-import: mark strings for translation fast-export: mark strings for translation gpg-interface: use left shift to define GPG_VERIFY_* gpg-interface: simplify ssh fingerprint parsing
2025-10-30Merge branch 'ps/remove-packfile-store-get-packs'Junio C Hamano1-2/+2
Two slightly different ways to get at "all the packfiles" in API has been cleaned up. * ps/remove-packfile-store-get-packs: packfile: rename `packfile_store_get_all_packs()` packfile: introduce macro to iterate through packs packfile: drop `packfile_store_get_packs()` builtin/grep: simplify how we preload packs builtin/gc: convert to use `packfile_store_get_all_packs()` object-name: convert to use `packfile_store_get_all_packs()`
2025-10-30fast-import: mark strings for translationChristian Couder1-140/+140
Some error or warning messages in "builtin/fast-import.c" are marked for translation, but many are not. To be more consistent and provide a better experience to people using a translated version, let's mark all the remaining error or warning messages for translation. While at it, let's make the following small changes: - replace "GIT" or "git" in a few error messages to just "Git", - replace "Expected from command, got %s" to "expected 'from' command, got '%s'", which makes it clearer that "from" is a command and should not be translated, - downcase error and warning messages that start with an uppercase, - fix test cases in "t9300-fast-import.sh" that broke because an error or warning message was downcased, - split error and warning messages that are too long, - adjust the indentation of some arguments of the error functions. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-28Merge branch 'cc/fast-import-strip-signed-tags'Junio C Hamano1-0/+43
"git fast-import" is taught to handle signed tags, just like it recently learned to handle signed commits, in different ways. * cc/fast-import-strip-signed-tags: fast-import: add '--signed-tags=<mode>' option fast-export: handle all kinds of tag signatures t9350: properly count annotated tags lib-gpg: allow tests with GPGSM or GPGSSH prereq first doc: git-tag: stop focusing on GPG signed tags
2025-10-16packfile: rename `packfile_store_get_all_packs()`Patrick Steinhardt1-2/+2
In a preceding commit we have removed `packfile_store_get_packs()`. With this function removed it's somewhat useless to still have the "all" infix in `packfile_store_get_all_packs()`. Rename the latter to drop that infix. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-13fast-import: add '--signed-tags=<mode>' optionChristian Couder1-0/+43
Recently, eaaddf5791 (fast-import: add '--signed-commits=<mode>' option, 2025-09-17) added support for controlling how signed commits are handled by `git fast-import`, but there is no option yet to decide about signed tags. To remediate that, let's add a '--signed-tags=<mode>' option to `git fast-import` too. With this, both `git fast-export` and `git fast-import` have both a '--signed-tags=<mode>' and a '--signed-commits=<mode>' supporting the same <mode>s. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-07Merge branch 'ps/packfile-store'Junio C Hamano1-4/+6
Code clean-up around the in-core list of all the pack files and object database(s). * ps/packfile-store: packfile: refactor `get_packed_git_mru()` to work on packfile store packfile: refactor `get_all_packs()` to work on packfile store packfile: refactor `get_packed_git()` to work on packfile store packfile: move `get_multi_pack_index()` into "midx.c" packfile: introduce function to load and add packfiles packfile: refactor `install_packed_git()` to work on packfile store packfile: split up responsibilities of `reprepare_packed_git()` packfile: refactor `prepare_packed_git()` to work on packfile store packfile: reorder functions to avoid function declaration odb: move kept cache into `struct packfile_store` odb: move MRU list of packfiles into `struct packfile_store` odb: move packfile map into `struct packfile_store` odb: move initialization bit into `struct packfile_store` odb: move list of packfiles into `struct packfile_store` packfile: introduce a new `struct packfile_store`
2025-09-24packfile: refactor `get_all_packs()` to work on packfile storePatrick Steinhardt1-2/+4
The `get_all_packs()` function prepares the packfile store and then returns its packfiles. Refactor it to accept a packfile store instead of a repository to clarify its scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24packfile: introduce function to load and add packfilesPatrick Steinhardt1-2/+2
We have a recurring pattern where we essentially perform an upsert of a packfile in case it isn't yet known by the packfile store. The logic to do so is non-trivial as we have to reconstruct the packfile's key, check the map of packfiles, then create the new packfile and finally add it to the store. Introduce a new function that does this dance for us. Refactor callsites to use it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24packfile: refactor `install_packed_git()` to work on packfile storePatrick Steinhardt1-1/+1
The `install_packed_git()` functions adds a packfile to a specific object store. Refactor it to accept a packfile store instead of a repository to clarify its scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-17fast-import: add '--signed-commits=<mode>' optionChristian Couder1-10/+53
A '--signed-commits=<mode>' option is already available when using `git fast-export` to decide what should be done at export time about commit signatures. At import time though, there is no option, or other way, in `git fast-import` to decide about commit signatures. To remediate that, let's add a '--signed-commits=<mode>' option to `git fast-import` too. For now the supported <mode>s are the same as those supported by `git fast-export`. The code responsible for consuming a signature is refactored into the import_one_signature() and discard_one_signature() functions, which makes it easier to follow the logic and add new modes in the future. In the 'strip' and 'warn-strip' modes, we deliberately use discard_one_signature() to discard the signature without parsing it. This ensures that even malformed signatures, which would cause the parser to fail, can be successfully stripped from a commit. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-05Merge branch 'ps/object-file-wo-the-repository'Junio C Hamano1-2/+2
Reduce implicit assumption and dependence on the_repository in the object-file subsystem. * ps/object-file-wo-the-repository: object-file: get rid of `the_repository` in index-related functions object-file: get rid of `the_repository` in `force_object_loose()` object-file: get rid of `the_repository` in `read_loose_object()` object-file: get rid of `the_repository` in loose object iterators object-file: remove declaration for `for_each_file_in_obj_subdir()` object-file: inline `for_each_loose_file_in_objdir_buf()` object-file: get rid of `the_repository` when writing objects odb: introduce `odb_write_object()` loose: write loose objects map via their source object-file: get rid of `the_repository` in `finalize_object_file()` object-file: get rid of `the_repository` in `loose_object_info()` object-file: get rid of `the_repository` when freshening objects object-file: inline `check_and_freshen()` functions object-file: get rid of `the_repository` in `has_loose_object()` object-file: stop using `the_hash_algo` object-file: fix -Wsign-compare warnings
2025-08-04Merge branch 'ps/config-wo-the-repository'Junio C Hamano1-6/+6
The config API had a set of convenience wrapper functions that implicitly use the_repository instance; they have been removed and inlined at the calling sites. * ps/config-wo-the-repository: (21 commits) config: fix sign comparison warnings config: move Git config parsing into "environment.c" config: remove unused `the_repository` wrappers config: drop `git_config_set_multivar()` wrapper config: drop `git_config_get_multivar_gently()` wrapper config: drop `git_config_set_multivar_in_file_gently()` wrapper config: drop `git_config_set_in_file_gently()` wrapper config: drop `git_config_set()` wrapper config: drop `git_config_set_gently()` wrapper config: drop `git_config_set_in_file()` wrapper config: drop `git_config_get_bool()` wrapper config: drop `git_config_get_ulong()` wrapper config: drop `git_config_get_int()` wrapper config: drop `git_config_get_string()` wrapper config: drop `git_config_get_string()` wrapper config: drop `git_config_get_string_multi()` wrapper config: drop `git_config_get_value()` wrapper config: drop `git_config_get_value()` wrapper config: drop `git_config_get()` wrapper config: drop `git_config_clear()` wrapper ...
2025-07-23Merge branch 'cc/fast-import-export-signature-names'Junio C Hamano1-22/+91
Clean up the way how signature on commit objects are exported to and imported from fast-import stream. * cc/fast-import-export-signature-names: fast-(import|export): improve on commit signature output format
2025-07-23config: drop `git_config_get_ulong()` wrapperPatrick Steinhardt1-2/+2
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_ulong()`. All callsites are adjusted so that they use `repo_config_get_ulong(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_get_int()` wrapperPatrick Steinhardt1-3/+3
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_int()`. All callsites are adjusted so that they use `repo_config_get_int(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config()` wrapperPatrick Steinhardt1-1/+1
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config()`. All callsites are adjusted so that they use `repo_config(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-16object-file: get rid of `the_repository` in `finalize_object_file()`Patrick Steinhardt1-2/+2
We implicitly depend on `the_repository` when moving an object file into place in `finalize_object_file()`. Get rid of this global dependency by passing in a repository. Note that one might be pressed to inject an object database instead of a repository. But the function doesn't really care about the ODB at all. All it does is to move a file into place while checking whether there is any collision. As such, the functionality it provides is independent of the object database and only needs the repository as parameter so that it can adjust permissions of the file we are about to finalize. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-09fast-(import|export): improve on commit signature output formatChristian Couder1-22/+91
A recent commit, d9cb0e6ff8 (fast-export, fast-import: add support for signed-commits, 2025-03-10), added support for signed commits to fast-export and fast-import. When a signed commit is processed, fast-export can output either "gpgsig sha1" or "gpgsig sha256" depending on whether the signed commit uses the SHA-1 or SHA-256 Git object format. However, this implementation has a number of limitations: - the output format was not properly described in the documentation, - the output format is not very informative as it doesn't even say if the signature is an OpenPGP, an SSH, or an X509 signature, - the implementation doesn't support having both one signature on the SHA-1 object and one on the SHA-256 object. Let's improve on these limitations by improving fast-export and fast-import so that: - all the signatures are exported, - at most one signature on the SHA-1 object and one on the SHA-256 are imported, - if there is more than one signature on the SHA-1 object or on the SHA-256 object, fast-import emits a warning for each additional signature, - the output format is "gpgsig <git-hash-algo> <signature-format>", where <git-hash-algo> is the Git object format as before, and <signature-format> is the signature type ("openpgp", "x509", "ssh" or "unknown"), - the output is properly documented. About the output format: - <git-hash-algo> allows to know which representation of the commit was signed (the SHA-1 or the SHA-256 version) which helps with both signature verification and interoperability between repos with different hash functions, - <signature-format> helps tools that process the fast-export stream, so they don't have to parse the ASCII armor to identify the signature type. It could be even better to be able to import more than one signature on the SHA-1 object and on the SHA-256 object, but other parts of Git don't handle that well for now, so this is left for future improvements. Helped-by: brian m. carlson <sandals@crustytoothpaste.net> Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `read_object_with_reference()`Patrick Steinhardt1-11/+8
Rename `read_object_with_reference()` to `odb_read_object_peeled()` to match other functions related to the object database and our modern coding guidelines. Furthermore though, the old name didn't really describe very well what this function actually does, which is to walk down any commit and tag objects until an object of the required type has been found. This is generally referred to as "peeling", so the new name should be way more descriptive. No compatibility wrapper is introduced as the function is not used a lot throughout our codebase. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `repo_read_object_file()`Patrick Steinhardt1-4/+4
Rename `repo_read_object_file()` to `odb_read_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `oid_object_info()`Patrick Steinhardt1-8/+9
Rename `oid_object_info()` to `odb_read_object_info()` as well as their `_extended()` variant to match other functions related to the object database and our modern coding guidelines. Introduce compatibility wrappers so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: get rid of `the_repository` in `odb_mkstemp()`Patrick Steinhardt1-1/+2
Get rid of our dependency on `the_repository` in `odb_mkstemp()` by passing in the object database as a parameter and adjusting all callers. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename files to "odb.{c,h}"Patrick Steinhardt1-1/+1
In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-29object-store: move and rename `odb_pack_keep()`Patrick Steinhardt1-1/+2
The function `odb_pack_keep()` creates a file at the passed-in path. If this fails, then the function re-tries by first creating any potentially missing leading directories and then trying to create the file once again. As such, this function doesn't host any kind of logic that is specific to the object store, but is rather a generic helper function. Rename the function to `safe_create_file_with_leading_directories()` and move it into "path.c". While at it, refactor it so that it loses its dependency on `the_repository`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-store: merge "object-store-ll.h" and "object-store.h"Patrick Steinhardt1-1/+1
The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-file: move `safe_create_leading_directories()` into "path.c"Patrick Steinhardt1-1/+1
The `safe_create_leading_directories()` function and its relatives are located in "object-file.c", which is not a good fit as they provide generic functionality not related to objects at all. Move them into "path.c", which already hosts `safe_create_dir()` and its relative `safe_create_dir_in_gitdir()`. "path.c" is free of `the_repository`, but the moved functions depend on `the_repository` to read the "core.sharedRepository" config. Adapt the function signature to accept a repository as argument to fix the issue and adjust callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08Merge branch 'ps/object-wo-the-repository' into ps/object-file-cleanupJunio C Hamano1-4/+4
* ps/object-wo-the-repository: hash: stop depending on `the_repository` in `null_oid()` hash: fix "-Wsign-compare" warnings object-file: split out logic regarding hash algorithms delta-islands: stop depending on `the_repository` object-file-convert: stop depending on `the_repository` pack-bitmap-write: stop depending on `the_repository` pack-revindex: stop depending on `the_repository` pack-check: stop depending on `the_repository` environment: move access to "core.bigFileThreshold" into repo settings pack-write: stop depending on `the_repository` and `the_hash_algo` object: stop depending on `the_repository` csum-file: stop depending on `the_repository`
2025-03-29Merge branch 'cc/signed-fast-export-import'Junio C Hamano1-0/+23
"git fast-export | git fast-import" learns to deal with commit and tag objects with embedded signatures a bit better. * cc/signed-fast-export-import: fast-export, fast-import: add support for signed-commits fast-export: do not modify memory from get_commit_buffer git-fast-export.adoc: clarify why 'verbatim' may not be a good idea fast-export: rename --signed-tags='warn' to 'warn-verbatim' fast-export: fix missing whitespace after switch git-fast-import.adoc: add missing LF in the BNF
2025-03-10fast-export, fast-import: add support for signed-commitsLuke Shumaker1-0/+23
fast-export has a --signed-tags= option that controls how to handle tag signatures. However, there is no equivalent for commit signatures; it just silently strips the signature out of the commit (analogously to --signed-tags=strip). While signatures are generally problematic for fast-export/fast-import (because hashes are likely to change), if they're going to support tag signatures, there's no reason to not also support commit signatures. So, implement a --signed-commits= option that mirrors the --signed-tags= option. On the fast-export side, try to be as much like signed-tags as possible, in both implementation and in user-interface. This will change the default behavior to '--signed-commits=abort' from what is now '--signed-commits=strip'. In order to provide an escape hatch for users of third-party tools that call fast-export and do not yet know of the --signed-commits= option, add an environment variable 'FAST_EXPORT_SIGNED_COMMITS_NOABORT=1' that changes the default to '--signed-commits=warn-strip'. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10environment: move access to "core.bigFileThreshold" into repo settingsPatrick Steinhardt1-2/+2
The "core.bigFileThreshold" setting is stored in a global variable and populated via `git_default_core_config()`. This may cause issues in the case where one is handling multiple different repositories in a single process with different values for that config key, as we may or may not see the correct value in that case. Furthermore, global state blocks our path towards libification. Refactor the code so that we instead store the value in `struct repo_settings`, where the value is computed as-needed and cached. Note that this change requires us to adapt one test in t1050 that verifies that we die when parsing an invalid "core.bigFileThreshold" value. The exercised Git command doesn't use the value at all, and thus it won't hit the new code path that parses the value. This is addressed by using git-hash-object(1) instead, which does read the value. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10pack-write: stop depending on `the_repository` and `the_hash_algo`Patrick Steinhardt1-1/+1
There are a couple of functions in "pack-write.c" that implicitly depend on `the_repository` or `the_hash_algo`. Remove this dependency by injecting the repository via a parameter and adapt callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10csum-file: stop depending on `the_repository`Patrick Steinhardt1-1/+1
There are multiple sites in "csum-file.c" where we use the global `the_repository` variable, either explicitly or implicitly by using `the_hash_algo`. Refactor the code to stop using `the_repository` by adapting functions to receive required data as parameters. Adapt callsites accordingly by either using `the_repository->hash_algo`, or by using a context-provided hash algorithm in case the subsystem already got rid of its dependency on `the_repository`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-05Merge branch 'ps/path-sans-the-repository'Junio C Hamano1-2/+2
The path.[ch] API takes an explicit repository parameter passed throughout the callchain, instead of relying on the_repository singleton instance. * ps/path-sans-the-repository: path: adjust last remaining users of `the_repository` environment: move access to "core.sharedRepository" into repo settings environment: move access to "core.hooksPath" into repo settings repo-settings: introduce function to clear struct path: drop `git_path()` in favor of `repo_git_path()` rerere: let `rerere_path()` write paths into a caller-provided buffer path: drop `git_common_path()` in favor of `repo_common_path()` worktree: return allocated string from `get_worktree_git_dir()` path: drop `git_path_buf()` in favor of `repo_git_path_replace()` path: drop `git_pathdup()` in favor of `repo_git_path()` path: drop unused `strbuf_git_path()` function path: refactor `repo_submodule_path()` family of functions submodule: refactor `submodule_to_gitdir()` to accept a repo path: refactor `repo_worktree_path()` family of functions path: refactor `repo_git_path()` family of functions path: refactor `repo_common_path()` family of functions
2025-02-10Merge branch 'ps/hash-cleanup'Junio C Hamano1-8/+8
Further code clean-up on the use of hash functions. Now the context object knows what hash function it is working with. * ps/hash-cleanup: global: adapt callers to use generic hash context helpers hash: provide generic wrappers to update hash contexts hash: stop typedeffing the hash context hash: convert hashing context to a structure
2025-02-07path: drop `git_pathdup()` in favor of `repo_git_path()`Patrick Steinhardt1-2/+2
Remove `git_pathdup()` in favor of `repo_git_path()`. The latter does essentially the same, with the only exception that it does not rely on `the_repository` but takes the repo as separate parameter. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03Merge branch 'kn/pack-write-with-reduced-globals'Junio C Hamano1-5/+6
Code clean-up. * kn/pack-write-with-reduced-globals: pack-write: pass hash_algo to internal functions pack-write: pass hash_algo to `write_rev_file()` pack-write: pass hash_algo to `write_idx_file()` pack-write: pass repository to `index_pack_lockfile()` pack-write: pass hash_algo to `fixup_pack_header_footer()`
2025-02-03Merge branch 'tb/unsafe-hash-cleanup'Junio C Hamano1-1/+1
The API around choosing to use unsafe variant of SHA-1 implementation has been updated in an attempt to make it harder to abuse. * tb/unsafe-hash-cleanup: hash.h: drop unsafe_ function variants csum-file: introduce hashfile_checkpoint_init() t/helper/test-hash.c: use unsafe_hash_algo() csum-file.c: use unsafe_hash_algo() hash.h: introduce `unsafe_hash_algo()` csum-file.c: extract algop from hashfile_checksum_valid() csum-file: store the hash algorithm as a struct field t/helper/test-tool: implement sha1-unsafe helper
2025-01-31global: adapt callers to use generic hash context helpersPatrick Steinhardt1-6/+6
Adapt callers to use generic hash context helpers instead of using the hash algorithm to update them. This makes the callsites easier to reason about and removes the possibility that the wrong hash algorithm is used to update the hash context's state. And as a nice side effect this also gets rid of a bunch of users of `the_hash_algo`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31hash: stop typedeffing the hash contextPatrick Steinhardt1-2/+2
We generally avoid using `typedef` in the Git codebase. One exception though is the `git_hash_ctx`, likely because it used to be a union rather than a struct until the preceding commit refactored it. But now that it is a normal `struct` there isn't really a need for a typedef anymore. Drop the typedef and adapt all callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31Merge branch 'tb/unsafe-hash-cleanup' into ps/hash-cleanupJunio C Hamano1-1/+1
* tb/unsafe-hash-cleanup: hash.h: drop unsafe_ function variants csum-file: introduce hashfile_checkpoint_init() t/helper/test-hash.c: use unsafe_hash_algo() csum-file.c: use unsafe_hash_algo() hash.h: introduce `unsafe_hash_algo()` csum-file.c: extract algop from hashfile_checksum_valid() csum-file: store the hash algorithm as a struct field t/helper/test-tool: implement sha1-unsafe helper
2025-01-23csum-file: introduce hashfile_checkpoint_init()Taylor Blau1-1/+1
In 106140a99f (builtin/fast-import: fix segfault with unsafe SHA1 backend, 2024-12-30) and 9218c0bfe1 (bulk-checkin: fix segfault with unsafe SHA1 backend, 2024-12-30), we observed the effects of failing to initialize a hashfile_checkpoint with the same hash function implementation as is used by the hashfile it is used to checkpoint. While both 106140a99f and 9218c0bfe1 work around the immediate crash, changing the hash function implementation within the hashfile API to, for example, the non-unsafe variant would re-introduce the crash. This is a result of the tight coupling between initializing hashfiles and hashfile_checkpoints. Introduce and use a new function which ensures that both parts of a hashfile and hashfile_checkpoint pair use the same hash function implementation to avoid such crashes. A few things worth noting: - In the change to builtin/fast-import.c::stream_blob(), we can see that by removing the explicit reference to 'the_hash_algo->unsafe_init_fn()', we are hardened against the hashfile API changing away from the_hash_algo (or its unsafe variant) in the future. - The bulk-checkin code no longer needs to explicitly zero-initialize the hashfile_checkpoint, since it is now done as a result of calling 'hashfile_checkpoint_init()'. - Also in the bulk-checkin code, we add an additional call to prepare_to_stream() outside of the main loop in order to initialize 'state->f' so we know which hash function implementation to use when calling 'hashfile_checkpoint_init()'. This is OK, since subsequent 'prepare_to_stream()' calls are noops. However, we only need to call 'prepare_to_stream()' when we have the HASH_WRITE_OBJECT bit set in our flags. Without that bit, calling 'prepare_to_stream()' does not assign 'state->f', so we have nothing to initialize. - Other uses of the 'checkpoint' in 'deflate_blob_to_pack()' are appropriately guarded. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21pack-write: pass hash_algo to `write_idx_file()`Karthik Nayak1-2/+2
The `write_idx_file()` function uses the global `the_hash_algo` variable to access the repository's hash_algo. To avoid global variable usage, pass a hash_algo from the layers above. Since `stage_tmp_packfiles()` also resides in 'pack-write.c' and calls `write_idx_file()`, update it to accept a `struct git_hash_algo` as a parameter and pass it through to the callee. Altough the layers above could have access to the hash_algo internally, simply pass in `the_hash_algo`. This avoids any compatibility issues and bubbles up global variable usage to upper layers which can be eventually resolved. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21pack-write: pass hash_algo to `fixup_pack_header_footer()`Karthik Nayak1-3/+4
The `fixup_pack_header_footer()` function uses the global `the_hash_algo` variable to access the repository's hash function. To avoid global variable usage, pass a hash_algo from the layers above. Altough the layers above could have access to the hash_algo internally, simply pass in `the_hash_algo`. This avoids any compatibility issues and bubbles up global variable usage to upper layers which can be eventually resolved. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17builtin: send usage() help text to standard outputJunio C Hamano1-2/+1
Using the show_usage_and_exit_if_asked() helper we introduced earlier, fix callers of usage() that want to show the help text when explicitly asked by the end-user. The help text now goes to the standard output stream for them. These are the bog standard "if we got only '-h', then that is a request for help" callers. Their if (argc == 2 && !strcmp(argv[1], "-h")) usage(message); are simply replaced with show_usage_and_exit_if_asked(argc, argv, message); With this, the built-ins tested by t0012 all send their help text to their standard output stream, so the check in t0012 that was half tightened earlier is now fully tightened to insist on standard error stream being empty. Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-01Merge branch 'ps/weak-sha1-for-tail-sum-fix'Junio C Hamano1-1/+1
An earlier "csum-file checksum does not have to be computed with sha1dc" topic had a few code paths that had initialized an implementation of a hash function to be used by an unmatching hash by mistake, which have been corrected. * ps/weak-sha1-for-tail-sum-fix: ci: exercise unsafe OpenSSL backend builtin/fast-import: fix segfault with unsafe SHA1 backend bulk-checkin: fix segfault with unsafe SHA1 backend
2024-12-30builtin/fast-import: fix segfault with unsafe SHA1 backendPatrick Steinhardt1-1/+1
Same as with the preceding commit, git-fast-import(1) is using the safe variant to initialize a hashfile checkpoint. This leads to a segfault when passing the checkpoint into the hashfile subsystem because it would use the unsafe variants instead: ++ git --git-dir=R/.git fast-import --big-file-threshold=1 AddressSanitizer:DEADLYSIGNAL ================================================================= ==577126==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000040 (pc 0x7ffff7a01a99 bp 0x5070000009c0 sp 0x7fffffff5b30 T0) ==577126==The signal is caused by a READ memory access. ==577126==Hint: address points to the zero page. #0 0x7ffff7a01a99 in EVP_MD_CTX_copy_ex (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) #1 0x555555ddde56 in openssl_SHA1_Clone ../sha1/openssl.h:40:2 #2 0x555555dce2fc in git_hash_sha1_clone_unsafe ../object-file.c:123:2 #3 0x555555c2d5f8 in hashfile_checkpoint ../csum-file.c:211:2 #4 0x5555559647d1 in stream_blob ../builtin/fast-import.c:1110:2 #5 0x55555596247b in parse_and_store_blob ../builtin/fast-import.c:2031:3 #6 0x555555967f91 in file_change_m ../builtin/fast-import.c:2408:5 #7 0x55555595d8a2 in parse_new_commit ../builtin/fast-import.c:2768:4 #8 0x55555595bb7a in cmd_fast_import ../builtin/fast-import.c:3614:4 #9 0x555555b1f493 in run_builtin ../git.c:480:11 #10 0x555555b1bfef in handle_builtin ../git.c:740:9 #11 0x555555b1e6f4 in run_argv ../git.c:807:4 #12 0x555555b1b87a in cmd_main ../git.c:947:19 #13 0x5555561649e6 in main ../common-main.c:64:11 #14 0x7ffff742a1fb in __libc_start_call_main (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4) #15 0x7ffff742a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4) #16 0x555555772c84 in _start (git+0x21ec84) ==577126==Register values: rax = 0x0000511000000cc0 rbx = 0x0000000000000000 rcx = 0x000000000000000c rdx = 0x0000000000000000 rdi = 0x0000000000000000 rsi = 0x00005070000009c0 rbp = 0x00005070000009c0 rsp = 0x00007fffffff5b30 r8 = 0x0000000000000000 r9 = 0x0000000000000000 r10 = 0x0000000000000000 r11 = 0x00007ffff7a01a30 r12 = 0x0000000000000000 r13 = 0x00007fffffff6b60 r14 = 0x00007ffff7ffd000 r15 = 0x00005555563b9910 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) in EVP_MD_CTX_copy_ex ==577126==ABORTING ./test-lib.sh: line 1039: 577126 Aborted git --git-dir=R/.git fast-import --big-file-threshold=1 < input error: last command exited with $?=134 not ok 167 - R: blob bigger than threshold The segfault is only exposed in case the unsafe and safe backends are different from one another. Fix the issue by initializing the context with the unsafe SHA1 variant. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-23Merge branch 'ps/build-sign-compare'Junio C Hamano1-0/+2
Start working to make the codebase buildable with -Wsign-compare. * ps/build-sign-compare: t/helper: don't depend on implicit wraparound scalar: address -Wsign-compare warnings builtin/patch-id: fix type of `get_one_patchid()` builtin/blame: fix type of `length` variable when emitting object ID gpg-interface: address -Wsign-comparison warnings daemon: fix type of `max_connections` daemon: fix loops that have mismatching integer types global: trivial conversions to fix `-Wsign-compare` warnings pkt-line: fix -Wsign-compare warning on 32 bit platform csum-file: fix -Wsign-compare warning on 32-bit platform diff.h: fix index used to loop through unsigned integer config.mak.dev: drop `-Wno-sign-compare` global: mark code units that generate warnings with `-Wsign-compare` compat/win32: fix -Wsign-compare warning in "wWinMain()" compat/regex: explicitly ignore "-Wsign-compare" warnings git-compat-util: introduce macros to disable "-Wsign-compare" warnings
2024-12-13Merge branch 'kn/midx-wo-the-repository'Junio C Hamano1-7/+8
Yet another "pass the repository through the callchain" topic. * kn/midx-wo-the-repository: midx: inline the `MIDX_MIN_SIZE` definition midx: pass down `hash_algo` to functions using global variables midx: pass `repository` to `load_multi_pack_index` midx: cleanup internal usage of `the_repository` and `the_hash_algo` midx-write: pass down repository to `write_midx_file[_only]` write-midx: add repository field to `write_midx_context` midx-write: use `revs->repo` inside `read_refs_snapshot` midx-write: pass down repository to static functions packfile.c: remove unnecessary prepare_packed_git() call midx: add repository to `multi_pack_index` struct config: make `packed_git_(limit|window_size)` non-global variables config: make `delta_base_cache_limit` a non-global variable packfile: pass down repository to `for_each_packed_object` packfile: pass down repository to `has_object[_kept]_pack` packfile: pass down repository to `odb_pack_name` packfile: pass `repository` to static function in the file packfile: use `repository` from `packed_git` directly packfile: add repository to struct `packed_git`
2024-12-13Merge branch 'en/fast-import-verify-path'Junio C Hamano1-0/+6
"git fast-import" learned to reject paths with ".." and "." as their components to avoid creating invalid tree objects. * en/fast-import-verify-path: t9300: test verification of renamed paths fast-import: disallow more path components fast-import: disallow "." and ".." path components