aboutsummaryrefslogtreecommitdiffstats
path: root/builtin (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-04-17Merge branch 'en/merge-recursive-debug'Junio C Hamano5-34/+10
Remove remnants of the recursive merge strategy backend, which was superseded by the ort merge strategy. * en/merge-recursive-debug: builtin/{merge,rebase,revert}: remove GIT_TEST_MERGE_ALGORITHM tests: remove GIT_TEST_MERGE_ALGORITHM and test_expect_merge_algorithm merge-recursive.[ch]: thoroughly debug these merge, sequencer: switch recursive merges over to ort sequencer: switch non-recursive merges over to ort merge-ort: enable diff-algorithms other than histogram builtin/merge-recursive: switch to using merge_ort_generic() checkout: replace merge_trees() with merge_ort_nonrecursive()
2025-04-17Merge branch 'kn/blame-porcelain-unblamable'Junio C Hamano1-0/+15
"git blame --porcelain" mode now talks about unblamable lines and lines that are blamed to an ignored commit. * kn/blame-porcelain-unblamable: blame: print unblamable and ignored commits in porcelain mode
2025-04-17Merge branch 'jk/fetch-follow-remote-head-fix'Junio C Hamano1-25/+11
"git fetch [<remote>]" with only the configured fetch refspec should be the only thing to update refs/remotes/<remote>/HEAD, but the code was overly eager to do so in other cases. * jk/fetch-follow-remote-head-fix: fetch: make set_head() call easier to read fetch: don't ask for remote HEAD if followRemoteHEAD is "never" fetch: only respect followRemoteHEAD with configured refspecs
2025-04-17parse-options: detect mismatches in integer signednessPatrick Steinhardt3-5/+5
It was reported that "t5620-backfill.sh" fails on s390x and sparc64 in a test that exercises the "--min-batch-size" command line option. The symptom was that the option didn't seem to have an effect: we didn't fetch objects with a batch size of 20, but instead fetched all objects at once. As it turns out, the root cause is that `--min-batch-size` uses `OPT_INTEGER()` to parse the command line option. While this macro expects the caller to pass a pointer to an integer, we instead pass a pointer to a `size_t`. This coincidentally works on most platforms, but it breaks apart on the mentioned platforms because they are big endian. This issue isn't specific to git-backfill(1): there are a couple of other places where we have the same type confusion going on. This indicates that the issue really is the interface that the parse-options subsystem provides -- it is simply too easy to get this wrong as there isn't any kind of compiler warning, and things just work on the most common systems. Address the systemic issue by introducing two new build asserts `BARF_UNLESS_SIGNED()` and `BARF_UNLESS_UNSIGNED()`. As the names already hint at, those macros will cause a compiler error when passed a value that is not signed or unsigned, respectively. Adapt `OPT_INTEGER()`, `OPT_UNSIGNED()` as well as `OPT_MAGNITUDE()` to use those asserts. This uncovers a small set of sites where we indeed have the same bug as in git-backfill(1). Adapt all of them to use the correct option. Reported-by: Todd Zullinger <tmz@pobox.com> Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Helped-by: SZEDER Gábor <szeder.dev@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-17parse-options: introduce precision handling for `OPTION_INTEGER`Patrick Steinhardt4-0/+5
The `OPTION_INTEGER` option type accepts a signed integer. The type of the underlying integer is a simple `int`, which restricts the range of values accepted by such options. But there is a catch: because the caller provides a pointer to the value via the `.value` field, which is a simple void pointer. This has two consequences: - There is no check whether the passed value is sufficiently long to store the entire range of `int`. This can lead to integer wraparound in the best case and out-of-bounds writes in the worst case. - Even when a caller knows that they want to store a value larger than `INT_MAX` they don't have a way to do so. In practice this doesn't tend to be a huge issue because users typically don't end up passing huge values to most commands. But the parsing logic is demonstrably broken, and it is too easy to get the calling convention wrong. Improve the situation by introducing a new `precision` field into the structure. This field gets assigned automatically by `OPT_INTEGER_F()` and tracks the size of the passed value. Like this it becomes possible for the caller to pass arbitrarily-sized integers and the underlying logic knows to handle it correctly by doing range checks. Furthermore, convert the code to use `strtoimax()` intstead of `strtol()` so that we can also parse values larger than `LONG_MAX`. Note that we do not yet assert signedness of the passed variable, which is another source of bugs. This will be handled in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-17parse-options: rename `OPT_MAGNITUDE()` to `OPT_UNSIGNED()`Patrick Steinhardt4-11/+11
With the preceding commit, `OPT_INTEGER()` has learned to support unit factors. Consequently, the major differencen between `OPT_INTEGER()` and `OPT_MAGNITUDE()` isn't the support of unit factors anymore, as both of them do support them now. Instead, the difference is that one handles signed and the other handles unsigned integers. Adapt the name of `OPT_MAGNITUDE()` accordingly by renaming it to `OPT_UNSIGNED()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-17global: use designated initializers for optionsPatrick Steinhardt20-132/+368
While we expose macros for most of our different option types understood by the "parse-options" subsystem, not every combination of fields that has one as that would otherwise quickly lead to an explosion of macros. Instead, we just initialize structures manually for those variants of fields that don't have a macro. Callsites that open-code these structure initialization don't use designated initializers though and instead just provide values for each of the fields that they want to initialize. This has three significant downsides: - Callsites need to specify all values up to the last field that they care about. This often includes fields that should simply be left at their default zero-initialized state, which adds distraction. - Any reader not deeply familiar with the layout of the structure has a hard time figuring out what the respective initializers mean. - Reordering or introducing new fields in the middle of the structure is impossible without adapting all callsites. Convert all sites to instead use designated initializers, which we have started using in our codebase quite a while ago. This allows us to skip any default-initialized fields, gives the reader context by specifying the field names and allows us to reorder or introduce new fields where we want to. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-16builtin/gc.c: correct RAM calculation when using sysinfoRamsay Jones1-2/+7
The man page for sysinfo(2) on Linux states that (from v2.3.48) the sizes of the memory and swap fields, of the returned structure, are given as multiples of 'mem_unit' bytes. In earlier versions (prior to v2.3.23 on i386 in particular), the 'mem_unit' field was not part of the structure, and all sizes were measured in bytes. The man page does not discuss the motivation for this change, but it is possible that the change was intended for the, relatively rare, 32-bit platform with more than 4GB of memory. The total_ram() function makes the assumption that the 'totalram' field of the 'struct sysinfo' is measured in bytes, or alternatively that the 'mem_unit' field is always equal to one. Having writen a program to call the sysinfo() function and print the structure fields, it seems that, on Linux x84_64 and i686 anyway, the 'mem_unit' field is indeed set to one (note that the 32-bit system had only 2GB ram). However, cygwin also has an sysinfo() implementation, which gives the following values: $ ./sysinfo uptime: 21381 loads: 0, 0, 0 total ram: 2074637 free ram: 843237 shared ram: 0 buffer ram: 0 total swap: 327680 free swap: 306932 procs: 15 total high: 0 free high: 0 mem_unit: 4096 total ram: 8497713152 $ [This laptop has 8GB ram, so a little bit seems to be missing. ;) ] Modify the total_ram() function to allow for the possibility that the memory size is not specified in bytes (ie 'mem_unit' is greater than one). Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-16Merge branch 'ps/cat-file-filter-batch'Junio C Hamano3-71/+191
"git cat-file --batch" and friends learned to allow "--filter=" to omit certain objects, just like the transport layer does. * ps/cat-file-filter-batch: builtin/cat-file: use bitmaps to efficiently filter by object type builtin/cat-file: deduplicate logic to iterate over all objects pack-bitmap: introduce function to check whether a pack is bitmapped pack-bitmap: add function to iterate over filtered bitmapped objects pack-bitmap: allow passing payloads to `show_reachable_fn()` builtin/cat-file: support "object:type=" objects filter builtin/cat-file: support "blob:limit=" objects filter builtin/cat-file: support "blob:none" objects filter builtin/cat-file: wire up an option to filter objects builtin/cat-file: introduce function to report object status builtin/cat-file: rename variable that tracks usage
2025-04-16Merge branch 'kn/non-transactional-batch-updates'Junio C Hamano2-6/+62
Updating multiple references have only been possible in all-or-none fashion with transactions, but it can be more efficient to batch multiple updates even when some of them are allowed to fail in a best-effort manner. A new "best effort batches of updates" mode has been introduced. * kn/non-transactional-batch-updates: update-ref: add --batch-updates flag for stdin mode refs: support rejection in batch updates during F/D checks refs: implement batch reference update support refs: introduce enum-based transaction error types refs/reftable: extract code from the transaction preparation refs/files: remove duplicate duplicates check refs: move duplicate refname update check to generic layer refs/files: remove redundant check in split_symref_update()
2025-04-16Merge branch 'ps/maintenance-reflog-expire'Junio C Hamano2-148/+77
"git maintenance" learns a new task to expire reflog entries. * ps/maintenance-reflog-expire: builtin/maintenance: introduce "reflog-expire" task builtin/gc: split out function to expire reflog entries builtin/reflog: make functions regarding `reflog_expire_options` public builtin/reflog: stop storing per-reflog expiry dates globally builtin/reflog: stop storing default reflog expiry dates globally reflog: rename `cmd_reflog_expire_cb` to `reflog_expire_options`
2025-04-16Merge branch 'jt/rev-list-z'Junio C Hamano1-24/+69
"git rev-list" learns machine-parsable output format that delimits each field with NUL. * jt/rev-list-z: rev-list: support NUL-delimited --missing option rev-list: support NUL-delimited --boundary option rev-list: support delimiting objects with NUL bytes rev-list: refactor early option parsing rev-list: inline `show_object_with_name()` in `show_object()`
2025-04-16Merge branch 'ab/rm-sign-compare'Junio C Hamano1-12/+9
Some warnings from "-Wsign-compare" for builtin/rm.c have been squelched. * ab/rm-sign-compare: rm: fix sign comparison warnings
2025-04-16Merge branch 'jt/ref-transaction-abort-fix'Junio C Hamano1-1/+8
A ref transaction corner case fix. * jt/ref-transaction-abort-fix: builtin/fetch: avoid aborting closed reference transaction
2025-04-15Merge branch 'js/comma-semicolon-confusion'Junio C Hamano1-1/+1
Code clean-up. * js/comma-semicolon-confusion: detect-compiler: detect clang even if it found CUDA clang: warn when the comma operator is used compat/regex: explicitly mark intentional use of the comma operator wildmatch: avoid using of the comma operator diff-delta: avoid using the comma operator xdiff: avoid using the comma operator unnecessarily clar: avoid using the comma operator unnecessarily kwset: avoid using the comma operator unnecessarily rebase: avoid using the comma operator unnecessarily remote-curl: avoid using the comma operator unnecessarily
2025-04-15Merge branch 'jt/clone-guess-remote-head-fix'Junio C Hamano3-4/+7
"git clone" still gave the message about the default branch name; this message has been turned into an advice message that can be turned off. * jt/clone-guess-remote-head-fix: advice: allow disabling default branch name advice builtin/clone: suppress unexpected default branch advice remote: allow `guess_remote_head()` to suppress advice
2025-04-15Merge branch 'ds/maintenance-loose-objects-batchsize'Junio C Hamano1-0/+20
The job to coalesce loose objects into packfiles in "git maintenance" now has configurable batch size. * ds/maintenance-loose-objects-batchsize: maintenance: add loose-objects.batchSize config maintenance: force progress/no-quiet to children
2025-04-15Merge branch 'kn/reflog-drop'Junio C Hamano1-2/+66
"git reflog" learns "drop" subcommand, that discards the entire reflog data for a ref. * kn/reflog-drop: reflog: implement subcommand to drop reflogs reflog: improve error for when reflog is not found
2025-04-15Merge branch 'ps/object-wo-the-repository'Junio C Hamano21-65/+73
The object layer has been updated to take an explicit repository instance as a parameter in more code paths. * ps/object-wo-the-repository: hash: stop depending on `the_repository` in `null_oid()` hash: fix "-Wsign-compare" warnings object-file: split out logic regarding hash algorithms delta-islands: stop depending on `the_repository` object-file-convert: stop depending on `the_repository` pack-bitmap-write: stop depending on `the_repository` pack-revindex: stop depending on `the_repository` pack-check: stop depending on `the_repository` environment: move access to "core.bigFileThreshold" into repo settings pack-write: stop depending on `the_repository` and `the_hash_algo` object: stop depending on `the_repository` csum-file: stop depending on `the_repository`
2025-04-15object-store: merge "object-store-ll.h" and "object-store.h"Patrick Steinhardt38-38/+38
The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-file: split up concerns of `HASH_*` flagsPatrick Steinhardt3-8/+19
The functions `hash_object_file()`, `write_object_file()` and `index_fd()` reuse the same set of flags to alter their behaviour. This not only adds confusion, but given that every function only supports a subset of the flags it becomes very hard to see which flags can be passed to what function. Last but not least, this entangles the implementation of all three function families. Split up concerns by creating separate flags for each of the function families. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-file: split out functions relating to object store subsystemPatrick Steinhardt8-0/+8
While we have the "object-store.h" header, most of the functionality for object stores is actually hosted in "object-file.c". This makes it hard to find relevant functions and causes us to mix up concerns. Split out functions relating to the object store subsystem into a new "object-store.c" file. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-file: move `safe_create_leading_directories()` into "path.c"Patrick Steinhardt14-40/+43
The `safe_create_leading_directories()` function and its relatives are located in "object-file.c", which is not a good fit as they provide generic functionality not related to objects at all. Move them into "path.c", which already hosts `safe_create_dir()` and its relative `safe_create_dir_in_gitdir()`. "path.c" is free of `the_repository`, but the moved functions depend on `the_repository` to read the "core.sharedRepository" config. Adapt the function signature to accept a repository as argument to fix the issue and adjust callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-file: move `mkdir_in_gitdir()` into "path.c"Patrick Steinhardt1-1/+2
The `mkdir_in_gitdir()` function is similar to `safe_create_dir()`, but the former is hosted in "object-file.c" whereas the latter is hosted in "path.c". The latter code unit makes way more sense though as the logic has nothing to do with object files in particular. Move the file into "path.c". While at it, we: - Rename the function to `safe_create_dir_in_gitdir()` so that the function names are similar to one another. - Remove the dependency on `the_repository` by making the callers pass the repository instead. Adjust callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-14doc: convert git-mv to new documentation formatJean-Noël Avila1-1/+1
- Switch the synopsis to a synopsis block which will automatically format placeholders in italics and keywords in monospace - Use _<placeholder>_ instead of <placeholder> in the description - Use `backticks` for keywords and more complex option descriptions. The new rendering engine will apply synopsis rules to these spans. Unfortunately, there's an inconsistency in the synopsis style, where the ellipsis is used to indicate that the option can be repeated, but it can also be used in Git's three-dot notation to indicate a range of commits. The rendering engine will not be able to distinguish between these two cases. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-14doc: move synopsis git-mv commands in the synopsis sectionJean-Noël Avila1-1/+2
This also entails changing the help output for the command to match the new synopsis. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-09fetch: make set_head() call easier to readJeff King1-4/+5
We ignore any error returned from set_head(), but 638060dcb9 (fetch set_head: refactor to use remote directly, 2025-01-26) left its call in a noop "if" conditional as a sort of note-to-self. When c834d1a7ce (fetch: only respect followRemoteHEAD with configured refspecs, 2025-03-18) added a "do_set_head" flag, it was rolled into the same conditional, putting set_head() on the right-hand side of a short-circuit AND. That's not wrong, but it really hides the point of the line, which is (maybe) calling the function. Instead, let's have a full if() block for the flag, and then our comment (with some rewording) will be sufficient to clarify the error handling. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08builtin/update-server-info: remove unnecessary if statementUsman Akinyemi1-2/+2
Since we already teach the `repo_config()` in f29f1990 (config: teach repo_config to allow `repo` to be NULL, 2025-03-08) to allow `repo` to be NULL, no need to check if `repo` is NULL before calling `repo_config()`. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08Merge branch 'ps/object-wo-the-repository' into ps/object-file-cleanupJunio C Hamano21-65/+73
* ps/object-wo-the-repository: hash: stop depending on `the_repository` in `null_oid()` hash: fix "-Wsign-compare" warnings object-file: split out logic regarding hash algorithms delta-islands: stop depending on `the_repository` object-file-convert: stop depending on `the_repository` pack-bitmap-write: stop depending on `the_repository` pack-revindex: stop depending on `the_repository` pack-check: stop depending on `the_repository` environment: move access to "core.bigFileThreshold" into repo settings pack-write: stop depending on `the_repository` and `the_hash_algo` object: stop depending on `the_repository` csum-file: stop depending on `the_repository`
2025-04-08builtin/{merge,rebase,revert}: remove GIT_TEST_MERGE_ALGORITHMElijah Newren3-20/+1
This environment variable existed to allow the testsuite to reuse all the merge-related tests in the testsuite while easily flipping between the 'recursive' and the 'ort' backends. Now that we have removed merge-recursive and remapped 'recursive' to mean 'ort', we don't need this scaffolding anymore. Remove it from these three builtins. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08merge, sequencer: switch recursive merges over to ortElijah Newren1-7/+2
More precisely, replace calls to merge_recursive() with merge_ort_recursive(). Also change t7615 to quit calling out recursive; it is not needed anymore, and we are in fact using ort now. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08builtin/merge-recursive: switch to using merge_ort_generic()Elijah Newren1-2/+2
Switch from merge-recursive to merge-ort. Adjust the following testcases due to the switch: * t6430: most of the test differences here were due to improved D/F conflict handling explained in more detail in ef527787089c (merge tests: expect improved directory/file conflict handling in ort, 2020-10-26). These changes weren't made to this test back in that commit simply because I had been looking at `git merge` rather than `git merge-recursive`. The final test in this testsuite, though, was expunged because it was looking for specific output, and the calls to output_commit_title() were discarded from merge_ort_internal() in its adaptation from merge_recursive_internal(); see 8119214f4e70 (merge-ort: implement merge_incore_recursive(), 2020-12-16). * t6434: This test is built entirely around rename/delete conflicts, which had a suboptimal handling under merge-recursive. As explained in more detail in commits 1f3c9ba707 ("t6425: be more flexible with rename/delete conflict messages", 2020-08-10) and 727c75b23f ("t6404, t6423: expect improved rename/delete handling in ort backend", 2020-10-26), rename/delete conflicts should each have two entries in the index rather than just one. Adjust the expectations for all the tests in this testcase to see the two entries per rename/delete conflict. * t6424: merge-recursive had a special check-if-toplevel-trees-match check that it ran at the beginning on both the merge-base and the other side being merged in. In such a case, it exited early and printed an "Already up to date." message. merge-ort got rid of this, and instead checks the merge base tree matching the other side throughout the tree instead of just at the toplevel, allowing it to avoid recursing into various subtrees. As part of that, it got rid of the specialty toplevel message. That message hasn't been missed for years from `git merge`, so I don't think it is necessary to keep it just for `git merge-recursive`, especially since the latter is rarely used. (git itself only references it in the testsuite, whereas it used to power one of the three rebase backends that existed once upon a time.) Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08checkout: replace merge_trees() with merge_ort_nonrecursive()Elijah Newren1-5/+5
Replace the use of merge_trees() from merge-recursive.[ch] with the merge-ort equivalent. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08Merge branch 'tb/incremental-midx-part-2'Junio C Hamano1-1/+2
Incrementally updating multi-pack index files. * tb/incremental-midx-part-2: midx: implement writing incremental MIDX bitmaps pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators pack-bitmap.c: keep track of each layer's type bitmaps ewah: implement `struct ewah_or_iterator` pack-bitmap.c: apply pseudo-merge commits with incremental MIDXs pack-bitmap.c: compute disk-usage with incremental MIDXs pack-bitmap.c: teach `rev-list --test-bitmap` about incremental MIDXs pack-bitmap.c: support bitmap pack-reuse with incremental MIDXs pack-bitmap.c: teach `show_objects_for_type()` about incremental MIDXs pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs pack-bitmap.c: open and store incremental bitmap layers pack-revindex: prepare for incremental MIDX bitmaps Documentation: describe incremental MIDX bitmaps Documentation: remove a "future work" item from the MIDX docs
2025-04-08Merge branch 'tb/refspec-fetch-cleanup'Junio C Hamano2-2/+3
Code clean-up. * tb/refspec-fetch-cleanup: refspec: replace `refspec_item_init()` with fetch/push variants refspec: remove refspec_item_init_or_die() refspec: replace `refspec_init()` with fetch/push variants refspec: treat 'fetch' as a Boolean value
2025-04-08update-ref: add --batch-updates flag for stdin modeKarthik Nayak1-5/+61
When updating multiple references through stdin, Git's update-ref command normally aborts the entire transaction if any single update fails. This atomic behavior prevents partial updates. Introduce a new batch update system, where the updates the performed together similar but individual updates are allowed to fail. Add a new `--batch-updates` flag that allows the transaction to continue even when individual reference updates fail. This flag can only be used in `--stdin` mode and builds upon the batch update support added to the refs subsystem in the previous commits. When enabled, failed updates are reported in the following format: rejected SP (<old-oid> | <old-target>) SP (<new-oid> | <new-target>) SP <rejection-reason> LF Update the documentation to reflect this change and also tests to cover different scenarios where an update could be rejected. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08refs: introduce enum-based transaction error typesKarthik Nayak1-1/+1
Replace preprocessor-defined transaction errors with a strongly-typed enum `ref_transaction_error`. This change: - Improves type safety and function signature clarity. - Makes error handling more explicit and discoverable. - Maintains existing error cases, while adding new error cases for common scenarios. This refactoring paves the way for more comprehensive error handling which we will utilize in the upcoming commits to add batch reference update support. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08builtin/maintenance: introduce "reflog-expire" taskPatrick Steinhardt1-0/+50
By default, git-maintenance(1) uses the "gc" task to ensure that the repository is well-maintained. This can be changed, for example by either explicitly configuring which tasks should be enabled or by using the "incremental" maintenance strategy. If so, git-maintenance(1) does not know to expire reflog entries, which is a subtask that git-gc(1) knows to perform for the user. Consequently, the reflog will grow indefinitely unless the user manually trims it. Introduce a new "reflog-expire" task that plugs this gap: - When running the task directly, then we simply execute `git reflog expire --all`, which is the same as git-gc(1). - When running git-maintenance(1) with the `--auto` flag, then we only run the task in case the "HEAD" reflog has at least N reflog entries that would be discarded. By default, N is set to 100, but this can be configured via "maintenance.reflog-expire.auto". When a negative integer has been provided we always expire entries, zero causes us to never expire entries, and a positive value specifies how many entries need to exist before we consider pruning the entries. Note that the condition for the `--auto` flags is merely a heuristic and optimized for being fast. This is because `git maintenance run --auto` will be executed quite regularly, so scanning through all reflogs would likely be too expensive in many repositories. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08builtin/gc: split out function to expire reflog entriesPatrick Steinhardt1-11/+11
We're about to introduce a new task for git-maintenance(1) that knows to expire reflog entries. The logic will be shared with git-gc(1), which already knows how to do this. Pull out the common logic into a separate function so that we can share the implementation between both builtins. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08builtin/reflog: make functions regarding `reflog_expire_options` publicPatrick Steinhardt1-111/+4
Make functions that are required to manage `reflog_expire_options` available elsewhere by moving them into "reflog.c" and exposing them in the corresponding header. The functions will be used in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08builtin/reflog: stop storing per-reflog expiry dates globallyPatrick Steinhardt1-18/+12
As described in the preceding commit, the per-reflog expiry dates are stored in a global pair of variables. Refactor the code so that they are contained in `struct reflog_expire_options` to make the structure useful in other contexts. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08builtin/reflog: stop storing default reflog expiry dates globallyPatrick Steinhardt1-15/+7
When expiring reflog entries, it is possible to configure expiry dates that depend on the name of the reflog. This requires us to store a couple of different expiry dates: - The default expiry date for reflog entries that aren't otherwise specified. - The per-reflog expiry date. - The currently active set of expiry dates for a given reference. While the last item is stored in `struct reflog_expire_options`, the other items aren't, which makes it hard to reuse the structure in other places. Refactor the code so that the default expiry date is stored as part of the structure. The per-reflog expiry dates will be adapted accordingly in the subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08reflog: rename `cmd_reflog_expire_cb` to `reflog_expire_options`Patrick Steinhardt1-19/+19
We're about to expose `struct cmd_reflog_expire_cb` via "reflog.h" so that we can also use this structure in "builtin/gc.c". Once we make it accessible to a wider scope though it becomes awkwardly named, as it isn't only useful in the context of a callback. Instead, the function is containing all kinds of options relevant to whether or not a reflog entry should be expired. Rename the structure to `reflog_expire_options` to prepare for this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07blame: print unblamable and ignored commits in porcelain modeKarthik Nayak1-0/+15
The 'git-blame(1)' command allows users to ignore specific revisions via the '--ignore-rev <rev>' and '--ignore-revs-file <file>' flags. These flags are often combined with the 'blame.markIgnoredLines' and 'blame.markUnblamableLines' config options. These config options prefix ignored and unblamable lines with a '?' and '*', respectively. However, this option was never extended to the porcelain mode of 'git-blame(1)'. Since the documentation does not indicate this exclusion, it is a bug. Fix this by printing 'ignored' and 'unblamable' respectively for the options when using the porcelain modes. Helped-by: Patrick Steinhardt <ps@pks.im> Helped-by: Toon Claes <toon@iotcl.com> Helped-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07builtin/cat-file: use bitmaps to efficiently filter by object typePatrick Steinhardt1-5/+37
While it is now possible to filter objects by type, this mechanism is for now mostly a convenience. Most importantly, we still have to iterate through the whole packfile to find all objects of a specific type. This can be prohibitively expensive depending on the size of the packfiles. It isn't really possible to do better than this when only considering a packfile itself, as the order of objects is not fixed. But when we have a packfile with a corresponding bitmap, either because the packfile itself has one or because the multi-pack index has a bitmap for it, then we can use these bitmaps to improve the runtime. While bitmaps are typically used to compute reachability of objects, they also contain one bitmap per object type that encodes which object has what type. So instead of reading through the whole packfile(s), we can use the bitmaps and iterate through the type-specific bitmap. Typically, only a subset of packfiles will have a bitmap. But this isn't really much of a problem: we can use bitmaps when available, and then use the non-bitmap walk for every packfile that isn't covered by one. Overall, this leads to quite a significant speedup depending on how many objects of a certain type exist. The following benchmarks have been executed in the Chromium repository, which has a 50GB packfile with almost 25 million objects. As expected, there isn't really much of a change in performance without an object filter: Benchmark 1: cat-file with no-filter (revision = HEAD~) Time (mean ± σ): 89.675 s ± 4.527 s [User: 40.807 s, System: 10.782 s] Range (min … max): 83.052 s … 96.084 s 10 runs Benchmark 2: cat-file with no-filter (revision = HEAD) Time (mean ± σ): 88.991 s ± 2.488 s [User: 42.278 s, System: 10.305 s] Range (min … max): 82.843 s … 91.271 s 10 runs Summary cat-file with no-filter (revision = HEAD) ran 1.01 ± 0.06 times faster than cat-file with no-filter (revision = HEAD~) We still have to scan through all objects as we yield all of them, so using the bitmap in this case doesn't really buy us anything. What is noticeable in this benchmark is that we're I/O-bound, not CPU-bound, as can be seen from the user/system runtimes, which combined are way lower than the overall benchmarked runtime. But when we do use a filter we can see a significant improvement: Benchmark 1: cat-file with filter=object:type=commit (revision = HEAD~) Time (mean ± σ): 86.444 s ± 4.081 s [User: 36.830 s, System: 11.312 s] Range (min … max): 80.305 s … 93.104 s 10 runs Benchmark 2: cat-file with filter=object:type=commit (revision = HEAD) Time (mean ± σ): 2.089 s ± 0.015 s [User: 1.872 s, System: 0.207 s] Range (min … max): 2.073 s … 2.119 s 10 runs Summary cat-file with filter=object:type=commit (revision = HEAD) ran 41.38 ± 1.98 times faster than cat-file with filter=object:type=commit (revision = HEAD~) This is because we don't have to scan through all packfiles anymore, but can instead directly look up relevant objects. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07builtin/cat-file: deduplicate logic to iterate over all objectsPatrick Steinhardt1-37/+48
Pull out a common function that allows us to iterate over all objects in a repository. Right now the logic is trivial and would only require two function calls, making this refactoring a bit pointless. But in the next commit we will iterate on this logic to make use of bitmaps, so this is about to become a bit more complex. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07pack-bitmap: allow passing payloads to `show_reachable_fn()`Patrick Steinhardt2-2/+4
The `show_reachable_fn` callback is used by a couple of functions to present reachable objects to the caller. The function does not provide a way for the caller to pass a payload though, which is functionality that we'll require in a subsequent commit. Change the callback type to accept a payload and adapt all callsites accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07builtin/cat-file: support "object:type=" objects filterPatrick Steinhardt1-1/+11
Implement support for the "object:type=" filter in git-cat-file(1), which causes us to omit all objects that don't match the provided object type. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07builtin/cat-file: support "blob:limit=" objects filterPatrick Steinhardt1-1/+14
Implement support for the "blob:limit=" filter in git-cat-file(1), which causes us to omit all blobs that are bigger than a certain size. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07builtin/cat-file: support "blob:none" objects filterPatrick Steinhardt1-1/+14
Implement support for the "blob:none" filter in git-cat-file(1), which causes us to omit all blobs. Note that this new filter requires us to read the object type via `oid_object_info_extended()` in `batch_object_write()`. But as we try to optimize away reading objects from the database the `data->info.typep` pointer may not be set. We thus have to adapt the logic to conditionally set the pointer in cases where the filter is given. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>