| Age | Commit message (Collapse) | Author | Files | Lines |
|
Remove remnants of the recursive merge strategy backend, which was
superseded by the ort merge strategy.
* en/merge-recursive-debug:
builtin/{merge,rebase,revert}: remove GIT_TEST_MERGE_ALGORITHM
tests: remove GIT_TEST_MERGE_ALGORITHM and test_expect_merge_algorithm
merge-recursive.[ch]: thoroughly debug these
merge, sequencer: switch recursive merges over to ort
sequencer: switch non-recursive merges over to ort
merge-ort: enable diff-algorithms other than histogram
builtin/merge-recursive: switch to using merge_ort_generic()
checkout: replace merge_trees() with merge_ort_nonrecursive()
|
|
"git blame --porcelain" mode now talks about unblamable lines and
lines that are blamed to an ignored commit.
* kn/blame-porcelain-unblamable:
blame: print unblamable and ignored commits in porcelain mode
|
|
"git fetch [<remote>]" with only the configured fetch refspec
should be the only thing to update refs/remotes/<remote>/HEAD,
but the code was overly eager to do so in other cases.
* jk/fetch-follow-remote-head-fix:
fetch: make set_head() call easier to read
fetch: don't ask for remote HEAD if followRemoteHEAD is "never"
fetch: only respect followRemoteHEAD with configured refspecs
|
|
It was reported that "t5620-backfill.sh" fails on s390x and sparc64 in a
test that exercises the "--min-batch-size" command line option. The
symptom was that the option didn't seem to have an effect: we didn't
fetch objects with a batch size of 20, but instead fetched all objects
at once.
As it turns out, the root cause is that `--min-batch-size` uses
`OPT_INTEGER()` to parse the command line option. While this macro
expects the caller to pass a pointer to an integer, we instead pass a
pointer to a `size_t`. This coincidentally works on most platforms, but
it breaks apart on the mentioned platforms because they are big endian.
This issue isn't specific to git-backfill(1): there are a couple of
other places where we have the same type confusion going on. This
indicates that the issue really is the interface that the parse-options
subsystem provides -- it is simply too easy to get this wrong as there
isn't any kind of compiler warning, and things just work on the most
common systems.
Address the systemic issue by introducing two new build asserts
`BARF_UNLESS_SIGNED()` and `BARF_UNLESS_UNSIGNED()`. As the names
already hint at, those macros will cause a compiler error when passed a
value that is not signed or unsigned, respectively.
Adapt `OPT_INTEGER()`, `OPT_UNSIGNED()` as well as `OPT_MAGNITUDE()` to
use those asserts. This uncovers a small set of sites where we indeed
have the same bug as in git-backfill(1). Adapt all of them to use the
correct option.
Reported-by: Todd Zullinger <tmz@pobox.com>
Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `OPTION_INTEGER` option type accepts a signed integer. The type of
the underlying integer is a simple `int`, which restricts the range of
values accepted by such options. But there is a catch: because the
caller provides a pointer to the value via the `.value` field, which is
a simple void pointer. This has two consequences:
- There is no check whether the passed value is sufficiently long to
store the entire range of `int`. This can lead to integer wraparound
in the best case and out-of-bounds writes in the worst case.
- Even when a caller knows that they want to store a value larger than
`INT_MAX` they don't have a way to do so.
In practice this doesn't tend to be a huge issue because users typically
don't end up passing huge values to most commands. But the parsing logic
is demonstrably broken, and it is too easy to get the calling convention
wrong.
Improve the situation by introducing a new `precision` field into the
structure. This field gets assigned automatically by `OPT_INTEGER_F()`
and tracks the size of the passed value. Like this it becomes possible
for the caller to pass arbitrarily-sized integers and the underlying
logic knows to handle it correctly by doing range checks. Furthermore,
convert the code to use `strtoimax()` intstead of `strtol()` so that we
can also parse values larger than `LONG_MAX`.
Note that we do not yet assert signedness of the passed variable, which
is another source of bugs. This will be handled in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
With the preceding commit, `OPT_INTEGER()` has learned to support unit
factors. Consequently, the major differencen between `OPT_INTEGER()` and
`OPT_MAGNITUDE()` isn't the support of unit factors anymore, as both of
them do support them now. Instead, the difference is that one handles
signed and the other handles unsigned integers.
Adapt the name of `OPT_MAGNITUDE()` accordingly by renaming it to
`OPT_UNSIGNED()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While we expose macros for most of our different option types understood
by the "parse-options" subsystem, not every combination of fields that
has one as that would otherwise quickly lead to an explosion of macros.
Instead, we just initialize structures manually for those variants of
fields that don't have a macro.
Callsites that open-code these structure initialization don't use
designated initializers though and instead just provide values for each
of the fields that they want to initialize. This has three significant
downsides:
- Callsites need to specify all values up to the last field that they
care about. This often includes fields that should simply be left at
their default zero-initialized state, which adds distraction.
- Any reader not deeply familiar with the layout of the structure
has a hard time figuring out what the respective initializers mean.
- Reordering or introducing new fields in the middle of the structure
is impossible without adapting all callsites.
Convert all sites to instead use designated initializers, which we have
started using in our codebase quite a while ago. This allows us to skip
any default-initialized fields, gives the reader context by specifying
the field names and allows us to reorder or introduce new fields where
we want to.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The man page for sysinfo(2) on Linux states that (from v2.3.48) the
sizes of the memory and swap fields, of the returned structure, are
given as multiples of 'mem_unit' bytes. In earlier versions (prior to
v2.3.23 on i386 in particular), the 'mem_unit' field was not part of
the structure, and all sizes were measured in bytes. The man page does
not discuss the motivation for this change, but it is possible that the
change was intended for the, relatively rare, 32-bit platform with more
than 4GB of memory.
The total_ram() function makes the assumption that the 'totalram' field
of the 'struct sysinfo' is measured in bytes, or alternatively that the
'mem_unit' field is always equal to one. Having writen a program to call
the sysinfo() function and print the structure fields, it seems that, on
Linux x84_64 and i686 anyway, the 'mem_unit' field is indeed set to one
(note that the 32-bit system had only 2GB ram). However, cygwin also has
an sysinfo() implementation, which gives the following values:
$ ./sysinfo
uptime: 21381
loads: 0, 0, 0
total ram: 2074637
free ram: 843237
shared ram: 0
buffer ram: 0
total swap: 327680
free swap: 306932
procs: 15
total high: 0
free high: 0
mem_unit: 4096
total ram: 8497713152
$
[This laptop has 8GB ram, so a little bit seems to be missing. ;) ]
Modify the total_ram() function to allow for the possibility that the
memory size is not specified in bytes (ie 'mem_unit' is greater than
one).
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git cat-file --batch" and friends learned to allow "--filter=" to
omit certain objects, just like the transport layer does.
* ps/cat-file-filter-batch:
builtin/cat-file: use bitmaps to efficiently filter by object type
builtin/cat-file: deduplicate logic to iterate over all objects
pack-bitmap: introduce function to check whether a pack is bitmapped
pack-bitmap: add function to iterate over filtered bitmapped objects
pack-bitmap: allow passing payloads to `show_reachable_fn()`
builtin/cat-file: support "object:type=" objects filter
builtin/cat-file: support "blob:limit=" objects filter
builtin/cat-file: support "blob:none" objects filter
builtin/cat-file: wire up an option to filter objects
builtin/cat-file: introduce function to report object status
builtin/cat-file: rename variable that tracks usage
|
|
Updating multiple references have only been possible in all-or-none
fashion with transactions, but it can be more efficient to batch
multiple updates even when some of them are allowed to fail in a
best-effort manner. A new "best effort batches of updates" mode
has been introduced.
* kn/non-transactional-batch-updates:
update-ref: add --batch-updates flag for stdin mode
refs: support rejection in batch updates during F/D checks
refs: implement batch reference update support
refs: introduce enum-based transaction error types
refs/reftable: extract code from the transaction preparation
refs/files: remove duplicate duplicates check
refs: move duplicate refname update check to generic layer
refs/files: remove redundant check in split_symref_update()
|
|
"git maintenance" learns a new task to expire reflog entries.
* ps/maintenance-reflog-expire:
builtin/maintenance: introduce "reflog-expire" task
builtin/gc: split out function to expire reflog entries
builtin/reflog: make functions regarding `reflog_expire_options` public
builtin/reflog: stop storing per-reflog expiry dates globally
builtin/reflog: stop storing default reflog expiry dates globally
reflog: rename `cmd_reflog_expire_cb` to `reflog_expire_options`
|
|
"git rev-list" learns machine-parsable output format that delimits
each field with NUL.
* jt/rev-list-z:
rev-list: support NUL-delimited --missing option
rev-list: support NUL-delimited --boundary option
rev-list: support delimiting objects with NUL bytes
rev-list: refactor early option parsing
rev-list: inline `show_object_with_name()` in `show_object()`
|
|
Some warnings from "-Wsign-compare" for builtin/rm.c have been
squelched.
* ab/rm-sign-compare:
rm: fix sign comparison warnings
|
|
A ref transaction corner case fix.
* jt/ref-transaction-abort-fix:
builtin/fetch: avoid aborting closed reference transaction
|
|
Code clean-up.
* js/comma-semicolon-confusion:
detect-compiler: detect clang even if it found CUDA
clang: warn when the comma operator is used
compat/regex: explicitly mark intentional use of the comma operator
wildmatch: avoid using of the comma operator
diff-delta: avoid using the comma operator
xdiff: avoid using the comma operator unnecessarily
clar: avoid using the comma operator unnecessarily
kwset: avoid using the comma operator unnecessarily
rebase: avoid using the comma operator unnecessarily
remote-curl: avoid using the comma operator unnecessarily
|
|
"git clone" still gave the message about the default branch name;
this message has been turned into an advice message that can be
turned off.
* jt/clone-guess-remote-head-fix:
advice: allow disabling default branch name advice
builtin/clone: suppress unexpected default branch advice
remote: allow `guess_remote_head()` to suppress advice
|
|
The job to coalesce loose objects into packfiles in "git
maintenance" now has configurable batch size.
* ds/maintenance-loose-objects-batchsize:
maintenance: add loose-objects.batchSize config
maintenance: force progress/no-quiet to children
|
|
"git reflog" learns "drop" subcommand, that discards the entire
reflog data for a ref.
* kn/reflog-drop:
reflog: implement subcommand to drop reflogs
reflog: improve error for when reflog is not found
|
|
The object layer has been updated to take an explicit repository
instance as a parameter in more code paths.
* ps/object-wo-the-repository:
hash: stop depending on `the_repository` in `null_oid()`
hash: fix "-Wsign-compare" warnings
object-file: split out logic regarding hash algorithms
delta-islands: stop depending on `the_repository`
object-file-convert: stop depending on `the_repository`
pack-bitmap-write: stop depending on `the_repository`
pack-revindex: stop depending on `the_repository`
pack-check: stop depending on `the_repository`
environment: move access to "core.bigFileThreshold" into repo settings
pack-write: stop depending on `the_repository` and `the_hash_algo`
object: stop depending on `the_repository`
csum-file: stop depending on `the_repository`
|
|
The "object-store-ll.h" header has been introduced to keep transitive
header dependendcies and compile times at bay. Now that we have created
a new "object-store.c" file though we can easily move the last remaining
additional bit of "object-store.h", the `odb_path_map`, out of the
header.
Do so. As the "object-store.h" header is now equivalent to its low-level
alternative we drop the latter and inline it into the former.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The functions `hash_object_file()`, `write_object_file()` and
`index_fd()` reuse the same set of flags to alter their behaviour. This
not only adds confusion, but given that every function only supports a
subset of the flags it becomes very hard to see which flags can be
passed to what function. Last but not least, this entangles the
implementation of all three function families.
Split up concerns by creating separate flags for each of the function
families.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While we have the "object-store.h" header, most of the functionality for
object stores is actually hosted in "object-file.c". This makes it hard
to find relevant functions and causes us to mix up concerns.
Split out functions relating to the object store subsystem into a new
"object-store.c" file.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `safe_create_leading_directories()` function and its relatives are
located in "object-file.c", which is not a good fit as they provide
generic functionality not related to objects at all. Move them into
"path.c", which already hosts `safe_create_dir()` and its relative
`safe_create_dir_in_gitdir()`.
"path.c" is free of `the_repository`, but the moved functions depend on
`the_repository` to read the "core.sharedRepository" config. Adapt the
function signature to accept a repository as argument to fix the issue
and adjust callers accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `mkdir_in_gitdir()` function is similar to `safe_create_dir()`, but
the former is hosted in "object-file.c" whereas the latter is hosted in
"path.c". The latter code unit makes way more sense though as the logic
has nothing to do with object files in particular.
Move the file into "path.c". While at it, we:
- Rename the function to `safe_create_dir_in_gitdir()` so that the
function names are similar to one another.
- Remove the dependency on `the_repository` by making the callers pass
the repository instead.
Adjust callers accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
- Switch the synopsis to a synopsis block which will automatically
format placeholders in italics and keywords in monospace
- Use _<placeholder>_ instead of <placeholder> in the description
- Use `backticks` for keywords and more complex option
descriptions. The new rendering engine will apply synopsis rules to
these spans.
Unfortunately, there's an inconsistency in the synopsis style, where
the ellipsis is used to indicate that the option can be repeated, but
it can also be used in Git's three-dot notation to indicate a range of
commits. The rendering engine will not be able to distinguish
between these two cases.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This also entails changing the help output for the command to match the new
synopsis.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We ignore any error returned from set_head(), but 638060dcb9 (fetch
set_head: refactor to use remote directly, 2025-01-26) left its call in
a noop "if" conditional as a sort of note-to-self.
When c834d1a7ce (fetch: only respect followRemoteHEAD with configured
refspecs, 2025-03-18) added a "do_set_head" flag, it was rolled into the
same conditional, putting set_head() on the right-hand side of a
short-circuit AND.
That's not wrong, but it really hides the point of the line, which
is (maybe) calling the function.
Instead, let's have a full if() block for the flag, and then our comment
(with some rewording) will be sufficient to clarify the error handling.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Since we already teach the `repo_config()` in f29f1990 (config:
teach repo_config to allow `repo` to be NULL, 2025-03-08) to allow
`repo` to be NULL, no need to check if `repo` is NULL before calling
`repo_config()`.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* ps/object-wo-the-repository:
hash: stop depending on `the_repository` in `null_oid()`
hash: fix "-Wsign-compare" warnings
object-file: split out logic regarding hash algorithms
delta-islands: stop depending on `the_repository`
object-file-convert: stop depending on `the_repository`
pack-bitmap-write: stop depending on `the_repository`
pack-revindex: stop depending on `the_repository`
pack-check: stop depending on `the_repository`
environment: move access to "core.bigFileThreshold" into repo settings
pack-write: stop depending on `the_repository` and `the_hash_algo`
object: stop depending on `the_repository`
csum-file: stop depending on `the_repository`
|
|
This environment variable existed to allow the testsuite to reuse all
the merge-related tests in the testsuite while easily flipping between
the 'recursive' and the 'ort' backends. Now that we have removed
merge-recursive and remapped 'recursive' to mean 'ort', we don't need
this scaffolding anymore. Remove it from these three builtins.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
More precisely, replace calls to merge_recursive() with
merge_ort_recursive().
Also change t7615 to quit calling out recursive; it is not needed
anymore, and we are in fact using ort now.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Switch from merge-recursive to merge-ort. Adjust the following
testcases due to the switch:
* t6430: most of the test differences here were due to improved D/F
conflict handling explained in more detail in ef527787089c (merge
tests: expect improved directory/file conflict handling in ort,
2020-10-26). These changes weren't made to this test back in that
commit simply because I had been looking at `git merge` rather than
`git merge-recursive`. The final test in this testsuite, though, was
expunged because it was looking for specific output, and the calls to
output_commit_title() were discarded from merge_ort_internal() in its
adaptation from merge_recursive_internal(); see 8119214f4e70
(merge-ort: implement merge_incore_recursive(), 2020-12-16).
* t6434: This test is built entirely around rename/delete conflicts,
which had a suboptimal handling under merge-recursive. As explained
in more detail in commits 1f3c9ba707 ("t6425: be more flexible with
rename/delete conflict messages", 2020-08-10) and 727c75b23f ("t6404,
t6423: expect improved rename/delete handling in ort backend",
2020-10-26), rename/delete conflicts should each have two entries in
the index rather than just one. Adjust the expectations for all the
tests in this testcase to see the two entries per rename/delete
conflict.
* t6424: merge-recursive had a special check-if-toplevel-trees-match
check that it ran at the beginning on both the merge-base and the
other side being merged in. In such a case, it exited early and
printed an "Already up to date." message. merge-ort got rid of
this, and instead checks the merge base tree matching the other
side throughout the tree instead of just at the toplevel, allowing
it to avoid recursing into various subtrees. As part of that, it
got rid of the specialty toplevel message. That message hasn't
been missed for years from `git merge`, so I don't think it is
necessary to keep it just for `git merge-recursive`, especially
since the latter is rarely used. (git itself only references it
in the testsuite, whereas it used to power one of the three
rebase backends that existed once upon a time.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Replace the use of merge_trees() from merge-recursive.[ch] with the
merge-ort equivalent.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Incrementally updating multi-pack index files.
* tb/incremental-midx-part-2:
midx: implement writing incremental MIDX bitmaps
pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators
pack-bitmap.c: keep track of each layer's type bitmaps
ewah: implement `struct ewah_or_iterator`
pack-bitmap.c: apply pseudo-merge commits with incremental MIDXs
pack-bitmap.c: compute disk-usage with incremental MIDXs
pack-bitmap.c: teach `rev-list --test-bitmap` about incremental MIDXs
pack-bitmap.c: support bitmap pack-reuse with incremental MIDXs
pack-bitmap.c: teach `show_objects_for_type()` about incremental MIDXs
pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs
pack-bitmap.c: open and store incremental bitmap layers
pack-revindex: prepare for incremental MIDX bitmaps
Documentation: describe incremental MIDX bitmaps
Documentation: remove a "future work" item from the MIDX docs
|
|
Code clean-up.
* tb/refspec-fetch-cleanup:
refspec: replace `refspec_item_init()` with fetch/push variants
refspec: remove refspec_item_init_or_die()
refspec: replace `refspec_init()` with fetch/push variants
refspec: treat 'fetch' as a Boolean value
|
|
When updating multiple references through stdin, Git's update-ref
command normally aborts the entire transaction if any single update
fails. This atomic behavior prevents partial updates. Introduce a new
batch update system, where the updates the performed together similar
but individual updates are allowed to fail.
Add a new `--batch-updates` flag that allows the transaction to continue
even when individual reference updates fail. This flag can only be used
in `--stdin` mode and builds upon the batch update support added to the
refs subsystem in the previous commits. When enabled, failed updates are
reported in the following format:
rejected SP (<old-oid> | <old-target>) SP (<new-oid> | <new-target>) SP <rejection-reason> LF
Update the documentation to reflect this change and also tests to cover
different scenarios where an update could be rejected.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Replace preprocessor-defined transaction errors with a strongly-typed
enum `ref_transaction_error`. This change:
- Improves type safety and function signature clarity.
- Makes error handling more explicit and discoverable.
- Maintains existing error cases, while adding new error cases for
common scenarios.
This refactoring paves the way for more comprehensive error handling
which we will utilize in the upcoming commits to add batch reference
update support.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
By default, git-maintenance(1) uses the "gc" task to ensure that the
repository is well-maintained. This can be changed, for example by
either explicitly configuring which tasks should be enabled or by using
the "incremental" maintenance strategy. If so, git-maintenance(1) does
not know to expire reflog entries, which is a subtask that git-gc(1)
knows to perform for the user. Consequently, the reflog will grow
indefinitely unless the user manually trims it.
Introduce a new "reflog-expire" task that plugs this gap:
- When running the task directly, then we simply execute `git reflog
expire --all`, which is the same as git-gc(1).
- When running git-maintenance(1) with the `--auto` flag, then we only
run the task in case the "HEAD" reflog has at least N reflog entries
that would be discarded. By default, N is set to 100, but this can
be configured via "maintenance.reflog-expire.auto". When a negative
integer has been provided we always expire entries, zero causes us
to never expire entries, and a positive value specifies how many
entries need to exist before we consider pruning the entries.
Note that the condition for the `--auto` flags is merely a heuristic and
optimized for being fast. This is because `git maintenance run --auto`
will be executed quite regularly, so scanning through all reflogs would
likely be too expensive in many repositories.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We're about to introduce a new task for git-maintenance(1) that knows to
expire reflog entries. The logic will be shared with git-gc(1), which
already knows how to do this.
Pull out the common logic into a separate function so that we can share
the implementation between both builtins.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Make functions that are required to manage `reflog_expire_options`
available elsewhere by moving them into "reflog.c" and exposing them in
the corresponding header. The functions will be used in a subsequent
commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As described in the preceding commit, the per-reflog expiry dates are
stored in a global pair of variables. Refactor the code so that they are
contained in `struct reflog_expire_options` to make the structure useful
in other contexts.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When expiring reflog entries, it is possible to configure expiry dates
that depend on the name of the reflog. This requires us to store a
couple of different expiry dates:
- The default expiry date for reflog entries that aren't otherwise
specified.
- The per-reflog expiry date.
- The currently active set of expiry dates for a given reference.
While the last item is stored in `struct reflog_expire_options`, the
other items aren't, which makes it hard to reuse the structure in other
places.
Refactor the code so that the default expiry date is stored as part of
the structure. The per-reflog expiry dates will be adapted accordingly
in the subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We're about to expose `struct cmd_reflog_expire_cb` via "reflog.h" so
that we can also use this structure in "builtin/gc.c". Once we make it
accessible to a wider scope though it becomes awkwardly named, as it
isn't only useful in the context of a callback. Instead, the function is
containing all kinds of options relevant to whether or not a reflog
entry should be expired.
Rename the structure to `reflog_expire_options` to prepare for this.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The 'git-blame(1)' command allows users to ignore specific revisions via
the '--ignore-rev <rev>' and '--ignore-revs-file <file>' flags. These
flags are often combined with the 'blame.markIgnoredLines' and
'blame.markUnblamableLines' config options. These config options prefix
ignored and unblamable lines with a '?' and '*', respectively.
However, this option was never extended to the porcelain mode of
'git-blame(1)'. Since the documentation does not indicate this
exclusion, it is a bug.
Fix this by printing 'ignored' and 'unblamable' respectively for the
options when using the porcelain modes.
Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Toon Claes <toon@iotcl.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While it is now possible to filter objects by type, this mechanism is
for now mostly a convenience. Most importantly, we still have to iterate
through the whole packfile to find all objects of a specific type. This
can be prohibitively expensive depending on the size of the packfiles.
It isn't really possible to do better than this when only considering a
packfile itself, as the order of objects is not fixed. But when we have
a packfile with a corresponding bitmap, either because the packfile
itself has one or because the multi-pack index has a bitmap for it, then
we can use these bitmaps to improve the runtime.
While bitmaps are typically used to compute reachability of objects,
they also contain one bitmap per object type that encodes which object
has what type. So instead of reading through the whole packfile(s), we
can use the bitmaps and iterate through the type-specific bitmap.
Typically, only a subset of packfiles will have a bitmap. But this isn't
really much of a problem: we can use bitmaps when available, and then
use the non-bitmap walk for every packfile that isn't covered by one.
Overall, this leads to quite a significant speedup depending on how many
objects of a certain type exist. The following benchmarks have been
executed in the Chromium repository, which has a 50GB packfile with
almost 25 million objects. As expected, there isn't really much of a
change in performance without an object filter:
Benchmark 1: cat-file with no-filter (revision = HEAD~)
Time (mean ± σ): 89.675 s ± 4.527 s [User: 40.807 s, System: 10.782 s]
Range (min … max): 83.052 s … 96.084 s 10 runs
Benchmark 2: cat-file with no-filter (revision = HEAD)
Time (mean ± σ): 88.991 s ± 2.488 s [User: 42.278 s, System: 10.305 s]
Range (min … max): 82.843 s … 91.271 s 10 runs
Summary
cat-file with no-filter (revision = HEAD) ran
1.01 ± 0.06 times faster than cat-file with no-filter (revision = HEAD~)
We still have to scan through all objects as we yield all of them, so
using the bitmap in this case doesn't really buy us anything. What is
noticeable in this benchmark is that we're I/O-bound, not CPU-bound, as
can be seen from the user/system runtimes, which combined are way lower
than the overall benchmarked runtime.
But when we do use a filter we can see a significant improvement:
Benchmark 1: cat-file with filter=object:type=commit (revision = HEAD~)
Time (mean ± σ): 86.444 s ± 4.081 s [User: 36.830 s, System: 11.312 s]
Range (min … max): 80.305 s … 93.104 s 10 runs
Benchmark 2: cat-file with filter=object:type=commit (revision = HEAD)
Time (mean ± σ): 2.089 s ± 0.015 s [User: 1.872 s, System: 0.207 s]
Range (min … max): 2.073 s … 2.119 s 10 runs
Summary
cat-file with filter=object:type=commit (revision = HEAD) ran
41.38 ± 1.98 times faster than cat-file with filter=object:type=commit (revision = HEAD~)
This is because we don't have to scan through all packfiles anymore, but
can instead directly look up relevant objects.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Pull out a common function that allows us to iterate over all objects in
a repository. Right now the logic is trivial and would only require two
function calls, making this refactoring a bit pointless. But in the next
commit we will iterate on this logic to make use of bitmaps, so this is
about to become a bit more complex.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `show_reachable_fn` callback is used by a couple of functions to
present reachable objects to the caller. The function does not provide a
way for the caller to pass a payload though, which is functionality that
we'll require in a subsequent commit.
Change the callback type to accept a payload and adapt all callsites
accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Implement support for the "object:type=" filter in git-cat-file(1),
which causes us to omit all objects that don't match the provided object
type.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Implement support for the "blob:limit=" filter in git-cat-file(1), which
causes us to omit all blobs that are bigger than a certain size.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Implement support for the "blob:none" filter in git-cat-file(1), which
causes us to omit all blobs.
Note that this new filter requires us to read the object type via
`oid_object_info_extended()` in `batch_object_write()`. But as we try to
optimize away reading objects from the database the `data->info.typep`
pointer may not be set. We thus have to adapt the logic to conditionally
set the pointer in cases where the filter is given.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|