| Age | Commit message (Collapse) | Author | Files | Lines |
|
Git very often uses the terms "object", "reference", or "index" in its
documentation.
However, it's hard to find a clear explanation of these terms and how
they relate to each other in the documentation. The closest candidates
currently are:
1. `gitglossary`. This makes a good effort, but it's an alphabetically
ordered dictionary and a dictionary is not a good way to learn
concepts. You have to jump around too much and it's not possible to
present the concepts in the order that they should be explained.
2. `gitcore-tutorial`. This explains how to use the "core" Git commands.
This is a nice document to have, but it's not necessary to learn how
`update-index` works to understand Git's data model, and we should
not be requiring users to learn how to use the "plumbing" commands
if they want to learn what the term "index" or "object" means.
3. `gitrepository-layout`. This is a great resource, but it includes a
lot of information about configuration and internal implementation
details which are not related to the data model. It also does
not explain how commits work.
The result of this is that Git users (even users who have been using
Git for 15+ years) struggle to read the documentation because they don't
know what the core terms mean, and it's not possible to add links
to help them learn more.
Add an explanation of Git's data model. Some choices I've made in
deciding what "core data model" means:
1. Omit pseudorefs like `FETCH_HEAD`, because it's not clear to me
if those are intended to be user facing or if they're more like
internal implementation details.
2. Don't talk about submodules other than by mentioning how they
relate to trees. This is because Git has a lot of special features,
and explaining how they all work exhaustively could quickly go
down a rabbit hole which would make this document less useful for
understanding Git's core behaviour.
3. Don't discuss the structure of a commit message
(first line, trailers etc).
4. Don't mention configuration.
5. Don't mention the `.git` directory, to avoid getting too much into
implementation details
Signed-off-by: Julia Evans <julia@jvns.ca>
[jc: fixed up xml validation error in the documentation part]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a new check in validate_submodule_git_dir() to detect and
prevent case-folding filesystem colisions. When this new check
is triggered, a stricter casefolding aware URI encoding is used
to percent-encode uppercase characters, e.g. Foo becomes %46oo.
By using this check/retry mechanism the uppercase encoding is
only applied when necessary, so case-sensitive filesystems are
not affected.
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a submoduleEncoding extension which fixes filesystem collisions by
encoding gitdir paths. At a high level, this implements a mechanism to
encode -> validate -> retry until a working gitdir path is found.
Credit goes to Junio for coming up with this design: encoding is only
applied when necessary, e.g. uppercase characters are encoded only on
case-folding filesystems and only if a real conflict is detected.
To make this work, we rely on the submodule.<name>.gitdir config as the
single source of truth for gitidir paths: the config is always set when
the extension is enabled. Users who care about gitdir paths are expected
to get/set the config and not the underlying encoding implementation.
This commit adds the basic encoding logic which addresses nested gitdirs.
The next commit fixes case-folding, the next commit fixes names longer
than NAME_MAX. The idea is the encoding can be improved over time in a
way which is transparent to users.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Based-on-patch-by: Brandon Williams <bwilliams.eng@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
is_rfc3986_unreserved() was moved to credential-store.c and was made
static by f89854362c (credential-store: move related functions to
credential-store file, 2023-06-06) under a correct assumption, at the
time, that it was the only place using it.
However now we need it to apply URL-encoding to submodule names when
constructing gitdir paths, to avoid conflicts, so bring it back as a
public function exposed via url.h, instead of the old helper path
(strbuf), which has nothing to do with 3986 encoding/decoding anymore.
This function will be used by submodule.c in the next commit.
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While testing submodule gitdir path encoding, I noticed submodule--helper
is still using a hardcoded modules gitdir path leading to test failures.
Call the submodule_name_to_gitdir() helper instead, which was invented
exactly for this purpose and is already used by all the other locations
which work on gitdirs.
Also narrow the scope of the submod_gitdir_path variable which is not
used anymore in the updated "else" branch.
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The diff algorithm used in 'git-blame(1)' is set to 'myers',
without the possibility to change it aside from the `--minimal` option.
There has been long-standing interest in changing the default diff
algorithm to "histogram", and Git 3.0 was floated as a possible occasion
for taking some steps towards that:
https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/
As a preparation for this move, it is worth making sure that the diff
algorithm is configurable where useful.
Make it configurable in the `git-blame(1)` command by introducing the
`--diff-algorithm` option and make honor the `diff.algorithm` config
variable. Keep Myers diff as the default.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The XDF_DIFF_ALGORITHM_MASK bit mask only includes bits for the patience
and histogram diffs, not for the minimal one. This means that when
reseting the diff algorithm to the default one, one needs to separately
clear the bit for the minimal diff. There are places in the code that fail
to do that: merge-ort.c and builtin/merge-file.c.
Add the XDF_NEED_MINIMAL bit to the bit mask, and remove the separate
clearing of this bit in the places where it hasn't been forgotten.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Our Bencher dashboards [1] have recently alerted us about a bunch of
performance regressions when writing references, specifically with the
reftable backend. There is a 3x regression when writing many refs with
preexisting refs in the reftable format, and a 10x regression when
migrating refs between backends in either of the formats.
Bisecting the issue lands us at 6ec4c0b45b (refs: don't store peeled
object IDs for invalid tags, 2025-10-23). The gist of the commit is that
we may end up storing peeled objects in both reftables and packed-refs
for corrupted tags, where the claimed tagged object type is different
than the actual tagged object type. This will then cause us to create
the `struct object *` with a wrong type, as well, and obviously nothing
good comes out of that.
The fix for this issue was to introduce a new flag to `peel_object()`
that causes us to verify the tagged object's type before writing it into
the refdb -- if the tag is corrupt, we skip writing the peeled value.
To verify whether the peeled value is correct we have to look up the
object type via the ODB and compare the actual type with the claimed
type, and that additional object lookup is costly.
This also explains why we see the regression only when writing refs with
the reftable backend, but we see the regression with both backends when
migrating refs:
- The reftable backend knows to store peeled values in the new table
immediately, so it has to try and peel each ref it's about to write
to the transaction. So the performance regression is visible for all
writes.
- The files backend only stores peeled values when writing the
packed-refs file, so it wouldn't hit the performance regression for
normal writes. But on ref migrations we know to write all new values
into the packed-refs file immediately, and that's why we see the
regression for both backends there.
Taking a step back though reveals an oddity in the new verification
logic: we not only verify the _tagged_ object's type, but we also verify
the type of the tag itself. But this isn't really needed, as we wouldn't
hit the bug in such a case anyway, as we only hit the issue with corrupt
tags claiming an invalid type for the tagged object.
The consequence of this is that we now started to look up the target
object of every single reference we're about to write, regardless of
whether it even is a tag or not. And that is of course quite costly.
Fix the issue by only verifying the type of the tagged objects. This
means that we of course still have a performance hit for actual tags.
But this only happens for writes anyway, and I'd claim it's preferable
to not store corrupted data in the refdb than to be fast here. Rename
the flag accordingly to clarify that we only verify the tagged object's
type.
This fix brings performance back to previous levels:
Benchmark 1: baseline
Time (mean ± σ): 46.0 ms ± 0.4 ms [User: 40.0 ms, System: 5.7 ms]
Range (min … max): 45.0 ms … 47.1 ms 54 runs
Benchmark 2: regression
Time (mean ± σ): 140.2 ms ± 1.3 ms [User: 77.5 ms, System: 60.5 ms]
Range (min … max): 138.0 ms … 142.7 ms 20 runs
Benchmark 3: fix
Time (mean ± σ): 46.2 ms ± 0.4 ms [User: 40.2 ms, System: 5.7 ms]
Range (min … max): 45.0 ms … 47.3 ms 55 runs
Summary
update-ref: baseline
1.00 ± 0.01 times faster than fix
3.05 ± 0.04 times faster than regression
[1]: https://bencher.dev/perf/git/plots
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Bumps `actions/upload-artifact` from 4 to 5.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v4...v5)
Bumps `actions/download-artifact` from 5 to 6.
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](https://github.com/actions/download-artifact/compare/v5...v6)
Originally-authored-by: dependabot[bot] <support@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The 'git-maintenance(1)' command provides tooling to run maintenance
tasks over Git repositories. The 'run' subcommand, as the name suggests,
runs the maintenance tasks. When used with the '--auto' flag, it uses
heuristics to determine if the required thresholds are met for running
said maintenance tasks.
There is however a lack of insight into these heuristics. Meaning, the
checks are linked to the execution.
Add a new 'is-needed' subcommand to 'git-maintenance(1)' which allows
users to simply check if it is needed to run maintenance without
performing it.
This subcommand can check if it is needed to run maintenance without
actually running it. Ideally it should be used with the '--auto' flag,
which would allow users to check if the thresholds required are met. The
subcommand also supports the '--task' flag which can be used to check
specific maintenance tasks.
While adding the respective tests in 't/t7900-maintenance.sh', remove a
duplicate of the test: 'worktree-prune task with --auto honors
maintenance.worktree-prune.auto'.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The 'git-maintenance(1)' command supports an '--auto' flag. Usage of the
flag ensures to run maintenance tasks only if certain thresholds are
met. The heuristic is defined on a task level, wherein each task defines
an 'auto_condition', which states if the task should be run.
The 'pack-refs' task is hard-coded to return 1 as:
1. There was never a way to check if the reference backend needs to be
optimized without actually performing the optimization.
2. We can pass in the '--auto' flag to 'git-pack-refs(1)' which would
optimize based on heuristics.
The previous commit added a `refs_optimize_required()` function, which
can be used to check if a reference backend required optimization. Use
this within `pack_refs_condition()`.
This allows us to add a 'git maintenance is-needed' subcommand which can
notify the user if maintenance is needed without actually performing the
optimization. Without this change, the reference backend would always
state that optimization is needed.
Since we import 'revision.h', we need to remove the definition for
'SEEN' which is duplicated in the included header.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
To allow users of the refs namespace to check if the reference backend
requires optimization, add a new field `optimize_required` field to
`struct ref_storage_be`. This field is of type `optimize_required_fn`
which is also introduced in this commit.
Modify the debug, files, packed and reftable backend to implement this
field. A following commit will expose this via 'git pack-refs' and 'git
refs optimize'.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The reftable backend performs auto-compaction as part of its regular
flow, which is required to keep the number of tables part of a stack at
bay. This allows it to stay optimized.
Compaction can also be triggered voluntarily by the user via the 'git
pack-refs' or the 'git refs optimize' command. However, currently there
is no way for the user to check if optimization is required without
actually performing it.
Extract out the heuristics logic from 'reftable_stack_auto_compact()'
into an internal function 'update_segment_if_compaction_required()'.
Then use this to add and expose `reftable_stack_compaction_required()`
which will allow users to check if the reftable backend can be
optimized.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `stack_table_sizes_for_compaction()` function returns individual
sizes of each reftable table. This function is only called by
`reftable_stack_auto_compact()` to decide which tables need to be
compacted, if any.
Modify the function to directly return the segments, which avoids the
extra step of receiving the sizes only to pass it to
`suggest_compaction_segment()`.
A future commit will also add functionality for checking whether
auto-compaction is necessary without performing it. This change allows
code re-usability in that context.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Makefile-based builds can configure Git's internal HTML_PATH by defining
htmldir, which is useful for packagers that put documentation in
different locations. Gentoo, for example, uses version-suffixed
directories like ${prefix}/share/doc/git-2.51 and puts the HTML
documentation in an 'html' subdirectory of the same.
Propagate the same configuration knob to Meson-based builds so that
"git --html-path" on such systems can be configured to output the
correct directory.
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When installing git-contacts with Meson via -Dcontrib=contacts, the default
Perl generation fails to mark it executable. As a result, "git contacts"
reports "'contacts' is not a git command."
Unlike generate-script.sh, we aren't testing the basename here; so, glob
the script name in the case arm to match wherever the input comes from.
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* Replace $(LOADLIBES) because it is deprecated since long and it is
used nowhere else in the git project.
* Use $(gitexecdir) instead of $(libexecdir) because config.mak defines
$(libexecdir) as $(prefix)/libexec, not as $(prefix)/libexec/git-core.
* Similar to other Makefiles, let install target rule create
$(gitexecdir) to make sure the directory exists before copying the
executable and also let it respect $(DESTDIR).
* Shuffle the lines for the default settings to align them with the
other Makefiles in contrib/credential.
* Define .PHONY for all special targets (all, install, clean).
Signed-off-by: Thomas Uhle <thomas.uhle@mailbox.tu-dresden.de>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Update the documentation to clearly describe how the server responds when a
client sends an invalid or malformed `want` line during the HTTP protocol
exchange. The server includes the offending object name in its error message.
Signed-off-by: Queen Ediri Jessa <qjessa662@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Implement a new `--trailer <text>` option for `git rebase`
(support merge backend only now), which appends arbitrary
trailer lines to each rebased commit message.
Reject it if the user passes an option that requires the
apply backend (git am) since it lacks message‑filter/trailer
hook. otherwise we can just use the merge backend.
Automatically set REBASE_FORCE when any trailer is supplied.
And reject invalid input before user edits the interactive file.
Signed-off-by: Li Chen <chenl311@chinatelecom.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Route all trailer insertion through trailer_process() and make
builtin/interpret-trailers just do file I/O before calling into it.
amend_file_with_trailers() now shares the same code path.
This removes the fork/exec and tempfile juggling, cutting overhead and
simplifying error handling. No functional change. It also
centralizes logic to prepare for follow-up rebase --trailer patch.
Signed-off-by: Li Chen <chenl311@chinatelecom.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This function would be used by trailer_process
in following commits.
Signed-off-by: Li Chen <chenl311@chinatelecom.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Extracted trailer processing into a helper that accumulates output in
a strbuf before writing.
Updated interpret_trailers() to reuse the helper, buffer output, and
clean up both input and output buffers after writing.
Signed-off-by: Li Chen <chenl311@chinatelecom.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
It is quite a common use case that one wants to split up one commit into
multiple commits by moving parts of the changes of the original commit
out into a separate commit. This is quite an involved operation though:
1. Identify the commit in question that is to be dropped.
2. Perform an interactive rebase on top of that commit's parent.
3. Modify the instruction sheet to "edit" the commit that is to be
split up.
4. Drop the commit via "git reset HEAD~".
5. Stage changes that should go into the first commit and commit it.
6. Stage changes that should go into the second commit and commit it.
7. Finalize the rebase.
This is quite complex, and overall I would claim that most people who
are not experts in Git would struggle with this flow.
Introduce a new "split" subcommand for git-history(1) to make this way
easier. All the user needs to do is to say `git history split $COMMIT`.
From hereon, Git asks the user which parts of the commit shall be moved
out into a separate commit and, once done, asks the user for the commit
message. Git then creates that split-out commit and applies the original
commit on top of it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The function `write_in_core_index_as_tree()` takes a repository and
writes its index into a tree object. What this function cannot do though
is to take an _arbitrary_ in-memory index.
Introduce a new `struct index_state` parameter so that the caller can
pass a different index than the one belonging to the repository. This
will be used in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
With `run_add_p()` callers have the ability to apply changes from a
specific revision to a repository's index. This infra supports several
different modes, like for example applying changes to the index,
working tree or both.
One feature that is missing though is the ability to apply changes to an
in-memory index different from the repository's index. Add a new
function `run_add_p_index()` to plug this gap.
This new function will be used in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
With the preceding commit we have split out interactive configuration
that is used by both "git add -p" and "git add -i". But we still
initialize that configuration in the "add -p" subsystem by calling
`init_add_i_state()`, even though we only do so to initialize the
interactive configuration as well as a repository pointer.
Stop doing so and instead store and initialize the interactive
configuration in `struct add_p_state` directly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `struct add_p_opt` is reused both by our infra for "git add -p" and
"git add -i". Users of `run_add_i()` for example are expected to pass
`struct add_p_opt`. This is somewhat confusing and raises the question
of which options apply to what part of the stack.
But things are even more confusing than that: while callers are expected
to pass in `struct add_p_opt`, these options ultimately get used to
initialize a `struct add_i_state` that is used by both subsystems. So we
are basically going full circle here.
Refactor the code and split out a new `struct interactive_options` that
hosts common options used by both. These options are then applied to a
`struct interactive_config` that hosts common configuration.
This refactoring doesn't yet fully detangle the two subsystems from one
another, as we still end up calling `init_add_i_state()` in the "git add
-p" subsystem. This will be fixed in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While we have a "add-patch.c" code file, its declarations are part of
"add-interactive.h". This makes it somewhat harder than necessary to
find relevant code and to identify clear boundaries between the two
subsystems.
Split up concerns and move declarations that relate to "add-patch.c"
into a new "add-patch.h" header.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Implement a new "reword" subcommand for git-history(1). This subcommand
is essentially the same as if a user performed an interactive rebase
with a single commit changed to use the "reword" verb.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When rewriting history via git-rebase(1) there are a couple of very
common use cases:
- The ordering of two commits should be reversed.
- A commit should be split up into two commits.
- A commit should be dropped from the history completely.
- Multiple commits should be squashed into one.
While these operations are all doable, it often feels needlessly kludgey
to do so by doing an interactive rebase, using the editor to say what
one wants, and then perform the actions. Furthermore, some operations
like splitting up a commit into two are way more involved than that and
require a whole series of commands.
Add a new "history" command to plug this gap. This command will have
several different subcommands to imperatively rewrite history for common
use cases like the above. These subcommands will be implemented in
subsequent commits.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In `create_commit()` we're using `the_repository` even though we already
have a repository passed to use as an argument. Fix this.
Note that we still cannot get rid of `USE_THE_REPOSITORY_VARIABLE`. This
is because we use `DEFAULT_ABBREV and `get_commit_output_encoding()`,
both of which are stored as global variables that can be modified via
the Git configuration.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We're about to add a new git-history(1) command that will reuse some of
the same infrastructure as git-replay(1). To prepare for this, extract
the logic to pick a commit into a new "replay.c" file so that it can be
shared between both commands.
Rename the function to have a "replay_" prefix to clearly indicate its
subsystem.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "wt-status" subsystem is responsible for printing status information
around the current state of the working tree. This most importantly
includes information around whether the working tree or the index have
any changes.
We're about to introduce a new command where the changes in neither of
them are actually relevant to us. Instead, what we want is to format the
changes between two different trees. While it is a little bit of a
stretch to add this as functionality to _working tree_ status, it
doesn't make any sense to open-code this functionality, either.
Implement a new function `wt_status_collect_changes_trees()` that diffs
two trees and formats the status accordingly. This function is not yet
used, but will be in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Now "git diff --check" and "git apply --whitespace=warn/fix" learned
incomplete line is a whitespace error, enable them for this project
to prevent patches to add new incomplete lines to our sources.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Teach "git diff" to highlight "\ No newline at end of file" message
as a whitespace error when incomplete-line whitespace error class is
in effect. Thanks to the previous refactoring of complete rewrite
code path, we can do this at a single place.
Unlike whitespace errors in the payload where we need to annotate in
line, possibly using colors, the line that has whitespace problems,
we have a dedicated line already that can serve as the error
message, so paint it as a whitespace error message.
Also teach "git diff --check" to notice incomplete lines as
whitespace errors and report when incomplete-line whitespace error
class is in effect.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The final line of a file that lacks the terminating newline at its
end is called an incomplete line. In general they are frowned upon
for many reasons (imagine concatenating two files with "cat A B" and
what happens when A ends in an incomplete line, for example), and
text-oriented tools often mishandle such a line.
Implement checks in "git apply" for incomplete lines, which is off
by default for backward compatibility's sake, so that "git apply
--whitespace={fix,warn,error}" can notice, warn against, and fix
them.
As one of the new test shows, if you modify contents on an
incomplete line in the original and leave the resulting line
incomplete, it is still considered a whitespace error, the reasoning
being that "you'd better fix it while at it if you are making a
change on an incomplete line anyway", which may controversial.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Reserve a few more bits in the diff flags word to be used for future
whitespace rules. Add WS_INCOMPLETE_LINE without implementing the
behaviour (yet).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A patch file represents the incomplete line at the end of the file
with two lines, one that is the usual "context" with " " as the
first letter, "added" with "+" as the first letter, or "removed"
with "-" as the first letter that shows the content of the line,
plus an extra "\ No newline at the end of file" line that comes
immediately after it.
Ever since the apply machinery was written, the "git apply"
machinery parses "\ No newline at the end of file" line
independently, without even knowing what line the incomplete-ness
applies to, simply because it does not even remember what the
previous line was.
This poses a problem if we want to check and warn on an incomplete
line. Revamp the code that parses a fragment, to actually drop the
'\n' at the end of the incoming patch file that terminates a line,
so that check_whitespace() calls made from the code path actually
sees an incomplete as incomplete.
Note that the result of this parsing is not directly used by the
code path that applies the patch. apply_one_fragment() function
already checks if each of the patch text it handles is followed by a
line that begins with a backslash to drop the newline at the end of
the current line it is looking at. In a sense, this patch harmonizes
the behaviour of the parsing side to what is already done in the
application side.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The diff_symbol based output framework uses one DIFF_SYMBOL_* enum
value per the kind of output lines of "git diff", which corresponds
to one output line from the xdiff machinery used internally. Most
notably, DIFF_SYMBOL_PLUS and DIFF_SYMBOL_MINUS that correspond to
"+" and "-" lines are designed to always take a complete line, even
if the output from xdiff machinery may produce "\ No newline at the
end of file" immediately after them.
But this is not true in the rewrite-diff codepath, which completely
bypasses the xdiff machinery. Since the code path feeds the bytes
directly from the payload to the output routines, the output layer
has to deal with an incomplete line with DIFF_SYMBOL_PLUS and
DIFF_SYMBOL_MINUS, which never would see an incomplete line in the
normal code paths. This lack of final newline is compensated by an
ugly hack for a fabricated DIFF_SYMBOL_NO_LF_EOF token to inject an
extra newline to the output to simulate output coming from the xdiff
machinery.
Revamp the way the complete-rewrite code path feeds the lines to the
output layer by treating the last line of the pre/post image when it
is an incomplete line specially.
This lets us remove the DIFF_SYMBOL_NO_LF_EOF hack and use the usual
DIFF_SYMBOL_CONTEXT_INCOMPLETE code path, which will later learn how
to handle whitespace errors.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Everybody else, except for emit_rewrite_lines(), calls the
emit_callback data ecbdata. Make sure we call the same thing by
the same name for consistency.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Create a helper function that reacts to "\ No newline at the end of
file" in preparation for unifying the incomplete line handling in
the code path that handles xdiff output and the code path that
bypasses xdiff and produces complete rewrite patch.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "\ No newline at the end of the file" can come after any of the
"-" (deleted preimage line), " " (unchanged line), or "+" (added
postimage line). Incrementing only the preimage line number upon
seeing it does not make any sense.
We can keep track of what the previous line was, and increment
lno_in_{pre,post}image variables properly, like this patch does. I
do not think it matters, as these numbers are used only to compare
them with blank_at_eof_in_{pre,post}image to issue the warning every
time we see an added line, but by definition, after we see "\ No
newline at the end of the file" for an added line, we will not see
an added line for the file.
Keeping track of what the last line was (in other words, "is it that
the file used to end in an incomplete line? The file ends in an
incomplete line after the change? Both the file before and after
the change ends in an incomplete line that did not change?") will be
independently useful.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The suppress-blank-empty feature abused the CONTEXT_INCOMPLETE
symbol that was meant to be used only for "\ No newline at the end
of file" code path.
The intent of the feature was to turn a context line we receive from
xdiff machinery (which always uses ' ' for context lines, even an
empty one) and spit it out as a truly empty line.
Perform such a conversion very locally at where a line from xdiff
that begins with ' ' is handled for output; there are many checks
before the control reaches such place that checks the first letter
of the diff output line to see if it is a context line, and having
to check for '\n' and treat it as a special case is error prone.
In order to catch similar hacks in the future, make sure the code
path that is meant for "\ No newline" case checks the first byte is
indeed a backslash.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Apply the simple rule: if you need {} in one arm of the if/else
if/else... cascade, have {} in all of them.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A comment in diff.c claimed that bits up to 12th (counting from 0th)
are whitespace rules, and 13th thru 15th are for new/old/context,
but it turns out it was miscounting. Correct them, and clarify
where the whitespace rule bits come from in the comment. Extend bit
assignment comments to cover bits used for color-moved, which
weren't described.
Also update the way these bit constants are defined to use (1 << N)
notation, instead of octal constants, as it tends to make it easier
to notice a breakage like this.
Sprinkle a few blank lines between logically distinct groups of CPP
macro definitions to make them easier to read.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a configuration variable to control the default behavior of git replay
for updating references. This allows users who prefer the traditional
pipeline output to set it once in their config instead of passing
--ref-action=print with every command.
The config variable uses string values that mirror the behavior modes:
* replay.refAction = update (default): atomic ref updates
* replay.refAction = print: output commands for pipeline
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Elijah Newren <newren@gmail.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The git replay command currently outputs update commands that can be
piped to update-ref to achieve a rebase, e.g.
git replay --onto main topic1..topic2 | git update-ref --stdin
This separation had advantages for three special cases:
* it made testing easy (when state isn't modified from one step to
the next, you don't need to make temporary branches or have undo
commands, or try to track the changes)
* it provided a natural can-it-rebase-cleanly (and what would it
rebase to) capability without automatically updating refs, similar
to a --dry-run
* it provided a natural low-level tool for the suite of hash-object,
mktree, commit-tree, mktag, merge-tree, and update-ref, allowing
users to have another building block for experimentation and making
new tools
However, it should be noted that all three of these are somewhat
special cases; users, whether on the client or server side, would
almost certainly find it more ergonomic to simply have the updating
of refs be the default.
For server-side operations in particular, the pipeline architecture
creates process coordination overhead. Server implementations that need
to perform rebases atomically must maintain additional code to:
1. Spawn and manage a pipeline between git-replay and git-update-ref
2. Coordinate stdout/stderr streams across the pipe boundary
3. Handle partial failure states if the pipeline breaks mid-execution
4. Parse and validate the update-ref command output
Change the default behavior to update refs directly, and atomically (at
least to the extent supported by the refs backend in use). This
eliminates the process coordination overhead for the common case.
For users needing the traditional pipeline workflow, add a new
--ref-action=<mode> option that preserves the original behavior:
git replay --ref-action=print --onto main topic1..topic2 | git update-ref --stdin
The mode can be:
* update (default): Update refs directly using an atomic transaction
* print: Output update-ref commands for pipeline use
Test suite changes:
All existing tests that expected command output now use
--ref-action=print to preserve their original behavior. This keeps
the tests valid while allowing them to verify that the pipeline workflow
still works correctly.
New tests were added to verify:
- Default atomic behavior (no output, refs updated directly)
- Bare repository support (server-side use case)
- Equivalence between traditional pipeline and atomic updates
- Real atomicity using a lock file to verify all-or-nothing guarantee
- Test isolation using test_when_finished to clean up state
- Reflog messages include replay mode and target
A following commit will add a replay.refAction configuration
option for users who prefer the traditional pipeline output as their
default behavior.
Helped-by: Elijah Newren <newren@gmail.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Christian Couder <christian.couder@gmail.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In preparation for adding the --ref-action option, convert option
validation to use die_for_incompatible_opt2(). This helper provides
standardized error messages for mutually exclusive options.
The following commit introduces --ref-action which will be incompatible
with certain other options. Using die_for_incompatible_opt2() now means
that commit can cleanly add its validation using the same pattern,
keeping the validation logic consistent and maintainable.
This also aligns git-replay's option handling with how other Git commands
manage option conflicts, using the established die_for_incompatible_opt*()
helper family.
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|