aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/BreakingChanges.adoc20
-rw-r--r--Documentation/Makefile3
-rw-r--r--Documentation/MyFirstContribution.adoc5
-rw-r--r--Documentation/RelNotes/2.51.2.adoc45
-rw-r--r--Documentation/RelNotes/2.52.0.adoc155
-rw-r--r--Documentation/SubmittingPatches28
-rw-r--r--Documentation/config/commitgraph.adoc11
-rw-r--r--Documentation/config/core.adoc9
-rw-r--r--Documentation/config/extensions.adoc6
-rw-r--r--Documentation/config/maintenance.adoc49
-rw-r--r--Documentation/config/replay.adoc11
-rw-r--r--Documentation/config/stash.adoc4
-rw-r--r--Documentation/config/submodule.adoc5
-rw-r--r--Documentation/diff-algorithm-option.adoc20
-rw-r--r--Documentation/diff-options.adoc21
-rw-r--r--Documentation/fsck-msgids.adoc6
-rw-r--r--Documentation/git-add.adoc1
-rw-r--r--Documentation/git-bisect.adoc43
-rw-r--r--Documentation/git-blame.adoc2
-rw-r--r--Documentation/git-checkout.adoc4
-rw-r--r--Documentation/git-commit-graph.adoc2
-rw-r--r--Documentation/git-config.adoc18
-rw-r--r--Documentation/git-fast-import.adoc5
-rw-r--r--Documentation/git-history.adoc111
-rw-r--r--Documentation/git-maintenance.adoc13
-rw-r--r--Documentation/git-patch-id.adoc22
-rw-r--r--Documentation/git-pull.adoc93
-rw-r--r--Documentation/git-rebase.adoc9
-rw-r--r--Documentation/git-replay.adoc63
-rw-r--r--Documentation/git-repo.adoc36
-rw-r--r--Documentation/git-rev-parse.adoc25
-rw-r--r--Documentation/git-shortlog.adoc4
-rw-r--r--Documentation/git-sparse-checkout.adoc105
-rw-r--r--Documentation/git-tag.adoc48
-rw-r--r--Documentation/git-worktree.adoc14
-rw-r--r--Documentation/gitcli.adoc2
-rw-r--r--Documentation/gitdatamodel.adoc302
-rw-r--r--Documentation/gitformat-loose.adoc157
-rw-r--r--Documentation/gitformat-pack.adoc19
-rw-r--r--Documentation/gitignore.adoc5
-rw-r--r--Documentation/gitprotocol-http.adoc3
-rw-r--r--Documentation/glossary-content.adoc4
-rw-r--r--Documentation/howto/meson.build4
-rw-r--r--Documentation/meson.build15
-rw-r--r--Documentation/pull-fetch-param.adoc1
-rw-r--r--Documentation/technical/commit-graph.adoc29
-rw-r--r--Documentation/technical/hash-function-transition.adoc46
-rw-r--r--Documentation/technical/large-object-promisors.adoc64
-rw-r--r--Documentation/technical/meson.build5
-rw-r--r--Documentation/technical/remembering-renames.adoc120
-rw-r--r--Documentation/technical/sparse-checkout.adoc704
-rw-r--r--Documentation/technical/unambiguous-types.adoc229
52 files changed, 2026 insertions, 699 deletions
diff --git a/Documentation/BreakingChanges.adoc b/Documentation/BreakingChanges.adoc
index 90b53abcea..f814450d2f 100644
--- a/Documentation/BreakingChanges.adoc
+++ b/Documentation/BreakingChanges.adoc
@@ -295,6 +295,26 @@ The command will be removed.
+
cf. <xmqqa59i45wc.fsf@gitster.g>
+* Support for `core.preferSymlinkRefs=true` has been deprecated and will be
+ removed in Git 3.0. Writing symbolic refs as symbolic links will be phased
+ out in favor of using plain files using the textual representation of
+ symbolic refs.
++
+Symbolic references were initially always stored as a symbolic link. This was
+changed in 9b143c6e15 (Teach update-ref about a symbolic ref stored in a
+textfile., 2005-09-25), where a new textual symref format was introduced to
+store those symbolic refs in a plain file. In 9f0bb90d16
+(core.prefersymlinkrefs: use symlinks for .git/HEAD, 2006-05-02), the Git
+project switched the default to use the textual symrefs in favor of symbolic
+links.
++
+The migration away from symbolic links has happened almost 20 years ago by now,
+and there is no known reason why one should prefer them nowadays. Furthermore,
+symbolic links are not supported on some platforms.
++
+Note that only the writing side for such symbolic links is deprecated. Reading
+such symbolic links is still supported for now.
+
== Superseded features that will not be deprecated
Some features have gained newer replacements that aim to improve the design in
diff --git a/Documentation/Makefile b/Documentation/Makefile
index a3fbd29744..47208269a2 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -34,6 +34,7 @@ MAN5_TXT += gitformat-bundle.adoc
MAN5_TXT += gitformat-chunk.adoc
MAN5_TXT += gitformat-commit-graph.adoc
MAN5_TXT += gitformat-index.adoc
+MAN5_TXT += gitformat-loose.adoc
MAN5_TXT += gitformat-pack.adoc
MAN5_TXT += gitformat-signature.adoc
MAN5_TXT += githooks.adoc
@@ -52,6 +53,7 @@ MAN7_TXT += gitcli.adoc
MAN7_TXT += gitcore-tutorial.adoc
MAN7_TXT += gitcredentials.adoc
MAN7_TXT += gitcvs-migration.adoc
+MAN7_TXT += gitdatamodel.adoc
MAN7_TXT += gitdiffcore.adoc
MAN7_TXT += giteveryday.adoc
MAN7_TXT += gitfaq.adoc
@@ -122,6 +124,7 @@ TECH_DOCS += technical/bundle-uri
TECH_DOCS += technical/commit-graph
TECH_DOCS += technical/directory-rename-detection
TECH_DOCS += technical/hash-function-transition
+TECH_DOCS += technical/large-object-promisors
TECH_DOCS += technical/long-running-process-protocol
TECH_DOCS += technical/multi-pack-index
TECH_DOCS += technical/packfile-uri
diff --git a/Documentation/MyFirstContribution.adoc b/Documentation/MyFirstContribution.adoc
index 02ba8ba5f6..f186dfbc89 100644
--- a/Documentation/MyFirstContribution.adoc
+++ b/Documentation/MyFirstContribution.adoc
@@ -1153,6 +1153,11 @@ NOTE: When you are sending a real patch, it will go to git@vger.kernel.org - but
please don't send your patchset from the tutorial to the real mailing list! For
now, you can send it to yourself, to make sure you understand how it will look.
+NOTE: After sending your patches, you can confirm that they reached the mailing
+list by visiting https://lore.kernel.org/git/. Use the search bar to find your
+name or the subject of your patch. If it appears, your email was successfully
+delivered.
+
After you run the command above, you will be presented with an interactive
prompt for each patch that's about to go out. This gives you one last chance to
edit or quit sending something (but again, don't edit code this way). Once you
diff --git a/Documentation/RelNotes/2.51.2.adoc b/Documentation/RelNotes/2.51.2.adoc
new file mode 100644
index 0000000000..f0be60333a
--- /dev/null
+++ b/Documentation/RelNotes/2.51.2.adoc
@@ -0,0 +1,45 @@
+Git 2.51.2 Release Notes
+========================
+
+In addition to fixes for an unfortunate regression introduced in Git
+2.51.1 that caused "git diff --quiet -w" to be not so quiet when there
+are additions, deletions and conflicts, this maintenance release merges
+more fixes/improvements that have landed on the master front, primarily
+to make the CI part of the system a bit more robust.
+
+
+Fixes since Git 2.51.1
+----------------------
+
+ * Recently we attempted to improve "git diff -w --quiet" and friends
+ to handle cases where patch output would be suppressed, but it
+ introduced a bug that emits unnecessary output, which has been
+ corrected.
+
+ * The code to squelch output from "git diff -w --name-status"
+ etc. for paths that "git diff -w -p" would have stayed silent
+ leaked output from dry-run patch generation, which has been
+ corrected.
+
+ * Windows "real-time monitoring" interferes with the execution of
+ tests and affects negatively in both correctness and performance,
+ which has been disabled in Gitlab CI.
+
+ * An earlier addition to "git diff --no-index A B" to limit the
+ output with pathspec after the two directories misbehaved when
+ these directories were given with a trailing slash, which has been
+ corrected.
+
+ * The "--short" option of "git status" that meant output for humans
+ and "-z" option to show NUL delimited output format did not mix
+ well, and colored some but not all things. The command has been
+ updated to color all elements consistently in such a case.
+
+ * Unicode width table update.
+
+ * Recent OpenSSH creates the Unix domain socket to communicate with
+ ssh-agent under $HOME instead of /tmp, which causes our test to
+ fail doe to overly long pathname in our test environment, which has
+ been worked around by using "ssh-agent -T".
+
+Also contains various documentation updates, code cleanups and minor fixups.
diff --git a/Documentation/RelNotes/2.52.0.adoc b/Documentation/RelNotes/2.52.0.adoc
index ef5f91fcc0..6c0e7d05c0 100644
--- a/Documentation/RelNotes/2.52.0.adoc
+++ b/Documentation/RelNotes/2.52.0.adoc
@@ -55,6 +55,30 @@ UI, Workflows & Features
(e.g. blame.ignorerevsfile) can be marked as optional by prefixing
":(optoinal)" before its value.
+ * Show 'P'ipe command in "git add -p".
+
+ * "git sparse-checkout" subcommand learned a new "clean" action to
+ prune otherwise unused working-tree files that are outside the
+ areas of interest.
+
+ * "git fast-import" is taught to handle signed tags, just like it
+ recently learned to handle signed commits, in different ways.
+
+ * A new configuration variable commitGraph.changedPaths allows to
+ turn "--changed-paths" on by default for "git commit-graph".
+
+ * "Symlink symref" has been added to the list of things that will
+ disappear at Git 3.0 boundary.
+
+ * "git maintenance" command learns the "geometric" strategy where it
+ avoids doing maintenance tasks that rebuilds everything from
+ scratch.
+
+ * "git repo structure", a new command.
+
+ * The help text and manual page of "git bisect" command have been
+ made consistent with each other.
+
Performance, Internal Implementation, Development Support etc.
--------------------------------------------------------------
@@ -102,7 +126,6 @@ Performance, Internal Implementation, Development Support etc.
* Adjust to the way newer versions of cURL selectively enable tracing
options, so that our tests can continue to work.
- (merge 1b5a6bfff3 jk/curl-global-trace-components later to maint).
* The clear_alloc_state() API function was not fully clearing the
structure for reuse, but since nobody reuses it, replace it with a
@@ -131,6 +154,34 @@ Performance, Internal Implementation, Development Support etc.
and one for xdiff), roll everything into a single libgit.a archive.
This would help later effort to FFI into Rust.
+ * The beginning of SHA1-SHA256 interoperability work.
+
+ * Build procedure for a few credential helpers (in contrib/) have
+ been updated.
+
+ * CI improvements to handle the recent Rust integration better.
+
+ * The code in "git repack" machinery has been cleaned up to prepare
+ for incremental update of midx files.
+
+ * Two slightly different ways to get at "all the packfiles" in API
+ has been cleaned up.
+
+ * The code to walk revision graph to compute merge base has been
+ optimized.
+
+ * AI guidelines has been added to our documentation set.
+
+ * Contributed credential helpers (obviously in contrib/) now have "cd
+ $there && make install" target.
+
+ * The "MyFirstContribution" tutorial tells the reader how to send out
+ their patches; the section gained a hint to verify the message
+ reached the mailing list.
+
+ * The "debug" ref-backend was missing a method implementation, which
+ has been corrected.
+
Fixes since v2.51
-----------------
@@ -140,11 +191,9 @@ including security updates, are included in this release.
* During interactive rebase, using 'drop' on a merge commit lead to
an error, which was incorrect.
- (merge 4d491ade8f js/rebase-i-allow-drop-on-a-merge later to maint).
* "git refs migrate" to migrate the reflog entries from a refs
backend to another had a handful of bugs squashed.
- (merge 465eff81de ps/reflog-migrate-fixes later to maint).
* "git remote rename origin upstream" failed to move origin/HEAD to
upstream/HEAD when origin/HEAD is unborn and performed other
@@ -157,11 +206,9 @@ including security updates, are included in this release.
* "git push" had a code path that led to BUG() but it should have
been a die(), as it is a response to a usual but invalid end-user
action to attempt pushing an object that does not exist.
- (merge dfbfc2221b dl/push-missing-object-error later to maint).
* Various bugs about rename handling in "ort" merge strategy have
been fixed.
- (merge f6ecb603ff en/ort-rename-fixes later to maint).
* "git jump" (in contrib/) fails to parse the diff header correctly
when a file has a space in its name, which has been corrected.
@@ -172,7 +219,6 @@ including security updates, are included in this release.
the prefix from the output, and oddballs like "-" (stdin) did not
work correctly because of it. Correct the set-up by undoing what
the set-up sequence did to cwd and prefix.
- (merge e1d3d61a45 jc/diff-no-index-in-subdir later to maint).
* Various options to "git diff" that makes comparison ignore certain
aspects of the differences (like "space changes are ignored",
@@ -180,19 +226,19 @@ including security updates, are included in this release.
ignored") did not work well with "--name-only" and friends.
(merge b55e6d36eb ly/diff-name-only-with-diff-from-content later to maint).
+ * The above caused regressions, which has been corrected.
+
* Documentation for "git rebase" has been updated.
(merge 3f7f2b0359 je/doc-rebase later to maint).
* The start_delayed_progress() function in the progress eye-candy API
did not clear its internal state, making an initial delay value
larger than 1 second ineffective, which has been corrected.
- (merge 457534d041 js/progress-delay-fix later to maint).
* The compatObjectFormat extension is used to hide an incomplete
feature that is not yet usable for any purpose other than
developing the feature further. Document it as such to discourage
its use by mere mortals.
- (merge 716d905792 bc/doc-compat-object-format-not-working later to maint).
* "git log -L..." compared trees of multiple parents with the tree of the
merge result in an unnecessarily inefficient way.
@@ -202,7 +248,6 @@ including security updates, are included in this release.
repository, especially a partially cloned one, "git fetch" may
mistakenly think some objects we do have are missing, which has
been corrected.
- (merge 8f32a5a6c0 jk/fetch-check-graph-objects-fix later to maint).
* "git fetch" can clobber a symref that is dangling when the
remote-tracking HEAD is set to auto update, which has been
@@ -214,20 +259,16 @@ including security updates, are included in this release.
* Manual page for "gitk" is updated with the current maintainer's
name.
- (merge bcb20dda83 js/doc-gitk-history later to maint).
* Update the instructions for using GGG in the MyFirstContribution
document to say that a GitHub PR could be made against `git/git`
instead of `gitgitgadget/git`.
- (merge 37001cdbc4 ds/doc-ggg-pr-fork-clarify later to maint).
* Makefile tried to run multiple "cargo build" which would not work
very well; serialize their execution to work around this problem.
- (merge 0eeacde50e da/cargo-serialize later to maint).
* "git repack --path-walk" lost objects in some corner cases, which
has been corrected.
- (merge 93afe9b060 ds/path-walk-repack-fix later to maint).
* "git ls-files <pathspec>..." should not necessarily have to expand
the index fully if a sparsified directory is excluded by the
@@ -238,15 +279,12 @@ including security updates, are included in this release.
* Windows "real-time monitoring" interferes with the execution of
tests and affects negatively in both correctness and performance,
which has been disabled in Gitlab CI.
- (merge 608cf5b793 ps/gitlab-ci-disable-windows-monitoring later to maint).
* A broken or malicious "git fetch" can say that it has the same
object for many many times, and the upload-pack serving it can
exhaust memory storing them redundantly, which has been corrected.
- (merge 88a2dc68c8 ps/upload-pack-oom-protection later to maint).
* A corner case bug in "git log -L..." has been corrected.
- (merge e3106998ff sg/line-log-boundary-fixes later to maint).
* "git rev-parse --short" and friends failed to disambiguate two
objects with object names that share common prefix longer than 32
@@ -256,7 +294,6 @@ including security updates, are included in this release.
* Some among "git add -p" and friends ignored color.diff and/or
color.ui configuration variables, which is an old regression, which
has been corrected.
- (merge 1092cd6435 jk/add-i-color later to maint).
* "git subtree" (in contrib/) did not work correctly when splitting
squashed subtrees, which has been improved.
@@ -272,7 +309,6 @@ including security updates, are included in this release.
* "git rebase -i" failed to clean-up the commit log message when the
command commits the final one in a chain of "fixup" commands, which
has been corrected.
- (merge 82a0a73e15 pw/rebase-i-cleanup-fix later to maint).
* There are double frees and leaks around setup_revisions() API used
in "git stash show", which has been fixed, and setup_revisions()
@@ -283,7 +319,6 @@ including security updates, are included in this release.
* Deal more gracefully with directory / file conflicts when the files
backend is used for ref storage, by failing only the ones that are
involved in the conflict while allowing others.
- (merge 948b2ab0d8 kn/refs-files-case-insensitive later to maint).
* "git last-modified" operating in non-recursive mode used to trigger
a BUG(), which has been corrected.
@@ -296,16 +331,13 @@ including security updates, are included in this release.
* The "do you still use it?" message given by a command that is
deeply deprecated and allow us to suggest alternatives has been
updated.
- (merge 54a60e5b38 kh/you-still-use-whatchanged-fix later to maint).
* Clang-format update to let our control macros be formatted the way we
had them traditionally, e.g., "for_each_string_list_item()" without
space before the parentheses.
- (merge 3721541d35 jt/clang-format-foreach-wo-space-before-parenthesis later to maint).
* A few places where a size_t value was cast to curl_off_t without
checking has been updated to use the existing helper function.
- (merge ecc5749578 js/curl-off-t-fixes later to maint).
* "git reflog write" did not honor the configured user.name/email
which has been corrected.
@@ -317,7 +349,6 @@ including security updates, are included in this release.
environment, but Ubuntu replaced with "sudo" with an implementation
that lacks the feature. Work this around by reinstalling the
original version.
- (merge fddb484255 ps/ci-avoid-broken-sudo-on-ubuntu later to maint).
* The reftable backend learned to sanity check its on-disk data more
carefully.
@@ -344,35 +375,69 @@ including security updates, are included in this release.
output with pathspec after the two directories misbehaved when
these directories were given with a trailing slash, which has been
corrected.
- (merge c0bec06cfe jk/diff-no-index-with-pathspec-fix later to maint).
+
+ * The "--short" option of "git status" that meant output for humans
+ and "-z" option to show NUL delimited output format did not mix
+ well, and colored some but not all things. The command has been
+ updated to color all elements consistently in such a case.
+
+ * Unicode width table update.
+
+ * GPG signing test set-up has been broken for a year, which has been
+ corrected.
+ (merge 516bf45749 jc/t1016-setup-fix later to maint).
+
+ * Recent OpenSSH creates the Unix domain socket to communicate with
+ ssh-agent under $HOME instead of /tmp, which causes our test to
+ fail doe to overly long pathname in our test environment, which has
+ been worked around by using "ssh-agent -T".
+
+ * strbuf_split*() to split a string into multiple strbufs is often a
+ wrong API to use. A few uses of it have been removed by
+ simplifying the code.
+ (merge 2ab72a16d9 ob/gpg-interface-cleanup later to maint).
+
+ * "git shortlog" knows "--committer" and "--author" options, which
+ the command line completion (in contrib/) did not handle well,
+ which has been corrected.
+ (merge c568fa8e1c kf/log-shortlog-completion-fix later to maint).
+
+ * "git bisect" command did not react correctly to "git bisect help"
+ and "git bisect unknown", which has been corrected.
+ (merge 2bb3a012f3 rz/bisect-help-unknown later to maint).
+
+ * The 'q'(uit) command in "git add -p" has been improved to quit
+ without doing any meaningless work before leaving, and giving EOF
+ (typically control-D) to the prompt is made to behave the same way.
+
+ * The wildmatch code had a corner case bug that mistakenly makes
+ "foo**/bar" match with "foobar", which has been corrected.
+ (merge 1940a02dc1 jk/match-pathname-fix later to maint).
+
+ * Tests did not set up GNUPGHOME correctly, which is fixed but some
+ flaky tests are exposed in t1016, which needs to be addressed
+ before this topic can move forward.
+ (merge 6cd8369ef3 tz/test-prepare-gnupghome later to maint).
+
+ * The patterns used in the .gitignore files use backslash in the way
+ documented for fnmatch(3); document as such to reduce confusion.
+ (merge 8a6d158a1d jk/doc-backslash-in-exclude later to maint).
+
+ * The version of macos image used in GitHub CI has been updated to
+ macos-14, as the macos-13 that we have been using got deprecated.
+ (merge 73b9cdb7c4 jc/ci-use-macos-14 later to maint).
* Other code cleanup, docfix, build fix, etc.
- (merge 823d537fa7 kh/doc-git-log-markup-fix later to maint).
- (merge cf7efa4f33 rj/t6137-cygwin-fix later to maint).
(merge 529a60a885 ua/t1517-short-help-tests later to maint).
(merge 22d421fed9 ac/deglobal-fmt-merge-log-config later to maint).
- (merge 741f36c7d9 kr/clone-synopsis-fix later to maint).
(merge a60203a015 dk/t7005-editor-updates later to maint).
- (merge 7d4a5fef7d ds/doc-count-objects-fix later to maint).
(merge 16684b6fae ps/reftable-libgit2-cleanup later to maint).
- (merge f38786baa7 ja/asciidoc-doctor-verbatim-fixes later to maint).
- (merge 374579c6d4 kh/doc-interpret-trailers-markup-fix later to maint).
- (merge 44dce6541c kh/doc-config-typofix later to maint).
- (merge 785628b173 js/doc-sending-patch-via-thunderbird later to maint).
(merge e5c27bd3d8 je/doc-add later to maint).
(merge 13296ac909 ps/object-store-midx-dedup-info later to maint).
- (merge 2f4bf83ffc km/alias-doc-markup-fix later to maint).
- (merge b0d97aac19 kh/doc-markup-fixes later to maint).
(merge f9a6705d9a tc/t0450-harden later to maint).
- (merge c25651aefd ds/midx-write-fixes later to maint).
- (merge 069c15d256 rs/object-name-extend-abbrev-len-update later to maint).
- (merge bf5c224537 mm/worktree-doc-typofix later to maint).
- (merge 31397bc4f7 kh/doc-fast-import-markup-fix later to maint).
- (merge ac7096723b jc/doc-includeif-hasconfig-remote-url-fix later to maint).
- (merge fafc9b08b8 ag/doc-sendmail-gmail-example-update later to maint).
(merge a66fc22bf9 rs/get-oid-with-flags-cleanup later to maint).
- (merge e1d062e8ba ps/odb-clean-stale-wrappers later to maint).
- (merge fdd21ba116 mh/doc-credential-url-prefix later to maint).
- (merge 1c573a3451 en/doc-merge-tree-describe-merge-base later to maint).
- (merge 84a6bf7965 ja/doc-markup-attached-paragraph-fix later to maint).
- (merge 399694384b kh/doc-patch-id-markup-fix later to maint).
+ (merge 15b8abde07 js/mingw-includes-cleanup later to maint).
+ (merge 2cebca0582 tb/cat-file-objectmode-update later to maint).
+ (merge 8f487db07a kh/doc-patch-id-1 later to maint).
+ (merge f711f37b05 eb/t1016-hash-transition-fix later to maint).
+ (merge 85333aa1af jk/test-delete-gpgsig-leakfix later to maint).
diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index d620bd93bd..e270ccbe85 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -446,6 +446,34 @@ highlighted above.
Only capitalize the very first letter of the trailer, i.e. favor
"Signed-off-by" over "Signed-Off-By" and "Acked-by:" over "Acked-By".
+[[ai]]
+=== Use of Artificial Intelligence (AI)
+
+The Developer's Certificate of Origin requires contributors to certify
+that they know the origin of their contributions to the project and
+that they have the right to submit it under the project's license.
+It's not yet clear that this can be legally satisfied when submitting
+significant amount of content that has been generated by AI tools.
+
+Another issue with AI generated content is that AIs still often
+hallucinate or just produce bad code, commit messages, documentation
+or output, even when you point out their mistakes.
+
+To avoid these issues, we will reject anything that looks AI
+generated, that sounds overly formal or bloated, that looks like AI
+slop, that looks good on the surface but makes no sense, or that
+senders don’t understand or cannot explain.
+
+We strongly recommend using AI tools carefully and responsibly.
+
+Contributors would often benefit more from AI by using it to guide and
+help them step by step towards producing a solution by themselves
+rather than by asking for a full solution that they would then mostly
+copy-paste. They can also use AI to help with debugging, or with
+checking for obvious mistakes, things that can be improved, things
+that don’t match our style, guidelines or our feedback, before sending
+it to us.
+
[[git-tools]]
=== Generate your patch using Git tools out of your commits.
diff --git a/Documentation/config/commitgraph.adoc b/Documentation/config/commitgraph.adoc
index 7f8c9d6638..70a56c53d2 100644
--- a/Documentation/config/commitgraph.adoc
+++ b/Documentation/config/commitgraph.adoc
@@ -8,6 +8,17 @@ commitGraph.maxNewFilters::
Specifies the default value for the `--max-new-filters` option of `git
commit-graph write` (c.f., linkgit:git-commit-graph[1]).
+commitGraph.changedPaths::
+ If true, then `git commit-graph write` will compute and write
+ changed-path Bloom filters by default, equivalent to passing
+ `--changed-paths`. If false or unset, changed-paths Bloom filters will
+ be written during `git commit-graph write` only if the filters already
+ exist in the current commit-graph file. This matches the default
+ behavior of `git commit-graph write` without any `--[no-]changed-paths`
+ option. To rewrite a commit-graph file without any filters, use the
+ `--no-changed-paths` option. Command-line option `--[no-]changed-paths`
+ always takes precedence over this configuration. Defaults to unset.
+
commitGraph.readChangedPaths::
Deprecated. Equivalent to commitGraph.changedPathsVersion=-1 if true, and
commitGraph.changedPathsVersion=0 if false. (If commitGraph.changedPathVersion
diff --git a/Documentation/config/core.adoc b/Documentation/config/core.adoc
index 08739bb9d4..01202da7cd 100644
--- a/Documentation/config/core.adoc
+++ b/Documentation/config/core.adoc
@@ -75,8 +75,8 @@ The built-in file system monitor is currently available only on a
limited set of supported platforms. Currently, this includes Windows
and MacOS.
+
- Otherwise, this variable contains the pathname of the "fsmonitor"
- hook command.
+Otherwise, this variable contains the pathname of the "fsmonitor"
+hook command.
+
This hook command is used to identify all files that may have changed
since the requested date/time. This information is used to speed up
@@ -290,6 +290,9 @@ core.preferSymlinkRefs::
and other symbolic reference files, use symbolic links.
This is sometimes needed to work with old scripts that
expect HEAD to be a symbolic link.
++
+This configuration is deprecated and will be removed in Git 3.0. Symbolic refs
+will always be written as textual symrefs.
core.alternateRefsCommand::
When advertising tips of available history from an alternate, use the shell to
@@ -626,6 +629,8 @@ core.whitespace::
part of the line terminator, i.e. with it, `trailing-space`
does not trigger if the character before such a carriage-return
is not a whitespace (not enabled by default).
+* `incomplete-line` treats the last line of a file that is missing the
+ newline at the end as an error (not enabled by default).
* `tabwidth=<n>` tells how many character positions a tab occupies; this
is relevant for `indent-with-non-tab` and when Git fixes `tab-in-indent`
errors. The default tab width is 8. Allowed values are 1 to 63.
diff --git a/Documentation/config/extensions.adoc b/Documentation/config/extensions.adoc
index 532456644b..e33040fff5 100644
--- a/Documentation/config/extensions.adoc
+++ b/Documentation/config/extensions.adoc
@@ -73,6 +73,12 @@ relativeWorktrees:::
repaired with either the `--relative-paths` option or with the
`worktree.useRelativePaths` config set to `true`.
+submoduleEncoding:::
+ If enabled, submodule gitdir paths are encoded to avoid filesystem
+ conflicts due to nested gitdirs, case insensitivity or other issues
+ When enabled, the submodule.<name>.gitdir config is always set for
+ all submodulesand is the single point of authority for gitdir paths.
+
worktreeConfig:::
If enabled, then worktrees will load config settings from the
`$GIT_DIR/config.worktree` file in addition to the
diff --git a/Documentation/config/maintenance.adoc b/Documentation/config/maintenance.adoc
index 2f71934218..d0c38f03fa 100644
--- a/Documentation/config/maintenance.adoc
+++ b/Documentation/config/maintenance.adoc
@@ -16,19 +16,36 @@ detach.
maintenance.strategy::
This string config option provides a way to specify one of a few
- recommended schedules for background maintenance. This only affects
- which tasks are run during `git maintenance run --schedule=X`
- commands, provided no `--task=<task>` arguments are provided.
- Further, if a `maintenance.<task>.schedule` config value is set,
- then that value is used instead of the one provided by
- `maintenance.strategy`. The possible strategy strings are:
+ recommended strategies for repository maintenance. This affects
+ which tasks are run during `git maintenance run`, provided no
+ `--task=<task>` arguments are provided. This setting impacts manual
+ maintenance, auto-maintenance as well as scheduled maintenance. The
+ tasks that run may be different depending on the maintenance type.
+
-* `none`: This default setting implies no tasks are run at any schedule.
+The maintenance strategy can be further tweaked by setting
+`maintenance.<task>.enabled` and `maintenance.<task>.schedule`. If set, these
+values are used instead of the defaults provided by `maintenance.strategy`.
++
+The possible strategies are:
++
+* `none`: This strategy implies no tasks are run at all. This is the default
+ strategy for scheduled maintenance.
+* `gc`: This strategy runs the `gc` task. This is the default strategy for
+ manual maintenance.
+* `geometric`: This strategy performs geometric repacking of packfiles and
+ keeps auxiliary data structures up-to-date. The strategy expires data in the
+ reflog and removes worktrees that cannot be located anymore. When the
+ geometric repacking strategy would decide to do an all-into-one repack, then
+ the strategy generates a cruft pack for all unreachable objects. Objects that
+ are already part of a cruft pack will be expired.
++
+This repacking strategy is a full replacement for the `gc` strategy and is
+recommended for large repositories.
* `incremental`: This setting optimizes for performing small maintenance
activities that do not delete any data. This does not schedule the `gc`
task, but runs the `prefetch` and `commit-graph` tasks hourly, the
`loose-objects` and `incremental-repack` tasks daily, and the `pack-refs`
- task weekly.
+ task weekly. Manual repository maintenance uses the `gc` task.
maintenance.<task>.enabled::
This boolean config option controls whether the maintenance task
@@ -75,6 +92,22 @@ maintenance.incremental-repack.auto::
number of pack-files not in the multi-pack-index is at least the value
of `maintenance.incremental-repack.auto`. The default value is 10.
+maintenance.geometric-repack.auto::
+ This integer config option controls how often the `geometric-repack`
+ task should be run as part of `git maintenance run --auto`. If zero,
+ then the `geometric-repack` task will not run with the `--auto`
+ option. A negative value will force the task to run every time.
+ Otherwise, a positive value implies the command should run either when
+ there are packfiles that need to be merged together to retain the
+ geometric progression, or when there are at least this many loose
+ objects that would be written into a new packfile. The default value is
+ 100.
+
+maintenance.geometric-repack.splitFactor::
+ This integer config option controls the factor used for the geometric
+ sequence. See the `--geometric=` option in linkgit:git-repack[1] for
+ more details. Defaults to `2`.
+
maintenance.reflog-expire.auto::
This integer config option controls how often the `reflog-expire` task
should be run as part of `git maintenance run --auto`. If zero, then
diff --git a/Documentation/config/replay.adoc b/Documentation/config/replay.adoc
new file mode 100644
index 0000000000..7d549d2f0e
--- /dev/null
+++ b/Documentation/config/replay.adoc
@@ -0,0 +1,11 @@
+replay.refAction::
+ Specifies the default mode for handling reference updates in
+ `git replay`. The value can be:
++
+--
+ * `update`: Update refs directly using an atomic transaction (default behavior).
+ * `print`: Output update-ref commands for pipeline use.
+--
++
+This setting can be overridden with the `--ref-action` command-line option.
+When not configured, `git replay` defaults to `update` mode.
diff --git a/Documentation/config/stash.adoc b/Documentation/config/stash.adoc
index 7fc32027f7..a1197ffd7d 100644
--- a/Documentation/config/stash.adoc
+++ b/Documentation/config/stash.adoc
@@ -11,6 +11,10 @@ endif::git-stash[]
behave as if `--index` was supplied. Defaults to false.
ifndef::git-stash[]
See the descriptions in linkgit:git-stash[1].
++
+This also affects invocations of linkgit:git-stash[1] via `--autostash` from
+commands like linkgit:git-merge[1], linkgit:git-rebase[1], and
+linkgit:git-pull[1].
endif::git-stash[]
`stash.showIncludeUntracked`::
diff --git a/Documentation/config/submodule.adoc b/Documentation/config/submodule.adoc
index 0672d99117..ddaadc3dc5 100644
--- a/Documentation/config/submodule.adoc
+++ b/Documentation/config/submodule.adoc
@@ -52,6 +52,11 @@ submodule.<name>.active::
submodule.active config option. See linkgit:gitsubmodules[7] for
details.
+submodule.<name>.gitdir::
+ This option sets the gitdir path for submodule <name>, allowing users to
+ override the default path. Only works when `extensions.submoduleEncoding`
+ is enabled, otherwise does nothing. See linkgit:git-config[1] for details.
+
submodule.active::
A repeated field which contains a pathspec used to match against a
submodule's path to determine if the submodule is of interest to git
diff --git a/Documentation/diff-algorithm-option.adoc b/Documentation/diff-algorithm-option.adoc
new file mode 100644
index 0000000000..8e3a0b63d7
--- /dev/null
+++ b/Documentation/diff-algorithm-option.adoc
@@ -0,0 +1,20 @@
+`--diff-algorithm=(patience|minimal|histogram|myers)`::
+ Choose a diff algorithm. The variants are as follows:
++
+--
+ `default`;;
+ `myers`;;
+ The basic greedy diff algorithm. Currently, this is the default.
+ `minimal`;;
+ Spend extra time to make sure the smallest possible diff is
+ produced.
+ `patience`;;
+ Use "patience diff" algorithm when generating patches.
+ `histogram`;;
+ This algorithm extends the patience algorithm to "support
+ low-occurrence common elements".
+--
++
+For instance, if you configured the `diff.algorithm` variable to a
+non-default value and want to use the default one, then you
+have to use `--diff-algorithm=default` option.
diff --git a/Documentation/diff-options.adoc b/Documentation/diff-options.adoc
index ae31520f7f..9cdad6f72a 100644
--- a/Documentation/diff-options.adoc
+++ b/Documentation/diff-options.adoc
@@ -197,26 +197,7 @@ and starts with _<text>_, this algorithm attempts to prevent it from
appearing as a deletion or addition in the output. It uses the "patience
diff" algorithm internally.
-`--diff-algorithm=(patience|minimal|histogram|myers)`::
- Choose a diff algorithm. The variants are as follows:
-+
---
- `default`;;
- `myers`;;
- The basic greedy diff algorithm. Currently, this is the default.
- `minimal`;;
- Spend extra time to make sure the smallest possible diff is
- produced.
- `patience`;;
- Use "patience diff" algorithm when generating patches.
- `histogram`;;
- This algorithm extends the patience algorithm to "support
- low-occurrence common elements".
---
-+
-For instance, if you configured the `diff.algorithm` variable to a
-non-default value and want to use the default one, then you
-have to use `--diff-algorithm=default` option.
+include::diff-algorithm-option.adoc[]
`--stat[=<width>[,<name-width>[,<count>]]]`::
Generate a diffstat. By default, as much space as necessary
diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
index 81f11ba125..acac9683af 100644
--- a/Documentation/fsck-msgids.adoc
+++ b/Documentation/fsck-msgids.adoc
@@ -10,6 +10,12 @@
`badFilemode`::
(INFO) A tree contains a bad filemode entry.
+`badGpgsig`::
+ (ERROR) A tag contains a bad (truncated) signature (e.g., `gpgsig`) header.
+
+`badHeaderContinuation`::
+ (ERROR) A continuation header (such as for `gpgsig`) is unexpectedly truncated.
+
`badName`::
(ERROR) An author/committer name is empty.
diff --git a/Documentation/git-add.adoc b/Documentation/git-add.adoc
index 3116a2cac5..6192daeb03 100644
--- a/Documentation/git-add.adoc
+++ b/Documentation/git-add.adoc
@@ -349,6 +349,7 @@ patch::
s - split the current hunk into smaller hunks
e - manually edit the current hunk
p - print the current hunk
+ P - print the current hunk using the pager
? - print help
+
After deciding the fate for all hunks, if there is any hunk
diff --git a/Documentation/git-bisect.adoc b/Documentation/git-bisect.adoc
index 58dbb74a15..b0078dda0e 100644
--- a/Documentation/git-bisect.adoc
+++ b/Documentation/git-bisect.adoc
@@ -9,26 +9,22 @@ git-bisect - Use binary search to find the commit that introduced a bug
SYNOPSIS
--------
[verse]
-'git bisect' <subcommand> <options>
+'git bisect' start [--term-(bad|new)=<term-new> --term-(good|old)=<term-old>]
+ [--no-checkout] [--first-parent] [<bad> [<good>...]] [--] [<pathspec>...]
+'git bisect' (bad|new|<term-new>) [<rev>]
+'git bisect' (good|old|<term-old>) [<rev>...]
+'git bisect' terms [--term-(good|old) | --term-(bad|new)]
+'git bisect' skip [(<rev>|<range>)...]
+'git bisect' next
+'git bisect' reset [<commit>]
+'git bisect' (visualize|view)
+'git bisect' replay <logfile>
+'git bisect' log
+'git bisect' run <cmd> [<arg>...]
+'git bisect' help
DESCRIPTION
-----------
-The command takes various subcommands, and different options depending
-on the subcommand:
-
- git bisect start [--term-(bad|new)=<term-new> --term-(good|old)=<term-old>]
- [--no-checkout] [--first-parent] [<bad> [<good>...]] [--] [<pathspec>...]
- git bisect (bad|new|<term-new>) [<rev>]
- git bisect (good|old|<term-old>) [<rev>...]
- git bisect terms [--term-(good|old) | --term-(bad|new)]
- git bisect skip [(<rev>|<range>)...]
- git bisect reset [<commit>]
- git bisect (visualize|view)
- git bisect replay <logfile>
- git bisect log
- git bisect run <cmd> [<arg>...]
- git bisect help
-
This command uses a binary search algorithm to find which commit in
your project's history introduced a bug. You use it by first telling
it a "bad" commit that is known to contain the bug, and a "good"
@@ -295,6 +291,19 @@ $ git bisect skip v2.5 v2.5..v2.6
This tells the bisect process that the commits between `v2.5` and
`v2.6` (inclusive) should be skipped.
+Bisect next
+~~~~~~~~~~~
+
+Normally, after marking a revision as good or bad, Git automatically
+computes and checks out the next revision to test. However, if you need to
+explicitly request the next bisection step, you can use:
+
+------------
+$ git bisect next
+------------
+
+You might use this to resume the bisection process after interrupting it
+by checking out a different revision.
Cutting down bisection by giving more parameters to bisect start
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc
index e438d28625..adcbb6f5dc 100644
--- a/Documentation/git-blame.adoc
+++ b/Documentation/git-blame.adoc
@@ -85,6 +85,8 @@ include::blame-options.adoc[]
Ignore whitespace when comparing the parent's version and
the child's to find where the lines came from.
+include::diff-algorithm-option.adoc[]
+
--abbrev=<n>::
Instead of using the default 7+1 hexadecimal digits as the
abbreviated object name, use <m>+1 digits, where <m> is at
diff --git a/Documentation/git-checkout.adoc b/Documentation/git-checkout.adoc
index 431185ca0b..6f281b298e 100644
--- a/Documentation/git-checkout.adoc
+++ b/Documentation/git-checkout.adoc
@@ -61,7 +61,7 @@ uncommitted changes.
`git checkout -B <branch> [<start-point>]`::
The same as `-b`, except that if the branch already exists it
- resets `_<branch>_` to the start point instead of failing.
+ resets _<branch>_ to the start point instead of failing.
`git checkout --detach [<branch>]`::
`git checkout [--detach] <commit>`::
@@ -155,7 +155,7 @@ of it").
`-B <new-branch>`::
The same as `-b`, except that if the branch already exists it
- resets `_<branch>_` to the start point instead of failing.
+ resets _<branch>_ to the start point instead of failing.
`-t`::
`--track[=(direct|inherit)]`::
diff --git a/Documentation/git-commit-graph.adoc b/Documentation/git-commit-graph.adoc
index e9558173c0..6d19026035 100644
--- a/Documentation/git-commit-graph.adoc
+++ b/Documentation/git-commit-graph.adoc
@@ -71,7 +71,7 @@ take a while on large repositories. It provides significant performance gains
for getting history of a directory or a file with `git log -- <path>`. If
this option is given, future commit-graph writes will automatically assume
that this option was intended. Use `--no-changed-paths` to stop storing this
-data.
+data. `--changed-paths` is implied by config `commitGraph.changedPaths=true`.
+
With the `--max-new-filters=<n>` option, generate at most `n` new Bloom
filters (if `--changed-paths` is specified). If `n` is `-1`, no limit is
diff --git a/Documentation/git-config.adoc b/Documentation/git-config.adoc
index 36d2845152..cc054fa7e1 100644
--- a/Documentation/git-config.adoc
+++ b/Documentation/git-config.adoc
@@ -117,15 +117,15 @@ OPTIONS
--comment <message>::
Append a comment at the end of new or modified lines.
-
- If _<message>_ begins with one or more whitespaces followed
- by "#", it is used as-is. If it begins with "#", a space is
- prepended before it is used. Otherwise, a string " # " (a
- space followed by a hash followed by a space) is prepended
- to it. And the resulting string is placed immediately after
- the value defined for the variable. The _<message>_ must
- not contain linefeed characters (no multi-line comments are
- permitted).
++
+If _<message>_ begins with one or more whitespaces followed
+by "#", it is used as-is. If it begins with "#", a space is
+prepended before it is used. Otherwise, a string " # " (a
+space followed by a hash followed by a space) is prepended
+to it. And the resulting string is placed immediately after
+the value defined for the variable. The _<message>_ must
+not contain linefeed characters (no multi-line comments are
+permitted).
--all::
With `get`, return all values for a multi-valued key.
diff --git a/Documentation/git-fast-import.adoc b/Documentation/git-fast-import.adoc
index 85ed7a7270..b74179a6c8 100644
--- a/Documentation/git-fast-import.adoc
+++ b/Documentation/git-fast-import.adoc
@@ -66,6 +66,11 @@ fast-import stream! This option is enabled automatically for
remote-helpers that use the `import` capability, as they are
already trusted to run their own code.
+--signed-tags=(verbatim|warn-verbatim|warn-strip|strip|abort)::
+ Specify how to handle signed tags. Behaves in the same way
+ as the same option in linkgit:git-fast-export[1], except that
+ default is 'verbatim' (instead of 'abort').
+
--signed-commits=(verbatim|warn-verbatim|warn-strip|strip|abort)::
Specify how to handle signed commits. Behaves in the same way
as the same option in linkgit:git-fast-export[1], except that
diff --git a/Documentation/git-history.adoc b/Documentation/git-history.adoc
new file mode 100644
index 0000000000..3d6b2665f8
--- /dev/null
+++ b/Documentation/git-history.adoc
@@ -0,0 +1,111 @@
+git-history(1)
+==============
+
+NAME
+----
+git-history - EXPERIMENTAL: Rewrite history of the current branch
+
+SYNOPSIS
+--------
+[synopsis]
+git history reword <commit>
+git history split <commit> [--] [<pathspec>...]
+
+DESCRIPTION
+-----------
+
+Rewrite history by rearranging or modifying specific commits in the
+history.
+
+THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE.
+
+This command is similar to linkgit:git-rebase[1] and uses the same
+underlying machinery. You should use rebases if you want to reapply a range of
+commits onto a different base, or interactive rebases if you want to edit a
+range of commits.
+
+Note that this command does not (yet) work with histories that contain
+merges. You should use linkgit:git-rebase[1] with the `--rebase-merges`
+flag instead.
+
+COMMANDS
+--------
+
+Several commands are available to rewrite history in different ways:
+
+`reword <commit>`::
+ Rewrite the commit message of the specified commit. All the other
+ details of this commit remain unchanged. This command will spawn an
+ editor with the current message of that commit.
+
+`split <commit> [--] [<pathspec>...]`::
+ Interactively split up <commit> into two commits by choosing
+ hunks introduced by it that will be moved into the new split-out
+ commit. These hunks will then be written into a new commit that
+ becomes the parent of the previous commit. The original commit
+ stays intact, except that its parent will be the newly split-out
+ commit.
++
+The commit message of the new commit will be asked for by launching the
+configured editor. Authorship of the commit will be the same as for the
+original commit.
++
+If passed, _<pathspec>_ can be used to limit which changes shall be split out
+of the original commit. Files not matching any of the pathspecs will remain
+part of the original commit. For more details, see the 'pathspec' entry in
+linkgit:gitglossary[7].
++
+It is invalid to select either all or no hunks, as that would lead to
+one of the commits becoming empty.
+
+CONFIGURATION
+-------------
+
+include::includes/cmd-config-section-all.adoc[]
+
+include::config/sequencer.adoc[]
+
+EXAMPLES
+--------
+
+Split a commit
+~~~~~~~~~~~~~~
+
+----------
+$ git log --stat --oneline
+3f81232 (HEAD -> main) original
+ bar | 1 +
+ foo | 1 +
+ 2 files changed, 2 insertions(+)
+
+$ git history split HEAD
+diff --git a/bar b/bar
+new file mode 100644
+index 0000000..5716ca5
+--- /dev/null
++++ b/bar
+@@ -0,0 +1 @@
++bar
+(1/1) Stage addition [y,n,q,a,d,e,p,?]? y
+
+diff --git a/foo b/foo
+new file mode 100644
+index 0000000..257cc56
+--- /dev/null
++++ b/foo
+@@ -0,0 +1 @@
++foo
+(1/1) Stage addition [y,n,q,a,d,e,p,?]? n
+
+$ git log --stat --oneline
+7cebe64 (HEAD -> main) original
+ foo | 1 +
+ 1 file changed, 1 insertion(+)
+d1582f3 split-out commit
+ bar | 1 +
+ 1 file changed, 1 insertion(+)
+----------
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Documentation/git-maintenance.adoc b/Documentation/git-maintenance.adoc
index 540b5cf68b..37939510d4 100644
--- a/Documentation/git-maintenance.adoc
+++ b/Documentation/git-maintenance.adoc
@@ -12,6 +12,7 @@ SYNOPSIS
'git maintenance' run [<options>]
'git maintenance' start [--scheduler=<scheduler>]
'git maintenance' (stop|register|unregister) [<options>]
+'git maintenance' is-needed [<options>]
DESCRIPTION
@@ -84,6 +85,16 @@ The `unregister` subcommand will report an error if the current repository
is not already registered. Use the `--force` option to return success even
when the current repository is not registered.
+is-needed::
+ Check whether maintenance needs to be run without actually running it.
+ Exits with a 0 status code if maintenance needs to be run, 1 otherwise.
+ Ideally used with the '--auto' flag.
++
+If one or more `--task` options are specified, then those tasks are checked
+in that order. Otherwise, the tasks are determined by which
+`maintenance.<task>.enabled` config options are true. By default, only
+`maintenance.gc.enabled` is true.
+
TASKS
-----
@@ -183,6 +194,8 @@ OPTIONS
in the `gc.auto` config setting, or when the number of pack-files
exceeds the `gc.autoPackLimit` config setting. Not compatible with
the `--schedule` option.
+ When combined with the `is-needed` subcommand, check if the required
+ thresholds are met without actually running maintenance.
--schedule::
When combined with the `run` subcommand, run maintenance tasks
diff --git a/Documentation/git-patch-id.adoc b/Documentation/git-patch-id.adoc
index 45da0f27ac..92a1af36a2 100644
--- a/Documentation/git-patch-id.adoc
+++ b/Documentation/git-patch-id.adoc
@@ -7,8 +7,8 @@ git-patch-id - Compute unique ID for a patch
SYNOPSIS
--------
-[verse]
-'git patch-id' [--stable | --unstable | --verbatim]
+[synopsis]
+git patch-id [--stable | --unstable | --verbatim]
DESCRIPTION
-----------
@@ -21,7 +21,7 @@ the same time also reasonably unique, i.e., two patches that have the same
The main usecase for this command is to look for likely duplicate commits.
-When dealing with 'git diff-tree' output, it takes advantage of
+When dealing with `git diff-tree` output, it takes advantage of
the fact that the patch is prefixed with the object name of the
commit, and outputs two 40-byte hexadecimal strings. The first
string is the patch ID, and the second string is the commit ID.
@@ -30,35 +30,35 @@ This can be used to make a mapping from patch ID to commit ID.
OPTIONS
-------
---verbatim::
+`--verbatim`::
Calculate the patch-id of the input as it is given, do not strip
any whitespace.
+
-This is the default if patchid.verbatim is true.
+This is the default if `patchid.verbatim` is `true`.
---stable::
+`--stable`::
Use a "stable" sum of hashes as the patch ID. With this option:
+
--
- Reordering file diffs that make up a patch does not affect the ID.
In particular, two patches produced by comparing the same two trees
- with two different settings for "-O<orderfile>" result in the same
+ with two different settings for `-O<orderfile>` result in the same
patch ID signature, thereby allowing the computed result to be used
as a key to index some meta-information about the change between
the two trees;
- Result is different from the value produced by git 1.9 and older
- or produced when an "unstable" hash (see --unstable below) is
+ or produced when an "unstable" hash (see `--unstable` below) is
configured - even when used on a diff output taken without any use
- of "-O<orderfile>", thereby making existing databases storing such
+ of `-O<orderfile>`, thereby making existing databases storing such
"unstable" or historical patch-ids unusable.
- All whitespace within the patch is ignored and does not affect the id.
--
+
-This is the default if patchid.stable is set to true.
+This is the default if `patchid.stable` is set to `true`.
---unstable::
+`--unstable`::
Use an "unstable" hash as the patch ID. With this option,
the result produced is compatible with the patch-id value produced
by git 1.9 and older and whitespace is ignored. Users with pre-existing
diff --git a/Documentation/git-pull.adoc b/Documentation/git-pull.adoc
index 48e924a10a..cd3bbc90e3 100644
--- a/Documentation/git-pull.adoc
+++ b/Documentation/git-pull.adoc
@@ -15,68 +15,54 @@ SYNOPSIS
DESCRIPTION
-----------
-Incorporates changes from a remote repository into the current branch.
-If the current branch is behind the remote, then by default it will
-fast-forward the current branch to match the remote. If the current
-branch and the remote have diverged, the user needs to specify how to
-reconcile the divergent branches with `--rebase` or `--no-rebase` (or
-the corresponding configuration option in `pull.rebase`).
-
-More precisely, `git pull` runs `git fetch` with the given parameters
-and then depending on configuration options or command line flags,
-will call either `git rebase` or `git merge` to reconcile diverging
-branches.
-
-<repository> should be the name of a remote repository as
-passed to linkgit:git-fetch[1]. <refspec> can name an
-arbitrary remote ref (for example, the name of a tag) or even
-a collection of refs with corresponding remote-tracking branches
-(e.g., refs/heads/{asterisk}:refs/remotes/origin/{asterisk}),
-but usually it is the name of a branch in the remote repository.
-
-Default values for <repository> and <branch> are read from the
-"remote" and "merge" configuration for the current branch
-as set by linkgit:git-branch[1] `--track`.
-
-Assume the following history exists and the current branch is
-"`master`":
+Integrate changes from a remote repository into the current branch.
-------------
- A---B---C master on origin
- /
- D---E---F---G master
- ^
- origin/master in your repository
-------------
+First, `git pull` runs `git fetch` with the same arguments
+(excluding merge options) to fetch remote branch(es).
+Then it decides which remote branch to integrate: if you run `git pull`
+with no arguments this defaults to the <<UPSTREAM-BRANCHES,upstream>>
+for the current branch.
+Then it integrates that branch into the current branch.
-Then "`git pull`" will fetch and replay the changes from the remote
-`master` branch since it diverged from the local `master` (i.e., `E`)
-until its current commit (`C`) on top of `master` and record the
-result in a new commit along with the names of the two parent commits
-and a log message from the user describing the changes.
-
-------------
- A---B---C origin/master
- / \
- D---E---F---G---H master
-------------
+There are 4 main options for integrating the remote branch:
-See linkgit:git-merge[1] for details, including how conflicts
-are presented and handled.
+1. `git pull --ff-only` will only do "fast-forward" updates: it
+ fails if your local branch has diverged from the remote branch.
+ This is the default.
+2. `git pull --rebase` runs `git rebase`
+3. `git pull --no-rebase` runs `git merge`.
+4. `git pull --squash` runs `git merge --squash`
-In Git 1.7.0 or later, to cancel a conflicting merge, use
-`git reset --merge`. *Warning*: In older versions of Git, running 'git pull'
-with uncommitted changes is discouraged: while possible, it leaves you
-in a state that may be hard to back out of in the case of a conflict.
+You can also set the configuration options `pull.rebase`, `pull.squash`,
+or `pull.ff` with your preferred behaviour.
-If any of the remote changes overlap with local uncommitted changes,
-the merge will be automatically canceled and the work tree untouched.
-It is generally best to get any local changes in working order before
-pulling or stash them away with linkgit:git-stash[1].
+If there's a merge conflict during the merge or rebase that you don't
+want to handle, you can safely abort it with `git merge --abort` or `git
+--rebase abort`.
OPTIONS
-------
+<repository>::
+ The "remote" repository to pull from. This can be either
+ a URL (see the section <<URLS,GIT URLS>> below) or the name
+ of a remote (see the section <<REMOTES,REMOTES>> below).
++
+Defaults to the configured upstream for the current branch, or `origin`.
+See <<UPSTREAM-BRANCHES,UPSTREAM BRANCHES>> below for more on how to
+configure upstreams.
+
+<refspec>::
+ Which branch or other reference(s) to fetch and integrate into the
+ current branch, for example `main` in `git pull origin main`.
+ Defaults to the configured upstream for the current branch.
++
+This can be a branch, tag, or other collection of reference(s).
+See <<fetch-refspec,<refspec>>> below under "Options related to fetching"
+for the full syntax, and <<DEFAULT-BEHAVIOUR,DEFAULT BEHAVIOUR>> below
+for how `git pull` uses this argument to determine which remote branch
+to integrate.
+
-q::
--quiet::
This is passed to both underlying git-fetch to squelch reporting of
@@ -145,6 +131,7 @@ include::urls-remotes.adoc[]
include::merge-strategies.adoc[]
+[[DEFAULT-BEHAVIOUR]]
DEFAULT BEHAVIOUR
-----------------
diff --git a/Documentation/git-rebase.adoc b/Documentation/git-rebase.adoc
index 005caf6164..4d2fe4be6e 100644
--- a/Documentation/git-rebase.adoc
+++ b/Documentation/git-rebase.adoc
@@ -487,9 +487,16 @@ See also INCOMPATIBLE OPTIONS below.
Add a `Signed-off-by` trailer to all the rebased commits. Note
that if `--interactive` is given then only commits marked to be
picked, edited or reworded will have the trailer added.
-+
+
See also INCOMPATIBLE OPTIONS below.
+--trailer=<trailer>::
+ Append the given trailer line(s) to every rebased commit
+ message, processed via linkgit:git-interpret-trailers[1].
+ When this option is present *rebase automatically implies*
+ `--force-rebase` so that fast‑forwarded commits are also
+ rewritten.
+
-i::
--interactive::
Make a list of the commits which are about to be rebased. Let the
diff --git a/Documentation/git-replay.adoc b/Documentation/git-replay.adoc
index 0b12bf8aa4..dcb26e8a8e 100644
--- a/Documentation/git-replay.adoc
+++ b/Documentation/git-replay.adoc
@@ -9,15 +9,16 @@ git-replay - EXPERIMENTAL: Replay commits on a new base, works with bare repos t
SYNOPSIS
--------
[verse]
-(EXPERIMENTAL!) 'git replay' ([--contained] --onto <newbase> | --advance <branch>) <revision-range>...
+(EXPERIMENTAL!) 'git replay' ([--contained] --onto <newbase> | --advance <branch>) [--ref-action[=<mode>]] <revision-range>...
DESCRIPTION
-----------
Takes ranges of commits and replays them onto a new location. Leaves
-the working tree and the index untouched, and updates no references.
-The output of this command is meant to be used as input to
-`git update-ref --stdin`, which would update the relevant branches
+the working tree and the index untouched. By default, updates the
+relevant references using an atomic transaction (all refs update or
+none). Use `--ref-action=print` to avoid automatic ref updates and
+instead get update commands that can be piped to `git update-ref --stdin`
(see the OUTPUT section below).
THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE.
@@ -29,18 +30,29 @@ OPTIONS
Starting point at which to create the new commits. May be any
valid commit, and not just an existing branch name.
+
-When `--onto` is specified, the update-ref command(s) in the output will
-update the branch(es) in the revision range to point at the new
-commits, similar to the way how `git rebase --update-refs` updates
-multiple branches in the affected range.
+When `--onto` is specified, the branch(es) in the revision range will be
+updated to point at the new commits, similar to the way `git rebase --update-refs`
+updates multiple branches in the affected range.
--advance <branch>::
Starting point at which to create the new commits; must be a
branch name.
+
-When `--advance` is specified, the update-ref command(s) in the output
-will update the branch passed as an argument to `--advance` to point at
-the new commits (in other words, this mimics a cherry-pick operation).
+The history is replayed on top of the <branch> and <branch> is updated to
+point at the tip of the resulting history. This is different from `--onto`,
+which uses the target only as a starting point without updating it.
+
+--ref-action[=<mode>]::
+ Control how references are updated. The mode can be:
++
+--
+ * `update` (default): Update refs directly using an atomic transaction.
+ All refs are updated or none are (all-or-nothing behavior).
+ * `print`: Output update-ref commands for pipeline use. This is the
+ traditional behavior where output can be piped to `git update-ref --stdin`.
+--
++
+The default mode can be configured via the `replay.refAction` configuration variable.
<revision-range>::
Range of commits to replay. More than one <revision-range> can
@@ -54,8 +66,11 @@ include::rev-list-options.adoc[]
OUTPUT
------
-When there are no conflicts, the output of this command is usable as
-input to `git update-ref --stdin`. It is of the form:
+By default, or with `--ref-action=update`, this command produces no output on
+success, as refs are updated directly using an atomic transaction.
+
+When using `--ref-action=print`, the output is usable as input to
+`git update-ref --stdin`. It is of the form:
update refs/heads/branch1 ${NEW_branch1_HASH} ${OLD_branch1_HASH}
update refs/heads/branch2 ${NEW_branch2_HASH} ${OLD_branch2_HASH}
@@ -81,6 +96,14 @@ To simply rebase `mybranch` onto `target`:
------------
$ git replay --onto target origin/main..mybranch
+------------
+
+The refs are updated atomically and no output is produced on success.
+
+To see what would be updated without actually updating:
+
+------------
+$ git replay --ref-action=print --onto target origin/main..mybranch
update refs/heads/mybranch ${NEW_mybranch_HASH} ${OLD_mybranch_HASH}
------------
@@ -88,33 +111,29 @@ To cherry-pick the commits from mybranch onto target:
------------
$ git replay --advance target origin/main..mybranch
-update refs/heads/target ${NEW_target_HASH} ${OLD_target_HASH}
------------
Note that the first two examples replay the exact same commits and on
top of the exact same new base, they only differ in that the first
-provides instructions to make mybranch point at the new commits and
-the second provides instructions to make target point at them.
+updates mybranch to point at the new commits and the second updates
+target to point at them.
What if you have a stack of branches, one depending upon another, and
you'd really like to rebase the whole set?
------------
$ git replay --contained --onto origin/main origin/main..tipbranch
-update refs/heads/branch1 ${NEW_branch1_HASH} ${OLD_branch1_HASH}
-update refs/heads/branch2 ${NEW_branch2_HASH} ${OLD_branch2_HASH}
-update refs/heads/tipbranch ${NEW_tipbranch_HASH} ${OLD_tipbranch_HASH}
------------
+All three branches (`branch1`, `branch2`, and `tipbranch`) are updated
+atomically.
+
When calling `git replay`, one does not need to specify a range of
commits to replay using the syntax `A..B`; any range expression will
do:
------------
$ git replay --onto origin/main ^base branch1 branch2 branch3
-update refs/heads/branch1 ${NEW_branch1_HASH} ${OLD_branch1_HASH}
-update refs/heads/branch2 ${NEW_branch2_HASH} ${OLD_branch2_HASH}
-update refs/heads/branch3 ${NEW_branch3_HASH} ${OLD_branch3_HASH}
------------
This will simultaneously rebase `branch1`, `branch2`, and `branch3`,
diff --git a/Documentation/git-repo.adoc b/Documentation/git-repo.adoc
index 209afd1b61..70f0a6d2e4 100644
--- a/Documentation/git-repo.adoc
+++ b/Documentation/git-repo.adoc
@@ -8,7 +8,8 @@ git-repo - Retrieve information about the repository
SYNOPSIS
--------
[synopsis]
-git repo info [--format=(keyvalue|nul)] [-z] [<key>...]
+git repo info [--format=(keyvalue|nul)] [-z] [--all | <key>...]
+git repo structure [--format=(table|keyvalue|nul)]
DESCRIPTION
-----------
@@ -18,13 +19,13 @@ THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE.
COMMANDS
--------
-`info [--format=(keyvalue|nul)] [-z] [<key>...]`::
+`info [--format=(keyvalue|nul)] [-z] [--all | <key>...]`::
Retrieve metadata-related information about the current repository. Only
the requested data will be returned based on their keys (see "INFO KEYS"
section below).
+
The values are returned in the same order in which their respective keys were
-requested.
+requested. The `--all` flag requests the values for all the available keys.
+
The output format can be chosen through the flag `--format`. Two formats are
supported:
@@ -43,6 +44,35 @@ supported:
+
`-z` is an alias for `--format=nul`.
+`structure [--format=(table|keyvalue|nul)]`::
+ Retrieve statistics about the current repository structure. The
+ following kinds of information are reported:
++
+* Reference counts categorized by type
+* Reachable object counts categorized by type
+
++
+The output format can be chosen through the flag `--format`. Three formats are
+supported:
++
+`table`:::
+ Outputs repository stats in a human-friendly table. This format may
+ change and is not intended for machine parsing. This is the default
+ format.
+
+`keyvalue`:::
+ Each line of output contains a key-value pair for a repository stat.
+ The '=' character is used to delimit between the key and the value.
+ Values containing "unusual" characters are quoted as explained for the
+ configuration variable `core.quotePath` (see linkgit:git-config[1]).
+
+`nul`:::
+ Similar to `keyvalue`, but uses a NUL character to delimit between
+ key-value pairs instead of a newline. Also uses a newline character as
+ the delimiter between the key and value instead of '='. Unlike the
+ `keyvalue` format, values containing "unusual" characters are never
+ quoted.
+
INFO KEYS
---------
In order to obtain a set of values from `git repo info`, you should provide
diff --git a/Documentation/git-rev-parse.adoc b/Documentation/git-rev-parse.adoc
index cc32b4b4f0..5398691f3f 100644
--- a/Documentation/git-rev-parse.adoc
+++ b/Documentation/git-rev-parse.adoc
@@ -174,13 +174,13 @@ for another option.
Allow oids to be input from any object format that the current
repository supports.
-
- Specifying "sha1" translates if necessary and returns a sha1 oid.
-
- Specifying "sha256" translates if necessary and returns a sha256 oid.
-
- Specifying "storage" translates if necessary and returns an oid in
- encoded in the storage hash algorithm.
++
+Specifying "sha1" translates if necessary and returns a sha1 oid.
++
+Specifying "sha256" translates if necessary and returns a sha256 oid.
++
+Specifying "storage" translates if necessary and returns an oid in
+encoded in the storage hash algorithm.
Options for Objects
~~~~~~~~~~~~~~~~~~~
@@ -324,11 +324,12 @@ The following options are unaffected by `--path-format`:
path of the current directory relative to the top-level
directory.
---show-object-format[=(storage|input|output)]::
- Show the object format (hash algorithm) used for the repository
- for storage inside the `.git` directory, input, or output. For
- input, multiple algorithms may be printed, space-separated.
- If not specified, the default is "storage".
+--show-object-format[=(storage|input|output|compat)]::
+ Show the object format (hash algorithm) used for the repository for storage
+ inside the `.git` directory, input, output, or compatibility. For input,
+ multiple algorithms may be printed, space-separated. If `compat` is
+ requested and no compatibility algorithm is enabled, prints an empty line. If
+ not specified, the default is "storage".
--show-ref-format::
Show the reference storage format used for the repository.
diff --git a/Documentation/git-shortlog.adoc b/Documentation/git-shortlog.adoc
index d8ab38dcc1..aa92800c69 100644
--- a/Documentation/git-shortlog.adoc
+++ b/Documentation/git-shortlog.adoc
@@ -44,8 +44,8 @@ OPTIONS
describe each commit. '<format>' can be any string accepted
by the `--format` option of 'git log', such as '* [%h] %s'.
(See the "PRETTY FORMATS" section of linkgit:git-log[1].)
-
- Each pretty-printed commit will be rewrapped before it is shown.
++
+Each pretty-printed commit will be rewrapped before it is shown.
--date=<format>::
Show dates formatted according to the given date string. (See
diff --git a/Documentation/git-sparse-checkout.adoc b/Documentation/git-sparse-checkout.adoc
index 529a8edd9c..0d1618f161 100644
--- a/Documentation/git-sparse-checkout.adoc
+++ b/Documentation/git-sparse-checkout.adoc
@@ -9,7 +9,7 @@ git-sparse-checkout - Reduce your working tree to a subset of tracked files
SYNOPSIS
--------
[verse]
-'git sparse-checkout' (init | list | set | add | reapply | disable | check-rules) [<options>]
+'git sparse-checkout' (init | list | set | add | reapply | disable | check-rules | clean) [<options>]
DESCRIPTION
@@ -111,6 +111,37 @@ flags, with the same meaning as the flags from the `set` command, in order
to change which sparsity mode you are using without needing to also respecify
all sparsity paths.
+'clean'::
+ Opportunistically remove files outside of the sparse-checkout
+ definition. This command requires cone mode to use recursive
+ directory matches to determine which files should be removed. A
+ file is considered for removal if it is contained within a tracked
+ directory that is outside of the sparse-checkout definition.
++
+Some special cases, such as merge conflicts or modified files outside of
+the sparse-checkout definition could lead to keeping files that would
+otherwise be removed. Resolve conflicts, stage modifications, and use
+`git sparse-checkout reapply` in conjunction with `git sparse-checkout
+clean` to resolve these cases.
++
+This command can be used to be sure the sparse index works efficiently,
+though it does not require enabling the sparse index feature via the
+`index.sparse=true` configuration.
++
+To prevent accidental deletion of worktree files, the `clean` subcommand
+will not delete any files without the `-f` or `--force` option, unless
+the `clean.requireForce` config option is set to `false`.
++
+The `--dry-run` option will list the directories that would be removed
+without deleting them. Running in this mode can be helpful to predict the
+behavior of the clean comand or to determine which kinds of files are left
+in the sparse directories.
++
+The `--verbose` option will list every file within the directories that
+are considered for removal. This option is helpful to determine if those
+files are actually important or perhaps to explain why the directory is
+still present despite the current sparse-checkout.
+
'disable'::
Disable the `core.sparseCheckout` config setting, and restore the
working directory to include all files.
@@ -264,34 +295,50 @@ patterns in non-cone mode has a number of shortcomings:
inconsistent.
* It has edge cases where the "right" behavior is unclear. Two examples:
-
- First, two users are in a subdirectory, and the first runs
- git sparse-checkout set '/toplevel-dir/*.c'
- while the second runs
- git sparse-checkout set relative-dir
- Should those arguments be transliterated into
- current/subdirectory/toplevel-dir/*.c
- and
- current/subdirectory/relative-dir
- before inserting into the sparse-checkout file? The user who typed
- the first command is probably aware that arguments to set/add are
- supposed to be patterns in non-cone mode, and probably would not be
- happy with such a transliteration. However, many gitignore-style
- patterns are just paths, which might be what the user who typed the
- second command was thinking, and they'd be upset if their argument
- wasn't transliterated.
-
- Second, what should bash-completion complete on for set/add commands
- for non-cone users? If it suggests paths, is it exacerbating the
- problem above? Also, if it suggests paths, what if the user has a
- file or directory that begins with either a '!' or '#' or has a '*',
- '\', '?', '[', or ']' in its name? And if it suggests paths, will
- it complete "/pro" to "/proc" (in the root filesystem) rather than to
- "/progress.txt" in the current directory? (Note that users are
- likely to want to start paths with a leading '/' in non-cone mode,
- for the same reason that .gitignore files often have one.)
- Completing on files or directories might give nasty surprises in
- all these cases.
++
+First, two users are in a subdirectory, and the first runs
++
+----
+git sparse-checkout set '/toplevel-dir/*.c'
+----
++
+while the second runs
++
+----
+git sparse-checkout set relative-dir
+----
++
+Should those arguments be transliterated into
++
+----
+current/subdirectory/toplevel-dir/*.c
+----
++
+and
++
+----
+current/subdirectory/relative-dir
+----
++
+before inserting into the sparse-checkout file? The user who typed
+the first command is probably aware that arguments to set/add are
+supposed to be patterns in non-cone mode, and probably would not be
+happy with such a transliteration. However, many gitignore-style
+patterns are just paths, which might be what the user who typed the
+second command was thinking, and they'd be upset if their argument
+wasn't transliterated.
++
+Second, what should bash-completion complete on for set/add commands
+for non-cone users? If it suggests paths, is it exacerbating the
+problem above? Also, if it suggests paths, what if the user has a
+file or directory that begins with either a '!' or '#' or has a '*',
+'\', '?', '[', or ']' in its name? And if it suggests paths, will
+it complete "/pro" to "/proc" (in the root filesystem) rather than to
+"/progress.txt" in the current directory? (Note that users are
+likely to want to start paths with a leading '/' in non-cone mode,
+for the same reason that .gitignore files often have one.)
+Completing on files or directories might give nasty surprises in
+all these cases.
* The excessive flexibility made other extensions essentially
impractical. `--sparse-index` is likely impossible in non-cone
diff --git a/Documentation/git-tag.adoc b/Documentation/git-tag.adoc
index 0f7badc116..cea3202fdb 100644
--- a/Documentation/git-tag.adoc
+++ b/Documentation/git-tag.adoc
@@ -3,7 +3,7 @@ git-tag(1)
NAME
----
-git-tag - Create, list, delete or verify a tag object signed with GPG
+git-tag - Create, list, delete or verify tags
SYNOPSIS
@@ -38,15 +38,17 @@ and `-a`, `-s`, and `-u <key-id>` are absent, `-a` is implied.
Otherwise, a tag reference that points directly at the given object
(i.e., a lightweight tag) is created.
-A GnuPG signed tag object will be created when `-s` or `-u
-<key-id>` is used. When `-u <key-id>` is not used, the
-committer identity for the current user is used to find the
-GnuPG key for signing. The configuration variable `gpg.program`
-is used to specify custom GnuPG binary.
+A cryptographically signed tag object will be created when `-s` or
+`-u <key-id>` is used. The signing backend (GPG, X.509, SSH, etc.) is
+controlled by the `gpg.format` configuration variable, defaulting to
+OpenPGP. When `-u <key-id>` is not used, the committer identity for
+the current user is used to find the key for signing. The
+configuration variable `gpg.program` is used to specify a custom
+signing binary.
Tag objects (created with `-a`, `-s`, or `-u`) are called "annotated"
tags; they contain a creation date, the tagger name and e-mail, a
-tagging message, and an optional GnuPG signature. Whereas a
+tagging message, and an optional cryptographic signature. Whereas a
"lightweight" tag is simply a name for an object (usually a commit
object).
@@ -64,10 +66,12 @@ OPTIONS
`-s`::
`--sign`::
- Make a GPG-signed tag, using the default e-mail address's key.
- The default behavior of tag GPG-signing is controlled by `tag.gpgSign`
- configuration variable if it exists, or disabled otherwise.
- See linkgit:git-config[1].
+ Make a cryptographically signed tag, using the default signing
+ key. The signing backend used depends on the `gpg.format`
+ configuration variable. The default key is determined by the
+ backend. For GPG, it's based on the committer's email address,
+ while for SSH it may be a specific key file or agent
+ identity. See linkgit:git-config[1].
`--no-sign`::
Override `tag.gpgSign` configuration variable that is
@@ -75,7 +79,10 @@ OPTIONS
`-u <key-id>`::
`--local-user=<key-id>`::
- Make a GPG-signed tag, using the given key.
+ Make a cryptographically signed tag using the given key. The
+ format of the <key-id> and the backend used depend on the
+ `gpg.format` configuration variable. See
+ linkgit:git-config[1].
`-f`::
`--force`::
@@ -87,7 +94,7 @@ OPTIONS
`-v`::
`--verify`::
- Verify the GPG signature of the given tag names.
+ Verify the cryptographic signature of the given tags.
`-n<num>`::
_<num>_ specifies how many lines from the annotation, if any,
@@ -235,12 +242,23 @@ it in the repository configuration as follows:
-------------------------------------
[user]
- signingKey = <gpg-key-id>
+ signingKey = <key-id>
-------------------------------------
+The signing backend can be chosen via the `gpg.format` configuration
+variable, which defaults to `openpgp`. See linkgit:git-config[1]
+for a list of other supported formats.
+
+The path to the program used for each signing backend can be specified
+with the `gpg.<format>.program` configuration variable. For the
+`openpgp` backend, `gpg.program` can be used as a synonym for
+`gpg.openpgp.program`. See linkgit:git-config[1] for details.
+
`pager.tag` is only respected when listing tags, i.e., when `-l` is
used or implied. The default is to use a pager.
-See linkgit:git-config[1].
+
+See linkgit:git-config[1] for more details and other configuration
+variables.
DISCUSSION
----------
diff --git a/Documentation/git-worktree.adoc b/Documentation/git-worktree.adoc
index f272f79783..0f82ec5439 100644
--- a/Documentation/git-worktree.adoc
+++ b/Documentation/git-worktree.adoc
@@ -79,6 +79,9 @@ with a matching name, treat as equivalent to:
$ git worktree add --track -b <branch> <path> <remote>/<branch>
------------
+
+For best results it is advised to specify _<path>_ outside of the repository
+and existing worktrees - see <<EXAMPLES,EXAMPLES>>
++
If the branch exists in multiple remotes and one of them is named by
the `checkout.defaultRemote` configuration variable, we'll use that
one for the purposes of disambiguation, even if the _<branch>_ isn't
@@ -502,6 +505,7 @@ locked "reason\nwhy is locked"
...
------------
+[[EXAMPLES]]
EXAMPLES
--------
You are in the middle of a refactoring session and your boss comes in and
@@ -522,6 +526,16 @@ $ popd
$ git worktree remove ../temp
------------
+Side by side branch checkouts for a repository using multiple worktrees
+
+------------
+mkdir some-repository
+cd some-repository
+git clone --bare gitforge@someforge.example.com:some-org/some-repository some-repository.git
+git --git-dir=some-repository.git worktree add some-branch
+git --git-dir=some-repository.git worktree add another-branch
+------------
+
CONFIGURATION
-------------
diff --git a/Documentation/gitcli.adoc b/Documentation/gitcli.adoc
index ef2a0a399d..6815d6bfb7 100644
--- a/Documentation/gitcli.adoc
+++ b/Documentation/gitcli.adoc
@@ -223,7 +223,7 @@ Options that take a filename allow a prefix `:(optional)`. For example:
----------------------------
git commit -F :(optional)COMMIT_EDITMSG
-# if COMMIT_EDITMSG does not exist, equivalent to
+# if COMMIT_EDITMSG does not exist, the above is equivalent to
git commit
----------------------------
diff --git a/Documentation/gitdatamodel.adoc b/Documentation/gitdatamodel.adoc
new file mode 100644
index 0000000000..7d8d707d2d
--- /dev/null
+++ b/Documentation/gitdatamodel.adoc
@@ -0,0 +1,302 @@
+gitdatamodel(7)
+===============
+
+NAME
+----
+gitdatamodel - Git's core data model
+
+SYNOPSIS
+--------
+gitdatamodel
+
+DESCRIPTION
+-----------
+
+It's not necessary to understand Git's data model to use Git, but it's
+very helpful when reading Git's documentation so that you know what it
+means when the documentation says "object", "reference" or "index".
+
+Git's core operations use 4 kinds of data:
+
+1. <<object,Objects>>: commits, trees, blobs, and tag objects
+2. <<references,References>>: branches, tags,
+ remote-tracking branches, etc
+3. <<index,The index>>, also known as the staging area
+4. <<reflogs,Reflogs>>: logs of changes to references ("ref log")
+
+[[object]]
+OBJECTS
+-------
+
+All of the commits and files in a Git repository are stored as "Git objects".
+Git objects never change after they're created, and every object has an ID,
+like `1b61de420a21a2f1aaef93e38ecd0e45e8bc9f0a`.
+
+This means that if you have an object's ID, you can always recover its
+exact contents as long as the object hasn't been deleted.
+
+Every object has:
+
+[[object-id]]
+1. an *ID* (aka "object name"), which is a cryptographic hash of its
+ type and contents.
+ It's fast to look up a Git object using its ID.
+ This is usually represented in hexadecimal, like
+ `1b61de420a21a2f1aaef93e38ecd0e45e8bc9f0a`.
+2. a *type*. There are 4 types of objects:
+ <<commit,commits>>, <<tree,trees>>, <<blob,blobs>>,
+ and <<tag-object,tag objects>>.
+3. *contents*. The structure of the contents depends on the type.
+
+Here's how each type of object is structured:
+
+[[commit]]
+commit::
+ A commit contains these required fields
+ (though there are other optional fields):
++
+1. The full directory structure of all the files in that version of the
+ repository and each file's contents, stored as the *<<tree,tree>>* ID
+ of the commit's base directory
+2. Its *parent commit ID(s)*. The first commit in a repository has 0 parents,
+ regular commits have 1 parent, merge commits have 2 or more parents
+3. An *author* and the time the commit was authored
+4. A *committer* and the time the commit was committed
+5. A *commit message*
++
+Here's how an example commit is stored:
++
+----
+tree 1b61de420a21a2f1aaef93e38ecd0e45e8bc9f0a
+parent 4ccb6d7b8869a86aae2e84c56523f8705b50c647
+author Maya <maya@example.com> 1759173425 -0400
+committer Maya <maya@example.com> 1759173425 -0400
+
+Add README
+----
++
+Like all other objects, commits can never be changed after they're created.
+For example, "amending" a commit with `git commit --amend` creates a new
+commit with the same parent.
++
+Git does not store the diff for a commit: when you ask Git to show
+the commit with linkgit:git-show[1], it calculates the diff from its
+parent on the fly.
+
+[[tree]]
+tree::
+ A tree is how Git represents a directory.
+ It can contain files or other trees (which are subdirectories).
+ It lists, for each item in the tree:
++
+1. The *filename*, for example `hello.py`
+2. The *file mode*. These are all of the file modes in Git.
+ They're only spiritually related to Unix file modes.
++
+ - `100644`: regular file (with <<object,object type>> `blob`)
+ - `100755`: executable file (with type `blob`)
+ - `120000`: symbolic link (with type `blob`)
+ - `040000`: directory (with type `tree`)
+ - `160000`: gitlink, for use with submodules (with type `commit`)
+
+3. The <<object-id,*object ID*>> with the contents of the file or directory
++
+For example, this is how a tree containing one directory (`src`) and one file
+(`README.md`) is stored:
++
+----
+100644 blob 8728a858d9d21a8c78488c8b4e70e531b659141f README.md
+040000 tree 89b1d2e0495f66d6929f4ff76ff1bb07fc41947d src
+----
+
+[[blob]]
+blob::
+ A blob object contains a file's contents.
++
+When you make a commit, Git stores the full contents of each file that
+you changed as a blob.
+For example, if you have a commit that changes 2 files in a repository
+with 1000 files, that commit will create 2 new blobs, and use the
+previous blob ID for the other 998 files.
+This means that commits can use relatively little disk space even in a
+very large repository.
+
+[[tag-object]]
+tag object::
+ Tag objects contain these required fields
+ (though there are other optional fields):
++
+1. The *ID* of the object it references
+2. The *type* of the object it references
+3. The *tagger* and tag date
+4. A *tag message*, similar to a commit message
+
+Here's how an example tag object is stored:
+
+----
+object 750b4ead9c87ceb3ddb7a390e6c7074521797fb3
+type commit
+tag v1.0.0
+tagger Maya <maya@example.com> 1759927359 -0400
+
+Release version 1.0.0
+----
+
+NOTE: All of the examples in this section were generated with
+`git cat-file -p <object-id>`.
+
+[[references]]
+REFERENCES
+----------
+
+References are a way to give a name to a commit.
+It's easier to remember "the changes I'm working on are on the `turtle`
+branch" than "the changes are in commit bb69721404348e".
+Git often uses "ref" as shorthand for "reference".
+
+References can either refer to:
+
+1. An object ID, usually a <<commit,commit>> ID
+2. Another reference. This is called a "symbolic reference"
+
+References are stored in a hierarchy, and Git handles references
+differently based on where they are in the hierarchy.
+Most references are under `refs/`. Here are the main types:
+
+[[branch]]
+branches: `refs/heads/<name>`::
+ A branch refers to a commit ID.
+ That commit is the latest commit on the branch.
++
+To get the history of commits on a branch, Git will start at the commit
+ID the branch references, and then look at the commit's parent(s),
+the parent's parent, etc.
+
+[[tag]]
+tags: `refs/tags/<name>`::
+ A tag refers to a commit ID, tag object ID, or other object ID.
+ There are two types of tags:
+ 1. "Annotated tags", which reference a <<tag-object,tag object>> ID
+ which contains a tag message
+ 2. "Lightweight tags", which reference a commit, blob, or tree ID
+ directly
++
+Even though branches and tags both refer to a commit ID, Git
+treats them very differently.
+Branches are expected to change over time: when you make a commit, Git
+will update your <<HEAD,current branch>> to point to the new commit.
+Tags are usually not changed after they're created.
+
+[[HEAD]]
+HEAD: `HEAD`::
+ `HEAD` is where Git stores your current <<branch,branch>>,
+ if there is a current branch. `HEAD` can either be:
++
+1. A symbolic reference to your current branch, for example `ref:
+ refs/heads/main` if your current branch is `main`.
+2. A direct reference to a commit ID. In this case there is no current branch.
+ This is called "detached HEAD state", see the DETACHED HEAD section
+ of linkgit:git-checkout[1] for more.
+
+[[remote-tracking-branch]]
+remote-tracking branches: `refs/remotes/<remote>/<branch>`::
+ A remote-tracking branch refers to a commit ID.
+ It's how Git stores the last-known state of a branch in a remote
+ repository. `git fetch` updates remote-tracking branches. When
+ `git status` says "you're up to date with origin/main", it's looking at
+ this.
++
+`refs/remotes/<remote>/HEAD` is a symbolic reference to the remote's
+default branch. This is the branch that `git clone` checks out by default.
+
+[[other-refs]]
+Other references::
+ Git tools may create references anywhere under `refs/`.
+ For example, linkgit:git-stash[1], linkgit:git-bisect[1],
+ and linkgit:git-notes[1] all create their own references
+ in `refs/stash`, `refs/bisect`, etc.
+ Third-party Git tools may also create their own references.
++
+Git may also create references other than `HEAD` at the base of the
+hierarchy, like `ORIG_HEAD`.
+
+NOTE: Git may delete objects that aren't "reachable" from any reference
+or <<reflogs,reflog>>.
+An object is "reachable" if we can find it by following tags to whatever
+they tag, commits to their parents or trees, and trees to the trees or
+blobs that they contain.
+For example, if you amend a commit with `git commit --amend`,
+there will no longer be a branch that points at the old commit.
+The old commit is recorded in the current branch's <<reflogs,reflog>>,
+so it is still "reachable", but when the reflog entry expires it may
+become unreachable and get deleted.
+
+the old commit will usually not be reachable, so it may be deleted eventually.
+Reachable objects will never be deleted.
+
+[[index]]
+THE INDEX
+---------
+The index, also known as the "staging area", is a list of files and
+the contents of each file, stored as a <<blob,blob>>.
+You can add files to the index or update the contents of a file in the
+index with linkgit:git-add[1]. This is called "staging" the file for commit.
+
+Unlike a <<tree,tree>>, the index is a flat list of files.
+When you commit, Git converts the list of files in the index to a
+directory <<tree,tree>> and uses that tree in the new <<commit,commit>>.
+
+Each index entry has 4 fields:
+
+1. The *file mode*, which must be one of:
+ - `100644`: regular file (with <<object,object type>> `blob`)
+ - `100755`: executable file (with type `blob`)
+ - `120000`: symbolic link (with type `blob`)
+ - `160000`: gitlink, for use with submodules (with type `commit`)
+2. The *<<blob,blob>>* ID of the file,
+ or (rarely) the *<<commit,commit>>* ID of the submodule
+3. The *stage number*, either 0, 1, 2, or 3. This is normally 0, but if
+ there's a merge conflict there can be multiple versions of the same
+ filename in the index.
+4. The *file path*, for example `src/hello.py`
+
+It's extremely uncommon to look at the index directly: normally you'd
+run `git status` to see a list of changes between the index and <<HEAD,HEAD>>.
+But you can use `git ls-files --stage` to see the index.
+Here's the output of `git ls-files --stage` in a repository with 2 files:
+
+----
+100644 8728a858d9d21a8c78488c8b4e70e531b659141f 0 README.md
+100644 665c637a360874ce43bf74018768a96d2d4d219a 0 src/hello.py
+----
+
+[[reflogs]]
+REFLOGS
+-------
+
+Every time a branch, remote-tracking branch, or HEAD is updated, Git
+updates a log called a "reflog" for that <<references,reference>>.
+This means that if you make a mistake and "lose" a commit, you can
+generally recover the commit ID by running `git reflog <reference>`.
+
+A reflog is a list of log entries. Each entry has:
+
+1. The *commit ID*
+2. *Timestamp* when the change was made
+3. *Log message*, for example `pull: Fast-forward`
+
+Reflogs only log changes made in your local repository.
+They are not shared with remotes.
+
+You can view a reflog with `git reflog <reference>`.
+For example, here's the reflog for a `main` branch which has changed twice:
+
+----
+$ git reflog main --date=iso --no-decorate
+750b4ea main@{2025-09-29 15:17:05 -0400}: commit: Add README
+4ccb6d7 main@{2025-09-29 15:16:48 -0400}: commit (initial): Initial commit
+----
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Documentation/gitformat-loose.adoc b/Documentation/gitformat-loose.adoc
new file mode 100644
index 0000000000..4850c91669
--- /dev/null
+++ b/Documentation/gitformat-loose.adoc
@@ -0,0 +1,157 @@
+gitformat-loose(5)
+==================
+
+NAME
+----
+gitformat-loose - Git loose object format
+
+
+SYNOPSIS
+--------
+[verse]
+$GIT_DIR/objects/[0-9a-f][0-9a-f]/*
+$GIT_DIR/objects/loose-object-idx
+$GIT_DIR/objects/loose-map/map-*.map
+
+DESCRIPTION
+-----------
+
+Loose objects are how Git stores individual objects, where every object is
+written as a separate file.
+
+Over the lifetime of a repository, objects are usually written as loose objects
+initially. Eventually, these loose objects will be compacted into packfiles
+via repository maintenance to improve disk space usage and speed up the lookup
+of these objects.
+
+== Loose objects
+
+Each loose object contains a prefix, followed immediately by the data of the
+object. The prefix contains `<type> <size>\0`. `<type>` is one of `blob`,
+`tree`, `commit`, or `tag` and `size` is the size of the data (without the
+prefix) as a decimal integer expressed in ASCII.
+
+The entire contents, prefix and data concatenated, is then compressed with zlib
+and the compressed data is stored in the file. The object ID of the object is
+the SHA-1 or SHA-256 (as appropriate) hash of the uncompressed data.
+
+The file for the loose object is stored under the `objects` directory, with the
+first two hex characters of the object ID being the directory and the remaining
+characters being the file name. This is done to shard the data and avoid too
+many files being in one directory, since some file systems perform poorly with
+many items in a directory.
+
+As an example, the empty tree contains the data (when uncompressed) `tree 0\0`
+and, in a SHA-256 repository, would have the object ID
+`6ef19b41225c5369f1c104d45d8d85efa9b057b53b14b4b9b939dd74decc5321` and would be
+stored under
+`$GIT_DIR/objects/6e/f19b41225c5369f1c104d45d8d85efa9b057b53b14b4b9b939dd74decc5321`.
+
+Similarly, a blob containing the contents `abc` would have the uncompressed
+data of `blob 3\0abc`.
+
+== Loose object mapping
+
+When the `compatObjectFormat` option is used, Git needs to store a mapping
+between the repository's main algorithm and the compatibility algorithm. There
+are two formats for this: the legacy mapping and the modern mapping.
+
+=== Legacy mapping
+
+The compatibility mapping is stored in a file called
+`$GIT_DIR/objects/loose-object-idx`. The format of this file looks like this:
+
+ # loose-object-idx
+ (main-name SP compat-name LF)*
+
+`main-name` refers to hexadecimal object ID of the object in the main
+repository format and `compat-name` refers to the same thing, but for the
+compatibility format.
+
+This format is read if it exists but is not written.
+
+Note that carriage returns are not permitted in this file, regardless of the
+host system or configuration.
+
+=== Modern mapping
+
+The modern mapping consists of a set of files under `$GIT_DIR/objects/loose`
+ending in `.map`. The portion of the filename before the extension is that of
+the hash checksum in hex format.
+
+`git pack-objects` will repack existing entries into one file, removing any
+unnecessary objects, such as obsolete shallow entries or loose objects that
+have been packed.
+
+==== Mapping file format
+
+- A header appears at the beginning and consists of the following:
+ * A 4-byte mapping signature: `LMAP`
+ * 4-byte version number: 1
+ * 4-byte length of the header section.
+ * 4-byte number of objects declared in this map file.
+ * 4-byte number of object formats declared in this map file.
+ * For each object format:
+ ** 4-byte format identifier (e.g., `sha1` for SHA-1)
+ ** 4-byte length in bytes of shortened object names. This is the
+ shortest possible length needed to make names in the shortened
+ object name table unambiguous.
+ ** 8-byte integer, recording where tables relating to this format
+ are stored in this index file, as an offset from the beginning.
+ * 8-byte offset to the trailer from the beginning of this file.
+ * Zero or more additional key/value pairs (4-byte key, 4-byte value), which
+ may optionally declare one or more chunks. No chunks are currently
+ defined. Readers must ignore unrecognized keys.
+- Zero or more NUL bytes. These are used to improve the alignment of the
+ 4-byte quantities below.
+- Tables for the first object format:
+ * A sorted table of shortened object names. These are prefixes of the names
+ of all objects in this file, packed together without offset values to
+ reduce the cache footprint of the binary search for a specific object name.
+ * A sorted table of full object names.
+ * A table of 4-byte metadata values.
+ * Zero or more chunks. A chunk starts with a four-byte chunk identifier and
+ a four-byte parameter (which, if unneeded, is all zeros) and an eight-byte
+ size (not including the identifier, parameter, or size), plus the chunk
+ data.
+- Zero or more NUL bytes.
+- Tables for subsequent object formats:
+ * A sorted table of shortened object names. These are prefixes of the names
+ of all objects in this file, packed together without offset values to
+ reduce the cache footprint of the binary search for a specific object name.
+ * A table of full object names in the order specified by the first object format.
+ * A table of 4-byte values mapping object name order to the order of the
+ first object format. For an object in the table of sorted shortened object
+ names, the value at the corresponding index in this table is the index in
+ the previous table for that same object.
+ * Zero or more NUL bytes.
+- The trailer consists of the following:
+ * Hash checksum of all of the above.
+
+The lower six bits of each metadata table contain a type field indicating the
+reason that this object is stored:
+
+0::
+ Reserved.
+1::
+ This object is stored as a loose object in the repository.
+2::
+ This object is a shallow entry. The mapping refers to a shallow value
+ returned by a remote server.
+3::
+ This object is a submodule entry. The mapping refers to the commit stored
+ representing a submodule.
+
+Other data may be stored in this field in the future. Bits that are not used
+must be zero.
+
+All 4-byte numbers are in network order and must be 4-byte aligned in the file,
+so the NUL padding may be required in some cases.
+
+Note that the hash at the end of the file is in whatever the repository's main
+algorithm is. In the usual case when there are multiple algorithms, the main
+algorithm will be SHA-256 and the compatibility algorithm will be SHA-1.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Documentation/gitformat-pack.adoc b/Documentation/gitformat-pack.adoc
index d6ae229be5..1b4db4aa61 100644
--- a/Documentation/gitformat-pack.adoc
+++ b/Documentation/gitformat-pack.adoc
@@ -32,6 +32,10 @@ In a repository using the traditional SHA-1, pack checksums, index checksums,
and object IDs (object names) mentioned below are all computed using SHA-1.
Similarly, in SHA-256 repositories, these values are computed using SHA-256.
+CRC32 checksums are always computed over the entire packed object, including
+the header (n-byte type and length); the base object name or offset, if any;
+and the entire compressed object. The CRC32 algorithm used is that of zlib.
+
== pack-*.pack files have the following format:
- A header appears at the beginning and consists of the following:
@@ -80,6 +84,16 @@ Valid object types are:
Type 5 is reserved for future expansion. Type 0 is invalid.
+=== Object encoding
+
+Unlike loose objects, packed objects do not have a prefix containing the type,
+size, and a NUL byte. These are not necessary because they can be determined by
+the n-byte type and length that prefixes the data and so they are omitted from
+the compressed and deltified data.
+
+The computation of the object ID still uses this prefix by reconstructing it
+from the type and length as needed.
+
=== Size encoding
This document uses the following "size encoding" of non-negative
@@ -92,6 +106,11 @@ values are more significant.
This size encoding should not be confused with the "offset encoding",
which is also used in this document.
+When encoding the size of an undeltified object in a pack, the size is that of
+the uncompressed raw object. For deltified objects, it is the size of the
+uncompressed delta. The base object name or offset is not included in the size
+computation.
+
=== Deltified representation
Conceptually there are only four object types: commit, tree, tag and
diff --git a/Documentation/gitignore.adoc b/Documentation/gitignore.adoc
index 5e0964ef41..9fccab4ae8 100644
--- a/Documentation/gitignore.adoc
+++ b/Documentation/gitignore.adoc
@@ -111,6 +111,11 @@ PATTERN FORMAT
one of the characters in a range. See fnmatch(3) and the
FNM_PATHNAME flag for a more detailed description.
+ - A backslash ("`\`") can be used to escape any character. E.g., "`\*`"
+ matches a literal asterisk (and "`\a`" matches "`a`", even though
+ there is no need for escaping there). As with fnmatch(3), a backslash
+ at the end of a pattern is an invalid pattern that never matches.
+
Two consecutive asterisks ("`**`") in patterns matched against
full pathname may have special meaning:
diff --git a/Documentation/gitprotocol-http.adoc b/Documentation/gitprotocol-http.adoc
index d024010414..e2ef7f0459 100644
--- a/Documentation/gitprotocol-http.adoc
+++ b/Documentation/gitprotocol-http.adoc
@@ -443,7 +443,8 @@ If no "want" objects are received, send an error:
TODO: Define error if no "want" lines are requested.
If any "want" object is not reachable, send an error:
-TODO: Define error if an invalid "want" is requested.
+When a Git server receives an invalid or malformed `want` line, it
+responds with an error message that includes the offending object name.
Create an empty list, `s_common`.
diff --git a/Documentation/glossary-content.adoc b/Documentation/glossary-content.adoc
index e423e4765b..20ba121314 100644
--- a/Documentation/glossary-content.adoc
+++ b/Documentation/glossary-content.adoc
@@ -297,8 +297,8 @@ This commit is referred to as a "merge commit", or sometimes just a
identified by its <<def_object_name,object name>>. The objects usually
live in `$GIT_DIR/objects/`.
-[[def_object_identifier]]object identifier (oid)::
- Synonym for <<def_object_name,object name>>.
+[[def_object_identifier]]object identifier, object ID, oid::
+ Synonyms for <<def_object_name,object name>>.
[[def_object_name]]object name::
The unique identifier of an <<def_object,object>>. The
diff --git a/Documentation/howto/meson.build b/Documentation/howto/meson.build
index ece20244af..16b9056f24 100644
--- a/Documentation/howto/meson.build
+++ b/Documentation/howto/meson.build
@@ -35,7 +35,7 @@ doc_targets += custom_target(
output: 'howto-index.html',
depends: documentation_deps,
install: true,
- install_dir: get_option('datadir') / 'doc/git-doc',
+ install_dir: htmldir,
)
foreach howto : howto_sources
@@ -57,6 +57,6 @@ foreach howto : howto_sources
output: fs.stem(howto_stripped.full_path()) + '.html',
depends: documentation_deps,
install: true,
- install_dir: get_option('datadir') / 'doc/git-doc/howto',
+ install_dir: htmldir / 'howto',
)
endforeach
diff --git a/Documentation/meson.build b/Documentation/meson.build
index 44f94cdb7b..fd2e8cc02d 100644
--- a/Documentation/meson.build
+++ b/Documentation/meson.build
@@ -64,6 +64,7 @@ manpages = {
'git-gui.adoc' : 1,
'git-hash-object.adoc' : 1,
'git-help.adoc' : 1,
+ 'git-history.adoc' : 1,
'git-hook.adoc' : 1,
'git-http-backend.adoc' : 1,
'git-http-fetch.adoc' : 1,
@@ -173,6 +174,7 @@ manpages = {
'gitformat-chunk.adoc' : 5,
'gitformat-commit-graph.adoc' : 5,
'gitformat-index.adoc' : 5,
+ 'gitformat-loose.adoc' : 5,
'gitformat-pack.adoc' : 5,
'gitformat-signature.adoc' : 5,
'githooks.adoc' : 5,
@@ -192,6 +194,7 @@ manpages = {
'gitcore-tutorial.adoc' : 7,
'gitcredentials.adoc' : 7,
'gitcvs-migration.adoc' : 7,
+ 'gitdatamodel.adoc' : 7,
'gitdiffcore.adoc' : 7,
'giteveryday.adoc' : 7,
'gitfaq.adoc' : 7,
@@ -411,7 +414,7 @@ foreach manpage, category : manpages
input: manpage,
output: fs.stem(manpage) + '.html',
install: true,
- install_dir: get_option('datadir') / 'doc/git-doc',
+ install_dir: htmldir,
)
endif
endforeach
@@ -422,7 +425,7 @@ if get_option('docs').contains('html')
output: 'docinfo.html',
copy: true,
install: true,
- install_dir: get_option('datadir') / 'doc/git-doc',
+ install_dir: htmldir,
)
configure_file(
@@ -430,11 +433,11 @@ if get_option('docs').contains('html')
output: 'docbook-xsl.css',
copy: true,
install: true,
- install_dir: get_option('datadir') / 'doc/git-doc',
+ install_dir: htmldir,
)
install_symlink('index.html',
- install_dir: get_option('datadir') / 'doc/git-doc',
+ install_dir: htmldir,
pointing_to: 'git.html',
)
@@ -465,7 +468,7 @@ if get_option('docs').contains('html')
input: 'docbook.xsl',
output: 'user-manual.html',
install: true,
- install_dir: get_option('datadir') / 'doc/git-doc',
+ install_dir: htmldir,
)
articles = [
@@ -491,7 +494,7 @@ if get_option('docs').contains('html')
output: fs.stem(article) + '.html',
depends: documentation_deps,
install: true,
- install_dir: get_option('datadir') / 'doc/git-doc',
+ install_dir: htmldir,
)
endforeach
diff --git a/Documentation/pull-fetch-param.adoc b/Documentation/pull-fetch-param.adoc
index d79d2f6065..bb2cf6a462 100644
--- a/Documentation/pull-fetch-param.adoc
+++ b/Documentation/pull-fetch-param.adoc
@@ -11,6 +11,7 @@ ifndef::git-pull[]
(See linkgit:git-config[1]).
endif::git-pull[]
+[[fetch-refspec]]
<refspec>::
Specifies which refs to fetch and which local refs to update.
When no <refspec>s appear on the command line, the refs to fetch
diff --git a/Documentation/technical/commit-graph.adoc b/Documentation/technical/commit-graph.adoc
index 2c26e95e51..a259d1567b 100644
--- a/Documentation/technical/commit-graph.adoc
+++ b/Documentation/technical/commit-graph.adoc
@@ -39,6 +39,7 @@ A consumer may load the following info for a commit from the graph:
Values 1-4 satisfy the requirements of parse_commit_gently().
There are two definitions of generation number:
+
1. Corrected committer dates (generation number v2)
2. Topological levels (generation number v1)
@@ -158,7 +159,8 @@ number of commits in the full history. By creating a "chain" of commit-graphs,
we enable fast writes of new commit data without rewriting the entire commit
history -- at least, most of the time.
-## File Layout
+File Layout
+~~~~~~~~~~~
A commit-graph chain uses multiple files, and we use a fixed naming convention
to organize these files. Each commit-graph file has a name
@@ -170,11 +172,11 @@ hashes for the files in order from "lowest" to "highest".
For example, if the `commit-graph-chain` file contains the lines
-```
+----
{hash0}
{hash1}
{hash2}
-```
+----
then the commit-graph chain looks like the following diagram:
@@ -213,7 +215,8 @@ specifying the hashes of all files in the lower layers. In the above example,
`graph-{hash1}.graph` contains `{hash0}` while `graph-{hash2}.graph` contains
`{hash0}` and `{hash1}`.
-## Merging commit-graph files
+Merging commit-graph files
+~~~~~~~~~~~~~~~~~~~~~~~~~~
If we only added a new commit-graph file on every write, we would run into a
linear search problem through many commit-graph files. Instead, we use a merge
@@ -225,6 +228,7 @@ is determined by the merge strategy that the files should collapse to
the commits in `graph-{hash1}` should be combined into a new `graph-{hash3}`
file.
+....
+---------------------+
| |
| (new commits) |
@@ -250,6 +254,7 @@ file.
| |
| |
+-----------------------+
+....
During this process, the commits to write are combined, sorted and we write the
contents to a temporary file, all while holding a `commit-graph-chain.lock`
@@ -257,14 +262,15 @@ lock-file. When the file is flushed, we rename it to `graph-{hash3}`
according to the computed `{hash3}`. Finally, we write the new chain data to
`commit-graph-chain.lock`:
-```
+----
{hash3}
{hash0}
-```
+----
We then close the lock-file.
-## Merge Strategy
+Merge Strategy
+~~~~~~~~~~~~~~
When writing a set of commits that do not exist in the commit-graph stack of
height N, we default to creating a new file at level N + 1. We then decide to
@@ -289,7 +295,8 @@ The merge strategy values (2 for the size multiple, 64,000 for the maximum
number of commits) could be extracted into config settings for full
flexibility.
-## Handling Mixed Generation Number Chains
+Handling Mixed Generation Number Chains
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With the introduction of generation number v2 and generation data chunk, the
following scenario is possible:
@@ -318,7 +325,8 @@ have corrected commit dates when written by compatible versions of Git. Thus,
rewriting split commit-graph as a single file (`--split=replace`) creates a
single layer with corrected commit dates.
-## Deleting graph-{hash} files
+Deleting graph-\{hash\} files
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
After a new tip file is written, some `graph-{hash}` files may no longer
be part of a chain. It is important to remove these files from disk, eventually.
@@ -333,7 +341,8 @@ files whose modified times are older than a given expiry window. This window
defaults to zero, but can be changed using command-line arguments or a config
setting.
-## Chains across multiple object directories
+Chains across multiple object directories
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In a repo with alternates, we look for the `commit-graph-chain` file starting
in the local object directory and then in each alternate. The first file that
diff --git a/Documentation/technical/hash-function-transition.adoc b/Documentation/technical/hash-function-transition.adoc
index f047fd80ca..2359d7d106 100644
--- a/Documentation/technical/hash-function-transition.adoc
+++ b/Documentation/technical/hash-function-transition.adoc
@@ -227,9 +227,9 @@ network byte order):
** 4-byte length in bytes of shortened object names. This is the
shortest possible length needed to make names in the shortened
object name table unambiguous.
- ** 4-byte integer, recording where tables relating to this format
+ ** 8-byte integer, recording where tables relating to this format
are stored in this index file, as an offset from the beginning.
- * 4-byte offset to the trailer from the beginning of this file.
+ * 8-byte offset to the trailer from the beginning of this file.
* Zero or more additional key/value pairs (4-byte key, 4-byte
value). Only one key is supported: 'PSRC'. See the "Loose objects
and unreachable objects" section for supported values and how this
@@ -260,12 +260,10 @@ network byte order):
compressed data to be copied directly from pack to pack during
repacking without undetected data corruption.
- * A table of 4-byte offset values. For an object in the table of
- sorted shortened object names, the value at the corresponding
- index in this table indicates where that object can be found in
- the pack file. These are usually 31-bit pack file offsets, but
- large offsets are encoded as an index into the next table with the
- most significant bit set.
+ * A table of 4-byte offset values. The index of this table in pack order
+ indicates where that object can be found in the pack file. These are
+ usually 31-bit pack file offsets, but large offsets are encoded as
+ an index into the next table with the most significant bit set.
* A table of 8-byte offset entries (empty for pack files less than
2 GiB). Pack files are organized with heavily used objects toward
@@ -276,10 +274,14 @@ network byte order):
up to and not including the table of CRC32 values.
- Zero or more NUL bytes.
- The trailer consists of the following:
- * A copy of the 20-byte SHA-256 checksum at the end of the
+ * A copy of the full main hash checksum at the end of the
corresponding packfile.
- * 20-byte SHA-256 checksum of all of the above.
+ * Full main hash checksum of all of the above.
+
+The "full main hash" is a full-length hash of the main (not compatibility)
+algorithm in the repository. Thus, if the main algorithm is SHA-256, this is
+a 32-byte SHA-256 hash and for SHA-1, it's a 20-byte SHA-1 hash.
Loose object index
~~~~~~~~~~~~~~~~~~
@@ -427,17 +429,19 @@ ordinary unsigned commit.
Signed Tags
~~~~~~~~~~~
-We add a new field "gpgsig-sha256" to the tag object format to allow
-signing tags without relying on SHA-1. Its signed payload is the
-SHA-256 content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
-SIGNATURE-----" delimited in-body signature removed.
-
-This means tags can be signed
-
-1. using SHA-1 only, as in existing signed tag objects
-2. using both SHA-1 and SHA-256, by using gpgsig-sha256 and an in-body
- signature.
-3. using only SHA-256, by only using the gpgsig-sha256 field.
+We add new fields "gpgsig" and "gpgsig-sha256" to the tag object format to
+allow signing tags in both formats. The in-body signature is used for the
+signature in the current hash algorithm and the header is used for the
+signature in the other algorithm. Thus, a dual-signature tag will contain both
+an in-body signature and a gpgsig-sha256 header for the SHA-1 format of an
+object or both an in-body signature and a gpgsig header for the SHA-256 format
+of and object.
+
+The signed payload of the tag is the content of the tag in the current
+algorithm with both its gpgsig and gpgsig-sha256 fields and
+"-----BEGIN PGP SIGNATURE-----" delimited in-body signature removed.
+
+This means tags can be signed using one or both algorithms.
Mergetag embedding
~~~~~~~~~~~~~~~~~~
diff --git a/Documentation/technical/large-object-promisors.adoc b/Documentation/technical/large-object-promisors.adoc
index dea8dafa66..2aa815e023 100644
--- a/Documentation/technical/large-object-promisors.adoc
+++ b/Documentation/technical/large-object-promisors.adoc
@@ -34,8 +34,8 @@ a new object representation for large blobs as discussed in:
https://lore.kernel.org/git/xmqqbkdometi.fsf@gitster.g/
-0) Non goals
-------------
+Non goals
+---------
- We will not discuss those client side improvements here, as they
would require changes in different parts of Git than this effort.
@@ -90,8 +90,8 @@ later in this document:
even more to host content with larger blobs or more large blobs
than currently.
-I) Issues with the current situation
-------------------------------------
+I Issues with the current situation
+-----------------------------------
- Some statistics made on GitLab repos have shown that more than 75%
of the disk space is used by blobs that are larger than 1MB and
@@ -138,8 +138,8 @@ I) Issues with the current situation
complaining that these tools require significant effort to set up,
learn and use correctly.
-II) Main features of the "Large Object Promisors" solution
-----------------------------------------------------------
+II Main features of the "Large Object Promisors" solution
+---------------------------------------------------------
The main features below should give a rough overview of how the
solution may work. Details about needed elements can be found in
@@ -166,7 +166,7 @@ format. They should be used along with main remotes that contain the
other objects.
Note 1
-++++++
+^^^^^^
To clarify, a LOP is a normal promisor remote, except that:
@@ -178,7 +178,7 @@ To clarify, a LOP is a normal promisor remote, except that:
itself.
Note 2
-++++++
+^^^^^^
Git already makes it possible for a main remote to also be a promisor
remote storing both regular objects and large blobs for a client that
@@ -186,13 +186,13 @@ clones from it with a filter on blob size. But here we explicitly want
to avoid that.
Rationale
-+++++++++
+^^^^^^^^^
LOPs aim to be good at handling large blobs while main remotes are
already good at handling other objects.
Implementation
-++++++++++++++
+^^^^^^^^^^^^^^
Git already has support for multiple promisor remotes, see
link:partial-clone.html#using-many-promisor-remotes[the partial clone documentation].
@@ -213,19 +213,19 @@ remote helper (see linkgit:gitremote-helpers[7]) which makes the
underlying object storage appear like a remote to Git.
Note
-++++
+^^^^
A LOP can be a promisor remote accessed using a remote helper by
both some clients and the main remote.
Rationale
-+++++++++
+^^^^^^^^^
This looks like the simplest way to create LOPs that can cheaply
handle many large blobs.
Implementation
-++++++++++++++
+^^^^^^^^^^^^^^
Remote helpers are quite easy to write as shell scripts, but it might
be more efficient and maintainable to write them using other languages
@@ -247,7 +247,7 @@ The underlying object storage that a LOP uses could also serve as
storage for large files handled by Git LFS.
Rationale
-+++++++++
+^^^^^^^^^
This would simplify the server side if it wants to both use a LOP and
act as a Git LFS server.
@@ -259,7 +259,7 @@ On the server side, a main remote should have a way to offload to a
LOP all its blobs with a size over a configurable threshold.
Rationale
-+++++++++
+^^^^^^^^^
This makes it easy to set things up and to clean things up. For
example, an admin could use this to manually convert a repo not using
@@ -268,7 +268,7 @@ some users would sometimes push large blobs, a cron job could use this
to regularly make sure the large blobs are moved to the LOP.
Implementation
-++++++++++++++
+^^^^^^^^^^^^^^
Using something based on `git repack --filter=...` to separate the
blobs we want to offload from the other Git objects could be a good
@@ -284,13 +284,13 @@ should have ways to prevent oversize blobs to be fetched, and also
perhaps pushed, into it.
Rationale
-+++++++++
+^^^^^^^^^
A main remote containing many oversize blobs would defeat the purpose
of LOPs.
Implementation
-++++++++++++++
+^^^^^^^^^^^^^^
The way to offload to a LOP discussed in 4) above can be used to
regularly offload oversize blobs. About preventing oversize blobs from
@@ -326,18 +326,18 @@ large blobs directly from the LOP and the server would not need to
fetch those blobs from the LOP to be able to serve the client.
Note
-++++
+^^^^
For fetches instead of clones, a protocol negotiation might not always
happen, see the "What about fetches?" FAQ entry below for details.
Rationale
-+++++++++
+^^^^^^^^^
Security, configurability and efficiency of setting things up.
Implementation
-++++++++++++++
+^^^^^^^^^^^^^^
A "promisor-remote" protocol v2 capability looks like a good way to
implement this. The way the client and server use this capability
@@ -356,7 +356,7 @@ the client should be able to offload some large blobs it has fetched,
but might not need anymore, to the LOP.
Note
-++++
+^^^^
It might depend on the context if it should be OK or not for clients
to offload large blobs they have created, instead of fetched, directly
@@ -367,13 +367,13 @@ This should be discussed and refined when we get closer to
implementing this feature.
Rationale
-+++++++++
+^^^^^^^^^
On the client, the easiest way to deal with unneeded large blobs is to
offload them.
Implementation
-++++++++++++++
+^^^^^^^^^^^^^^
This is very similar to what 4) above is about, except on the client
side instead of the server side. So a good solution to 4) could likely
@@ -385,8 +385,8 @@ when cloning (see 6) above). Also if the large blobs were fetched from
a LOP, it is likely, and can easily be confirmed, that the LOP still
has them, so that they can just be removed from the client.
-III) Benefits of using LOPs
----------------------------
+III Benefits of using LOPs
+--------------------------
Many benefits are related to the issues discussed in "I) Issues with
the current situation" above:
@@ -406,8 +406,8 @@ the current situation" above:
- Reduced storage needs on the client side.
-IV) FAQ
--------
+IV FAQ
+------
What about using multiple LOPs on the server and client side?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -533,7 +533,7 @@ some objects it already knows about but doesn't have because they are
on a promisor remote.
Regular fetch
-+++++++++++++
+^^^^^^^^^^^^^
In a regular fetch, the client will contact the main remote and a
protocol negotiation will happen between them. It's a good thing that
@@ -551,7 +551,7 @@ new fetch will happen in the same way as the previous clone or fetch,
using, or not using, the same LOP(s) as last time.
"Backfill" or "lazy" fetch
-++++++++++++++++++++++++++
+^^^^^^^^^^^^^^^^^^^^^^^^^^
When there is a backfill fetch, the client doesn't necessarily contact
the main remote first. It will try to fetch from its promisor remotes
@@ -576,8 +576,8 @@ from the client when it fetches from them. The client could get the
token when performing a protocol negotiation with the main remote (see
section II.6 above).
-V) Future improvements
-----------------------
+V Future improvements
+---------------------
It is expected that at the beginning using LOPs will be mostly worth
it either in a corporate context where the Git version that clients
diff --git a/Documentation/technical/meson.build b/Documentation/technical/meson.build
index 858af811a7..faff3964a9 100644
--- a/Documentation/technical/meson.build
+++ b/Documentation/technical/meson.build
@@ -13,6 +13,7 @@ articles = [
'commit-graph.adoc',
'directory-rename-detection.adoc',
'hash-function-transition.adoc',
+ 'large-object-promisors.adoc',
'long-running-process-protocol.adoc',
'multi-pack-index.adoc',
'packfile-uri.adoc',
@@ -52,7 +53,7 @@ doc_targets += custom_target(
output: 'api-index.html',
depends: documentation_deps,
install: true,
- install_dir: get_option('datadir') / 'doc/git-doc/technical',
+ install_dir: htmldir / 'technical',
)
foreach article : api_docs + articles
@@ -62,6 +63,6 @@ foreach article : api_docs + articles
output: fs.stem(article) + '.html',
depends: documentation_deps,
install: true,
- install_dir: get_option('datadir') / 'doc/git-doc/technical',
+ install_dir: htmldir / 'technical',
)
endforeach
diff --git a/Documentation/technical/remembering-renames.adoc b/Documentation/technical/remembering-renames.adoc
index 73f41761e2..6155f36c72 100644
--- a/Documentation/technical/remembering-renames.adoc
+++ b/Documentation/technical/remembering-renames.adoc
@@ -10,32 +10,32 @@ history as an optimization, assuming all merges are automatic and clean
Outline:
- 0. Assumptions
+ 1. Assumptions
- 1. How rebasing and cherry-picking work
+ 2. How rebasing and cherry-picking work
- 2. Why the renames on MERGE_SIDE1 in any given pick are *always* a
+ 3. Why the renames on MERGE_SIDE1 in any given pick are *always* a
superset of the renames on MERGE_SIDE1 for the next pick.
- 3. Why any rename on MERGE_SIDE1 in any given pick is _almost_ always also
+ 4. Why any rename on MERGE_SIDE1 in any given pick is _almost_ always also
a rename on MERGE_SIDE1 for the next pick
- 4. A detailed description of the counter-examples to #3.
+ 5. A detailed description of the counter-examples to #4.
- 5. Why the special cases in #4 are still fully reasonable to use to pair
+ 6. Why the special cases in #5 are still fully reasonable to use to pair
up files for three-way content merging in the merge machinery, and why
they do not affect the correctness of the merge.
- 6. Interaction with skipping of "irrelevant" renames
+ 7. Interaction with skipping of "irrelevant" renames
- 7. Additional items that need to be cached
+ 8. Additional items that need to be cached
- 8. How directory rename detection interacts with the above and why this
+ 9. How directory rename detection interacts with the above and why this
optimization is still safe even if merge.directoryRenames is set to
"true".
-=== 0. Assumptions ===
+== 1. Assumptions ==
There are two assumptions that will hold throughout this document:
@@ -44,8 +44,8 @@ There are two assumptions that will hold throughout this document:
* All merges are fully automatic
-and a third that will hold in sections 2-5 for simplicity, that I'll later
-address in section 8:
+and a third that will hold in sections 3-6 for simplicity, that I'll later
+address in section 9:
* No directory renames occur
@@ -77,9 +77,9 @@ conflicts that the user needs to resolve), the cache of renames is not
stored on disk, and thus is thrown away as soon as the rebase or cherry
pick stops for the user to resolve the operation.
-The third assumption makes sections 2-5 simpler, and allows people to
+The third assumption makes sections 3-6 simpler, and allows people to
understand the basics of why this optimization is safe and effective, and
-then I can go back and address the specifics in section 8. It is probably
+then I can go back and address the specifics in section 9. It is probably
also worth noting that if directory renames do occur, then the default of
merge.directoryRenames being set to "conflict" means that the operation
will stop for users to resolve the conflicts and the cache will be thrown
@@ -88,22 +88,26 @@ reason we need to address directory renames specifically, is that some
users will have set merge.directoryRenames to "true" to allow the merges to
continue to proceed automatically. The optimization is still safe with
this config setting, but we have to discuss a few more cases to show why;
-this discussion is deferred until section 8.
+this discussion is deferred until section 9.
-=== 1. How rebasing and cherry-picking work ===
+== 2. How rebasing and cherry-picking work ==
Consider the following setup (from the git-rebase manpage):
+------------
A---B---C topic
/
D---E---F---G main
+------------
After rebasing or cherry-picking topic onto main, this will appear as:
+------------
A'--B'--C' topic
/
D---E---F---G main
+------------
The way the commits A', B', and C' are created is through a series of
merges, where rebase or cherry-pick sequentially uses each of the three
@@ -111,6 +115,7 @@ A-B-C commits in a special merge operation. Let's label the three commits
in the merge operation as MERGE_BASE, MERGE_SIDE1, and MERGE_SIDE2. For
this picture, the three commits for each of the three merges would be:
+....
To create A':
MERGE_BASE: E
MERGE_SIDE1: G
@@ -125,6 +130,7 @@ To create C':
MERGE_BASE: B
MERGE_SIDE1: B'
MERGE_SIDE2: C
+....
Sometimes, folks are surprised that these three-way merges are done. It
can be useful in understanding these three-way merges to view them in a
@@ -138,8 +144,7 @@ Conceptually the two statements above are the same as a three-way merge of
B, B', and C, at least the parts before you decide to record a commit.
-=== 2. Why the renames on MERGE_SIDE1 in any given pick are always a ===
-=== superset of the renames on MERGE_SIDE1 for the next pick. ===
+== 3. Why the renames on MERGE_SIDE1 in any given pick are always a superset of the renames on MERGE_SIDE1 for the next pick. ==
The merge machinery uses the filenames it is fed from MERGE_BASE,
MERGE_SIDE1, and MERGE_SIDE2. It will only move content to a different
@@ -156,6 +161,7 @@ filename under one of three conditions:
First, let's remember what commits are involved in the first and second
picks of the cherry-pick or rebase sequence:
+....
To create A':
MERGE_BASE: E
MERGE_SIDE1: G
@@ -165,6 +171,7 @@ To create B':
MERGE_BASE: A
MERGE_SIDE1: A'
MERGE_SIDE2: B
+....
So, in particular, we need to show that the renames between E and G are a
superset of those between A and A'.
@@ -181,11 +188,11 @@ are a subset of those between E and G. Equivalently, all renames between E
and G are a superset of those between A and A'.
-=== 3. Why any rename on MERGE_SIDE1 in any given pick is _almost_ ===
-=== always also a rename on MERGE_SIDE1 for the next pick. ===
+== 4. Why any rename on MERGE_SIDE1 in any given pick is _almost_ always also a rename on MERGE_SIDE1 for the next pick. ==
Let's again look at the first two picks:
+....
To create A':
MERGE_BASE: E
MERGE_SIDE1: G
@@ -195,17 +202,25 @@ To create B':
MERGE_BASE: A
MERGE_SIDE1: A'
MERGE_SIDE2: B
+....
Now let's look at any given rename from MERGE_SIDE1 of the first pick, i.e.
any given rename from E to G. Let's use the filenames 'oldfile' and
'newfile' for demonstration purposes. That first pick will function as
follows; when the rename is detected, the merge machinery will do a
three-way content merge of the following:
+
+....
E:oldfile
G:newfile
A:oldfile
+....
+
and produce a new result:
+
+....
A':newfile
+....
Note above that I've assumed that E->A did not rename oldfile. If that
side did rename, then we most likely have a rename/rename(1to2) conflict
@@ -254,19 +269,21 @@ were detected as renames, A:oldfile and A':newfile should also be
detectable as renames almost always.
-=== 4. A detailed description of the counter-examples to #3. ===
+== 5. A detailed description of the counter-examples to #4. ==
-We already noted in section 3 that rename/rename(1to1) (i.e. both sides
+We already noted in section 4 that rename/rename(1to1) (i.e. both sides
renaming a file the same way) was one counter-example. The more
interesting bit, though, is why did we need to use the "almost" qualifier
when stating that A:oldfile and A':newfile are "almost" always detectable
as renames?
-Let's repeat an earlier point that section 3 made:
+Let's repeat an earlier point that section 4 made:
+....
A':newfile was created by applying the changes between E:oldfile and
G:newfile to A:oldfile. The changes between E:oldfile and G:newfile were
<50% of the size of E:oldfile.
+....
If those changes that were <50% of the size of E:oldfile are also <50% of
the size of A:oldfile, then A:oldfile and A':newfile will be detectable as
@@ -276,18 +293,21 @@ still somehow merge cleanly), then traditional rename detection would not
detect A:oldfile and A':newfile as renames.
Here's an example where that can happen:
+
* E:oldfile had 20 lines
* G:newfile added 10 new lines at the beginning of the file
* A:oldfile kept the first 3 lines of the file, and deleted all the rest
+
then
+
+....
=> A':newfile would have 13 lines, 3 of which matches those in A:oldfile.
-E:oldfile -> G:newfile would be detected as a rename, but A:oldfile and
-A':newfile would not be.
+ E:oldfile -> G:newfile would be detected as a rename, but A:oldfile and
+ A':newfile would not be.
+....
-=== 5. Why the special cases in #4 are still fully reasonable to use to ===
-=== pair up files for three-way content merging in the merge machinery, ===
-=== and why they do not affect the correctness of the merge. ===
+== 6. Why the special cases in #5 are still fully reasonable to use to pair up files for three-way content merging in the merge machinery, and why they do not affect the correctness of the merge. ==
In the rename/rename(1to1) case, A:newfile and A':newfile are not renames
since they use the *same* filename. However, files with the same filename
@@ -295,14 +315,14 @@ are obviously fine to pair up for three-way content merging (the merge
machinery has never employed break detection). The interesting
counter-example case is thus not the rename/rename(1to1) case, but the case
where A did not rename oldfile. That was the case that we spent most of
-the time discussing in sections 3 and 4. The remainder of this section
+the time discussing in sections 4 and 5. The remainder of this section
will be devoted to that case as well.
So, even if A:oldfile and A':newfile aren't detectable as renames, why is
it still reasonable to pair them up for three-way content merging in the
merge machinery? There are multiple reasons:
- * As noted in sections 3 and 4, the diff between A:oldfile and A':newfile
+ * As noted in sections 4 and 5, the diff between A:oldfile and A':newfile
is *exactly* the same as the diff between E:oldfile and G:newfile. The
latter pair were detected as renames, so it seems unlikely to surprise
users for us to treat A:oldfile and A':newfile as renames.
@@ -394,7 +414,7 @@ cases 1 and 3 seem to provide as good or better behavior with the
optimization than without.
-=== 6. Interaction with skipping of "irrelevant" renames ===
+== 7. Interaction with skipping of "irrelevant" renames ==
Previous optimizations involved skipping rename detection for paths
considered to be "irrelevant". See for example the following commits:
@@ -421,24 +441,27 @@ detection -- though we can limit it to the paths for which we have not
already detected renames.
-=== 7. Additional items that need to be cached ===
+== 8. Additional items that need to be cached ==
It turns out we have to cache more than just renames; we also cache:
+....
A) non-renames (i.e. unpaired deletes)
B) counts of renames within directories
C) sources that were marked as RELEVANT_LOCATION, but which were
downgraded to RELEVANT_NO_MORE
D) the toplevel trees involved in the merge
+....
These are all stored in struct rename_info, and respectively appear in
+
* cached_pairs (along side actual renames, just with a value of NULL)
* dir_rename_counts
* cached_irrelevant
* merge_trees
-The reason for (A) comes from the irrelevant renames skipping
-optimization discussed in section 6. The fact that irrelevant renames
+The reason for `(A)` comes from the irrelevant renames skipping
+optimization discussed in section 7. The fact that irrelevant renames
are skipped means we only get a subset of the potential renames
detected and subsequent commits may need to run rename detection on
the upstream side on a subset of the remaining renames (to get the
@@ -447,23 +470,24 @@ deletes are involved in rename detection too, we don't want to
repeatedly check that those paths remain unpaired on the upstream side
with every commit we are transplanting.
-The reason for (B) is that diffcore_rename_extended() is what
+The reason for `(B)` is that diffcore_rename_extended() is what
generates the counts of renames by directory which is needed in
directory rename detection, and if we don't run
diffcore_rename_extended() again then we need to have the output from
it, including dir_rename_counts, from the previous run.
-The reason for (C) is that merge-ort's tree traversal will again think
+The reason for `(C)` is that merge-ort's tree traversal will again think
those paths are relevant (marking them as RELEVANT_LOCATION), but the
fact that they were downgraded to RELEVANT_NO_MORE means that
dir_rename_counts already has the information we need for directory
rename detection. (A path which becomes RELEVANT_CONTENT in a
subsequent commit will be removed from cached_irrelevant.)
-The reason for (D) is that is how we determine whether the remember
+The reason for `(D)` is that is how we determine whether the remember
renames optimization can be used. In particular, remembering that our
sequence of merges looks like:
+....
Merge 1:
MERGE_BASE: E
MERGE_SIDE1: G
@@ -475,6 +499,7 @@ sequence of merges looks like:
MERGE_SIDE1: A'
MERGE_SIDE2: B
=> Creates B'
+....
It is the fact that the trees A and A' appear both in Merge 1 and in
Merge 2, with A as a parent of A' that allows this optimization. So
@@ -482,12 +507,11 @@ we store the trees to compare with what we are asked to merge next
time.
-=== 8. How directory rename detection interacts with the above and ===
-=== why this optimization is still safe even if ===
-=== merge.directoryRenames is set to "true". ===
+== 9. How directory rename detection interacts with the above and why this optimization is still safe even if merge.directoryRenames is set to "true". ==
As noted in the assumptions section:
+....
"""
...if directory renames do occur, then the default of
merge.directoryRenames being set to "conflict" means that the operation
@@ -497,11 +521,13 @@ As noted in the assumptions section:
is that some users will have set merge.directoryRenames to "true" to
allow the merges to continue to proceed automatically.
"""
+....
Let's remember that we need to look at how any given pick affects the next
one. So let's again use the first two picks from the diagram in section
one:
+....
First pick does this three-way merge:
MERGE_BASE: E
MERGE_SIDE1: G
@@ -513,6 +539,7 @@ one:
MERGE_SIDE1: A'
MERGE_SIDE2: B
=> creates B'
+....
Now, directory rename detection exists so that if one side of history
renames a directory, and the other side adds a new file to the old
@@ -545,7 +572,7 @@ while considering all of these cases:
concerned; see the assumptions section). Two interesting sub-notes
about these counts:
- * If we need to perform rename-detection again on the given side (e.g.
+ ** If we need to perform rename-detection again on the given side (e.g.
some paths are relevant for rename detection that weren't before),
then we clear dir_rename_counts and recompute it, making use of
cached_pairs. The reason it is important to do this is optimizations
@@ -556,7 +583,7 @@ while considering all of these cases:
easiest way to "fix up" dir_rename_counts in such cases is to just
recompute it.
- * If we prune rename/rename(1to1) entries from the cache, then we also
+ ** If we prune rename/rename(1to1) entries from the cache, then we also
need to update dir_rename_counts to decrement the counts for the
involved directory and any relevant parent directories (to undo what
update_dir_rename_counts() in diffcore-rename.c incremented when the
@@ -578,6 +605,7 @@ in order:
Case 1: MERGE_SIDE1 renames old dir, MERGE_SIDE2 adds new file to old dir
+....
This case looks like this:
MERGE_BASE: E, Has olddir/
@@ -595,10 +623,13 @@ Case 1: MERGE_SIDE1 renames old dir, MERGE_SIDE2 adds new file to old dir
* MERGE_SIDE1 has cached olddir/newfile -> newdir/newfile
Given the cached rename noted above, the second merge can proceed as
expected without needing to perform rename detection from A -> A'.
+....
Case 2: MERGE_SIDE1 renames old dir, MERGE_SIDE2 renames file into old dir
+....
This case looks like this:
+
MERGE_BASE: E oldfile, olddir/
MERGE_SIDE1: G oldfile, olddir/ -> newdir/
MERGE_SIDE2: A oldfile -> olddir/newfile
@@ -617,9 +648,11 @@ Case 2: MERGE_SIDE1 renames old dir, MERGE_SIDE2 renames file into old dir
Given the cached rename noted above, the second merge can proceed as
expected without needing to perform rename detection from A -> A'.
+....
Case 3: MERGE_SIDE1 adds new file to old dir, MERGE_SIDE2 renames old dir
+....
This case looks like this:
MERGE_BASE: E, Has olddir/
@@ -635,9 +668,11 @@ Case 3: MERGE_SIDE1 adds new file to old dir, MERGE_SIDE2 renames old dir
In this case, with the optimization, note that after the first commit there
were no renames on MERGE_SIDE1, and any renames on MERGE_SIDE2 are tossed.
But the second merge didn't need any renames so this is fine.
+....
Case 4: MERGE_SIDE1 renames file into old dir, MERGE_SIDE2 renames old dir
+....
This case looks like this:
MERGE_BASE: E, Has olddir/
@@ -658,6 +693,7 @@ Case 4: MERGE_SIDE1 renames file into old dir, MERGE_SIDE2 renames old dir
Given the cached rename noted above, the second merge can proceed as
expected without needing to perform rename detection from A -> A'.
+....
Finally, I'll just note here that interactions with the
skip-irrelevant-renames optimization means we sometimes don't detect
diff --git a/Documentation/technical/sparse-checkout.adoc b/Documentation/technical/sparse-checkout.adoc
index 0f750ef3e3..3fa8e53655 100644
--- a/Documentation/technical/sparse-checkout.adoc
+++ b/Documentation/technical/sparse-checkout.adoc
@@ -14,37 +14,41 @@ Table of contents:
* Reference Emails
-=== Terminology ===
+== Terminology ==
-cone mode: one of two modes for specifying the desired subset of files
+*`cone mode`*::
+ one of two modes for specifying the desired subset of files
in a sparse-checkout. In cone-mode, the user specifies
directories (getting both everything under that directory as
well as everything in leading directories), while in non-cone
mode, the user specifies gitignore-style patterns. Controlled
by the --[no-]cone option to sparse-checkout init|set.
-SKIP_WORKTREE: When tracked files do not match the sparse specification and
+*`SKIP_WORKTREE`*::
+ When tracked files do not match the sparse specification and
are removed from the working tree, the file in the index is marked
with a SKIP_WORKTREE bit. Note that if a tracked file has the
SKIP_WORKTREE bit set but the file is later written by the user to
the working tree anyway, the SKIP_WORKTREE bit will be cleared at
the beginning of any subsequent Git operation.
-
- Most sparse checkout users are unaware of this implementation
- detail, and the term should generally be avoided in user-facing
- descriptions and command flags. Unfortunately, prior to the
- `sparse-checkout` subcommand this low-level detail was exposed,
- and as of time of writing, is still exposed in various places.
-
-sparse-checkout: a subcommand in git used to reduce the files present in
++
+Most sparse checkout users are unaware of this implementation
+detail, and the term should generally be avoided in user-facing
+descriptions and command flags. Unfortunately, prior to the
+`sparse-checkout` subcommand this low-level detail was exposed,
+and as of time of writing, is still exposed in various places.
+
+*`sparse-checkout`*::
+ a subcommand in git used to reduce the files present in
the working tree to a subset of all tracked files. Also, the
name of the file in the $GIT_DIR/info directory used to track
the sparsity patterns corresponding to the user's desired
subset.
-sparse cone: see cone mode
+*`sparse cone`*:: see cone mode
-sparse directory: An entry in the index corresponding to a directory, which
+*`sparse directory`*::
+ An entry in the index corresponding to a directory, which
appears in the index instead of all the files under that directory
that would normally appear. See also sparse-index. Something that
can cause confusion is that the "sparse directory" does NOT match
@@ -52,7 +56,8 @@ sparse directory: An entry in the index corresponding to a directory, which
working tree. May be renamed in the future (e.g. to "skipped
directory").
-sparse index: A special mode for sparse-checkout that also makes the
+*`sparse index`*::
+ A special mode for sparse-checkout that also makes the
index sparse by recording a directory entry in lieu of all the
files underneath that directory (thus making that a "skipped
directory" which unfortunately has also been called a "sparse
@@ -60,7 +65,8 @@ sparse index: A special mode for sparse-checkout that also makes the
directories. Controlled by the --[no-]sparse-index option to
init|set|reapply.
-sparsity patterns: patterns from $GIT_DIR/info/sparse-checkout used to
+*`sparsity patterns`*::
+ patterns from $GIT_DIR/info/sparse-checkout used to
define the set of files of interest. A warning: It is easy to
over-use this term (or the shortened "patterns" term), for two
reasons: (1) users in cone mode specify directories rather than
@@ -70,7 +76,8 @@ sparsity patterns: patterns from $GIT_DIR/info/sparse-checkout used to
transiently differ in the working tree or index from the sparsity
patterns (see "Sparse specification vs. sparsity patterns").
-sparse specification: The set of paths in the user's area of focus. This
+*`sparse specification`*::
+ The set of paths in the user's area of focus. This
is typically just the tracked files that match the sparsity
patterns, but the sparse specification can temporarily differ and
include additional files. (See also "Sparse specification
@@ -87,12 +94,13 @@ sparse specification: The set of paths in the user's area of focus. This
* If working with the index and the working copy, the sparse
specification is the union of the paths from above.
-vivifying: When a command restores a tracked file to the working tree (and
+*`vivifying`*::
+ When a command restores a tracked file to the working tree (and
hopefully also clears the SKIP_WORKTREE bit in the index for that
file), this is referred to as "vivifying" the file.
-=== Purpose of sparse-checkouts ===
+== Purpose of sparse-checkouts ==
sparse-checkouts exist to allow users to work with a subset of their
files.
@@ -120,14 +128,12 @@ those usecases, sparse-checkouts can modify different subcommands in over a
half dozen different ways. Let's start by considering the high level
usecases:
- A) Users are _only_ interested in the sparse portion of the repo
-
- A*) Users are _only_ interested in the sparse portion of the repo
- that they have downloaded so far
-
- B) Users want a sparse working tree, but are working in a larger whole
-
- C) sparse-checkout is a behind-the-scenes implementation detail allowing
+[horizontal]
+A):: Users are _only_ interested in the sparse portion of the repo
+A*):: Users are _only_ interested in the sparse portion of the repo
+ that they have downloaded so far
+B):: Users want a sparse working tree, but are working in a larger whole
+C):: sparse-checkout is a behind-the-scenes implementation detail allowing
Git to work with a specially crafted in-house virtual file system;
users are actually working with a "full" working tree that is
lazily populated, and sparse-checkout helps with the lazy population
@@ -136,7 +142,7 @@ usecases:
It may be worth explaining each of these in a bit more detail:
- (Behavior A) Users are _only_ interested in the sparse portion of the repo
+=== (Behavior A) Users are _only_ interested in the sparse portion of the repo
These folks might know there are other things in the repository, but
don't care. They are uninterested in other parts of the repository, and
@@ -163,8 +169,7 @@ side-effects of various other commands (such as the printed diffstat
after a merge or pull) can lead to worries about local repository size
growing unnecessarily[10].
- (Behavior A*) Users are _only_ interested in the sparse portion of the repo
- that they have downloaded so far (a variant on the first usecase)
+=== (Behavior A*) Users are _only_ interested in the sparse portion of the repo that they have downloaded so far (a variant on the first usecase)
This variant is driven by folks who using partial clones together with
sparse checkouts and do disconnected development (so far sounding like a
@@ -173,15 +178,14 @@ reason for yet another variant is that downloading even just the blobs
through history within their sparse specification may be too much, so they
only download some. They would still like operations to succeed without
network connectivity, though, so things like `git log -S${SEARCH_TERM} -p`
-or `git grep ${SEARCH_TERM} OLDREV ` would need to be prepared to provide
+or `git grep ${SEARCH_TERM} OLDREV` would need to be prepared to provide
partial results that depend on what happens to have been downloaded.
This variant could be viewed as Behavior A with the sparse specification
for history querying operations modified from "sparsity patterns" to
"sparsity patterns limited to the blobs we have already downloaded".
- (Behavior B) Users want a sparse working tree, but are working in a
- larger whole
+=== (Behavior B) Users want a sparse working tree, but are working in a larger whole
Stolee described this usecase this way[11]:
@@ -229,8 +233,7 @@ those expensive checks when interacting with the working copy, and may
prefer getting "unrelated" results from their history queries over having
slow commands.
- (Behavior C) sparse-checkout is an implementational detail supporting a
- special VFS.
+=== (Behavior C) sparse-checkout is an implementational detail supporting a special VFS.
This usecase goes slightly against the traditional definition of
sparse-checkout in that it actually tries to present a full or dense
@@ -255,13 +258,13 @@ will perceive the checkout as dense, and commands should thus behave as if
all files are present.
-=== Usecases of primary concern ===
+== Usecases of primary concern ==
Most of the rest of this document will focus on Behavior A and Behavior
B. Some notes about the other two cases and why we are not focusing on
them:
- (Behavior A*)
+=== (Behavior A*)
Supporting this usecase is estimated to be difficult and a lot of work.
There are no plans to implement it currently, but it may be a potential
@@ -275,7 +278,7 @@ valid for this usecase, with the only exception being that it redefines the
sparse specification to restrict it to already-downloaded blobs. The hard
part is in making commands capable of respecting that modified definition.
- (Behavior C)
+=== (Behavior C)
This usecase violates some of the early sparse-checkout documented
assumptions (since files marked as SKIP_WORKTREE will be displayed to users
@@ -300,20 +303,20 @@ Behavior C do not assume they are part of the Behavior B camp and propose
patches that break things for the real Behavior B folks.
-=== Oversimplified mental models ===
+== Oversimplified mental models ==
An oversimplification of the differences in the above behaviors is:
- Behavior A: Restrict worktree and history operations to sparse specification
- Behavior B: Restrict worktree operations to sparse specification; have any
- history operations work across all files
- Behavior C: Do not restrict either worktree or history operations to the
- sparse specification...with the exception of branch checkouts or
- switches which avoid writing files that will match the index so
- they can later lazily be populated instead.
+(Behavior A):: Restrict worktree and history operations to sparse specification
+(Behavior B):: Restrict worktree operations to sparse specification; have any
+ history operations work across all files
+(Behavior C):: Do not restrict either worktree or history operations to the
+ sparse specification...with the exception of branch checkouts or
+ switches which avoid writing files that will match the index so
+ they can later lazily be populated instead.
-=== Desired behavior ===
+== Desired behavior ==
As noted previously, despite the simple idea of just working with a subset
of files, there are a range of different behavioral changes that need to be
@@ -326,37 +329,38 @@ understanding these differences can be beneficial.
* Commands behaving the same regardless of high-level use-case
- * commands that only look at files within the sparsity specification
+ ** commands that only look at files within the sparsity specification
- * diff (without --cached or REVISION arguments)
- * grep (without --cached or REVISION arguments)
- * diff-files
+ *** diff (without --cached or REVISION arguments)
+ *** grep (without --cached or REVISION arguments)
+ *** diff-files
- * commands that restore files to the working tree that match sparsity
+ ** commands that restore files to the working tree that match sparsity
patterns, and remove unmodified files that don't match those
patterns:
- * switch
- * checkout (the switch-like half)
- * read-tree
- * reset --hard
+ *** switch
+ *** checkout (the switch-like half)
+ *** read-tree
+ *** reset --hard
- * commands that write conflicted files to the working tree, but otherwise
+ ** commands that write conflicted files to the working tree, but otherwise
will omit writing files to the working tree that do not match the
sparsity patterns:
- * merge
- * rebase
- * cherry-pick
- * revert
+ *** merge
+ *** rebase
+ *** cherry-pick
+ *** revert
- * `am` and `apply --cached` should probably be in this section but
+ *** `am` and `apply --cached` should probably be in this section but
are buggy (see the "Known bugs" section below)
The behavior for these commands somewhat depends upon the merge
strategy being used:
- * `ort` behaves as described above
- * `octopus` and `resolve` will always vivify any file changed in the merge
+
+ *** `ort` behaves as described above
+ *** `octopus` and `resolve` will always vivify any file changed in the merge
relative to the first parent, which is rather suboptimal.
It is also important to note that these commands WILL update the index
@@ -372,21 +376,21 @@ understanding these differences can be beneficial.
specification and the sparsity patterns (much like the commands in the
previous section).
- * commands that always ignore sparsity since commits must be full-tree
+ ** commands that always ignore sparsity since commits must be full-tree
- * archive
- * bundle
- * commit
- * format-patch
- * fast-export
- * fast-import
- * commit-tree
+ *** archive
+ *** bundle
+ *** commit
+ *** format-patch
+ *** fast-export
+ *** fast-import
+ *** commit-tree
- * commands that write any modified file to the working tree (conflicted
+ ** commands that write any modified file to the working tree (conflicted
or not, and whether those paths match sparsity patterns or not):
- * stash
- * apply (without `--index` or `--cached`)
+ *** stash
+ *** apply (without `--index` or `--cached`)
* Commands that may slightly differ for behavior A vs. behavior B:
@@ -394,19 +398,20 @@ understanding these differences can be beneficial.
behaviors, but may differ in verbosity and types of warning and error
messages.
- * commands that make modifications to which files are tracked:
- * add
- * rm
- * mv
- * update-index
+ ** commands that make modifications to which files are tracked:
+
+ *** add
+ *** rm
+ *** mv
+ *** update-index
The fact that files can move between the 'tracked' and 'untracked'
categories means some commands will have to treat untracked files
differently. But if we have to treat untracked files differently,
then additional commands may also need changes:
- * status
- * clean
+ *** status
+ *** clean
In particular, `status` may need to report any untracked files outside
the sparsity specification as an erroneous condition (especially to
@@ -420,9 +425,10 @@ understanding these differences can be beneficial.
may need to ignore the sparse specification by its nature. Also, its
current --[no-]ignore-skip-worktree-entries default is totally bogus.
- * commands for manually tweaking paths in both the index and the working tree
- * `restore`
- * the restore-like half of `checkout`
+ ** commands for manually tweaking paths in both the index and the working tree
+
+ *** `restore`
+ *** the restore-like half of `checkout`
These commands should be similar to add/rm/mv in that they should
only operate on the sparse specification by default, and require a
@@ -433,18 +439,19 @@ understanding these differences can be beneficial.
* Commands that significantly differ for behavior A vs. behavior B:
- * commands that query history
- * diff (with --cached or REVISION arguments)
- * grep (with --cached or REVISION arguments)
- * show (when given commit arguments)
- * blame (only matters when one or more -C flags are passed)
- * and annotate
- * log
- * whatchanged (may not exist anymore)
- * ls-files
- * diff-index
- * diff-tree
- * ls-tree
+ ** commands that query history
+
+ *** diff (with --cached or REVISION arguments)
+ *** grep (with --cached or REVISION arguments)
+ *** show (when given commit arguments)
+ *** blame (only matters when one or more -C flags are passed)
+ **** and annotate
+ *** log
+ *** whatchanged (may not exist anymore)
+ *** ls-files
+ *** diff-index
+ *** diff-tree
+ *** ls-tree
Note: for log and whatchanged, revision walking logic is unaffected
but displaying of patches is affected by scoping the command to the
@@ -458,91 +465,91 @@ understanding these differences can be beneficial.
* Commands I don't know how to classify
- * range-diff
+ ** range-diff
Is this like `log` or `format-patch`?
- * cherry
+ ** cherry
See range-diff
* Commands unaffected by sparse-checkouts
- * shortlog
- * show-branch
- * rev-list
- * bisect
-
- * branch
- * describe
- * fetch
- * gc
- * init
- * maintenance
- * notes
- * pull (merge & rebase have the necessary changes)
- * push
- * submodule
- * tag
-
- * config
- * filter-branch (works in separate checkout without sparse-checkout setup)
- * pack-refs
- * prune
- * remote
- * repack
- * replace
-
- * bugreport
- * count-objects
- * fsck
- * gitweb
- * help
- * instaweb
- * merge-tree (doesn't touch worktree or index, and merges always compute full-tree)
- * rerere
- * verify-commit
- * verify-tag
-
- * commit-graph
- * hash-object
- * index-pack
- * mktag
- * mktree
- * multi-pack-index
- * pack-objects
- * prune-packed
- * symbolic-ref
- * unpack-objects
- * update-ref
- * write-tree (operates on index, possibly optimized to use sparse dir entries)
-
- * for-each-ref
- * get-tar-commit-id
- * ls-remote
- * merge-base (merges are computed full tree, so merge base should be too)
- * name-rev
- * pack-redundant
- * rev-parse
- * show-index
- * show-ref
- * unpack-file
- * var
- * verify-pack
-
- * <Everything under 'Interacting with Others' in 'git help --all'>
- * <Everything under 'Low-level...Syncing' in 'git help --all'>
- * <Everything under 'Low-level...Internal Helpers' in 'git help --all'>
- * <Everything under 'External commands' in 'git help --all'>
+ ** shortlog
+ ** show-branch
+ ** rev-list
+ ** bisect
+
+ ** branch
+ ** describe
+ ** fetch
+ ** gc
+ ** init
+ ** maintenance
+ ** notes
+ ** pull (merge & rebase have the necessary changes)
+ ** push
+ ** submodule
+ ** tag
+
+ ** config
+ ** filter-branch (works in separate checkout without sparse-checkout setup)
+ ** pack-refs
+ ** prune
+ ** remote
+ ** repack
+ ** replace
+
+ ** bugreport
+ ** count-objects
+ ** fsck
+ ** gitweb
+ ** help
+ ** instaweb
+ ** merge-tree (doesn't touch worktree or index, and merges always compute full-tree)
+ ** rerere
+ ** verify-commit
+ ** verify-tag
+
+ ** commit-graph
+ ** hash-object
+ ** index-pack
+ ** mktag
+ ** mktree
+ ** multi-pack-index
+ ** pack-objects
+ ** prune-packed
+ ** symbolic-ref
+ ** unpack-objects
+ ** update-ref
+ ** write-tree (operates on index, possibly optimized to use sparse dir entries)
+
+ ** for-each-ref
+ ** get-tar-commit-id
+ ** ls-remote
+ ** merge-base (merges are computed full tree, so merge base should be too)
+ ** name-rev
+ ** pack-redundant
+ ** rev-parse
+ ** show-index
+ ** show-ref
+ ** unpack-file
+ ** var
+ ** verify-pack
+
+ ** <Everything under 'Interacting with Others' in 'git help --all'>
+ ** <Everything under 'Low-level...Syncing' in 'git help --all'>
+ ** <Everything under 'Low-level...Internal Helpers' in 'git help --all'>
+ ** <Everything under 'External commands' in 'git help --all'>
* Commands that might be affected, but who cares?
- * merge-file
- * merge-index
- * gitk?
+ ** merge-file
+ ** merge-index
+ ** gitk?
-=== Behavior classes ===
+== Behavior classes ==
From the above there are a few classes of behavior:
@@ -573,18 +580,19 @@ From the above there are a few classes of behavior:
Commands in this class generally behave like the "restrict" class,
except that:
- (1) they will ignore the sparse specification and write files with
- conflicts to the working tree (thus temporarily expanding the
- sparse specification to include such files.)
- (2) they are grouped with commands which move to a new commit, since
- they often create a commit and then move to it, even though we
- know there are many exceptions to moving to the new commit. (For
- example, the user may rebase a commit that becomes empty, or have
- a cherry-pick which conflicts, or a user could run `merge
- --no-commit`, and we also view `apply --index` kind of like `am
- --no-commit`.) As such, these commands can make changes to index
- files outside the sparse specification, though they'll mark such
- files with SKIP_WORKTREE.
+
+ (1) they will ignore the sparse specification and write files with
+ conflicts to the working tree (thus temporarily expanding the
+ sparse specification to include such files.)
+ (2) they are grouped with commands which move to a new commit, since
+ they often create a commit and then move to it, even though we
+ know there are many exceptions to moving to the new commit. (For
+ example, the user may rebase a commit that becomes empty, or have
+ a cherry-pick which conflicts, or a user could run `merge
+ --no-commit`, and we also view `apply --index` kind of like `am
+ --no-commit`.) As such, these commands can make changes to index
+ files outside the sparse specification, though they'll mark such
+ files with SKIP_WORKTREE.
* "restrict also specially applied to untracked files"
@@ -609,37 +617,39 @@ From the above there are a few classes of behavior:
specification.
-=== Subcommand-dependent defaults ===
+== Subcommand-dependent defaults ==
Note that we have different defaults depending on the command for the
desired behavior :
* Commands defaulting to "restrict":
- * diff-files
- * diff (without --cached or REVISION arguments)
- * grep (without --cached or REVISION arguments)
- * switch
- * checkout (the switch-like half)
- * reset (<commit>)
-
- * restore
- * checkout (the restore-like half)
- * checkout-index
- * reset (with pathspec)
+
+ ** diff-files
+ ** diff (without --cached or REVISION arguments)
+ ** grep (without --cached or REVISION arguments)
+ ** switch
+ ** checkout (the switch-like half)
+ ** reset (<commit>)
+
+ ** restore
+ ** checkout (the restore-like half)
+ ** checkout-index
+ ** reset (with pathspec)
This behavior makes sense; these interact with the working tree.
* Commands defaulting to "restrict modulo conflicts":
- * merge
- * rebase
- * cherry-pick
- * revert
- * am
- * apply --index (which is kind of like an `am --no-commit`)
+ ** merge
+ ** rebase
+ ** cherry-pick
+ ** revert
+
+ ** am
+ ** apply --index (which is kind of like an `am --no-commit`)
- * read-tree (especially with -m or -u; is kind of like a --no-commit merge)
- * reset (<tree-ish>, due to similarity to read-tree)
+ ** read-tree (especially with -m or -u; is kind of like a --no-commit merge)
+ ** reset (<tree-ish>, due to similarity to read-tree)
These also interact with the working tree, but require slightly
different behavior either so that (a) conflicts can be resolved or (b)
@@ -648,16 +658,17 @@ desired behavior :
(See also the "Known bugs" section below regarding `am` and `apply`)
* Commands defaulting to "no restrict":
- * archive
- * bundle
- * commit
- * format-patch
- * fast-export
- * fast-import
- * commit-tree
- * stash
- * apply (without `--index`)
+ ** archive
+ ** bundle
+ ** commit
+ ** format-patch
+ ** fast-export
+ ** fast-import
+ ** commit-tree
+
+ ** stash
+ ** apply (without `--index`)
These have completely different defaults and perhaps deserve the most
detailed explanation:
@@ -679,53 +690,59 @@ desired behavior :
sparse specification then we'll lose changes from the user.
* Commands defaulting to "restrict also specially applied to untracked files":
- * add
- * rm
- * mv
- * update-index
- * status
- * clean (?)
-
- Our original implementation for the first three of these commands was
- "no restrict", but it had some severe usability issues:
- * `git add <somefile>` if honored and outside the sparse
- specification, can result in the file randomly disappearing later
- when some subsequent command is run (since various commands
- automatically clean up unmodified files outside the sparse
- specification).
- * `git rm '*.jpg'` could very negatively surprise users if it deletes
- files outside the range of the user's interest.
- * `git mv` has similar surprises when moving into or out of the cone,
- so best to restrict by default
-
- So, we switched `add` and `rm` to default to "restrict", which made
- usability problems much less severe and less frequent, but we still got
- complaints because commands like:
- git add <file-outside-sparse-specification>
- git rm <file-outside-sparse-specification>
- would silently do nothing. We should instead print an error in those
- cases to get usability right.
-
- update-index needs to be updated to match, and status and maybe clean
- also need to be updated to specially handle untracked paths.
-
- There may be a difference in here between behavior A and behavior B in
- terms of verboseness of errors or additional warnings.
+
+ ** add
+ ** rm
+ ** mv
+ ** update-index
+ ** status
+ ** clean (?)
+
+....
+ Our original implementation for the first three of these commands was
+ "no restrict", but it had some severe usability issues:
+
+ * `git add <somefile>` if honored and outside the sparse
+ specification, can result in the file randomly disappearing later
+ when some subsequent command is run (since various commands
+ automatically clean up unmodified files outside the sparse
+ specification).
+ * `git rm '*.jpg'` could very negatively surprise users if it deletes
+ files outside the range of the user's interest.
+ * `git mv` has similar surprises when moving into or out of the cone,
+ so best to restrict by default
+
+ So, we switched `add` and `rm` to default to "restrict", which made
+ usability problems much less severe and less frequent, but we still got
+ complaints because commands like:
+
+ git add <file-outside-sparse-specification>
+ git rm <file-outside-sparse-specification>
+
+ would silently do nothing. We should instead print an error in those
+ cases to get usability right.
+
+ update-index needs to be updated to match, and status and maybe clean
+ also need to be updated to specially handle untracked paths.
+
+ There may be a difference in here between behavior A and behavior B in
+ terms of verboseness of errors or additional warnings.
+....
* Commands falling under "restrict or no restrict dependent upon behavior
A vs. behavior B"
- * diff (with --cached or REVISION arguments)
- * grep (with --cached or REVISION arguments)
- * show (when given commit arguments)
- * blame (only matters when one or more -C flags passed)
- * and annotate
- * log
- * and variants: shortlog, gitk, show-branch, whatchanged, rev-list
- * ls-files
- * diff-index
- * diff-tree
- * ls-tree
+ ** diff (with --cached or REVISION arguments)
+ ** grep (with --cached or REVISION arguments)
+ ** show (when given commit arguments)
+ ** blame (only matters when one or more -C flags passed)
+ *** and annotate
+ ** log
+ *** and variants: shortlog, gitk, show-branch, whatchanged, rev-list
+ ** ls-files
+ ** diff-index
+ ** diff-tree
+ ** ls-tree
For now, we default to behavior B for these, which want a default of
"no restrict".
@@ -749,7 +766,7 @@ desired behavior :
implemented.
-=== Sparse specification vs. sparsity patterns ===
+== Sparse specification vs. sparsity patterns ==
In a well-behaved situation, the sparse specification is given directly
by the $GIT_DIR/info/sparse-checkout file. However, it can transiently
@@ -821,45 +838,48 @@ under behavior B index operations are lumped with history and tend to
operate full-tree.
-=== Implementation Questions ===
-
- * Do the options --scope={sparse,all} sound good to others? Are there better
- options?
- * Names in use, or appearing in patches, or previously suggested:
- * --sparse/--dense
- * --ignore-skip-worktree-bits
- * --ignore-skip-worktree-entries
- * --ignore-sparsity
- * --[no-]restrict-to-sparse-paths
- * --full-tree/--sparse-tree
- * --[no-]restrict
- * --scope={sparse,all}
- * --focus/--unfocus
- * --limit/--unlimited
- * Rationale making me lean slightly towards --scope={sparse,all}:
- * We want a name that works for many commands, so we need a name that
+== Implementation Questions ==
+
+ * Do the options --scope={sparse,all} sound good to others? Are there better options?
+
+ ** Names in use, or appearing in patches, or previously suggested:
+
+ *** --sparse/--dense
+ *** --ignore-skip-worktree-bits
+ *** --ignore-skip-worktree-entries
+ *** --ignore-sparsity
+ *** --[no-]restrict-to-sparse-paths
+ *** --full-tree/--sparse-tree
+ *** --[no-]restrict
+ *** --scope={sparse,all}
+ *** --focus/--unfocus
+ *** --limit/--unlimited
+
+ ** Rationale making me lean slightly towards --scope={sparse,all}:
+
+ *** We want a name that works for many commands, so we need a name that
does not conflict
- * We know that we have more than two possible usecases, so it is best
+ *** We know that we have more than two possible usecases, so it is best
to avoid a flag that appears to be binary.
- * --scope={sparse,all} isn't overly long and seems relatively
+ *** --scope={sparse,all} isn't overly long and seems relatively
explanatory
- * `--sparse`, as used in add/rm/mv, is totally backwards for
+ *** `--sparse`, as used in add/rm/mv, is totally backwards for
grep/log/etc. Changing the meaning of `--sparse` for these
commands would fix the backwardness, but possibly break existing
scripts. Using a new name pairing would allow us to treat
`--sparse` in these commands as a deprecated alias.
- * There is a different `--sparse`/`--dense` pair for commands using
+ *** There is a different `--sparse`/`--dense` pair for commands using
revision machinery, so using that naming might cause confusion
- * There is also a `--sparse` in both pack-objects and show-branch, which
+ *** There is also a `--sparse` in both pack-objects and show-branch, which
don't conflict but do suggest that `--sparse` is overloaded
- * The name --ignore-skip-worktree-bits is a double negative, is
+ *** The name --ignore-skip-worktree-bits is a double negative, is
quite a mouthful, refers to an implementation detail that many
users may not be familiar with, and we'd need a negation for it
which would probably be even more ridiculously long. (But we
can make --ignore-skip-worktree-bits a deprecated alias for
--no-restrict.)
- * If a config option is added (sparse.scope?) what should the values and
+ ** If a config option is added (sparse.scope?) what should the values and
description be? "sparse" (behavior A), "worktree-sparse-history-dense"
(behavior B), "dense" (behavior C)? There's a risk of confusion,
because even for Behaviors A and B we want some commands to be
@@ -868,19 +888,20 @@ operate full-tree.
the primary difference we are focusing is just the history-querying
commands (log/diff/grep). Previous config suggestion here: [13]
- * Is `--no-expand` a good alias for ls-files's `--sparse` option?
+ ** Is `--no-expand` a good alias for ls-files's `--sparse` option?
(`--sparse` does not map to either `--scope=sparse` or `--scope=all`,
because in non-cone mode it does nothing and in cone-mode it shows the
sparse directory entries which are technically outside the sparse
specification)
- * Under Behavior A:
- * Does ls-files' `--no-expand` override the default `--scope=all`, or
+ ** Under Behavior A:
+
+ *** Does ls-files' `--no-expand` override the default `--scope=all`, or
does it need an extra flag?
- * Does ls-files' `-t` option imply `--scope=all`?
- * Does update-index's `--[no-]skip-worktree` option imply `--scope=all`?
+ *** Does ls-files' `-t` option imply `--scope=all`?
+ *** Does update-index's `--[no-]skip-worktree` option imply `--scope=all`?
- * sparse-checkout: once behavior A is fully implemented, should we take
+ ** sparse-checkout: once behavior A is fully implemented, should we take
an interim measure to ease people into switching the default? Namely,
if folks are not already in a sparse checkout, then require
`sparse-checkout init/set` to take a
@@ -892,7 +913,7 @@ operate full-tree.
is seamless for them.
-=== Implementation Goals/Plans ===
+== Implementation Goals/Plans ==
* Get buy-in on this document in general.
@@ -910,25 +931,26 @@ operate full-tree.
request that they not trigger this bug." flag
* Flags & Config
- * Make `--sparse` in add/rm/mv a deprecated alias for `--scope=all`
- * Make `--ignore-skip-worktree-bits` in checkout-index/checkout/restore
+
+ ** Make `--sparse` in add/rm/mv a deprecated alias for `--scope=all`
+ ** Make `--ignore-skip-worktree-bits` in checkout-index/checkout/restore
a deprecated aliases for `--scope=all`
- * Create config option (sparse.scope?), tie it to the "Cliff notes"
+ ** Create config option (sparse.scope?), tie it to the "Cliff notes"
overview
- * Add --scope=sparse (and --scope=all) flag to each of the history querying
+ ** Add --scope=sparse (and --scope=all) flag to each of the history querying
commands. IMPORTANT: make sure diff machinery changes don't mess with
format-patch, fast-export, etc.
-=== Known bugs ===
+== Known bugs ==
This list used to be a lot longer (see e.g. [1,2,3,4,5,6,7,8,9]), but we've
been working on it.
-0. Behavior A is not well supported in Git. (Behavior B didn't used to
+1. Behavior A is not well supported in Git. (Behavior B didn't used to
be either, but was the easier of the two to implement.)
-1. am and apply:
+2. am and apply:
apply, without `--index` or `--cached`, relies on files being present
in the working copy, and also writes to them unconditionally. As
@@ -948,7 +970,7 @@ been working on it.
files and then complain that those vivified files would be
overwritten by merge.
-2. reset --hard:
+3. reset --hard:
reset --hard provides confusing error message (works correctly, but
misleads the user into believing it didn't):
@@ -971,13 +993,13 @@ been working on it.
`git reset --hard` DID remove addme from the index and the working tree, contrary
to the error message, but in line with how reset --hard should behave.
-3. read-tree
+4. read-tree
`read-tree` doesn't apply the 'SKIP_WORKTREE' bit to *any* of the
entries it reads into the index, resulting in all your files suddenly
appearing to be "deleted".
-4. Checkout, restore:
+5. Checkout, restore:
These command do not handle path & revision arguments appropriately:
@@ -1030,7 +1052,7 @@ been working on it.
S tracked
H tracked-but-maybe-skipped
-5. checkout and restore --staged, continued:
+6. checkout and restore --staged, continued:
These commands do not correctly scope operations to the sparse
specification, and make it worse by not setting important SKIP_WORKTREE
@@ -1046,56 +1068,82 @@ been working on it.
the sparse specification, but then it will be important to set the
SKIP_WORKTREE bits appropriately.
-6. Performance issues; see:
- https://lore.kernel.org/git/CABPp-BEkJQoKZsQGCYioyga_uoDQ6iBeW+FKr8JhyuuTMK1RDw@mail.gmail.com/
+7. Performance issues; see:
+
+ https://lore.kernel.org/git/CABPp-BEkJQoKZsQGCYioyga_uoDQ6iBeW+FKr8JhyuuTMK1RDw@mail.gmail.com/
-=== Reference Emails ===
+== Reference Emails ==
Emails that detail various bugs we've had in sparse-checkout:
-[1] (Original descriptions of behavior A & behavior B)
- https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
-[2] (Fix stash applications in sparse checkouts; bugs from behavioral differences)
- https://lore.kernel.org/git/ccfedc7140dbf63ba26a15f93bd3885180b26517.1606861519.git.gitgitgadget@gmail.com/
-[3] (Present-despite-skipped entries)
- https://lore.kernel.org/git/11d46a399d26c913787b704d2b7169cafc28d639.1642175983.git.gitgitgadget@gmail.com/
-[4] (Clone --no-checkout interaction)
- https://lore.kernel.org/git/pull.801.v2.git.git.1591324899170.gitgitgadget@gmail.com/ (clone --no-checkout)
-[5] (The need for update_sparsity() and avoiding `read-tree -mu HEAD`)
- https://lore.kernel.org/git/3a1f084641eb47515b5a41ed4409a36128913309.1585270142.git.gitgitgadget@gmail.com/
-[6] (SKIP_WORKTREE is advisory, not mandatory)
- https://lore.kernel.org/git/844306c3e86ef67591cc086decb2b760e7d710a3.1585270142.git.gitgitgadget@gmail.com/
-[7] (`worktree add` should copy sparsity settings from current worktree)
- https://lore.kernel.org/git/c51cb3714e7b1d2f8c9370fe87eca9984ff4859f.1644269584.git.gitgitgadget@gmail.com/
-[8] (Avoid negative surprises in add, rm, and mv)
- https://lore.kernel.org/git/cover.1617914011.git.matheus.bernardino@usp.br/
- https://lore.kernel.org/git/pull.1018.v4.git.1632497954.gitgitgadget@gmail.com/
-[9] (Move from out-of-cone to in-cone)
- https://lore.kernel.org/git/20220630023737.473690-6-shaoxuan.yuan02@gmail.com/
- https://lore.kernel.org/git/20220630023737.473690-4-shaoxuan.yuan02@gmail.com/
-[10] (Unnecessarily downloading objects outside sparse specification)
- https://lore.kernel.org/git/CAOLTT8QfwOi9yx_qZZgyGa8iL8kHWutEED7ok_jxwTcYT_hf9Q@mail.gmail.com/
-
-[11] (Stolee's comments on high-level usecases)
- https://lore.kernel.org/git/1a1e33f6-3514-9afc-0a28-5a6b85bd8014@gmail.com/
+[1] (Original descriptions of behavior A & behavior B):
+
+https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
+
+[2] (Fix stash applications in sparse checkouts; bugs from behavioral differences):
+
+https://lore.kernel.org/git/ccfedc7140dbf63ba26a15f93bd3885180b26517.1606861519.git.gitgitgadget@gmail.com/
+
+[3] (Present-despite-skipped entries):
+
+https://lore.kernel.org/git/11d46a399d26c913787b704d2b7169cafc28d639.1642175983.git.gitgitgadget@gmail.com/
+
+[4] (Clone --no-checkout interaction):
+
+https://lore.kernel.org/git/pull.801.v2.git.git.1591324899170.gitgitgadget@gmail.com/ (clone --no-checkout)
+
+[5] (The need for update_sparsity() and avoiding `read-tree -mu HEAD`):
+
+https://lore.kernel.org/git/3a1f084641eb47515b5a41ed4409a36128913309.1585270142.git.gitgitgadget@gmail.com/
+
+[6] (SKIP_WORKTREE is advisory, not mandatory):
+
+https://lore.kernel.org/git/844306c3e86ef67591cc086decb2b760e7d710a3.1585270142.git.gitgitgadget@gmail.com/
+
+[7] (`worktree add` should copy sparsity settings from current worktree):
+
+https://lore.kernel.org/git/c51cb3714e7b1d2f8c9370fe87eca9984ff4859f.1644269584.git.gitgitgadget@gmail.com/
+
+[8] (Avoid negative surprises in add, rm, and mv):
+
+ * https://lore.kernel.org/git/cover.1617914011.git.matheus.bernardino@usp.br/
+ * https://lore.kernel.org/git/pull.1018.v4.git.1632497954.gitgitgadget@gmail.com/
+
+[9] (Move from out-of-cone to in-cone):
+
+ * https://lore.kernel.org/git/20220630023737.473690-6-shaoxuan.yuan02@gmail.com/
+ * https://lore.kernel.org/git/20220630023737.473690-4-shaoxuan.yuan02@gmail.com/
+
+[10] (Unnecessarily downloading objects outside sparse specification):
+
+https://lore.kernel.org/git/CAOLTT8QfwOi9yx_qZZgyGa8iL8kHWutEED7ok_jxwTcYT_hf9Q@mail.gmail.com/
+
+[11] (Stolee's comments on high-level usecases):
+
+https://lore.kernel.org/git/1a1e33f6-3514-9afc-0a28-5a6b85bd8014@gmail.com/
[12] Others commenting on eventually switching default to behavior A:
+
* https://lore.kernel.org/git/xmqqh719pcoo.fsf@gitster.g/
* https://lore.kernel.org/git/xmqqzgeqw0sy.fsf@gitster.g/
* https://lore.kernel.org/git/a86af661-cf58-a4e5-0214-a67d3a794d7e@github.com/
-[13] Previous config name suggestion and description
- * https://lore.kernel.org/git/CABPp-BE6zW0nJSStcVU=_DoDBnPgLqOR8pkTXK3dW11=T01OhA@mail.gmail.com/
+[13] Previous config name suggestion and description:
+
+ https://lore.kernel.org/git/CABPp-BE6zW0nJSStcVU=_DoDBnPgLqOR8pkTXK3dW11=T01OhA@mail.gmail.com/
[14] Tangential issue: switch to cone mode as default sparse specification mechanism:
- https://lore.kernel.org/git/a1b68fd6126eb341ef3637bb93fedad4309b36d0.1650594746.git.gitgitgadget@gmail.com/
+
+https://lore.kernel.org/git/a1b68fd6126eb341ef3637bb93fedad4309b36d0.1650594746.git.gitgitgadget@gmail.com/
[15] Lengthy email on grep behavior, covering what should be searched:
- * https://lore.kernel.org/git/CABPp-BGVO3QdbfE84uF_3QDF0-y2iHHh6G5FAFzNRfeRitkuHw@mail.gmail.com/
+
+https://lore.kernel.org/git/CABPp-BGVO3QdbfE84uF_3QDF0-y2iHHh6G5FAFzNRfeRitkuHw@mail.gmail.com/
[16] Email explaining sparsity patterns vs. SKIP_WORKTREE and history operations,
search for the parenthetical comment starting "We do not check".
- https://lore.kernel.org/git/CABPp-BFsCPPNOZ92JQRJeGyNd0e-TCW-LcLyr0i_+VSQJP+GCg@mail.gmail.com/
+
+https://lore.kernel.org/git/CABPp-BFsCPPNOZ92JQRJeGyNd0e-TCW-LcLyr0i_+VSQJP+GCg@mail.gmail.com/
[17] https://lore.kernel.org/git/20220207190320.2960362-1-jonathantanmy@google.com/
diff --git a/Documentation/technical/unambiguous-types.adoc b/Documentation/technical/unambiguous-types.adoc
new file mode 100644
index 0000000000..658a5b578e
--- /dev/null
+++ b/Documentation/technical/unambiguous-types.adoc
@@ -0,0 +1,229 @@
+= Unambiguous types
+
+Most of these mappings are obvious, but there are some nuances and gotchas with
+Rust FFI (Foreign Function Interface).
+
+This document defines clear, one-to-one mappings between primitive types in C,
+Rust (and possible other languages in the future). Its purpose is to eliminate
+ambiguity in type widths, signedness, and binary representation across
+platforms and languages.
+
+For Git, the only header required to use these unambiguous types in C is
+`git-compat-util.h`.
+
+== Boolean types
+[cols="1,1", options="header"]
+|===
+| C Type | Rust Type
+| bool^1^ | bool
+|===
+
+== Integer types
+
+In C, `<stdint.h>` (or an equivalent) must be included.
+
+[cols="1,1", options="header"]
+|===
+| C Type | Rust Type
+| uint8_t | u8
+| uint16_t | u16
+| uint32_t | u32
+| uint64_t | u64
+
+| int8_t | i8
+| int16_t | i16
+| int32_t | i32
+| int64_t | i64
+|===
+
+== Floating-point types
+
+Rust requires IEEE-754 semantics.
+In C, that is typically true, but not guaranteed by the standard.
+
+[cols="1,1", options="header"]
+|===
+| C Type | Rust Type
+| float^2^ | f32
+| double^2^ | f64
+|===
+
+== Size types
+
+These types represent pointer-sized integers and are typically defined in
+`<stddef.h>` or an equivalent header.
+
+Size types should be used any time pointer arithmetic is performed e.g.
+indexing an array, describing the number of elements in memory, etc...
+
+[cols="1,1", options="header"]
+|===
+| C Type | Rust Type
+| size_t^3^ | usize
+| ptrdiff_t^4^ | isize
+|===
+
+== Character types
+
+This is where C and Rust don't have a clean one-to-one mapping. A C `char` is
+an 8-bit type that is signless (neither signed nor unsigned) which causes
+problems with e.g. `make DEVELOPER=1`. Rust's `char` type is an unsigned 32-bit
+integer that is used to describe Unicode code points. Even though a C `char`
+is the same width as `u8`, `char` should be converted to u8 where it is
+describing bytes in memory. If a C `char` is not describing bytes, then it
+should be converted to a more accurate unambiguous type.
+
+While you could specify `char` in the C code and `u8` in Rust code, it's not as
+clear what the appropriate type is, but it would work across the FFI boundary.
+However the bigger problem comes from code generation tools like cbindgen and
+bindgen. When cbindgen see u8 in Rust it will generate uint8_t on the C side
+which will cause differ in signedness warnings/errors. Similaraly if bindgen
+see `char` on the C side it will generate `std::ffi::c_char` which has its own
+problems.
+
+=== Notes
+^1^ This is only true if stdbool.h (or equivalent) is used. +
+^2^ C does not enforce IEEE-754 compatibility, but Rust expects it. If the
+platform/arch for C does not follow IEEE-754 then this equivalence does not
+hold. Also, it's assumed that `float` is 32 bits and `double` is 64, but
+there may be a strange platform/arch where even this isn't true. +
+^3^ C also defines uintptr_t, but this should not be used in Git. +
+^4^ C also defines ssize_t and intptr_t, but these should not be used in Git. +
+
+== Problems with std::ffi::c_* types in Rust
+TL;DR: They're not guaranteed to match C types for all possible C
+compilers/platforms/architectures.
+
+Only a few of Rust's C FFI types are considered safe and semantically clear to
+use: +
+
+* `c_void`
+* `CStr`
+* `CString`
+
+Even then, they should be used sparingly, and only where the semantics match
+exactly.
+
+The std::os::raw::c_* (which is deprecated) directly inherits the problems of
+core::ffi, which changes over time and seems to make a best guess at the
+correct definition for a given platform/target. This probably isn't a problem
+for all platforms that Rust supports currently, but can anyone say that Rust
+got it right for all C compilers of all platforms/targets?
+
+On top of all of that we're targeting an older version of Rust which doesn't
+have the latest mappings.
+
+To give an example: c_long is defined in
+footnote:[https://doc.rust-lang.org/1.63.0/src/core/ffi/mod.rs.html#175-189[c_long in 1.63.0]]
+footnote:[https://doc.rust-lang.org/1.89.0/src/core/ffi/primitives.rs.html#135-151[c_long in 1.89.0]]
+
+=== Rust version 1.63.0
+
+[source]
+----
+mod c_long_definition {
+ cfg_if! {
+ if #[cfg(all(target_pointer_width = "64", not(windows)))] {
+ pub type c_long = i64;
+ pub type NonZero_c_long = crate::num::NonZeroI64;
+ pub type c_ulong = u64;
+ pub type NonZero_c_ulong = crate::num::NonZeroU64;
+ } else {
+ // The minimal size of `long` in the C standard is 32 bits
+ pub type c_long = i32;
+ pub type NonZero_c_long = crate::num::NonZeroI32;
+ pub type c_ulong = u32;
+ pub type NonZero_c_ulong = crate::num::NonZeroU32;
+ }
+ }
+}
+----
+
+=== Rust version 1.89.0
+
+[source]
+----
+mod c_long_definition {
+ crate::cfg_select! {
+ any(
+ all(target_pointer_width = "64", not(windows)),
+ // wasm32 Linux ABI uses 64-bit long
+ all(target_arch = "wasm32", target_os = "linux")
+ ) => {
+ pub(super) type c_long = i64;
+ pub(super) type c_ulong = u64;
+ }
+ _ => {
+ // The minimal size of `long` in the C standard is 32 bits
+ pub(super) type c_long = i32;
+ pub(super) type c_ulong = u32;
+ }
+ }
+}
+----
+
+Even for the cases where C types are correctly mapped to Rust types via
+std::ffi::c_* there are still problems. Let's take c_char for example. On some
+platforms it's u8 on others it's i8.
+
+=== Subtraction underflow in debug mode
+
+The following code will panic in debug on platforms that define c_char as u8,
+but won't if it's an i8.
+
+[source]
+----
+let mut x: std::ffi::c_char = 0;
+x -= 1;
+----
+
+=== Inconsistent shift behavior
+
+`x` will be 0xC0 for platforms that use i8, but will be 0x40 where it's u8.
+
+[source]
+----
+let mut x: std::ffi::c_char = 0x80;
+x >>= 1;
+----
+
+=== Equality fails to compile on some platforms
+
+The following will not compile on platforms that define c_char as i8, but will
+if it's u8. You can cast x e.g. `assert_eq!(x as u8, b'a');`, but then you get
+a warning on platforms that use u8 and a clean compilation where i8 is used.
+
+[source]
+----
+let mut x: std::ffi::c_char = 0x61;
+assert_eq!(x, b'a');
+----
+
+== Enum types
+Rust enum types should not be used as FFI types. Rust enum types are more like
+C union types than C enum's. For something like:
+
+[source]
+----
+#[repr(C, u8)]
+enum Fruit {
+ Apple,
+ Banana,
+ Cherry,
+}
+----
+
+It's easy enough to make sure the Rust enum matches what C would expect, but a
+more complex type like.
+
+[source]
+----
+enum HashResult {
+ SHA1([u8; 20]),
+ SHA256([u8; 32]),
+}
+----
+
+The Rust compiler has to add a discriminant to the enum to distinguish between
+the variants. The width, location, and values for that discriminant is up to
+the Rust compiler and is not ABI stable.