<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/xdiff, branch next</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=next</id>
<link rel='self' href='https://git.shady.money/git/atom?h=next'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2026-04-14T20:34:01Z</updated>
<entry>
<title>Revert "Merge branch 'hn/git-checkout-m-with-stash' into next"</title>
<updated>2026-04-14T20:34:01Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2026-04-14T20:34:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=af818d63126afcfc708b23eb0e5cc66a99274b54'/>
<id>urn:sha1:af818d63126afcfc708b23eb0e5cc66a99274b54</id>
<content type='text'>
This reverts commit b4e5a964fa85a84a9328647486c250706ad6501d, reversing
changes made to 0e6c98f313ae3e75cedad7125c9b4ab3ba1065c2, as the topic
is still getting rerolled.
</content>
</entry>
<entry>
<title>stash: add --label-ours, --label-theirs, --label-base for apply</title>
<updated>2026-04-12T22:11:39Z</updated>
<author>
<name>Harald Nordgren</name>
<email>haraldnordgren@gmail.com</email>
</author>
<published>2026-04-12T11:51:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=09a43fe49ffb441505182d1a5d922f1c88419740'/>
<id>urn:sha1:09a43fe49ffb441505182d1a5d922f1c88419740</id>
<content type='text'>
Allow callers of "git stash apply" to pass custom labels for conflict
markers instead of the default "Updated upstream" and "Stashed changes".
Document the new options and add a test.

Signed-off-by: Harald Nordgren &lt;haraldnordgren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'yc/histogram-hunk-shift-fix'</title>
<updated>2026-03-24T19:31:34Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2026-03-24T19:31:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=231f8100c41fdbdd49f5bca953fff775a79321db'/>
<id>urn:sha1:231f8100c41fdbdd49f5bca953fff775a79321db</id>
<content type='text'>
The final clean-up phase of the diff output could turn the result of
histogram diff algorithm suboptimal, which has been corrected.

* yc/histogram-hunk-shift-fix:
  xdiff: re-diff shifted change groups when using histogram algorithm
</content>
</entry>
<entry>
<title>xdiff: re-diff shifted change groups when using histogram algorithm</title>
<updated>2026-03-03T16:43:05Z</updated>
<author>
<name>Yee Cheng Chin</name>
<email>ychin.git@gmail.com</email>
</author>
<published>2026-03-02T14:54:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e417277ae99687b576e48cb477a7a50241ea0096'/>
<id>urn:sha1:e417277ae99687b576e48cb477a7a50241ea0096</id>
<content type='text'>
After a diff algorithm has been run, the compaction phase
(xdl_change_compact()) shifts and merges change groups to produce a
cleaner output. However, this shifting could create a new matched group
where both sides now have matching lines. This results in a
wrong-looking diff output which contains redundant lines that are the
same on both files.

Fix this by detecting this situation, and re-diff the texts on each side
to find similar lines, using the fall-back Myer's diff. Only do this for
histogram diff as it's the only algorithm where this is relevant. Below
contains an example, and more details.

For an example, consider two files below:

    file1:
        A

        A
        A
        A

        A
        A
        A

    file2:
        A

        A
        x
        A

        A
        A
        A

When using Myer's diff, the algorithm finds that only the "x" has been
changed, and produces a final diff result (these are line diffs, but
using word-diff syntax for ease of presentation):

        A A[-A-]{+x+}A AAA

When using histogram diff, the algorithm first discovers the LCS "A
AAA", which it uses as anchor, then produces an intermediate diff:

        {+A Ax+}A AAA[- AAA-].

This is a longer diff than Myer's, but it's still self-consistent.
However, the compaction phase attempts to shift the first file's diff
group upwards (note that this shift crosses the anchor that histogram
had used), leading to the final results for histogram diff:

        [-A AA-]{+A Ax+}A AAA

This is a technically correct patch but looks clearly redundant to a
human as the first 3 lines should not be in the diff.

The fix would detect that a shift has caused matching to a new group,
and re-diff the "A AA" and "A Ax" parts, which results in "A A"
correctly re-marked as unchanged. This creates the now correct histogram
diff:

        A A[-A-]{+x+}A AAA

This issue is not applicable to Myer's diff algorithm as it already
generates a minimal diff, which means a shift cannot result in a smaller
diff output (the default Myer's diff in xdiff is not guaranteed to be
minimal for performance reasons, but it typically does a good enough
job).

It's also not applicable to patience diff, because it uses only unique
lines as anchor for its splits, and falls back to Myer's diff within
each split. Shifting requires both ends having the same lines, and
therefore cannot cross the unique line boundaries established by the
patience algorithm. In contrast histogram diff uses non-unique lines as
anchors, and therefore shifting can cross over them.

This issue is rare in a normal repository. Below is a table of
repositories (`git log --no-merges -p --histogram -1000`), showing how
many times a re-diff was done and how many times it resulted in finding
matching lines (therefore addressing this issue) with the fix. In
general it is fewer than 1% of diff's that exhibit this offending
behavior:

| Repo (1k commits)  | Re-diff | Found matching lines |
|--------------------|---------|----------------------|
| llvm-project       |  45     | 11                   |
| vim                | 110     |  9                   |
| git                |  18     |  2                   |
| WebKit             | 168     |  1                   |
| ripgrep            |  22     |  1                   |
| cpython            |  32     |  0                   |
| vscode             |  13     |  0                   |

Signed-off-by: Yee Cheng Chin &lt;ychin.git@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'pw/diff-anchored-optim'</title>
<updated>2026-02-20T19:36:18Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2026-02-20T19:36:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8ceb69f85d0b68f5558c8c72ca104036523644fa'/>
<id>urn:sha1:8ceb69f85d0b68f5558c8c72ca104036523644fa</id>
<content type='text'>
"git diff --anchored=&lt;text&gt;" has been optimized.

* pw/diff-anchored-optim:
  diff --anchored: avoid checking unmatched lines
</content>
</entry>
<entry>
<title>Merge branch 'pw/xdiff-cleanups'</title>
<updated>2026-02-20T19:36:17Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2026-02-20T19:36:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5465d3683a97a950358152925204f16b98739fad'/>
<id>urn:sha1:5465d3683a97a950358152925204f16b98739fad</id>
<content type='text'>
Small clean-up of xdiff library to remove unnecessary data
duplication.

* pw/xdiff-cleanups:
  xdiff: remove unused data from xdlclass_t
  xdiff: remove "line_hash" field from xrecord_t
</content>
</entry>
<entry>
<title>diff --anchored: avoid checking unmatched lines</title>
<updated>2026-02-12T17:28:49Z</updated>
<author>
<name>Phillip Wood</name>
<email>phillip.wood@dunelm.org.uk</email>
</author>
<published>2026-02-12T15:53:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=dd2a4c0c7acb588abbf3c3a39ca755ce8aeee3b0'/>
<id>urn:sha1:dd2a4c0c7acb588abbf3c3a39ca755ce8aeee3b0</id>
<content type='text'>
For a line to be an anchor it has to appear in each of the files being
diffed exactly once. With that in mind lets delay checking whether
a line is an anchor until we know there is exactly one instance of
the line in each file. As each line is checked at most once, there
is no need to cache the result of is_anchor() and we can drop that
field from the hashmap entries. When diffing 5000 recent commits in
git.git this gives a modest speedup of ~2%. In the (rather extreme)
example below that consists largely of deletions the speedup is ~16%.

    seq 0 10000000 &gt;old
    printf '%s\n' 300000 100000 200000 &gt;new
    git diff --no-index --anchored=300000 old new

Signed-off-by: Phillip Wood &lt;phillip.wood@dunelm.org.uk&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>xdiff: remove unused data from xdlclass_t</title>
<updated>2026-01-26T16:38:29Z</updated>
<author>
<name>Phillip Wood</name>
<email>phillip.wood@dunelm.org.uk</email>
</author>
<published>2026-01-26T10:48:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5086213bd2f44fdc793fd8a081fd1c40a3267c44'/>
<id>urn:sha1:5086213bd2f44fdc793fd8a081fd1c40a3267c44</id>
<content type='text'>
Prior to commit 6d507bd41a (xdiff: delete fields ha, line, size
in xdlclass_t in favor of an xrecord_t, 2025-09-26) xdlclass_t
carried a copy of all the fields in xrecord_t. That commit embedded
xrecord_t in xdlclass_t to make it easier to change the types of
the fields in xrecord_t. However commit 6a26019c81 (xdiff: split
xrecord_t.ha into line_hash and minimal_perfect_hash, 2025-11-18)
added the "minimal_perfect_hash" field to xrecord_t which is not
used by xdlclass_t. To avoid wasting space stop copying the whole
of xrecord_t and just copy the pointer and length that we need to
intern the line. Together with the previous commit this effectively
reverts 6d507bd41a.

Signed-off-by: Phillip Wood &lt;phillip.wood@dunelm.org.uk&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>xdiff: remove "line_hash" field from xrecord_t</title>
<updated>2026-01-26T16:38:29Z</updated>
<author>
<name>Phillip Wood</name>
<email>phillip.wood@dunelm.org.uk</email>
</author>
<published>2026-01-26T10:48:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c27afcbfd0f440f410758432e2fe11a16fb2b360'/>
<id>urn:sha1:c27afcbfd0f440f410758432e2fe11a16fb2b360</id>
<content type='text'>
Prior to commit 6a26019c81 (xdiff: split xrecord_t.ha into line_hash
and minimal_perfect_hash, 2025-11-18) the "ha" field of xrecord_t
initially held the "line_hash" value and once the line had been
interned that field was updated to hold the "minimal_perfect_hash". The
"line_hash" is only used to intern the line so there is no point in
storing it after all the input lines have been interned.

Removing the "line_hash" field from xrecord_t and storing it in
xdlclass_t where it is actually used makes it clearer that it is a
temporary value and it should not be used once we're calculated the
"minimal_perfect_hash". This also reduces the size of xrecord_t by 25%
on 64-bit platforms and 40% on 32-bit platforms. While the struct is
small we create one instance per input line so any saving is welcome.

Signed-off-by: Phillip Wood &lt;phillip.wood@dunelm.org.uk&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'yc/xdiff-patience-optim'</title>
<updated>2025-12-08T22:54:55Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-12-08T22:54:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=7fc0b33b5d9ced74819132a094528154f83f4a6a'/>
<id>urn:sha1:7fc0b33b5d9ced74819132a094528154f83f4a6a</id>
<content type='text'>
The way patience diff finds LCS has been optimized.

* yc/xdiff-patience-optim:
  xdiff: optimize patience diff's LCS search
</content>
</entry>
</feed>
