<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/combine-diff.c, branch v2.35.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.35.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.35.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2021-07-13T23:52:50Z</updated>
<entry>
<title>Merge branch 'ab/pickaxe-pcre2'</title>
<updated>2021-07-13T23:52:50Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-07-13T23:52:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4da281e84d87b099ba8d0a255534dc1251be968e'/>
<id>urn:sha1:4da281e84d87b099ba8d0a255534dc1251be968e</id>
<content type='text'>
Rewrite the backend for "diff -G/-S" to use pcre2 engine when
available.

* ab/pickaxe-pcre2: (22 commits)
  xdiff-interface: replace discard_hunk_line() with a flag
  xdiff users: use designated initializers for out_line
  pickaxe -G: don't special-case create/delete
  pickaxe -G: terminate early on matching lines
  xdiff-interface: allow early return from xdiff_emit_line_fn
  xdiff-interface: prepare for allowing early return
  pickaxe -S: slightly optimize contains()
  pickaxe: rename variables in has_changes() for brevity
  pickaxe -S: support content with NULs under --pickaxe-regex
  pickaxe: assert that we must have a needle under -G or -S
  pickaxe: refactor function selection in diffcore-pickaxe()
  perf: add performance test for pickaxe
  pickaxe/style: consolidate declarations and assignments
  diff.h: move pickaxe fields together again
  pickaxe: die when --find-object and --pickaxe-all are combined
  pickaxe: die when -G and --pickaxe-regex are combined
  pickaxe tests: add missing test for --no-pickaxe-regex being an error
  pickaxe tests: test for -G, -S and --find-object incompatibility
  pickaxe tests: add test for "log -S" not being a regex
  pickaxe tests: add test for diffgrep_consume() internals
  ...
</content>
</entry>
<entry>
<title>xdiff-interface: prepare for allowing early return</title>
<updated>2021-05-11T03:47:31Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2021-04-12T17:15:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a8d5eb6dc0d61625667b0d8155c425d3629baa12'/>
<id>urn:sha1:a8d5eb6dc0d61625667b0d8155c425d3629baa12</id>
<content type='text'>
Change the function prototype of xdiff_emit_line_fn to return an "int"
instead of "void". Change all of those functions to "return 0",
nothing checks those return values yet, and no behavior is being
changed.

In subsequent commits the interface will be changed to allow early
return via this new return value.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hash: provide per-algorithm null OIDs</title>
<updated>2021-04-27T07:31:39Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2021-04-26T01:02:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=14228447c9ce664a4e9c31ba10344ec5e4ea4ba5'/>
<id>urn:sha1:14228447c9ce664a4e9c31ba10344ec5e4ea4ba5</id>
<content type='text'>
Up until recently, object IDs did not have an algorithm member, only a
hash.  Consequently, it was possible to share one null (all-zeros)
object ID among all hash algorithms.  Now that we're going to be
handling objects from multiple hash algorithms, it's important to make
sure that all object IDs have a correct algorithm field.

Introduce a per-algorithm null OID, and add it to struct hash_algo.
Introduce a wrapper function as well, and use it everywhere we used to
use the null_oid constant.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>use CALLOC_ARRAY</title>
<updated>2021-03-14T00:00:09Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2021-03-13T16:17:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ca56dadb4b65ccaeab809d80db80a312dc00941a'/>
<id>urn:sha1:ca56dadb4b65ccaeab809d80db80a312dc00941a</id>
<content type='text'>
Add and apply a semantic patch for converting code that open-codes
CALLOC_ARRAY to use it instead.  It shortens the code and infers the
element size automatically.

Signed-off-by: René Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jk/diff-cc-oidfind-fix'</title>
<updated>2020-10-05T21:01:55Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-10-05T21:01:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=7da656f1e083b065400c4ab92405c6332481eb7a'/>
<id>urn:sha1:7da656f1e083b065400c4ab92405c6332481eb7a</id>
<content type='text'>
"log -c --find-object=X" did not work well to find a merge that
involves a change to an object X from only one parent.

* jk/diff-cc-oidfind-fix:
  combine-diff: handle --find-object in multitree code path
</content>
</entry>
<entry>
<title>combine-diff: handle --find-object in multitree code path</title>
<updated>2020-09-30T20:35:24Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-09-30T11:52:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=957876f17d75b5cd0e94b85d3df843bb1e9ae110'/>
<id>urn:sha1:957876f17d75b5cd0e94b85d3df843bb1e9ae110</id>
<content type='text'>
When doing combined diffs, we have two possible code paths:

  - a slower one which independently diffs against each parent, applies
    any filters, and then intersects the resulting paths

  - a faster one which walks all trees simultaneously

When the diff options specify that we must do certain filters, like
pickaxe, then we always use the slow path, since the pickaxe code only
knows how to handle filepairs, not the n-parent entries generated for
combined diffs.

But there are two problems with the slow path:

 1. It's slow. Running:

      git rev-list HEAD | git diff-tree --stdin -r -c

    in git.git takes ~3s on my machine. But adding "--find-object" to
    that increases it to ~6s, even though find-object itself should
    incur only a few extra oid comparisons. On linux.git, it's even
    worse: 35s versus 215s.

 2. It doesn't catch all cases where a particular path is interesting.
    Consider a merge with parent blobs X and Y for a particular path,
    and end result Z. That should be interesting according to "-c",
    because the result doesn't match either parent. And it should be
    interesting even with "--find-object=X", because "X" went away in
    the merge.

    But because we perform each pairwise diff independently, this
    confuses the intersection code. The change from X to Z is still
    interesting according to --find-object. But in the other parent we
    went from Y to Z, so the diff appears empty! That causes the
    intersection code to think that parent didn't change the path, and
    thus it's not interesting for "-c".

This patch fixes both by implementing --find-object for the multitree
code. It's a bit unfortunate that we have to duplicate some logic from
diffcore-pickaxe, but this is the best we can do for now. In an ideal
world, all of the diffcore code would stop thinking about filepairs and
start thinking about n-parent sets, and we could use the multitree walk
with all of it.

Until then, there are some leftover warts:

  - other pickaxe operations, like -S or -G, still suffer from both
    problems. These would be hard to adapt because they rely on having
    a diff_filespec() for each path to look at content. And we'd need to
    define what an n-way "change" means in each case (probably easy for
    "-S", which can compare counts, but not so clear for -G, which is
    about grepping diffs).

  - other options besides --find-object may cause us to use the slow
    pairwise path, in which case we'll go back to producing a different
    (wrong) answer for the X/Y/Z case above.

We may be able to hack around these, but I think the ultimate solution
will be a larger rewrite of the diffcore code. For now, this patch
improves one specific case but leaves the rest.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diff: get rid of redundant 'dense' argument</title>
<updated>2020-09-29T18:54:53Z</updated>
<author>
<name>Sergey Organov</name>
<email>sorganov@gmail.com</email>
</author>
<published>2020-09-29T11:31:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d01141de5ab02cf4a156183ef4dc5ee8bf2638a3'/>
<id>urn:sha1:d01141de5ab02cf4a156183ef4dc5ee8bf2638a3</id>
<content type='text'>
Get rid of 'dense' argument that is redundant for every function that has
'struct rev_info *rev' argument as well, as the value of 'dense' passed is
always taken from 'rev-&gt;dense_combined_merges' field.

The only place where this was not the case is in 'submodule.c' where
'diff_tree_combined_merge()' was called with '1' for 'dense' argument. However,
at that call the 'revs' instance used is local to the function, and we now just
set 'revs-&gt;dense_combined_merges' to 1 in this local instance.

Signed-off-by: Sergey Organov &lt;sorganov@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>oid_array: rename source file from sha1-array</title>
<updated>2020-03-30T17:59:08Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-03-30T14:03:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=fe299ec5ae7b419990bbc3efd4e6bfa3f0773b45'/>
<id>urn:sha1:fe299ec5ae7b419990bbc3efd4e6bfa3f0773b45</id>
<content type='text'>
We renamed the actual data structure in 910650d2f8 (Rename sha1_array to
oid_array, 2017-03-31), but the file is still called sha1-array. Besides
being slightly confusing, it makes it more annoying to grep for leftover
occurrences of "sha1" in various files, because the header is included
in so many places.

Let's complete the transition by renaming the source and header files
(and fixing up a few comment references).

I kept the "-" in the name, as that seems to be our style; cf.
fc1395f4a4 (sha1_file.c: rename to use dash in file name, 2018-04-10).
We also have oidmap.h and oidset.h without any punctuation, but those
are "struct oidmap" and "struct oidset" in the code. We _could_ make
this "oidarray" to match, but somehow it looks uglier to me because of
the length of "array" (plus it would be a very invasive patch for little
gain).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>combine-diff: replace GIT_SHA1_HEXSZ with the_hash_algo</title>
<updated>2019-08-19T22:04:58Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2019-08-18T20:04:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=976ff7e49d722d1b72effb6d547ec2b00cd69fc0'/>
<id>urn:sha1:976ff7e49d722d1b72effb6d547ec2b00cd69fc0</id>
<content type='text'>
Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'en/combined-all-paths'</title>
<updated>2019-03-07T00:59:54Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2019-03-07T00:59:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c425d361f5bf46d99cd96d7eac3488ebb2e92b60'/>
<id>urn:sha1:c425d361f5bf46d99cd96d7eac3488ebb2e92b60</id>
<content type='text'>
Output from "diff --cc" did not show the original paths when the
merge involved renames.  A new option adds the paths in the
original trees to the output.

* en/combined-all-paths:
  log,diff-tree: add --combined-all-paths option
</content>
</entry>
</feed>
