<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/xdiff-interface.c, branch v2.32.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.32.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.32.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2021-04-27T07:31:39Z</updated>
<entry>
<title>hash: provide per-algorithm null OIDs</title>
<updated>2021-04-27T07:31:39Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2021-04-26T01:02:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=14228447c9ce664a4e9c31ba10344ec5e4ea4ba5'/>
<id>urn:sha1:14228447c9ce664a4e9c31ba10344ec5e4ea4ba5</id>
<content type='text'>
Up until recently, object IDs did not have an algorithm member, only a
hash.  Consequently, it was possible to share one null (all-zeros)
object ID among all hash algorithms.  Now that we're going to be
handling objects from multiple hash algorithms, it's important to make
sure that all object IDs have a correct algorithm field.

Introduce a per-algorithm null OID, and add it to struct hash_algo.
Introduce a wrapper function as well, and use it everywhere we used to
use the null_oid constant.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>xdiff: avoid computing non-zero offset from NULL pointer</title>
<updated>2020-01-29T07:13:25Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-01-25T05:39:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3cd309c16f3b9370a90d2f4eb69002473020412a'/>
<id>urn:sha1:3cd309c16f3b9370a90d2f4eb69002473020412a</id>
<content type='text'>
As with the previous commit, clang-11's UBSan complains about computing
offsets from a NULL pointer, causing some tests to fail. In this case,
though, we're actually computing a non-zero offset, which is even more
dubious. From t7810:

  xdiff-interface.c:268:14: runtime error: applying non-zero offset 1 to null pointer
  ...
  not ok 131 - grep -p with userdiff

The problem is our parsing of the funcname config. We count the number
of lines in the string, allocate an array, and then loop over our
allocated entries, parsing each line and moving our cursor to one past
the trailing newline for the next iteration.

But the final line will not generally have a trailing newline (since
it's a config value), and hence we go to one past NULL. In practice this
is OK, since our loop should terminate before we look at the value. But
even computing such an invalid pointer technically violates the
standard.

We can fix it by leaving the pointer at NULL if we're at the end, rather
than one-past. And while we're thinking about it, we can also document
the variant by asserting that our initial line-count matches the
second-pass of parsing.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>avoid computing zero offsets from NULL pointer</title>
<updated>2020-01-29T07:12:48Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-01-29T05:46:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d20bc01a51f7899deb7a04bdd7c2495789185297'/>
<id>urn:sha1:d20bc01a51f7899deb7a04bdd7c2495789185297</id>
<content type='text'>
The Undefined Behavior Sanitizer in clang-11 seems to have learned a new
trick: it complains about computing offsets from a NULL pointer, even if
that offset is 0. This causes numerous test failures. For example, from
t1090:

  unpack-trees.c:1355:41: runtime error: applying zero offset to null pointer
  ...
  not ok 6 - in partial clone, sparse checkout only fetches needed blobs

The code in question looks like this:

  struct cache_entry **cache_end = cache + nr;
  ...
  while (cache != cache_end)

and we sometimes pass in a NULL and 0 for "cache" and "nr". This is
conceptually fine, as "cache_end" would be equal to "cache" in this
case, and we wouldn't enter the loop at all. But computing even a zero
offset violates the C standard. And given the fact that UBSan is
noticing this behavior, this might be a potential problem spot if the
compiler starts making unexpected assumptions based on undefined
behavior.

So let's just avoid it, which is pretty easy. In some cases we can just
switch to iterating with a numeric index (as we do in sequencer.c here).
In other cases (like the cache_end one) the use of an end pointer is
more natural; we can keep that by just explicitly checking for the
NULL/0 case when assigning the end pointer.

Note that there are two ways you can write this latter case, checking
for the pointer:

  cache_end = cache ? cache + nr : cache;

or the size:

  cache_end = nr ? cache + nr : cache;

For the case of a NULL/0 ptr/len combo, they are equivalent. But writing
it the second way (as this patch does) has the property that if somebody
were to incorrectly pass a NULL pointer with a non-zero length, we'd
continue to notice and segfault, rather than silently pretending the
length was zero.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>completion: add more parameter value completion</title>
<updated>2019-02-20T20:31:56Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2019-02-16T11:24:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5a59a2301f6ec9bcf1b101edb9ca33beb465842f'/>
<id>urn:sha1:5a59a2301f6ec9bcf1b101edb9ca33beb465842f</id>
<content type='text'>
This adds value completion for a couple more paramters. To make it
easier to maintain these hard coded lists, add a comment at the original
list/code to remind people to update git-completion.bash too.

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jk/xdiff-interface'</title>
<updated>2018-11-13T13:37:27Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-11-13T13:37:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=39d23dfa409e51844ddda99599927092c8300f12'/>
<id>urn:sha1:39d23dfa409e51844ddda99599927092c8300f12</id>
<content type='text'>
The interface into "xdiff" library used to discover the offset and
size of a generated patch hunk by first formatting it into the
textual hunk header "@@ -n,m +k,l @@" and then parsing the numbers
out.  A new interface has been introduced to allow callers a more
direct access to them.

* jk/xdiff-interface:
  xdiff-interface: drop parse_hunk_header()
  range-diff: use a hunk callback
  diff: convert --check to use a hunk callback
  combine-diff: use an xdiff hunk callback
  diff: use hunk callback for word-diff
  diff: discard hunk headers for patch-ids earlier
  diff: avoid generating unused hunk header lines
  xdiff-interface: provide a separate consume callback for hunks
  xdiff: provide a separate emit callback for hunks
</content>
</entry>
<entry>
<title>xdiff-interface: drop parse_hunk_header()</title>
<updated>2018-11-05T04:14:35Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2018-11-02T06:40:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5eade0746e1daf659a9559d804068f9f31614625'/>
<id>urn:sha1:5eade0746e1daf659a9559d804068f9f31614625</id>
<content type='text'>
This function was used only for parsing the hunk headers generated by
xdiff. Now that we can use hunk callbacks to get that information
directly, it has outlived its usefulness.

Note to anyone who wants to resurrect it: the "len" parameter was
totally unused, meaning that the function could read past the end of the
"line" array. In practice this never happened, because we only used it
to parse xdiff's generated header lines. But it would be dangerous to
use it for other cases without fixing this defect.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diff: use hunk callback for word-diff</title>
<updated>2018-11-05T04:14:35Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2018-11-02T06:37:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=7c61e25fbf1a4a65208be1197940a383f220a1b7'/>
<id>urn:sha1:7c61e25fbf1a4a65208be1197940a383f220a1b7</id>
<content type='text'>
Our word-diff does not look at the -/+ lines generated by xdiff at all
(because they are not real lines to show the user, but just the
tokenized words split into lines). Instead we use the line numbers from
the hunk headers to index our own data structure.

As a result, our xdi_diff_outf() callback throws away all lines except
hunk headers. We can instead use a hunk callback, which has two
benefits:

  1. We don't have to re-parse the generated hunk header line, but can
     use the passed parameters directly.

  2. By setting our line callback to NULL, we can tell xdiff-interface
     that it does not even need to bother generating the other lines,
     saving a small amount of work.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diff: avoid generating unused hunk header lines</title>
<updated>2018-11-05T04:14:35Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2018-11-02T06:36:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3b40a090fd4e441e88897dfa96f50039952ed45b'/>
<id>urn:sha1:3b40a090fd4e441e88897dfa96f50039952ed45b</id>
<content type='text'>
Some callers of xdi_diff_outf() do not look at the generated hunk header
lines at all. By plugging in a no-op hunk callback, this tells xdiff not
to even bother formatting them.

This patch introduces a stock no-op callback and uses it with a few
callers whose line callbacks explicitly ignore hunk headers (because
they look only for +/- lines).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>xdiff-interface: provide a separate consume callback for hunks</title>
<updated>2018-11-02T11:43:02Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2018-11-02T06:35:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9346d6d14dddc7989ba879839d58f6c2426cffbb'/>
<id>urn:sha1:9346d6d14dddc7989ba879839d58f6c2426cffbb</id>
<content type='text'>
The previous commit taught xdiff to optionally provide the hunk header
data to a specialized callback. But most users of xdiff actually use our
more convenient xdi_diff_outf() helper, which ensures that our callbacks
are always fed whole lines.

Let's plumb the special hunk-callback through this interface, too. It
will follow the same rule as xdiff when the hunk callback is NULL (i.e.,
continue to pass a stringified hunk header to the line callback). Since
we add NULL to each caller, there should be no behavior change yet.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>xdiff: provide a separate emit callback for hunks</title>
<updated>2018-11-02T11:43:02Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2018-11-02T06:35:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=611e42a5980a3a9f8bb3b1b49c1abde63c7a191e'/>
<id>urn:sha1:611e42a5980a3a9f8bb3b1b49c1abde63c7a191e</id>
<content type='text'>
The xdiff library always emits hunk header lines to our callbacks as
formatted strings like "@@ -a,b +c,d @@\n". This is convenient if we're
going to output a diff, but less so if we actually need to compute using
those numbers, which requires re-parsing the line.

In preparation for moving away from this, let's teach xdiff a new
callback function which gets the broken-out hunk information. To help
callers that don't want to use this new callback, if it's NULL we'll
continue to format the hunk header into a string.

Note that this function renames the "outf" callback to "out_line", as
well. This isn't strictly necessary, but helps in two ways:

  1. Now that there are two callbacks, it's nice to use more descriptive
     names.

  2. Many callers did not zero the emit_callback_data struct, and needed
     to be modified to set ecb.out_hunk to NULL. By changing the name of
     the existing struct member, that guarantees that any new callers
     from in-flight topics will break the build and be examined
     manually.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
