git/xdiff-interface.c, branch v2.32.2

hash: provide per-algorithm null OIDs

2021-04-27T07:31:39Z

Up until recently, object IDs did not have an algorithm member, only a hash. Consequently, it was possible to share one null (all-zeros) object ID among all hash algorithms. Now that we're going to be handling objects from multiple hash algorithms, it's important to make sure that all object IDs have a correct algorithm field. Introduce a per-algorithm null OID, and add it to struct hash_algo. Introduce a wrapper function as well, and use it everywhere we used to use the null_oid constant. Signed-off-by: brian m. carlson Signed-off-by: Junio C Hamano

xdiff: avoid computing non-zero offset from NULL pointer

2020-01-29T07:13:25Z

As with the previous commit, clang-11's UBSan complains about computing offsets from a NULL pointer, causing some tests to fail. In this case, though, we're actually computing a non-zero offset, which is even more dubious. From t7810: xdiff-interface.c:268:14: runtime error: applying non-zero offset 1 to null pointer ... not ok 131 - grep -p with userdiff The problem is our parsing of the funcname config. We count the number of lines in the string, allocate an array, and then loop over our allocated entries, parsing each line and moving our cursor to one past the trailing newline for the next iteration. But the final line will not generally have a trailing newline (since it's a config value), and hence we go to one past NULL. In practice this is OK, since our loop should terminate before we look at the value. But even computing such an invalid pointer technically violates the standard. We can fix it by leaving the pointer at NULL if we're at the end, rather than one-past. And while we're thinking about it, we can also document the variant by asserting that our initial line-count matches the second-pass of parsing. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

avoid computing zero offsets from NULL pointer

2020-01-29T07:12:48Z

The Undefined Behavior Sanitizer in clang-11 seems to have learned a new trick: it complains about computing offsets from a NULL pointer, even if that offset is 0. This causes numerous test failures. For example, from t1090: unpack-trees.c:1355:41: runtime error: applying zero offset to null pointer ... not ok 6 - in partial clone, sparse checkout only fetches needed blobs The code in question looks like this: struct cache_entry **cache_end = cache + nr; ... while (cache != cache_end) and we sometimes pass in a NULL and 0 for "cache" and "nr". This is conceptually fine, as "cache_end" would be equal to "cache" in this case, and we wouldn't enter the loop at all. But computing even a zero offset violates the C standard. And given the fact that UBSan is noticing this behavior, this might be a potential problem spot if the compiler starts making unexpected assumptions based on undefined behavior. So let's just avoid it, which is pretty easy. In some cases we can just switch to iterating with a numeric index (as we do in sequencer.c here). In other cases (like the cache_end one) the use of an end pointer is more natural; we can keep that by just explicitly checking for the NULL/0 case when assigning the end pointer. Note that there are two ways you can write this latter case, checking for the pointer: cache_end = cache ? cache + nr : cache; or the size: cache_end = nr ? cache + nr : cache; For the case of a NULL/0 ptr/len combo, they are equivalent. But writing it the second way (as this patch does) has the property that if somebody were to incorrectly pass a NULL pointer with a non-zero length, we'd continue to notice and segfault, rather than silently pretending the length was zero. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

completion: add more parameter value completion

2019-02-20T20:31:56Z

This adds value completion for a couple more paramters. To make it easier to maintain these hard coded lists, add a comment at the original list/code to remind people to update git-completion.bash too. Signed-off-by: Nguyễn Thái Ngọc Duy Signed-off-by: Junio C Hamano

Merge branch 'jk/xdiff-interface'

2018-11-13T13:37:27Z

The interface into "xdiff" library used to discover the offset and size of a generated patch hunk by first formatting it into the textual hunk header "@@ -n,m +k,l @@" and then parsing the numbers out. A new interface has been introduced to allow callers a more direct access to them. * jk/xdiff-interface: xdiff-interface: drop parse_hunk_header() range-diff: use a hunk callback diff: convert --check to use a hunk callback combine-diff: use an xdiff hunk callback diff: use hunk callback for word-diff diff: discard hunk headers for patch-ids earlier diff: avoid generating unused hunk header lines xdiff-interface: provide a separate consume callback for hunks xdiff: provide a separate emit callback for hunks

xdiff-interface: drop parse_hunk_header()

2018-11-05T04:14:35Z

This function was used only for parsing the hunk headers generated by xdiff. Now that we can use hunk callbacks to get that information directly, it has outlived its usefulness. Note to anyone who wants to resurrect it: the "len" parameter was totally unused, meaning that the function could read past the end of the "line" array. In practice this never happened, because we only used it to parse xdiff's generated header lines. But it would be dangerous to use it for other cases without fixing this defect. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

diff: use hunk callback for word-diff

2018-11-05T04:14:35Z

Our word-diff does not look at the -/+ lines generated by xdiff at all (because they are not real lines to show the user, but just the tokenized words split into lines). Instead we use the line numbers from the hunk headers to index our own data structure. As a result, our xdi_diff_outf() callback throws away all lines except hunk headers. We can instead use a hunk callback, which has two benefits: 1. We don't have to re-parse the generated hunk header line, but can use the passed parameters directly. 2. By setting our line callback to NULL, we can tell xdiff-interface that it does not even need to bother generating the other lines, saving a small amount of work. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

diff: avoid generating unused hunk header lines

2018-11-05T04:14:35Z

Some callers of xdi_diff_outf() do not look at the generated hunk header lines at all. By plugging in a no-op hunk callback, this tells xdiff not to even bother formatting them. This patch introduces a stock no-op callback and uses it with a few callers whose line callbacks explicitly ignore hunk headers (because they look only for +/- lines). Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

xdiff-interface: provide a separate consume callback for hunks

2018-11-02T11:43:02Z

The previous commit taught xdiff to optionally provide the hunk header data to a specialized callback. But most users of xdiff actually use our more convenient xdi_diff_outf() helper, which ensures that our callbacks are always fed whole lines. Let's plumb the special hunk-callback through this interface, too. It will follow the same rule as xdiff when the hunk callback is NULL (i.e., continue to pass a stringified hunk header to the line callback). Since we add NULL to each caller, there should be no behavior change yet. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

xdiff: provide a separate emit callback for hunks

2018-11-02T11:43:02Z

The xdiff library always emits hunk header lines to our callbacks as formatted strings like "@@ -a,b +c,d @@\n". This is convenient if we're going to output a diff, but less so if we actually need to compute using those numbers, which requires re-parsing the line. In preparation for moving away from this, let's teach xdiff a new callback function which gets the broken-out hunk information. To help callers that don't want to use this new callback, if it's NULL we'll continue to format the hunk header into a string. Note that this function renames the "outf" callback to "out_line", as well. This isn't strictly necessary, but helps in two ways: 1. Now that there are two callbacks, it's nice to use more descriptive names. 2. Many callers did not zero the emit_callback_data struct, and needed to be modified to set ecb.out_hunk to NULL. By changing the name of the existing struct member, that guarantees that any new callers from in-flight topics will break the build and be examined manually. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano