<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/diffcore-pickaxe.c, branch v2.13.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.13.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.13.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2017-03-24T20:07:35Z</updated>
<entry>
<title>Merge branch 'js/regexec-buf'</title>
<updated>2017-03-24T20:07:35Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2017-03-24T20:07:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0efeb5ca12f070ca3a8cec48761af863e9a7f6fe'/>
<id>urn:sha1:0efeb5ca12f070ca3a8cec48761af863e9a7f6fe</id>
<content type='text'>
Fix for potential segv introduced in v2.11.0 and later (also
v2.10.2).

* js/regexec-buf:
  pickaxe: fix segfault with '-S&lt;...&gt; --pickaxe-regex'
</content>
</entry>
<entry>
<title>pickaxe: fix segfault with '-S&lt;...&gt; --pickaxe-regex'</title>
<updated>2017-03-18T19:22:33Z</updated>
<author>
<name>SZEDER Gábor</name>
<email>szeder.dev@gmail.com</email>
</author>
<published>2017-03-18T18:24:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=f53c5de29cec68e3294a008052251631eaffcf07'/>
<id>urn:sha1:f53c5de29cec68e3294a008052251631eaffcf07</id>
<content type='text'>
'git {log,diff,...} -S&lt;...&gt; --pickaxe-regex' can segfault as a result
of out-of-bounds memory reads.

diffcore-pickaxe.c:contains() looks for all matches of the given regex
in a buffer in a loop, advancing the buffer pointer to the end of the
last match in each iteration.  When we switched to REG_STARTEND in
b7d36ffca (regex: use regexec_buf(), 2016-09-21), we started passing
the size of that buffer to the regexp engine, too.  Unfortunately,
this buffer size is never updated on subsequent iterations, and as the
buffer pointer advances on each iteration, this "bufptr+bufsize"
points past the end of the buffer.  This results in segmentation
fault, if that memory can't be accessed.  In case of 'git log' it can
also result in erroneously listed commits, if the memory past the end
of buffer is accessible and happens to contain data matching the
regex.

Reduce the buffer size on each iteration as the buffer pointer is
advanced, thus maintaining the correct end of buffer location.
Furthermore, make sure that the buffer pointer is not dereferenced in
the control flow statements when we already reached the end of the
buffer.

The new test is flaky, I've never seen it fail on my Linux box even
without the fix, but this is expected according to db5dfa3 (regex:
-G&lt;pattern&gt; feeds a non NUL-terminated string to regexec() and fails,
2016-09-21).  However, it did fail on Travis CI with the first (and
incomplete) version of the fix, and based on that commit message I
would expect the new test without the fix to fail most of the time on
Windows.

Signed-off-by: SZEDER Gábor &lt;szeder.dev@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'js/regexec-buf'</title>
<updated>2016-09-26T23:09:19Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2016-09-26T23:09:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=6a67695268562f67babdb7d5195c8a43cc4015fa'/>
<id>urn:sha1:6a67695268562f67babdb7d5195c8a43cc4015fa</id>
<content type='text'>
Some codepaths in "git diff" used regexec(3) on a buffer that was
mmap(2)ed, which may not have a terminating NUL, leading to a read
beyond the end of the mapped region.  This was fixed by introducing
a regexec_buf() helper that takes a &lt;ptr,len&gt; pair with REG_STARTEND
extension.

* js/regexec-buf:
  regex: use regexec_buf()
  regex: add regexec_buf() that can work on a non NUL-terminated string
  regex: -G&lt;pattern&gt; feeds a non NUL-terminated string to regexec() and fails
</content>
</entry>
<entry>
<title>regex: use regexec_buf()</title>
<updated>2016-09-21T20:56:15Z</updated>
<author>
<name>Johannes Schindelin</name>
<email>johannes.schindelin@gmx.de</email>
</author>
<published>2016-09-21T18:24:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b7d36ffca02c23f545d6e098d78180e6e72dfd8d'/>
<id>urn:sha1:b7d36ffca02c23f545d6e098d78180e6e72dfd8d</id>
<content type='text'>
The new regexec_buf() function operates on buffers with an explicitly
specified length, rather than NUL-terminated strings.

We need to use this function whenever the buffer we want to pass to
regexec(3) may have been mmap(2)ed (and is hence not NUL-terminated).

Note: the original motivation for this patch was to fix a bug where
`git diff -G &lt;regex&gt;` would crash. This patch converts more callers,
though, some of which allocated to construct NUL-terminated strings,
or worse, modified buffers to temporarily insert NULs while calling
regexec(3).  By converting them to use regexec_buf(), the code has
become much cleaner.

Signed-off-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diffcore-pickaxe: support case insensitive match on non-ascii</title>
<updated>2016-07-01T19:44:57Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2016-06-25T05:22:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b51a9c1479645b3e0c7d5156d027a97a4bb87977'/>
<id>urn:sha1:b51a9c1479645b3e0c7d5156d027a97a4bb87977</id>
<content type='text'>
Similar to the "grep -F -i" case, we can't use kws on icase search
outside ascii range, so we quote the string and pass it to regcomp as
a basic regexp and let regex engine deal with case sensitivity.

The new test is put in t7812 instead of t4209-log-pickaxe because
lib-gettext.sh might cause problems elsewhere, probably.

Noticed-by: Plamen Totev &lt;plamen.totev@abv.bg&gt;
Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diffcore-pickaxe: Add regcomp_or_die()</title>
<updated>2016-07-01T19:44:57Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2016-06-25T05:22:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3d5b23a36218b0417a056fa7b5e6d25d595ccaf2'/>
<id>urn:sha1:3d5b23a36218b0417a056fa7b5e6d25d595ccaf2</id>
<content type='text'>
There's another regcomp code block coming in this function that needs
the same error handling. This function can help avoid duplicating
error handling code.

Helped-by: Jeff King &lt;peff@peff.com&gt;
Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>react to errors in xdi_diff</title>
<updated>2015-09-28T21:57:10Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2015-09-24T23:12:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3efb988098858bf6b974b1e673a190f9d2965d1d'/>
<id>urn:sha1:3efb988098858bf6b974b1e673a190f9d2965d1d</id>
<content type='text'>
When we call into xdiff to perform a diff, we generally lose
the return code completely. Typically by ignoring the return
of our xdi_diff wrapper, but sometimes we even propagate
that return value up and then ignore it later.  This can
lead to us silently producing incorrect diffs (e.g., "git
log" might produce no output at all, not even a diff header,
for a content-level diff).

In practice this does not happen very often, because the
typical reason for xdiff to report failure is that it
malloc() failed (it uses straight malloc, and not our
xmalloc wrapper).  But it could also happen when xdiff
triggers one our callbacks, which returns an error (e.g.,
outf() in builtin/rerere.c tries to report a write failure
in this way). And the next patch also plans to add more
failure modes.

Let's notice an error return from xdiff and react
appropriately. In most of the diff.c code, we can simply
die(), which matches the surrounding code (e.g., that is
what we do if we fail to load a file for diffing in the
first place). This is not that elegant, but we are probably
better off dying to let the user know there was a problem,
rather than simply generating bogus output.

We could also just die() directly in xdi_diff, but the
callers typically have a bit more context, and can provide a
better message (and if we do later decide to pass errors up,
we're one step closer to doing so).

There is one interesting case, which is in diff_grep(). Here
if we cannot generate the diff, there is nothing to match,
and we silently return "no hits". This is actually what the
existing code does already, but we make it a little more
explicit.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pickaxe: simplify kwset loop in contains()</title>
<updated>2014-03-24T22:13:17Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2014-03-22T17:16:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e4aab50475c1d384e016c6ac6548635f1ddcd3fe'/>
<id>urn:sha1:e4aab50475c1d384e016c6ac6548635f1ddcd3fe</id>
<content type='text'>
Inlining the variable "found" actually makes the code shorter and
easier to read.

Signed-off-by: Rene Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pickaxe: call strlen only when necessary in diffcore_pickaxe_count()</title>
<updated>2014-03-24T22:13:17Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2014-03-22T17:15:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=542b2aa2c9afba71febb248edb3083ff9cacf065'/>
<id>urn:sha1:542b2aa2c9afba71febb248edb3083ff9cacf065</id>
<content type='text'>
We need to determine the search term's length only when fixed-string
matching is used; regular expression compilation takes a NUL-terminated
string directly.  Only call strlen() in the former case.

Signed-off-by: Rene Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pickaxe: move pickaxe() after pickaxe_match()</title>
<updated>2014-03-24T22:13:10Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2014-03-22T17:15:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3753bd1f69d3b17e0369390fb8960ae3b7855b70'/>
<id>urn:sha1:3753bd1f69d3b17e0369390fb8960ae3b7855b70</id>
<content type='text'>
pickaxe() calls pickaxe_match(); moving the definition of the former
after the latter allows us to do without an explicit function
declaration.

Signed-off-by: Rene Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
