<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/grep.h, branch v2.26.1</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.26.1</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.26.1'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2020-01-17T21:52:14Z</updated>
<entry>
<title>grep: replace grep_read_mutex by internal obj read lock</title>
<updated>2020-01-17T21:52:14Z</updated>
<author>
<name>Matheus Tavares</name>
<email>matheus.bernardino@usp.br</email>
</author>
<published>2020-01-16T02:39:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=1d1729caebd41b340dd8dd61057f613da4df526c'/>
<id>urn:sha1:1d1729caebd41b340dd8dd61057f613da4df526c</id>
<content type='text'>
git-grep uses 'grep_read_mutex' to protect its calls to object reading
operations. But these have their own internal lock now, which ensures a
better performance (allowing parallel access to more regions). So, let's
remove the former and, instead, activate the latter with
enable_obj_read_lock().

Sections that are currently protected by 'grep_read_mutex' but are not
internally protected by the object reading lock should be surrounded by
obj_read_lock() and obj_read_unlock(). These guarantee mutual exclusion
with object reading operations, keeping the current behavior and
avoiding race conditions. Namely, these places are:

  In grep.c:

  - fill_textconv() at fill_textconv_grep().
  - userdiff_get_textconv() at grep_source_1().

  In builtin/grep.c:

  - parse_object_or_die() and the submodule functions at
    grep_submodule().
  - deref_tag() and gitmodules_config_oid() at grep_objects().

If these functions become thread-safe, in the future, we might remove
the locking and probably get some speedup.

Note that some of the submodule functions will already be thread-safe
(or close to being thread-safe) with the internal object reading lock.
However, as some of them will require additional modifications to be
removed from the critical section, this will be done in its own patch.

Signed-off-by: Matheus Tavares &lt;matheus.bernardino@usp.br&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'cb/pcre2-chartables-leakfix'</title>
<updated>2019-10-23T05:43:11Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2019-10-23T05:43:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e0ff2d4c7ec338e30ea5e0340cda7f5fe8a187dc'/>
<id>urn:sha1:e0ff2d4c7ec338e30ea5e0340cda7f5fe8a187dc</id>
<content type='text'>
Leakfix.

* cb/pcre2-chartables-leakfix:
  grep: avoid leak of chartables in PCRE2
  grep: make PCRE2 aware of custom allocator
  grep: make PCRE1 aware of custom allocator
</content>
</entry>
<entry>
<title>grep: avoid leak of chartables in PCRE2</title>
<updated>2019-10-18T01:33:18Z</updated>
<author>
<name>Carlo Marcelo Arenas Belón</name>
<email>carenas@gmail.com</email>
</author>
<published>2019-10-16T12:10:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=10da030ab757c38a01507bf18e2484698000b791'/>
<id>urn:sha1:10da030ab757c38a01507bf18e2484698000b791</id>
<content type='text'>
94da9193a6 ("grep: add support for PCRE v2", 2017-06-01) introduced
a small memory leak visible with valgrind in t7813.

Complete the creation of a PCRE2 specific variable that was missing from
the original change and free the generated table just like it is done
for PCRE1.

Signed-off-by: Carlo Marcelo Arenas Belón &lt;carenas@gmail.com&gt;
Signed-off-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>grep: make PCRE2 aware of custom allocator</title>
<updated>2019-10-18T01:33:18Z</updated>
<author>
<name>Carlo Marcelo Arenas Belón</name>
<email>carenas@gmail.com</email>
</author>
<published>2019-10-16T12:10:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=513f2b0bbd47015640723a10210c527839e8946a'/>
<id>urn:sha1:513f2b0bbd47015640723a10210c527839e8946a</id>
<content type='text'>
94da9193a6 (grep: add support for PCRE v2, 2017-06-01) didn't include
a way to override the system allocator, and so it is incompatible with
custom allocators (e.g. nedmalloc). This problem became obvious when we
tried to plug a memory leak by `free()`ing a data structure allocated by
PCRE2, triggering a segfault in Windows (where we use nedmalloc by
default).

PCRE2 requires the use of a general context to override the allocator
and therefore, there is a lot more code needed than in PCRE1, including
a couple of wrapper functions.

Extend the grep API with a "destructor" that could be called to cleanup
any objects that were created and used globally.

Update `builtin/grep.c` to use that new API, but any other future users
should make sure to have matching `grep_init()`/`grep_destroy()` calls
if they are using the pattern matching functionality.

Move some of the logic that was before done per thread (in the workers)
into an earlier phase to avoid degrading performance, but as the use of
PCRE2 with custom allocators is better understood it is expected more of
its functions will be instructed to use the custom allocator as well as
was done in the original code[1] this work was based on.

[1] https://public-inbox.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/

Reported-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
Signed-off-by: Carlo Marcelo Arenas Belón &lt;carenas@gmail.com&gt;
Signed-off-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'cb/pcre1-cleanup'</title>
<updated>2019-10-11T05:24:47Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2019-10-11T05:24:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=93424f1f7d790944168d46faf5222a388898d031'/>
<id>urn:sha1:93424f1f7d790944168d46faf5222a388898d031</id>
<content type='text'>
PCRE fixes.

* cb/pcre1-cleanup:
  grep: refactor and simplify PCRE1 support
  grep: make sure NO_LIBPCRE1_JIT disable JIT in PCRE1
</content>
</entry>
<entry>
<title>Merge branch 'ab/pcre-jit-fixes'</title>
<updated>2019-10-11T05:24:47Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2019-10-11T05:24:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a73f91774cbfb6f31bca328ebf200498fe92d97a'/>
<id>urn:sha1:a73f91774cbfb6f31bca328ebf200498fe92d97a</id>
<content type='text'>
A few simplification and bugfixes to PCRE interface.

* ab/pcre-jit-fixes:
  grep: under --debug, show whether PCRE JIT is enabled
  grep: do not enter PCRE2_UTF mode on fixed matching
  grep: stess test PCRE v2 on invalid UTF-8 data
  grep: create a "is_fixed" member in "grep_pat"
  grep: consistently use "p-&gt;fixed" in compile_regexp()
  grep: stop using a custom JIT stack with PCRE v1
  grep: stop "using" a custom JIT stack with PCRE v2
  grep: remove overly paranoid BUG(...) code
  grep: use PCRE v2 for optimized fixed-string search
  grep: remove the kwset optimization
  grep: drop support for \0 in --fixed-strings &lt;pattern&gt;
  grep: make the behavior for NUL-byte in patterns sane
  grep tests: move binary pattern tests into their own file
  grep tests: move "grep binary" alongside the rest
  grep: inline the return value of a function call used only once
  t4210: skip more command-line encoding tests on MinGW
  grep: don't use PCRE2?_UTF8 with "log --encoding=&lt;non-utf8&gt;"
  log tests: test regex backends in "--encode=&lt;enc&gt;" tests
</content>
</entry>
<entry>
<title>grep: skip UTF8 checks explicitly</title>
<updated>2019-09-09T18:50:08Z</updated>
<author>
<name>Carlo Marcelo Arenas Belón</name>
<email>carenas@gmail.com</email>
</author>
<published>2019-08-28T14:54:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ad7c543e3b0f80befd26f4115f8fec4285a018bf'/>
<id>urn:sha1:ad7c543e3b0f80befd26f4115f8fec4285a018bf</id>
<content type='text'>
18547aacf5 ("grep/pcre: support utf-8", 2016-06-25) that was released
with git 2.10 added the PCRE_UTF8 flag to PCRE1 matching including a
call to has_non_ascii() to try to avoid breakage if there was non-utf8
encoded content in the haystack.

Usually PCRE is compiled with JIT support (even if is not the default),
and therefore the codepath used includes calling pcre_jit_exec, which
skips UTF-8 validation by design (which might result in crashes or hangs)
but when JIT support wasn't compiled we use pcre_exec instead with the
posibility that grep might be aborted if invalid UTF-8 is found in the
haystack.

PCRE1 provides a flag since Mar 5, 2007 that could be used to skip the
checks explicitly so use that to make both codepaths equivalent (the
flag is ignored by pcre1_jit_exec)

this fix is only implemented for PCRE1 because PCRE2 is likely to have
a better solution (without the risks) instead in the future

Helped-by: Johannes Schindelin &lt;Johannes.Schindelin@gmx.de&gt;
Helped-by: Eric Sunshine &lt;sunshine@sunshineco.com&gt;
Helped-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Suggested-by: Junio C Hamano &lt;gitster@pobox.com&gt;
Signed-off-by: Carlo Marcelo Arenas Belón &lt;carenas@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>grep: refactor and simplify PCRE1 support</title>
<updated>2019-08-26T18:37:02Z</updated>
<author>
<name>Carlo Marcelo Arenas Belón</name>
<email>carenas@gmail.com</email>
</author>
<published>2019-08-25T18:22:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ff61681b46760f6a64353018760bccc14c90f8e9'/>
<id>urn:sha1:ff61681b46760f6a64353018760bccc14c90f8e9</id>
<content type='text'>
The code used both a macro and a variable to keep track if JIT
support was desired and relied on the fact that a non JIT
enabled library will ignore a request for JIT compilation
(as defined by the second parameter of the call to pcre_study)

Cleanup the multiple levels of macros used and call pcre_study
with the right parameter after JIT support has been confirmed
and unless it was requested to be disabled with NO_LIBPCRE1_JIT

Signed-off-by: Carlo Marcelo Arenas Belón &lt;carenas@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>grep: make sure NO_LIBPCRE1_JIT disable JIT in PCRE1</title>
<updated>2019-08-26T18:37:01Z</updated>
<author>
<name>Carlo Marcelo Arenas Belón</name>
<email>carenas@gmail.com</email>
</author>
<published>2019-08-25T18:22:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8991da6a3864d93d860afe4c510b4fdbf0da6363'/>
<id>urn:sha1:8991da6a3864d93d860afe4c510b4fdbf0da6363</id>
<content type='text'>
e87de7cab4 ("grep: un-break building with PCRE &lt; 8.32", 2017-05-25)
added a restriction for JIT support that is no longer needed after
pcre_jit_exec() calls were removed.

Reorganize the definitions in grep.h so that JIT support could be
detected early and NO_LIBPCRE1_JIT could be used reliably to enforce
JIT doesn't get used.

Signed-off-by: Carlo Marcelo Arenas Belón &lt;carenas@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>grep: create a "is_fixed" member in "grep_pat"</title>
<updated>2019-07-26T20:56:40Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2019-07-26T15:08:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=09872f6418f6b6fc1b823d3b324907c02e9bc75b'/>
<id>urn:sha1:09872f6418f6b6fc1b823d3b324907c02e9bc75b</id>
<content type='text'>
This change paves the way for later using this value the regex compile
functions themselves.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
