<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/grep.c, branch v2.48.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.48.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.48.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2024-12-06T11:20:02Z</updated>
<entry>
<title>global: mark code units that generate warnings with `-Wsign-compare`</title>
<updated>2024-12-06T11:20:02Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-12-06T10:27:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=41f43b8243f42b9df2e98be8460646d4c0100ad3'/>
<id>urn:sha1:41f43b8243f42b9df2e98be8460646d4c0100ad3</id>
<content type='text'>
Mark code units that generate warnings with `-Wsign-compare`. This
allows for a structured approach to get rid of all such warnings over
time in a way that can be easily measured.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ps/leakfixes-part-9'</title>
<updated>2024-11-12T23:35:31Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-11-12T23:35:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=6890c99e38c378c2fff7747b52bd2102b403357d'/>
<id>urn:sha1:6890c99e38c378c2fff7747b52bd2102b403357d</id>
<content type='text'>
More leakfixes.

* ps/leakfixes-part-9: (22 commits)
  list-objects-filter-options: work around reported leak on error
  builtin/merge: release output buffer after performing merge
  dir: fix leak when parsing "status.showUntrackedFiles"
  t/helper: fix leaking buffer in "dump-untracked-cache"
  t/helper: stop re-initialization of `the_repository`
  sparse-index: correctly free EWAH contents
  dir: release untracked cache data
  combine-diff: fix leaking lost lines
  builtin/tag: fix leaking key ID on failure to sign
  transport-helper: fix leaking import/export marks
  builtin/commit: fix leaking cleanup config
  trailer: fix leaking strbufs when formatting trailers
  trailer: fix leaking trailer values
  builtin/commit: fix leaking change data contents
  upload-pack: fix leaking URI protocols
  pretty: clear signature check
  diff-lib: fix leaking diffopts in `do_diff_cache()`
  revision: fix leaking bloom filters
  builtin/grep: fix leak with `--max-count=0`
  grep: fix leak in `grep_splice_or()`
  ...
</content>
</entry>
<entry>
<title>grep: fix leak in `grep_splice_or()`</title>
<updated>2024-11-05T06:37:52Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-11-05T06:16:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a6590ccdd431e2ab7b9c521cac674546725a54d2'/>
<id>urn:sha1:a6590ccdd431e2ab7b9c521cac674546725a54d2</id>
<content type='text'>
In `grep_splice_or()` we search for the next `TRUE` node in our tree of
grep expressions and replace it with the given new expression. But we
don't free the old node, which causes a memory leak. Plug it.

This leak is exposed by t7810, but plugging it alone isn't sufficient to
make the test suite pass.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>grep: disable lookahead on error</title>
<updated>2024-10-22T16:45:49Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2024-10-20T11:02:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ce025ae4f61e8e32b2ae6589e43e03e60f713f2d'/>
<id>urn:sha1:ce025ae4f61e8e32b2ae6589e43e03e60f713f2d</id>
<content type='text'>
regexec(3) can fail.  E.g. on macOS it fails if it is used with an UTF-8
locale to match a valid regex against a buffer containing invalid UTF-8
characters.

git grep has two ways to search for matches in a file: Either it splits
its contents into lines and matches them separately, or it matches the
whole content and figures out line boundaries later.  The latter is done
by look_ahead() and it's quicker in the common case where most files
don't contain a match.

Fall back to line-by-line matching if look_ahead() encounters an
regexec(3) error by propagating errors out of patmatch() and bailing out
of look_ahead() if there is one.  This way we at least can find matches
in lines that contain only valid characters.  That matches the behavior
of grep(1) on macOS.

pcre2match() dies if pcre2_jit_match() or pcre2_match() fail, but since
we use the flag PCRE2_MATCH_INVALID_UTF it handles invalid UTF-8
characters gracefully.  So implement the fall-back only for regexec(3)
and leave the PCRE2 matching unchanged.

Reported-by: David Gstir &lt;david@sigma-star.at&gt;
Signed-off-by: René Scharfe &lt;l.s.r@web.de&gt;
Tested-by: David Gstir &lt;david@sigma-star.at&gt;
Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
</content>
</entry>
<entry>
<title>grep: fix leaking grep pattern</title>
<updated>2024-09-27T15:25:36Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-09-26T11:46:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=6d82437a470fc7797e41f5a7ac4874914db7cdb8'/>
<id>urn:sha1:6d82437a470fc7797e41f5a7ac4874914db7cdb8</id>
<content type='text'>
When creating a pattern via `create_grep_pat()` we allocate the pattern
member of the structure regardless of the token type. But later, when we
try to free the structure, we free the pattern member conditionally on
the token type and thus leak memory.

Plug this leak. The leak is exposed by t7814, but plugging it alone does
not make the whole test suite pass.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>grep: prefer UNUSED to MAYBE_UNUSED for pcre allocators</title>
<updated>2024-08-29T20:59:46Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2024-08-29T20:09:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=516a9ec3d5874aad41757e573f7b841bb45cb098'/>
<id>urn:sha1:516a9ec3d5874aad41757e573f7b841bb45cb098</id>
<content type='text'>
We provide custom malloc/free callbacks for the pcre library to use.
Those take an extra "data" parameter, but we don't use it. Back when
these were added in 513f2b0bbd (grep: make PCRE2 aware of custom
allocator, 2019-10-16), we only had MAYBE_UNUSED.

But these days we have UNUSED, which we should prefer, as it will
let the compiler inform us if the code changes to actually use the
parameters.

I also moved the annotations to come after the variable name, which is
how we typically spell it.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>grep: -W: skip trailing empty lines at EOF, too</title>
<updated>2024-07-30T16:59:04Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2024-07-30T14:18:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8e5dd94e68e37d2d4b67e34db00b9e9790a2325b'/>
<id>urn:sha1:8e5dd94e68e37d2d4b67e34db00b9e9790a2325b</id>
<content type='text'>
4aa2c4753d (grep: -W: don't extend context to trailing empty lines,
2016-05-28) stopped showing empty lines at the end of function context
when using -W.  Do the same for trailing empty lines at the end of
files, for consistency -- it doesn't matter whether a function section
is ended by the next function or the end of the file.

Test it by adding a trailing empty line to the file used by the test
"grep -W" and leave its expected output the same.

Signed-off-by: René Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>grep: improve errors for unmatched ( and )</title>
<updated>2024-03-25T18:40:53Z</updated>
<author>
<name>Ahelenia Ziemiańska</name>
<email>nabijaczleweli@nabijaczleweli.xyz</email>
</author>
<published>2024-03-23T13:18:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0d527842b7633091158450d4dcda3ceb3547a636'/>
<id>urn:sha1:0d527842b7633091158450d4dcda3ceb3547a636</id>
<content type='text'>
Imagine you want to grep for (. Easy:

  $ git grep '('
  fatal: unmatched parenthesis

uhoh. This is plainly wrong. Unless you know specifically that

 (a) git grep has expression groups and '(' ... ')' are used for them.
 (b) you can use -e '(' to explicitly say '(' is what you are looking
     for, not the beginning of a group.

Similarly,

  $ git grep ')'
  fatal: incomplete pattern expression: )

is somehow worse. ")" is a complete regular expression pattern.
Of course, the error wants to say "group" here.
In this case it is also not "incomplete", it is unmatched.

Make them say

  $ ./git grep '('
  fatal: unmatched ( for expression group
  $ ./git grep ')'
  fatal: incomplete pattern expression group: )

which are clearer in indicating that it is not the expression that
is wrong (since no pattern had been parsed at all), but rather that
it is been misconstrued as a grouping operator.

Link: https://bugs.debian.org/1051205
Signed-off-by: Ahelenia Ziemiańska &lt;nabijaczleweli@nabijaczleweli.xyz&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>treewide: remove unnecessary includes in source files</title>
<updated>2023-12-26T20:04:31Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-12-23T17:14:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=eea0e59ffbed6e33d171ace5be13cde9faa41639'/>
<id>urn:sha1:eea0e59ffbed6e33d171ace5be13cde9faa41639</id>
<content type='text'>
Each of these were checked with
   gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).

...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file.  These cases were:
  * builtin/credential-cache.c
  * builtin/pull.c
  * builtin/send-pack.c

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>grep: mark unused parmaeters in pcre fallbacks</title>
<updated>2023-08-30T00:56:26Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2023-08-29T23:45:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4548b0145f17c633de5e267b6c7932c72824e9d3'/>
<id>urn:sha1:4548b0145f17c633de5e267b6c7932c72824e9d3</id>
<content type='text'>
When USE_LIBPCRE2 is not defined, we compile several noop fallbacks.
These need to have their parameters annotated to avoid
-Wunused-parameter warnings (and obviously we cannot remove the
parameters, since the functions must match the non-fallback versions).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
