<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/diff-lib.c, branch v2.48.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.48.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.48.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2024-12-06T11:20:02Z</updated>
<entry>
<title>global: mark code units that generate warnings with `-Wsign-compare`</title>
<updated>2024-12-06T11:20:02Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-12-06T10:27:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=41f43b8243f42b9df2e98be8460646d4c0100ad3'/>
<id>urn:sha1:41f43b8243f42b9df2e98be8460646d4c0100ad3</id>
<content type='text'>
Mark code units that generate warnings with `-Wsign-compare`. This
allows for a structured approach to get rid of all such warnings over
time in a way that can be easily measured.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diff-lib: fix leaking diffopts in `do_diff_cache()`</title>
<updated>2024-11-05T06:37:52Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-11-05T06:17:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8dd3cb4b45d06959f3344f02b2d45a4ccf8207d8'/>
<id>urn:sha1:8dd3cb4b45d06959f3344f02b2d45a4ccf8207d8</id>
<content type='text'>
In `do_diff_cache()` we initialize a new `rev_info` and then overwrite
its `diffopt` with a user-provided set of options. This can leak memory
because `repo_init_revisions()` may end up allocating memory for the
`diffopt` itself depending on the configuration. And since that field is
overwritten we won't ever free it.

Plug the memory leak by releasing the diffopts before we overwrite them.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jk/output-prefix-cleanup'</title>
<updated>2024-10-10T21:22:30Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-10-10T21:22:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3eb4cc451ed97123ff76e183a5be8a7dc164d1f6'/>
<id>urn:sha1:3eb4cc451ed97123ff76e183a5be8a7dc164d1f6</id>
<content type='text'>
Code clean-up.

* jk/output-prefix-cleanup:
  diff: store graph prefix buf in git_graph struct
  diff: return line_prefix directly when possible
  diff: return const char from output_prefix callback
  diff: drop line_prefix_length field
  line-log: use diff_line_prefix() instead of custom helper
</content>
</entry>
<entry>
<title>diff: return const char from output_prefix callback</title>
<updated>2024-10-03T21:22:22Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2024-10-03T21:09:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=436728fe9d75d05fa2439f867ca2039012b86e69'/>
<id>urn:sha1:436728fe9d75d05fa2439f867ca2039012b86e69</id>
<content type='text'>
The diff_options structure has an output_prefix callback for returning a
prefix string, but it does so by returning a pointer to a strbuf.

This makes the interface awkward. There's no reason the callback should
need to use a strbuf, and it creates questions about whether the
ownership of the resulting buffer should be transferred to the caller
(it should not be, but a recent attempt to clean up this code led to a
double-free in some cases).

The one advantage we get is that the strbuf contains a ptr/len pair, so
we could in theory have a prefix with embedded NULs. But we can observe
that none of the existing callbacks would ever produce such a NUL (they
are usually just indentation or graph symbols, and even the
"--line-prefix" option takes a NUL-terminated string).

And anyway, only one caller (the one in log_tree_diff_flush) actually
looks at the strbuf length. In every other case we use a helper function
which discards the length and just returns the NUL-terminated string.

So let's just have the callback return a "const char *" pointer. It's up
to the callbacks themselves if they want to use a strbuf under the hood.
And now the caller in log_tree_diff_flush() can just use the helper
function along with everybody else. That lets us even simplify out the
function pointer check, since the helper returns an empty string
(technically this does mean we'll sometimes issue an empty fputs() call,
but I don't think this code path is hot enough to care about that).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diff-lib: drop unused index argument from get_stat_data()</title>
<updated>2024-08-17T16:44:41Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2024-08-17T07:29:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=72c9793c154edbde16085f0f04db970e082c7f0b'/>
<id>urn:sha1:72c9793c154edbde16085f0f04db970e082c7f0b</id>
<content type='text'>
The "struct index_state" parameter passed to get_stat_data() has been
unused since we stopped passing it to check_removed() in 6a044a2048
(diff-lib: fix check_removed when fsmonitor is on, 2023-09-11). We can
just drop it, which in turns lets us simplify our callers a bit.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ps/leakfixes-more'</title>
<updated>2024-07-08T21:53:10Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-07-08T21:53:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3997614c249b4b475d07c00556446d8b698d1a49'/>
<id>urn:sha1:3997614c249b4b475d07c00556446d8b698d1a49</id>
<content type='text'>
More memory leaks have been plugged.

* ps/leakfixes-more: (29 commits)
  builtin/blame: fix leaking ignore revs files
  builtin/blame: fix leaking prefixed paths
  blame: fix leaking data for blame scoreboards
  line-range: plug leaking find functions
  merge: fix leaking merge bases
  builtin/merge: fix leaking `struct cmdnames` in `get_strategy()`
  sequencer: fix memory leaks in `make_script_with_merges()`
  builtin/clone: plug leaking HEAD ref in `wanted_peer_refs()`
  apply: fix leaking string in `match_fragment()`
  sequencer: fix leaking string buffer in `commit_staged_changes()`
  commit: fix leaking parents when calling `commit_tree_extended()`
  config: fix leaking "core.notesref" variable
  rerere: fix various trivial leaks
  builtin/stash: fix leak in `show_stash()`
  revision: free diff options
  builtin/log: fix leaking commit list in git-cherry(1)
  merge-recursive: fix memory leak when finalizing merge
  builtin/merge-recursive: fix leaking object ID bases
  builtin/difftool: plug memory leaks in `run_dir_diff()`
  object-name: free leaking object contexts
  ...
</content>
</entry>
<entry>
<title>global: introduce `USE_THE_REPOSITORY_VARIABLE` macro</title>
<updated>2024-06-14T17:26:33Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-06-14T06:50:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e7da9385708accf518a80a1e17969020fb361048'/>
<id>urn:sha1:e7da9385708accf518a80a1e17969020fb361048</id>
<content type='text'>
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.

It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.

Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit

For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).

Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hash: require hash algorithm in `oidread()` and `oidclr()`</title>
<updated>2024-06-14T17:26:32Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-06-14T06:49:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9da95bda74cf10e1475384a71fd20914c3b99784'/>
<id>urn:sha1:9da95bda74cf10e1475384a71fd20914c3b99784</id>
<content type='text'>
Both `oidread()` and `oidclr()` use `the_repository` to derive the hash
function that shall be used. Require callers to pass in the hash
algorithm to get rid of this implicit dependency.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>revision: free diff options</title>
<updated>2024-06-11T20:15:06Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-06-11T09:20:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a90a08961190dda2f664e102822fb6a7152e65d5'/>
<id>urn:sha1:a90a08961190dda2f664e102822fb6a7152e65d5</id>
<content type='text'>
There is a todo comment in `release_revisions()` that mentions that we
need to free the diff options, which was added via 54c8a7c379 (revisions
API: add a TODO for diff_free(&amp;revs-&gt;diffopt), 2022-04-14). Releasing
the diff options wasn't quite feasible at that time because some call
sites rely on its contents to remain even after the revisions have been
released.

In fact, there really only are a couple of callsites that misbehave
here:

  - `cmd_shortlog()` releases the revisions, but continues to access its
    file pointer.

  - `do_diff_cache()` creates a shallow copy of `struct diff_options`,
    but does not set the `no_free` member. Consequently, we end up
    releasing resources of the caller-provided diff options.

  - `diff_free()` and friends do not play nice when being called
    multiple times as they don't unset data structures that they have
    just released.

Fix all of those cases and enable the call to `diff_free()`, which plugs
a bunch of memory leaks.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs: refactor `resolve_gitlink_ref()` to accept a repository</title>
<updated>2024-05-17T17:33:38Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-05-17T08:18:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e19488a60aab022ca4c8d32d6629c04ce94f2779'/>
<id>urn:sha1:e19488a60aab022ca4c8d32d6629c04ce94f2779</id>
<content type='text'>
In `resolve_gitlink_ref()` we implicitly rely on `the_repository` to
look up the submodule ref store. Now that we can look up submodule ref
stores for arbitrary repositories we can improve this function to
instead accept a repository as parameter for which we want to resolve
the gitlink.

Do so and adjust callers accordingly.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
