<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/builtin/diff.c, branch v2.50.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.50.0</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.50.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2025-03-10T20:16:20Z</updated>
<entry>
<title>hash: stop depending on `the_repository` in `null_oid()`</title>
<updated>2025-03-10T20:16:20Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-03-10T07:13:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=7d70b29c4f0b2fd3c6698956d9fb4026632d9c6e'/>
<id>urn:sha1:7d70b29c4f0b2fd3c6698956d9fb4026632d9c6e</id>
<content type='text'>
The `null_oid()` function returns the object ID that only consists of
zeroes. Naturally, this ID also depends on the hash algorithm used, as
the number of zeroes is different between SHA1 and SHA256. Consequently,
the function returns the hash-algorithm-specific null object ID.

This is currently done by depending on `the_hash_algo`, which implicitly
makes us depend on `the_repository`. Refactor the function to instead
pass in the hash algorithm for which we want to retrieve the null object
ID. Adapt callsites accordingly by passing in `the_repository`, thus
bubbling up the dependency on that global variable by one layer.

There are a couple of trivial exceptions for subsystems that already got
rid of `the_repository`. These subsystems instead use the repository
that is available via the calling context:

  - "builtin/grep.c"
  - "grep.c"
  - "refs/debug.c"

There are also two non-trivial exceptions:

  - "diff-no-index.c": Here we know that we may not have a repository
    initialized at all, so we cannot rely on `the_repository`. Instead,
    we adapt `diff_no_index()` to get a `struct git_hash_algo` as
    parameter. The only caller is located in "builtin/diff.c", where we
    know to call `repo_set_hash_algo()` in case we're running outside of
    a Git repository. Consequently, it is fine to continue passing
    `the_repository-&gt;hash_algo` even in this case.

  - "builtin/ls-files.c": There is an in-flight patch series that drops
    `USE_THE_REPOSITORY_VARIABLE` in this file, which causes a semantic
    conflict because we use `null_oid()` in `show_submodule()`. The
    value is passed to `repo_submodule_init()`, which may use the object
    ID to resolve a tree-ish in the superproject from which we want to
    read the submodule config. As such, the object ID should refer to an
    object in the superproject, and consequently we need to use its hash
    algorithm.

    This means that we could in theory just not bother about this edge
    case at all and just use `the_repository` in "diff-no-index.c". But
    doing so would feel misdesigned.

Remove the `USE_THE_REPOSITORY_VARIABLE` preprocessor define in
"hash.c".

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>global: mark code units that generate warnings with `-Wsign-compare`</title>
<updated>2024-12-06T11:20:02Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-12-06T10:27:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=41f43b8243f42b9df2e98be8460646d4c0100ad3'/>
<id>urn:sha1:41f43b8243f42b9df2e98be8460646d4c0100ad3</id>
<content type='text'>
Mark code units that generate warnings with `-Wsign-compare`. This
allows for a structured approach to get rid of all such warnings over
time in a way that can be easily measured.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>global: drop `UNLEAK()` annotation</title>
<updated>2024-11-20T23:23:46Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-11-20T13:39:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d91a9db33c53744b3d1034926edda7846442bc9e'/>
<id>urn:sha1:d91a9db33c53744b3d1034926edda7846442bc9e</id>
<content type='text'>
There are two users of `UNLEAK()` left in our codebase:

  - In "builtin/clone.c", annotating the `repo` variable. That leak has
    already been fixed though as you can see in the context, where we do
    know to free `repo_to_free`.

  - In "builtin/diff.c", to unleak entries of the `blob[]` array. That
    leak has also been fixed, because the entries we assign to that
    array come from `rev.pending.objects`, and we do eventually release
    `rev`.

This neatly demonstrates one of the issues with `UNLEAK()`: it is quite
easy for the annotation to become stale. A second issue is that its
whole intent is to paper over leaks. And while that has been a necessary
evil in the past, because Git was leaking left and right, it isn't
really much of an issue nowadays where our test suite has no known leaks
anymore.

Remove the last two users of this macro.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jc/pass-repo-to-builtins'</title>
<updated>2024-09-23T17:35:09Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-09-23T17:35:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b8e318ea58a0502ff99f37032ee8ac536df4e730'/>
<id>urn:sha1:b8e318ea58a0502ff99f37032ee8ac536df4e730</id>
<content type='text'>
The convention to calling into built-in command implementation has
been updated to pass the repository, if known, together with the
prefix value.

* jc/pass-repo-to-builtins:
  add: pass in repo variable instead of global the_repository
  builtin: remove USE_THE_REPOSITORY for those without the_repository
  builtin: remove USE_THE_REPOSITORY_VARIABLE from builtin.h
  builtin: add a repository parameter for builtin functions
</content>
</entry>
<entry>
<title>Merge branch 'jc/range-diff-lazy-setup'</title>
<updated>2024-09-16T21:22:55Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-09-16T21:22:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=be8ca2848a9e73f6ddc31ebce2ddc3c367d4f0cb'/>
<id>urn:sha1:be8ca2848a9e73f6ddc31ebce2ddc3c367d4f0cb</id>
<content type='text'>
Code clean-up.

* jc/range-diff-lazy-setup:
  remerge-diff: clean up temporary objdir at a central place
  remerge-diff: lazily prepare temporary objdir on demand
</content>
</entry>
<entry>
<title>builtin: remove USE_THE_REPOSITORY_VARIABLE from builtin.h</title>
<updated>2024-09-13T21:32:24Z</updated>
<author>
<name>John Cai</name>
<email>johncai86@gmail.com</email>
</author>
<published>2024-09-13T21:16:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=03eae9afb447ad4add2f18a1adb3589f050d596f'/>
<id>urn:sha1:03eae9afb447ad4add2f18a1adb3589f050d596f</id>
<content type='text'>
Instead of including USE_THE_REPOSITORY_VARIABLE by default on every
builtin, remove it from builtin.h and add it to all the builtins that
include builtin.h (by definition, that means all builtins/*.c).

Also, remove the include statement for repository.h since it gets
brought in through builtin.h.

The next step will be to migrate each builtin
from having to use the_repository.

Signed-off-by: John Cai &lt;johncai86@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>builtin: add a repository parameter for builtin functions</title>
<updated>2024-09-13T21:27:08Z</updated>
<author>
<name>John Cai</name>
<email>johncai86@gmail.com</email>
</author>
<published>2024-09-13T21:16:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9b1cb5070ffb581798763eaee85cec10da969e90'/>
<id>urn:sha1:9b1cb5070ffb581798763eaee85cec10da969e90</id>
<content type='text'>
In order to reduce the usage of the global the_repository, add a
parameter to builtin functions that will get passed a repository
variable.

This commit uses UNUSED on most of the builtin functions, as subsequent
commits will modify the actual builtins to pass the repository parameter
down.

Signed-off-by: John Cai &lt;johncai86@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>builtin/diff: free symmetric diff members</title>
<updated>2024-08-14T17:08:02Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-08-14T06:52:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=77d4b3dd73c44b2c617345a6d9686d2f7f5b8a68'/>
<id>urn:sha1:77d4b3dd73c44b2c617345a6d9686d2f7f5b8a68</id>
<content type='text'>
We populate a `struct symdiff` in case the user has requested a
symmetric diff. Part of this is to populate a `skip` bitmap that
indicates which commits shall be ignored in the diff. But while this
bitmap is dynamically allocated, we never free it.

Fix this by introducing and calling a new `symdiff_release()` function
that does this for us.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>remerge-diff: clean up temporary objdir at a central place</title>
<updated>2024-08-09T22:42:40Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-08-09T22:31:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4460e052e074490cfc083703fba285d3c2e36560'/>
<id>urn:sha1:4460e052e074490cfc083703fba285d3c2e36560</id>
<content type='text'>
After running a diff between two things, or a series of diffs while
walking the history, the diff computation is concluded by a call to
diff_result_code() to extract the exit status of the diff machinery.

The function can work on "struct diffopt", but all the callers
historically and currently pass "struct diffopt" that is embedded in
the "struct rev_info" that is used to hold the remerge_diff bit and
the remerge_objdir variable that points at the temporary object
directory in use.

Redefine diff_result_code() to take the whole "struct rev_info" to
give it an access to these members related to remerge-diff, so that
it can get rid of the temporary object directory for any and all
callers that used the feature.  We can lose the equivalent code to
do so from the code paths for individual commands, diff-tree, diff,
and log.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>builtin/diff: explicitly set hash algo when there is no repo</title>
<updated>2024-05-07T05:50:49Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-05-07T04:53:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ab274909d440a43cc03f9f815e9f8d4bba705f1e'/>
<id>urn:sha1:ab274909d440a43cc03f9f815e9f8d4bba705f1e</id>
<content type='text'>
The git-diff(1) command can be used outside repositories to diff two
files with each other. But even if there is no repository we will end up
hashing the files that we are diffing so that we can print the "index"
line:

    ```
    diff --git a/a b/b
    index 7898192..6178079 100644
    --- a/a
    +++ b/b
    @@ -1 +1 @@
    -a
    +b
    ```

We implicitly use SHA1 to calculate the hash here, which is because
`the_repository` gets initialized with SHA1 during the startup routine.
We are about to stop doing this though such that `the_repository` only
ever has a hash function when it was properly initialized via a repo's
configuration.

To give full control to our users, we would ideally add a new switch to
git-diff(1) that allows them to specify the hash function when executed
outside of a repository. But for now, we only convert the code to make
this explicit such that we can stop setting the default hash algorithm
for `the_repository`.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
