<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/builtin/patch-id.c, branch v2.46.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.46.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.46.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2024-09-16T22:12:06Z</updated>
<entry>
<title>Revert "Merge branch 'jc/patch-id' into maint-2.46"</title>
<updated>2024-09-16T22:12:06Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-09-16T22:12:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d6bf6527ebfab4899d52d5662f3c5054195e6415'/>
<id>urn:sha1:d6bf6527ebfab4899d52d5662f3c5054195e6415</id>
<content type='text'>
This reverts commit 41c952ebacf7e3369e7bee721f768114d65e50c4,
reversing changes made to 712d970c0145b95ce655773e7cd1676f09dfd215.
Keeping a known breakage for now is better than introducing new
regression(s).
</content>
</entry>
<entry>
<title>Merge branch 'jc/patch-id' into maint-2.46</title>
<updated>2024-09-12T18:02:16Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-09-12T18:02:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=41c952ebacf7e3369e7bee721f768114d65e50c4'/>
<id>urn:sha1:41c952ebacf7e3369e7bee721f768114d65e50c4</id>
<content type='text'>
The patch parser in "git patch-id" has been tightened to avoid
getting confused by lines that look like a patch header in the log
message.
cf. &lt;Zqh2T_2RLt0SeKF7@tanuki&gt;

* jc/patch-id:
  patch-id: tighten code to detect the patch header
  patch-id: rewrite code that detects the beginning of a patch
  patch-id: make get_one_patchid() more extensible
  patch-id: call flush_current_id() only when needed
  t4204: patch-id supports various input format
</content>
</entry>
<entry>
<title>patch-id: tighten code to detect the patch header</title>
<updated>2024-07-30T01:19:14Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-07-30T01:17:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a6e9429f728d088999b815c25adbd2f2c115e051'/>
<id>urn:sha1:a6e9429f728d088999b815c25adbd2f2c115e051</id>
<content type='text'>
The get_one_patchid() function unconditionally takes a line that
matches the patch header (namely, a line that begins with a full
object name, possibly prefixed by "commit" or "From" plus a space)
as the beginning of a patch.  Even when it is *not* looking for one
(namely, when the previous call found the patch header and returned,
and then we are called again to skip the log message and process the
patch whose header was found by the previous invocation).

As a consequence, a line in the commit log message that begins with
one of these patterns can be mistaken to start another patch, with
current message entirely skipped (because we haven't even reached
the patch at all).

Allow the caller to tell us if it called us already and saw the
patch header (in which case we shouldn't be looking for another one,
until we see the "diff" part of the patch; instead we simply should
be skipping these lines as part of the commit log message), and skip
the header processing logic when that is the case.  In the helper
function, it also needs to flip this "are we looking for a header?"
bit, once it finished skipping the commit log message and started
processing the patches, as the patch header of the _next_ message is
the only clue in the input that the current patch is done.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>patch-id: rewrite code that detects the beginning of a patch</title>
<updated>2024-07-30T01:19:14Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-07-30T01:17:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3f288b6fafbb8c76edf53fba7cea90d5a20a9c56'/>
<id>urn:sha1:3f288b6fafbb8c76edf53fba7cea90d5a20a9c56</id>
<content type='text'>
The get_one_patchid() function reads input lines until it finds a
patch header (the line that begins a patch), whose beginning is one
of:

 (1) an "&lt;object name&gt;", which is what "git diff-tree --stdin" shows;
 (2) "commit &lt;object name&gt;", which is what "git log" shows; or
 (3) "From &lt;object name&gt;",  which is what "git log --format=email" shows.

When it finds such a line, it returns to the caller, reporting the
&lt;object name&gt; it found, and the size of the "patch" it processed.

The caller then calls the function again, which then ignores the
commit log message, and then processes the lines in the patch part
until it hits another "beginning of a patch".

The above logic was fairly easy to see until 2bb73ae8 (patch-id: use
starts_with() and skip_prefix(), 2016-05-28) reorganized the code,
which made another logic that has nothing to do with the "where does
the next patch begin?" logic, which came from 2485eab5
(git-patch-id: do not trip over "no newline" markers, 2011-02-17)
that ignores the "\ No newline at the end", rolled into the same
single if() statement.

Let's split it out.  The "\ No newline at the end" marker is part of
the patch, should not appear before we start reading the patch part,
and does not belong to the detection of patch header.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>patch-id: make get_one_patchid() more extensible</title>
<updated>2024-07-30T01:19:14Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-07-30T01:17:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=2438294a13dc3f04684bcfb1af75008b3ca07f86'/>
<id>urn:sha1:2438294a13dc3f04684bcfb1af75008b3ca07f86</id>
<content type='text'>
We pass two independent Boolean flags (i.e. do we want the stable
variant of patch-id?  do we want to hash the stuff verbatim?) into
the function as two separate parameters.  Before adding the third
one and make the interface even wider, let's consolidate them into
a single flag word.

No changes in behaviour.  Just a trivial interface change.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>patch-id: call flush_current_id() only when needed</title>
<updated>2024-07-30T01:19:14Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-07-30T01:17:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c92f3195adf61f248af5bc787def379b2db6f2e9'/>
<id>urn:sha1:c92f3195adf61f248af5bc787def379b2db6f2e9</id>
<content type='text'>
The caller passes a flag that is used to become no-op when calling
flush_current_id().  Instead of calling something that becomes a
no-op, teach the caller not to call it in the first place.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hash: require hash algorithm in `oidread()` and `oidclr()`</title>
<updated>2024-06-14T17:26:32Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-06-14T06:49:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9da95bda74cf10e1475384a71fd20914c3b99784'/>
<id>urn:sha1:9da95bda74cf10e1475384a71fd20914c3b99784</id>
<content type='text'>
Both `oidread()` and `oidclr()` use `the_repository` to derive the hash
function that shall be used. Require callers to pass in the hash
algorithm to get rid of this implicit dependency.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>builtin/patch-id: fix uninitialized hash function</title>
<updated>2024-05-21T16:05:13Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-05-20T23:14:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4a1c95931f5f0dc541925703b7d9ad9409cd4067'/>
<id>urn:sha1:4a1c95931f5f0dc541925703b7d9ad9409cd4067</id>
<content type='text'>
In c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07), we have adapted `initialize_repository()` to no longer set
up a default hash function. As this function is also used to set up
`the_repository`, the consequence is that `the_hash_algo` will now by
default be a `NULL` pointer unless the hash algorithm was configured
properly. This is done as a mechanism to detect cases where we may be
using the wrong hash function by accident.

This change now causes git-patch-id(1) to segfault when it's run outside
of a repository. As this command can read diffs from stdin, it does not
necessarily need a repository, but then relies on `the_hash_algo` to
compute the patch ID itself.

It is somewhat dubious that git-patch-id(1) relies on `the_hash_algo` in
the first place. Quoting its manpage:

    A "patch ID" is nothing but a sum of SHA-1 of the file diffs
    associated with a patch, with line numbers ignored. As such, it’s
    "reasonably stable", but at the same time also reasonably unique,
    i.e., two patches that have the same "patch ID" are almost
    guaranteed to be the same thing.

We explicitly document patch IDs to be using SHA-1. Furthermore, patch
IDs are supposed to be stable for most of the part. But even with the
same input, the patch IDs will now be different depending on the repo's
configured object hash.

Work around the issue by setting up SHA-1 when there was no startup
repository for now. This is arguably not the correct fix, but for now we
rather want to focus on getting the segfault fixed.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'gc/config-context'</title>
<updated>2023-07-06T18:54:48Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-07-06T18:54:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b3d1c85d4833aef546f11e4d37516a1ececaefc3'/>
<id>urn:sha1:b3d1c85d4833aef546f11e4d37516a1ececaefc3</id>
<content type='text'>
Reduce reliance on a global state in the config reading API.

* gc/config-context:
  config: pass source to config_parser_event_fn_t
  config: add kvi.path, use it to evaluate includes
  config.c: remove config_reader from configsets
  config: pass kvi to die_bad_number()
  trace2: plumb config kvi
  config.c: pass ctx with CLI config
  config: pass ctx with config files
  config.c: pass ctx in configsets
  config: add ctx arg to config_fn_t
  urlmatch.h: use config_fn_t type
  config: inline git_color_default_config
</content>
</entry>
<entry>
<title>config: add ctx arg to config_fn_t</title>
<updated>2023-06-28T21:06:39Z</updated>
<author>
<name>Glen Choo</name>
<email>chooglen@google.com</email>
</author>
<published>2023-06-28T19:26:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a4e7e317f8f27f861321e6eb08b9c8c0f3ab570c'/>
<id>urn:sha1:a4e7e317f8f27f861321e6eb08b9c8c0f3ab570c</id>
<content type='text'>
Add a new "const struct config_context *ctx" arg to config_fn_t to hold
additional information about the config iteration operation.
config_context has a "struct key_value_info kvi" member that holds
metadata about the config source being read (e.g. what kind of config
source it is, the filename, etc). In this series, we're only interested
in .kvi, so we could have just used "struct key_value_info" as an arg,
but config_context makes it possible to add/adjust members in the future
without changing the config_fn_t signature. We could also consider other
ways of organizing the args (e.g. moving the config name and value into
config_context or key_value_info), but in my experiments, the
incremental benefit doesn't justify the added complexity (e.g. a
config_fn_t will sometimes invoke another config_fn_t but with a
different config value).

In subsequent commits, the .kvi member will replace the global "struct
config_reader" in config.c, making config iteration a global-free
operation. It requires much more work for the machinery to provide
meaningful values of .kvi, so for now, merely change the signature and
call sites, pass NULL as a placeholder value, and don't rely on the arg
in any meaningful way.

Most of the changes are performed by
contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every
config_fn_t:

- Modifies the signature to accept "const struct config_context *ctx"
- Passes "ctx" to any inner config_fn_t, if needed
- Adds UNUSED attributes to "ctx", if needed

Most config_fn_t instances are easily identified by seeing if they are
called by the various config functions. Most of the remaining ones are
manually named in the .cocci patch. Manual cleanups are still needed,
but the majority of it is trivial; it's either adjusting config_fn_t
that the .cocci patch didn't catch, or adding forward declarations of
"struct config_context ctx" to make the signatures make sense.

The non-trivial changes are in cases where we are invoking a config_fn_t
outside of config machinery, and we now need to decide what value of
"ctx" to pass. These cases are:

- trace2/tr2_cfg.c:tr2_cfg_set_fl()

  This is indirectly called by git_config_set() so that the trace2
  machinery can notice the new config values and update its settings
  using the tr2 config parsing function, i.e. tr2_cfg_cb().

- builtin/checkout.c:checkout_main()

  This calls git_xmerge_config() as a shorthand for parsing a CLI arg.
  This might be worth refactoring away in the future, since
  git_xmerge_config() can call git_default_config(), which can do much
  more than just parsing.

Handle them by creating a KVI_INIT macro that initializes "struct
key_value_info" to a reasonable default, and use that to construct the
"ctx" arg.

Signed-off-by: Glen Choo &lt;chooglen@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
