<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/run-command.h, branch v2.37.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.37.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.37.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2022-06-13T22:53:41Z</updated>
<entry>
<title>Merge branch 'ab/hooks-regression-fix'</title>
<updated>2022-06-13T22:53:41Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2022-06-13T22:53:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=1a7f6be5b17f572fc68ff2a2e0c079d50c671c74'/>
<id>urn:sha1:1a7f6be5b17f572fc68ff2a2e0c079d50c671c74</id>
<content type='text'>
In Git 2.36 we revamped the way how hooks are invoked.  One change
that is end-user visible is that the output of a hook is no longer
directly connected to the standard output of "git" that spawns the
hook, which was noticed post release.  This is getting corrected.

* ab/hooks-regression-fix:
  hook API: fix v2.36.0 regression: hooks should be connected to a TTY
  run-command: add an "ungroup" option to run_process_parallel()
</content>
</entry>
<entry>
<title>Merge branch 'ab/env-array'</title>
<updated>2022-06-10T22:04:13Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2022-06-10T22:04:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c21fa3bb549a7769f9d508f0a5f95c654539e1f7'/>
<id>urn:sha1:c21fa3bb549a7769f9d508f0a5f95c654539e1f7</id>
<content type='text'>
Rename .env_array member to .env in the child_process structure.

* ab/env-array:
  run-command API users: use "env" not "env_array" in comments &amp; names
  run-command API: rename "env_array" to "env"
</content>
</entry>
<entry>
<title>run-command: add an "ungroup" option to run_process_parallel()</title>
<updated>2022-06-07T17:01:41Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2022-06-07T08:48:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=fd3aaf53f713d424d3c08cffa7e76e29b31638ba'/>
<id>urn:sha1:fd3aaf53f713d424d3c08cffa7e76e29b31638ba</id>
<content type='text'>
Extend the parallel execution API added in c553c72eed6 (run-command:
add an asynchronous parallel child processor, 2015-12-15) to support a
mode where the stdout and stderr of the processes isn't captured and
output in a deterministic order, instead we'll leave it to the kernel
and stdio to sort it out.

This gives the API same functionality as GNU parallel's --ungroup
option. As we'll see in a subsequent commit the main reason to want
this is to support stdout and stderr being connected to the TTY in the
case of jobs=1, demonstrated here with GNU parallel:

	$ parallel --ungroup 'test -t {} &amp;&amp; echo TTY || echo NTTY' ::: 1 2
	TTY
	TTY
	$ parallel 'test -t {} &amp;&amp; echo TTY || echo NTTY' ::: 1 2
	NTTY
	NTTY

Another is as GNU parallel's documentation notes a potential for
optimization. As demonstrated in next commit our results with "git
hook run" will be similar, but generally speaking this shows that if
you want to run processes in parallel where the exact order isn't
important this can be a lot faster:

	$ hyperfine -r 3 -L o ,--ungroup 'parallel {o} seq ::: 10000000 &gt;/dev/null '
	Benchmark 1: parallel  seq ::: 10000000 &gt;/dev/null
	  Time (mean ± σ):     220.2 ms ±   9.3 ms    [User: 124.9 ms, System: 96.1 ms]
	  Range (min … max):   212.3 ms … 230.5 ms    3 runs

	Benchmark 2: parallel --ungroup seq ::: 10000000 &gt;/dev/null
	  Time (mean ± σ):     154.7 ms ±   0.9 ms    [User: 136.2 ms, System: 25.1 ms]
	  Range (min … max):   153.9 ms … 155.7 ms    3 runs

	Summary
	  'parallel --ungroup seq ::: 10000000 &gt;/dev/null ' ran
	    1.42 ± 0.06 times faster than 'parallel  seq ::: 10000000 &gt;/dev/null '

A large part of the juggling in the API is to make the API safer for
its maintenance and consumers alike.

For the maintenance of the API we e.g. avoid malloc()-ing the
"pp-&gt;pfd", ensuring that SANITIZE=address and other similar tools will
catch any unexpected misuse.

For API consumers we take pains to never pass the non-NULL "out"
buffer to an API user that provided the "ungroup" option. The
resulting code in t/helper/test-run-command.c isn't typical of such a
user, i.e. they'd typically use one mode or the other, and would know
whether they'd provided "ungroup" or not.

We could also avoid the strbuf_init() for "buffered_output" by having
"struct parallel_processes" use a static PARALLEL_PROCESSES_INIT
initializer, but let's leave that cleanup for later.

Using a global "run_processes_parallel_ungroup" variable to enable
this option is rather nasty, but is being done here to produce as
minimal of a change as possible for a subsequent regression fix. This
change is extracted from a larger initial version[1] which ends up
with a better end-state for the API, but in doing so needed to modify
all existing callers of the API. Let's defer that for now, and
narrowly focus on what we need for fixing the regression in the
subsequent commit.

It's safe to do this with a global variable because:

 A) hook.c is the only user of it that sets it to non-zero, and before
    we'll get any other API users we'll refactor away this method of
    passing in the option, i.e. re-roll [1].

 B) Even if hook.c wasn't the only user we don't have callers of this
    API that concurrently invoke this parallel process starting API
    itself in parallel.

As noted above "A" &amp;&amp; "B" are rather nasty, and we don't want to live
with those caveats long-term, but for now they should be an acceptable
compromise.

1. https://lore.kernel.org/git/cover-v2-0.8-00000000000-20220518T195858Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>run-command API users: use "env" not "env_array" in comments &amp; names</title>
<updated>2022-06-02T21:31:27Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2022-06-02T09:09:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b3193252c4278e9039fbb896a35f84abc1fb5aac'/>
<id>urn:sha1:b3193252c4278e9039fbb896a35f84abc1fb5aac</id>
<content type='text'>
Follow-up on a preceding commit which changed all references to the
"env_array" when referring to the "struct child_process" member. These
changes are all unnecessary for the compiler, but help the code's
human readers.

All the comments that referred to "env_array" have now been updated,
as well as function names and variables that had "env_array" in their
name, they now refer to "env".

In addition the "out" name for the submodule.h prototype was
inconsistent with the function definition's use of "env_array" in
submodule.c. Both of them use "env" now.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>run-command API: rename "env_array" to "env"</title>
<updated>2022-06-02T21:31:16Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2022-06-02T09:09:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=29fda24dd11e90583f3ea9ff2f90ee9acacd7792'/>
<id>urn:sha1:29fda24dd11e90583f3ea9ff2f90ee9acacd7792</id>
<content type='text'>
Start following-up on the rename mentioned in c7c4bdeccf3 (run-command
API: remove "env" member, always use "env_array", 2021-11-25) of
"env_array" to "env".

The "env_array" name was picked in 19a583dc39e (run-command: add
env_array, an optional argv_array for env, 2014-10-19) because "env"
was taken. Let's not forever keep the oddity of "*_array" for this
"struct strvec", but not for its "args" sibling.

This commit is almost entirely made with a coccinelle rule[1]. The
only manual change here is in run-command.h to rename the struct
member itself and to change "env_array" to "env" in the
CHILD_PROCESS_INIT initializer.

The rest of this is all a result of applying [1]:

 * make contrib/coccinelle/run_command.cocci.patch
 * patch -p1 &lt;contrib/coccinelle/run_command.cocci.patch
 * git add -u

1. cat contrib/coccinelle/run_command.pending.cocci
   @@
   struct child_process E;
   @@
   - E.env_array
   + E.env

   @@
   struct child_process *E;
   @@
   - E-&gt;env_array
   + E-&gt;env

I've avoided changing any comments and derived variable names here,
that will all be done in the next commit.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>run-command.h: remove always unused "clean_on_exit_handler_cbdata"</title>
<updated>2022-04-01T17:16:03Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2022-03-31T01:45:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=551f5022019803cceb0fa3943f696fae32f093f9'/>
<id>urn:sha1:551f5022019803cceb0fa3943f696fae32f093f9</id>
<content type='text'>
Remove a "struct child_process" member added in
ac2fbaa674c (run-command: add clean_on_exit_handler, 2016-10-16), but
which was never used.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>run-command: remove old run_hook_{le,ve}() hook API</title>
<updated>2022-01-07T23:19:35Z</updated>
<author>
<name>Emily Shaffer</name>
<email>emilyshaffer@google.com</email>
</author>
<published>2021-12-22T03:59:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=95ba86a203213fb828de096dc0dba18ce94598f7'/>
<id>urn:sha1:95ba86a203213fb828de096dc0dba18ce94598f7</id>
<content type='text'>
The new hook.h library has replaced all run-command.h hook-related
functionality. So let's delete this dead code.

Signed-off-by: Emily Shaffer &lt;emilyshaffer@google.com&gt;
Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Acked-by: Emily Shaffer &lt;emilyshaffer@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>read-cache: convert post-index-change to use hook.h</title>
<updated>2022-01-07T23:19:35Z</updated>
<author>
<name>Emily Shaffer</name>
<email>emilyshaffer@google.com</email>
</author>
<published>2021-12-22T03:59:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=dbb1c61365baabf4e74c6ef2eee4c4a520056c1d'/>
<id>urn:sha1:dbb1c61365baabf4e74c6ef2eee4c4a520056c1d</id>
<content type='text'>
Move the post-index-change hook away from run-command.h to and over to
the new hook.h library.

This removes the last direct user of "run_hook_ve()" outside of
run-command.c ("run_hook_le()" still uses it). So we can make the
function static now. A subsequent commit will remove this code
entirely when "run_hook_le()" itself goes away.

Signed-off-by: Emily Shaffer &lt;emilyshaffer@google.com&gt;
Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Acked-by: Emily Shaffer &lt;emilyshaffer@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>run-command API: remove "env" member, always use "env_array"</title>
<updated>2021-11-26T06:15:08Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2021-11-25T22:52:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c7c4bdeccf3e737e6e674cd9f0828922e629ab06'/>
<id>urn:sha1:c7c4bdeccf3e737e6e674cd9f0828922e629ab06</id>
<content type='text'>
Remove the "env" member from "struct child_process" in favor of always
using the "env_array". As with the preceding removal of "argv" in
favor of "args" this gets rid of current and future oddities around
memory management at the API boundary (see the amended API docs).

For some of the conversions we can replace patterns like:

    child.env = env-&gt;v;

With:

    strvec_pushv(&amp;child.env_array, env-&gt;v);

But for others we need to guard the strvec_pushv() with a NULL check,
since we're not passing in the "v" member of a "struct strvec",
e.g. in the case of tmp_objdir_env()'s return value.

Ideally we'd rename the "env_array" member to simply "env" as a
follow-up, since it and "args" are now inconsistent in not having an
"_array" suffix, and seemingly without any good reason, unless we look
at the history of how they came to be.

But as we've currently got 122 in-tree hits for a "git grep env_array"
let's leave that for now (and possibly forever). Doing that rename
would be too disruptive.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>run-command API: remove "argv" member, always use "args"</title>
<updated>2021-11-26T06:15:07Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2021-11-25T22:52:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d3b2159712019a06f1f495d3e42bd6aa6e76e848'/>
<id>urn:sha1:d3b2159712019a06f1f495d3e42bd6aa6e76e848</id>
<content type='text'>
Remove the "argv" member from the run-command API, ever since "args"
was added in c460c0ecdca (run-command: store an optional argv_array,
2014-05-15) being able to provide either "argv" or "args" has led to
some confusion and bugs.

If we hadn't gone in that direction and only had an "argv" our
problems wouldn't have been solved either, as noted in [1] (and in the
documentation amended here) it comes with inherent memory management
issues: The caller would have to hang on to the "argv" until the
run-command API was finished. If the "argv" was an argument to main()
this wasn't an issue, but if it it was manually constructed using the
API might be painful.

We also have a recent report[2] of a user of the API segfaulting,
which is a direct result of it being complex to use. This commit
addresses the root cause of that bug.

This change is larger than I'd like, but there's no easy way to avoid
it that wouldn't involve even more verbose intermediate steps. We use
the "argv" as the source of truth over the "args", so we need to
change all parts of run-command.[ch] itself, as well as the trace2
logging at the same time.

The resulting Windows-specific code in start_command() is a bit nasty,
as we're now assigning to a strvec's "v" member, instead of to our own
"argv". There was a suggestion of some alternate approaches in reply
to an earlier version of this commit[3], but let's leave larger a
larger and needless refactoring of this code for now.

1. http://lore.kernel.org/git/YT6BnnXeAWn8BycF@coredump.intra.peff.net
2. https://lore.kernel.org/git/20211120194048.12125-1-ematsumiya@suse.de/
3. https://lore.kernel.org/git/patch-5.5-ea1011f7473-20211122T153605Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
