<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/commit-reach.c, branch v2.30.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.30.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.30.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2020-10-02T17:26:31Z</updated>
<entry>
<title>commit-reach: fix in_merge_bases_many bug</title>
<updated>2020-10-02T17:26:31Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2020-10-02T14:58:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8791bf18414a37205127e184c04cad53a43aeff1'/>
<id>urn:sha1:8791bf18414a37205127e184c04cad53a43aeff1</id>
<content type='text'>
Way back in f9b8908b (commit.c: use generation numbers for
in_merge_bases(), 2018-05-01), a heuristic was used to short-circuit
the in_merge_bases() walk. This works just fine as long as the
caller is checking only two commits, but when there are multiple,
there is a possibility that this heuristic is _very wrong_.

Some code moves since then has changed this method to
repo_in_merge_bases_many() inside commit-reach.c. The heuristic
computes the minimum generation number of the "reference" list, then
compares this number to the generation number of the "commit".

In a recent topic, a test was added that used in_merge_bases_many()
to test if a commit was reachable from a number of commits pulled
from a reflog. However, this highlighted the problem: if any of the
reference commits have a smaller generation number than the given
commit, then the walk is skipped _even if there exist some with
higher generation number_.

This heuristic is wrong! It must check the MAXIMUM generation number
of the reference commits, not the MINIMUM.

This highlights a testing gap. t6600-test-reach.sh covers many
methods in commit-reach.c, including in_merge_bases() and
get_merge_bases_many(), but since these methods either restrict to
two input commits or actually look for the full list of merge bases,
they don't check this heuristic!

Add a possible input to "test-tool reach" that tests
in_merge_bases_many() and add tests to t6600-test-reach.sh that
cover this heuristic. This includes cases for the reference commits
having generation above and below the generation of the input commit,
but also having maximum generation below the generation of the input
commit.

The fix itself is to swap min_generation with a max_generation in
repo_in_merge_bases_many().

Reported-by: Srinidhi Kaushik &lt;shrinidhi.kaushik@gmail.com&gt;
Helped-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'cb/is-descendant-of'</title>
<updated>2020-07-07T05:09:16Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-07-07T05:09:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0258ed1e08e7973f1d6829db8d33109851067a91'/>
<id>urn:sha1:0258ed1e08e7973f1d6829db8d33109851067a91</id>
<content type='text'>
Code clean-up.

* cb/is-descendant-of:
  commit-reach: avoid is_descendant_of() shim
</content>
</entry>
<entry>
<title>Merge branch 'ak/commit-graph-to-slab'</title>
<updated>2020-07-07T05:09:14Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-07-07T05:09:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d80bea479daa1d49a6b2332c4731e2647c896c29'/>
<id>urn:sha1:d80bea479daa1d49a6b2332c4731e2647c896c29</id>
<content type='text'>
A few fields in "struct commit" that do not have to always be
present have been moved to commit slabs.

* ak/commit-graph-to-slab:
  commit-graph: minimize commit_graph_data_slab access
  commit: move members graph_pos, generation to a slab
  commit-graph: introduce commit_graph_data_slab
  object: drop parsed_object_pool-&gt;commit_count
</content>
</entry>
<entry>
<title>Merge branch 'rs/commit-reach-leakfix'</title>
<updated>2020-06-29T21:17:27Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-06-29T21:17:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=fa2c57d562921cdd3be5c7af1ee59622439bc8cd'/>
<id>urn:sha1:fa2c57d562921cdd3be5c7af1ee59622439bc8cd</id>
<content type='text'>
Leakfix.

* rs/commit-reach-leakfix:
  commit-reach: plug minor memory leak after using is_descendant_of()
</content>
</entry>
<entry>
<title>commit-reach: avoid is_descendant_of() shim</title>
<updated>2020-06-23T23:36:53Z</updated>
<author>
<name>Carlo Marcelo Arenas Belón</name>
<email>carenas@gmail.com</email>
</author>
<published>2020-06-23T18:42:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c1ea625f72110fda87fd225d7dff002befb82d9f'/>
<id>urn:sha1:c1ea625f72110fda87fd225d7dff002befb82d9f</id>
<content type='text'>
d91d6fbf26 (commit-reach: create repo_is_descendant_of(), 2020-06-17)
adds a repository aware version of is_descendant_of() and a backward
compatibility shim that is barely used.

Update all callers to directly use the new repo_is_descendant_of()
function instead; making the codebase simpler and pushing more
the_repository references higher up the stack.

Helped-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Carlo Marcelo Arenas Belón &lt;carenas@gmail.com&gt;
Reviewed-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-reach: plug minor memory leak after using is_descendant_of()</title>
<updated>2020-06-19T18:06:01Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2020-06-19T13:13:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d546fe2874ce8dc73cb0ac7541640fd202ec27c8'/>
<id>urn:sha1:d546fe2874ce8dc73cb0ac7541640fd202ec27c8</id>
<content type='text'>
ref_newer() builds a commit_list to pass a single potential ancestor to
is_descendant_of().  The latter leaves the list intact.  Release the
allocated memory after the call.

Signed-off-by: René Scharfe &lt;l.s.r@web.de&gt;
Acked-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph: minimize commit_graph_data_slab access</title>
<updated>2020-06-17T21:37:52Z</updated>
<author>
<name>Abhishek Kumar</name>
<email>abhishekkumar8222@gmail.com</email>
</author>
<published>2020-06-17T09:14:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c752ad09c4ea479e8d54d08637cc0e5709723208'/>
<id>urn:sha1:c752ad09c4ea479e8d54d08637cc0e5709723208</id>
<content type='text'>
In an earlier patch, multiple struct acccesses to `graph_pos` and
`generation` were auto-converted to multiple method calls.

Since the values are fixed and commit-slab access costly, we would be
better off with storing the values as a local variable and reusing it.

Signed-off-by: Abhishek Kumar &lt;abhishekkumar8222@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit: move members graph_pos, generation to a slab</title>
<updated>2020-06-17T21:37:30Z</updated>
<author>
<name>Abhishek Kumar</name>
<email>abhishekkumar8222@gmail.com</email>
</author>
<published>2020-06-17T09:14:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c49c82aa4c1036ba1629f73223cff53230e695f3'/>
<id>urn:sha1:c49c82aa4c1036ba1629f73223cff53230e695f3</id>
<content type='text'>
We remove members `graph_pos` and `generation` from the struct commit.
The default assignments in init_commit_node() are no longer valid,
which is fine as the slab helpers return appropriate default values and
the assignments are removed.

We will replace existing use of commit-&gt;generation and commit-&gt;graph_pos
by commit_graph_data_slab helpers using
`contrib/coccinelle/commit.cocci'.

Signed-off-by: Abhishek Kumar &lt;abhishekkumar8222@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-reach: use fast logic in repo_in_merge_base</title>
<updated>2020-06-17T20:49:38Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2020-06-17T17:24:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=80b8ada54756e972b01d20b98a388fbbc3a2fe58'/>
<id>urn:sha1:80b8ada54756e972b01d20b98a388fbbc3a2fe58</id>
<content type='text'>
The repo_is_descendant_of() method is aware of the existence of the
commit-graph file. It checks for generation_numbers_enabled() before
deciding on using can_all_from_reach() or repo_in_merge_bases()
depending on the situation. The reason here is that can_all_from_reach()
uses a depth-first search that is limited by the minimum generation
number of the target commits, and that algorithm can be very slow when
generation numbers are not present. The alternative uses
paint_down_to_common() which will walk the entire merge-base boundary,
which is typically slower.

This method is used by commands like "git tag --contains" and "git
branch --contains" for very fast results when a commit-graph file
exists. Unfortunately, it is _not_ used in commands like "git merge-base
--is-ancestor" which is doing an even simpler request.

This issue was raised recently [1] with respect to a change to how
generation numbers are stored, but was also reported much earlier [2]
before commit-reach.c existed to simplify these reachability queries.

[1] https://lore.kernel.org/git/20200607195347.GA8232@szeder.dev/
[2] https://lore.kernel.org/git/87608bawoa.fsf@evledraar.gmail.com/

The root cause is that builtin/merge-base.c has a method
handle_is_ancestor() that calls in_merge_bases(), an older version of
repo_in_merge_bases(). It would be better if we have every caller to
in_merge_bases() use the logic in can_all_from_reach() when possible.

This is where things get a little tricky: repo_is_descendant_of() calls
repo_in_merge_bases() in the non-generation numbers enabled case! If we
simply update repo_in_merge_bases() to call repo_is_descendant_of()
instead of repo_in_merge_bases_many(), then we will get a recursive call
loop. Thankfully, this is caught by the test suite in the default mode
(i.e. GIT_TEST_COMMIT_GRAPH=0).

The trick, then, is to make the non-generation number case for
repo_is_descendant_of() call repo_in_merge_bases_many() directly,
skipping the non-_many version. This allows us to take advantage of this
faster code path, when possible.

The easiest way to measure the performance impact is to test the
following command on the Linux kernel repository:

	git merge-base --is-ancestor &lt;A&gt; &lt;B&gt;

  | A    | B    | Time Before | Time After |
  |------|------|-------------|------------|
  | v3.0 | v5.7 | 0.459s      | 0.028s     |
  | v4.0 | v5.7 | 0.267s      | 0.021s     |
  | v5.0 | v5.7 | 0.074s      | 0.013s     |

Note that each of these samples return success. The old code performed
the same operation when &lt;A&gt; and &lt;B&gt; are swapped. However,
can_all_from_reach() will return immediately if the generation numbers
show that &lt;A&gt; has larger generation number than &lt;B&gt;. Thus, the time for
the swapped case is universally 0.004s in each case.

Reported-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Reported-by: SZEDER Gábor &lt;szeder.dev@gmail.com&gt;
Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-reach: create repo_is_descendant_of()</title>
<updated>2020-06-17T20:49:36Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2020-06-17T17:24:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d91d6fbf26663ab64c453411c2de6cf4ea754246'/>
<id>urn:sha1:d91d6fbf26663ab64c453411c2de6cf4ea754246</id>
<content type='text'>
The next change will make repo_in_merge_bases() depend on the logic in
is_descendant_of(), but we need to make the method independent of
the_repository first.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
