<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/commit-reach.h, branch v2.22.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.22.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.22.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2019-02-05T22:26:09Z</updated>
<entry>
<title>Merge branch 'sb/more-repo-in-api'</title>
<updated>2019-02-05T22:26:09Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2019-02-05T22:26:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b99a579f8e434a7757f90895945b5711b3f159d5'/>
<id>urn:sha1:b99a579f8e434a7757f90895945b5711b3f159d5</id>
<content type='text'>
The in-core repository instances are passed through more codepaths.

* sb/more-repo-in-api: (23 commits)
  t/helper/test-repository: celebrate independence from the_repository
  path.h: make REPO_GIT_PATH_FUNC repository agnostic
  commit: prepare free_commit_buffer and release_commit_memory for any repo
  commit-graph: convert remaining functions to handle any repo
  submodule: don't add submodule as odb for push
  submodule: use submodule repos for object lookup
  pretty: prepare format_commit_message to handle arbitrary repositories
  commit: prepare logmsg_reencode to handle arbitrary repositories
  commit: prepare repo_unuse_commit_buffer to handle any repo
  commit: prepare get_commit_buffer to handle any repo
  commit-reach: prepare in_merge_bases[_many] to handle any repo
  commit-reach: prepare get_merge_bases to handle any repo
  commit-reach.c: allow get_merge_bases_many_0 to handle any repo
  commit-reach.c: allow remove_redundant to handle any repo
  commit-reach.c: allow merge_bases_many to handle any repo
  commit-reach.c: allow paint_down_to_common to handle any repo
  commit: allow parse_commit* to handle any repo
  object: parse_object to honor its repository argument
  object-store: prepare has_{sha1, object}_file to handle any repo
  object-store: prepare read_object_file to deal with any repo
  ...
</content>
</entry>
<entry>
<title>commit-reach: prepare in_merge_bases[_many] to handle any repo</title>
<updated>2018-11-14T08:22:40Z</updated>
<author>
<name>Stefan Beller</name>
<email>sbeller@google.com</email>
</author>
<published>2018-11-14T00:12:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4d5430f7479d45c3fca088984a08ec093f870b5b'/>
<id>urn:sha1:4d5430f7479d45c3fca088984a08ec093f870b5b</id>
<content type='text'>
Signed-off-by: Stefan Beller &lt;sbeller@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-reach: prepare get_merge_bases to handle any repo</title>
<updated>2018-11-14T08:22:40Z</updated>
<author>
<name>Stefan Beller</name>
<email>sbeller@google.com</email>
</author>
<published>2018-11-14T00:12:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=21a9651ba3fb350f48a53bf885a225bf6b71cac3'/>
<id>urn:sha1:21a9651ba3fb350f48a53bf885a225bf6b71cac3</id>
<content type='text'>
Similarly to previous patches, the get_merge_base functions are used
often in the code base, which makes migrating them hard.

Implement the new functions, prefixed with 'repo_' and hide the old
functions behind a wrapper macro.

Signed-off-by: Stefan Beller &lt;sbeller@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ds/add-missing-tags'</title>
<updated>2018-11-13T13:37:24Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-11-13T13:37:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=291123e69babcbbab720705a1bc1e6f0ba3012a4'/>
<id>urn:sha1:291123e69babcbbab720705a1bc1e6f0ba3012a4</id>
<content type='text'>
The history traversal used to implement the tag-following has been
optimized by introducing a new helper.

* ds/add-missing-tags:
  remote: make add_missing_tags() linear
  test-reach: test get_reachable_subset
  commit-reach: implement get_reachable_subset
</content>
</entry>
<entry>
<title>Merge branch 'rj/header-cleanup'</title>
<updated>2018-11-06T06:50:23Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-11-06T06:50:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=6b37389f8569780a0fffcb072047aa3e0e376945'/>
<id>urn:sha1:6b37389f8569780a0fffcb072047aa3e0e376945</id>
<content type='text'>
Code cleanup.

* rj/header-cleanup:
  commit-reach.h: add missing declarations (hdr-check)
  ewok_rlw.h: add missing 'inline' to function definition
  fetch-object.h: add missing declaration (hdr-check)
</content>
</entry>
<entry>
<title>commit-reach: implement get_reachable_subset</title>
<updated>2018-11-02T15:12:06Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2018-11-02T13:14:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=fcb2c0769db54022b5bf3ed134623fbab48cdc20'/>
<id>urn:sha1:fcb2c0769db54022b5bf3ed134623fbab48cdc20</id>
<content type='text'>
The existing reachability algorithms in commit-reach.c focus on
finding merge-bases or determining if all commits in a set X can
reach at least one commit in a set Y. However, for two commits sets
X and Y, we may also care about which commits in Y are reachable
from at least one commit in X.

Implement get_reachable_subset() which answers this question. Given
two arrays of commits, 'from' and 'to', return a commit_list with
every commit from the 'to' array that is reachable from at least
one commit in the 'from' array.

The algorithm is a simple walk starting at the 'from' commits, using
the PARENT2 flag to indicate "this commit has already been added to
the walk queue". By marking the 'to' commits with the PARENT1 flag,
we can determine when we see a commit from the 'to' array. We remove
the PARENT1 flag as we add that commit to the result list to avoid
duplicates.

The order of the resulting list is a reverse of the order that the
commits are discovered in the walk.

There are a couple shortcuts to avoid walking more than we need:

1. We determine the minimum generation number of commits in the
   'to' array. We do not walk commits with generation number
   below this minimum.

2. We count how many distinct commits are in the 'to' array, and
   decrement this count when we discover a 'to' commit during the
   walk. If this number reaches zero, then we can terminate the
   walk.

Tests will be added using the 'test-tool reach' helper in a
subsequent commit.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-reach.h: add missing declarations (hdr-check)</title>
<updated>2018-10-29T01:14:21Z</updated>
<author>
<name>Ramsay Jones</name>
<email>ramsay@ramsayjones.plus.com</email>
</author>
<published>2018-10-27T01:53:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=1406725b881900074a032e847e2d12a6a059d7e9'/>
<id>urn:sha1:1406725b881900074a032e847e2d12a6a059d7e9</id>
<content type='text'>
Add the necessary #includes and forward declarations to allow the header
file to pass the 'hdr-check' target.

Note that, since this header includes the commit-slab implementation
header file (indirectly via commit-slab.h), some of the commit-slab
inline functions (e.g contains_cache_at_peek()) will not compile without
the complete type of 'struct commit'. Hence, we replace the forward
declaration of 'struct commit' with the an #include of the 'commit.h'
header file.

It is possible, using the 'commit-slab-{decl,impl}.h' files, to avoid
this inclusion of the 'commit.h' header. Commit a9f1f1f9f8 ("commit-slab.h:
code split", 2018-05-19) separated the commit-slab interface from its
implementation, to allow for the definition of a public commit-slab data
structure. This enabled us to avoid including the commit-slab implementation
in a header file, which could result in the replication of the commit-slab
functions in each compilation unit in which it was included.

Indeed, if you compile with optimizations disabled, then run this script:

  $ cat -n dup-static.sh
       1 #!/bin/sh
       2
       3 nm $1 | grep ' t ' | cut -d' ' -f3 | sort | uniq -c |
       4 	sort -rn | grep -v '      1'
  $

  $ ./dup-static.sh git | grep contains
       24 init_contains_cache_with_stride
       24 init_contains_cache
       24 contains_cache_peek
       24 contains_cache_at_peek
       24 contains_cache_at
       24 clear_contains_cache
  $

you will find 24 copies of the commit-slab routines for the contains_cache.
Of course, when you enable optimizations again, these duplicate static
functions (mostly) disappear. Compiling with gcc at -O2, leaves two static
functions, thus:

  $ nm commit-reach.o | grep contains_cache
  0000000000000870 t contains_cache_at_peek.isra.1.constprop.6
  $ nm ref-filter.o | grep contains_cache
  00000000000002b0 t clear_contains_cache.isra.14
  $

However, using a shared 'contains_cache' would result in all six of the
above functions as external public functions in the git binary. At present,
only three of these functions are actually called, so the trade-off
seems to favour letting the compiler inline the commit-slab functions.

Signed-off-by: Ramsay Jones &lt;ramsay@ramsayjones.plus.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>headers: normalize the spelling of some header guards</title>
<updated>2018-10-18T04:39:35Z</updated>
<author>
<name>Ramsay Jones</name>
<email>ramsay@ramsayjones.plus.com</email>
</author>
<published>2018-10-17T22:13:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0009d3501b87bd02f6c6faba93cf991d701fa5a5'/>
<id>urn:sha1:0009d3501b87bd02f6c6faba93cf991d701fa5a5</id>
<content type='text'>
Signed-off-by: Ramsay Jones &lt;ramsay@ramsayjones.plus.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-reach: make can_all_from_reach... linear</title>
<updated>2018-07-20T22:38:56Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2018-07-20T16:33:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4fbcca4effc1c6f8431120f88f5a4bd1c8e38ca3'/>
<id>urn:sha1:4fbcca4effc1c6f8431120f88f5a4bd1c8e38ca3</id>
<content type='text'>
The can_all_from_reach_with_flags() algorithm is currently quadratic in
the worst case, because it calls the reachable() method for every 'from'
without tracking which commits have already been walked or which can
already reach a commit in 'to'.

Rewrite the algorithm to walk each commit a constant number of times.

We also add some optimizations that should work for the main consumer of
this method: fetch negotitation (haves/wants).

The first step includes using a depth-first-search (DFS) from each
'from' commit, sorted by ascending generation number. We do not walk
beyond the minimum generation number or the minimum commit date. This
DFS is likely to be faster than the existing reachable() method because
we expect previous ref values to be along the first-parent history.

If we find a target commit, then we mark everything in the DFS stack as
a RESULT. This expands the set of targets for the other 'from' commits.
We also mark the visited commits using 'assign_flag' to prevent re-
walking the same commits.

We still need to clear our flags at the end, which is why we will have a
total of three visits to each commit.

Performance was measured on the Linux repository using
'test-tool reach can_all_from_reach'. The input included rows seeded by
tag values. The "small" case included X-rows as v4.[0-9]* and Y-rows as
v3.[0-9]*. This mimics a (very large) fetch that says "I have all major
v3 releases and want all major v4 releases." The "large" case included
X-rows as "v4.*" and Y-rows as "v3.*". This adds all release-candidate
tags to the set, which does not greatly increase the number of objects
that are considered, but does increase the number of 'from' commits,
demonstrating the quadratic nature of the previous code.

Small Case:

Before: 1.52 s
 After: 0.26 s

Large Case:

Before: 3.50 s
 After: 0.27 s

Note how the time increases between the two cases in the two versions.
The new code increases relative to the number of commits that need to be
walked, but not directly relative to the number of 'from' commits.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>test-reach: test can_all_from_reach_with_flags</title>
<updated>2018-07-20T22:38:56Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2018-07-20T16:33:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=1792bc125069e3e5b59f0158e259335a07aa7cf5'/>
<id>urn:sha1:1792bc125069e3e5b59f0158e259335a07aa7cf5</id>
<content type='text'>
The can_all_from_reach_with_flags method is used by ok_to_give_up in
upload-pack.c to see if we have done enough negotiation during a fetch.
This method is intentionally created to preserve state between calls to
assist with stateful negotiation, such as over SSH.

To make this method testable, add a new can_all_from_reach method that
does the initial setup and final tear-down. We will later use this
method in production code. Call the method from 'test-tool reach' for
now.

Since this is a many-to-many reachability query, add a new type of input
to the 'test-tool reach' input format. Lines "Y:&lt;committish&gt;" create a
list of commits to be the reachability targets from the commits in the
'X' list. In the context of fetch negotiation, the 'X' commits are the
'want' commits and the 'Y' commits are the 'have' commits.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
