<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/object.h, branch v2.22.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.22.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.22.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2019-05-05T06:20:06Z</updated>
<entry>
<title>*.[ch]: remove extern from function declarations using spatch</title>
<updated>2019-05-05T06:20:06Z</updated>
<author>
<name>Denton Liu</name>
<email>liu.denton@gmail.com</email>
</author>
<published>2019-04-29T08:28:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=554544276a604c144df45efcb060c80aa322088c'/>
<id>urn:sha1:554544276a604c144df45efcb060c80aa322088c</id>
<content type='text'>
There has been a push to remove extern from function declarations.
Remove some instances of "extern" for function declarations which are
caught by Coccinelle. Note that Coccinelle has some difficulty with
processing functions with `__attribute__` or varargs so some `extern`
declarations are left behind to be dealt with in a future patch.

This was the Coccinelle patch used:

	@@
	type T;
	identifier f;
	@@
	- extern
	  T f(...);

and it was run with:

	$ git ls-files \*.{c,h} |
		grep -v ^compat/ |
		xargs spatch --sp-file contrib/coccinelle/noextern.cocci --in-place

Files under `compat/` are intentionally excluded as some are directly
copied from external sources and we should avoid churning them as much
as possible.

Signed-off-by: Denton Liu &lt;liu.denton@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>revision.c: generation-based topo-order algorithm</title>
<updated>2018-11-02T03:14:22Z</updated>
<author>
<name>Derrick Stolee</name>
<email>stolee@gmail.com</email>
</author>
<published>2018-11-01T13:46:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b45424181e9e8b2284a48c6db7b8db635bbfccc8'/>
<id>urn:sha1:b45424181e9e8b2284a48c6db7b8db635bbfccc8</id>
<content type='text'>
The current --topo-order algorithm requires walking all
reachable commits up front, topo-sorting them, all before
outputting the first value. This patch introduces a new
algorithm which uses stored generation numbers to
incrementally walk in topo-order, outputting commits as
we go. This can dramatically reduce the computation time
to write a fixed number of commits, such as when limiting
with "-n &lt;N&gt;" or filling the first page of a pager.

When running a command like 'git rev-list --topo-order HEAD',
Git performed the following steps:

1. Run limit_list(), which parses all reachable commits,
   adds them to a linked list, and distributes UNINTERESTING
   flags. If all unprocessed commits are UNINTERESTING, then
   it may terminate without walking all reachable commits.
   This does not occur if we do not specify UNINTERESTING
   commits.

2. Run sort_in_topological_order(), which is an implementation
   of Kahn's algorithm. It first iterates through the entire
   set of important commits and computes the in-degree of each
   (plus one, as we use 'zero' as a special value here). Then,
   we walk the commits in priority order, adding them to the
   priority queue if and only if their in-degree is one. As
   we remove commits from this priority queue, we decrement the
   in-degree of their parents.

3. While we are peeling commits for output, get_revision_1()
   uses pop_commit on the full list of commits computed by
   sort_in_topological_order().

In the new algorithm, these three steps correspond to three
different commit walks. We run these walks simultaneously,
and advance each only as far as necessary to satisfy the
requirements of the 'higher order' walk. We know when we can
pause each walk by using generation numbers from the commit-
graph feature.

Recall that the generation number of a commit satisfies:

* If the commit has at least one parent, then the generation
  number is one more than the maximum generation number among
  its parents.

* If the commit has no parent, then the generation number is one.

There are two special generation numbers:

* GENERATION_NUMBER_INFINITY: this value is 0xffffffff and
  indicates that the commit is not stored in the commit-graph and
  the generation number was not previously calculated.

* GENERATION_NUMBER_ZERO: this value (0) is a special indicator
  to say that the commit-graph was generated by a version of Git
  that does not compute generation numbers (such as v2.18.0).

Since we use generation_numbers_enabled() before using the new
algorithm, we do not need to worry about GENERATION_NUMBER_ZERO.
However, the existence of GENERATION_NUMBER_INFINITY implies the
following weaker statement than the usual we expect from
generation numbers:

    If A and B are commits with generation numbers gen(A) and
    gen(B) and gen(A) &lt; gen(B), then A cannot reach B.

Thus, we will walk in each of our stages until the "maximum
unexpanded generation number" is strictly lower than the
generation number of a commit we are about to use.

The walks are as follows:

1. EXPLORE: using the explore_queue priority queue (ordered by
   maximizing the generation number), parse each reachable
   commit until all commits in the queue have generation
   number strictly lower than needed. During this walk, update
   the UNINTERESTING flags as necessary.

2. INDEGREE: using the indegree_queue priority queue (ordered
   by maximizing the generation number), add one to the in-
   degree of each parent for each commit that is walked. Since
   we walk in order of decreasing generation number, we know
   that discovering an in-degree value of 0 means the value for
   that commit was not initialized, so should be initialized to
   two. (Recall that in-degree value "1" is what we use to say a
   commit is ready for output.) As we iterate the parents of a
   commit during this walk, ensure the EXPLORE walk has walked
   beyond their generation numbers.

3. TOPO: using the topo_queue priority queue (ordered based on
   the sort_order given, which could be commit-date, author-
   date, or typical topo-order which treats the queue as a LIFO
   stack), remove a commit from the queue and decrement the
   in-degree of each parent. If a parent has an in-degree of
   one, then we add it to the topo_queue. Before we decrement
   the in-degree, however, ensure the INDEGREE walk has walked
   beyond that generation number.

The implementations of these walks are in the following methods:

* explore_walk_step and explore_to_depth
* indegree_walk_step and compute_indegrees_to_depth
* next_topo_commit and expand_topo_walk

These methods have some patterns that may seem strange at first,
but they are probably carry-overs from their equivalents in
limit_list and sort_in_topological_order.

One thing that is missing from this implementation is a proper
way to stop walking when the entire queue is UNINTERESTING, so
this implementation is not enabled by comparisions, such as in
'git rev-list --topo-order A..B'. This can be updated in the
future.

In my local testing, I used the following Git commands on the
Linux repository in three modes: HEAD~1 with no commit-graph,
HEAD~1 with a commit-graph, and HEAD with a commit-graph. This
allows comparing the benefits we get from parsing commits from
the commit-graph and then again the benefits we get by
restricting the set of commits we walk.

Test: git rev-list --topo-order -100 HEAD
HEAD~1, no commit-graph: 6.80 s
HEAD~1, w/ commit-graph: 0.77 s
  HEAD, w/ commit-graph: 0.02 s

Test: git rev-list --topo-order -100 HEAD -- tools
HEAD~1, no commit-graph: 9.63 s
HEAD~1, w/ commit-graph: 6.06 s
  HEAD, w/ commit-graph: 0.06 s

This speedup is due to a few things. First, the new generation-
number-enabled algorithm walks commits on order of the number of
results output (subject to some branching structure expectations).
Since we limit to 100 results, we are running a query similar to
filling a single page of results. Second, when specifying a path,
we must parse the root tree object for each commit we walk. The
previous benefits from the commit-graph are entirely from reading
the commit-graph instead of parsing commits. Since we need to
parse trees for the same number of commits as before, we slow
down significantly from the non-path-based query.

For the test above, I specifically selected a path that is changed
frequently, including by merge commits. A less-frequently-changed
path (such as 'README') has similar end-to-end time since we need
to walk the same number of commits (before determining we do not
have 100 hits). However, get the benefit that the output is
presented to the user as it is discovered, much the same as a
normal 'git log' command (no '--topo-order'). This is an improved
user experience, even if the command has the same runtime.

Helped-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ds/reachable'</title>
<updated>2018-09-17T20:53:52Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-09-17T20:53:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=1b7a91da71d42759dfb83fa3a17be54ad01f0132'/>
<id>urn:sha1:1b7a91da71d42759dfb83fa3a17be54ad01f0132</id>
<content type='text'>
The code for computing history reachability has been shuffled,
obtained a bunch of new tests to cover them, and then being
improved.

* ds/reachable:
  commit-reach: correct accidental #include of C file
  commit-reach: use can_all_from_reach
  commit-reach: make can_all_from_reach... linear
  commit-reach: replace ref_newer logic
  test-reach: test commit_contains
  test-reach: test can_all_from_reach_with_flags
  test-reach: test reduce_heads
  test-reach: test get_merge_bases_many
  test-reach: test is_descendant_of
  test-reach: test in_merge_bases
  test-reach: create new test tool for ref_newer
  commit-reach: move can_all_from_reach_with_flags
  upload-pack: generalize commit date cutoff
  upload-pack: refactor ok_to_give_up()
  upload-pack: make reachable() more generic
  commit-reach: move commit_contains from ref-filter
  commit-reach: move ref_newer from remote.c
  commit.h: remove method declarations
  commit-reach: move walk methods from commit.c
</content>
</entry>
<entry>
<title>Add missing includes and forward declarations</title>
<updated>2018-08-15T18:52:09Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2018-08-15T17:54:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ef3ca95475ce467ae883cc8175ed40e6f7d27800'/>
<id>urn:sha1:ef3ca95475ce467ae883cc8175ed40e6f7d27800</id>
<content type='text'>
I looped over the toplevel header files, creating a temporary two-line C
program for each consisting of
  #include "git-compat-util.h"
  #include $HEADER
This patch is the result of manually fixing errors in compiling those
tiny programs.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sb/object-store-lookup'</title>
<updated>2018-08-02T22:30:42Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-08-02T22:30:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3a2a1dc17077a27ad1a89db27cb1b4b374f3b0ff'/>
<id>urn:sha1:3a2a1dc17077a27ad1a89db27cb1b4b374f3b0ff</id>
<content type='text'>
lookup_commit_reference() and friends have been updated to find
in-core object for a specific in-core repository instance.

* sb/object-store-lookup: (32 commits)
  commit.c: allow lookup_commit_reference to handle arbitrary repositories
  commit.c: allow lookup_commit_reference_gently to handle arbitrary repositories
  tag.c: allow deref_tag to handle arbitrary repositories
  object.c: allow parse_object to handle arbitrary repositories
  object.c: allow parse_object_buffer to handle arbitrary repositories
  commit.c: allow get_cached_commit_buffer to handle arbitrary repositories
  commit.c: allow set_commit_buffer to handle arbitrary repositories
  commit.c: migrate the commit buffer to the parsed object store
  commit-slabs: remove realloc counter outside of slab struct
  commit.c: allow parse_commit_buffer to handle arbitrary repositories
  tag: allow parse_tag_buffer to handle arbitrary repositories
  tag: allow lookup_tag to handle arbitrary repositories
  commit: allow lookup_commit to handle arbitrary repositories
  tree: allow lookup_tree to handle arbitrary repositories
  blob: allow lookup_blob to handle arbitrary repositories
  object: allow lookup_object to handle arbitrary repositories
  object: allow object_as_type to handle arbitrary repositories
  tag: add repository argument to deref_tag
  tag: add repository argument to parse_tag_buffer
  tag: add repository argument to lookup_tag
  ...
</content>
</entry>
<entry>
<title>Merge branch 'jt/fetch-pack-negotiator'</title>
<updated>2018-08-02T22:30:41Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-08-02T22:30:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=af8ac73801eebf0f3690c3875751eb9e108ceda8'/>
<id>urn:sha1:af8ac73801eebf0f3690c3875751eb9e108ceda8</id>
<content type='text'>
Code restructuring and a small fix to transport protocol v2 during
fetching.

* jt/fetch-pack-negotiator:
  fetch-pack: introduce negotiator API
  fetch-pack: move common check and marking together
  fetch-pack: make negotiation-related vars local
  fetch-pack: use ref adv. to prune "have" sent
  fetch-pack: directly end negotiation if ACK ready
  fetch-pack: clear marks before re-marking
  fetch-pack: split up everything_local()
</content>
</entry>
<entry>
<title>Merge branch 'jt/partial-clone-fsck-connectivity'</title>
<updated>2018-07-24T21:50:47Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-07-24T21:50:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8fa8a4f1ec088ab937d1c460f3eb1701e51b8ffc'/>
<id>urn:sha1:8fa8a4f1ec088ab937d1c460f3eb1701e51b8ffc</id>
<content type='text'>
Partial clone support of "git clone" has been updated to correctly
validate the objects it receives from the other side.  The server
side has been corrected to send objects that are directly
requested, even if they may match the filtering criteria (e.g. when
doing a "lazy blob" partial clone).

* jt/partial-clone-fsck-connectivity:
  clone: check connectivity even if clone is partial
  upload-pack: send refs' objects despite "filter"
</content>
</entry>
<entry>
<title>commit-reach: move can_all_from_reach_with_flags</title>
<updated>2018-07-20T22:38:55Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2018-07-20T16:33:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ba3ca1edce70d9d2d0eea1ac69377ae786e9413a'/>
<id>urn:sha1:ba3ca1edce70d9d2d0eea1ac69377ae786e9413a</id>
<content type='text'>
There are several commit walks in the codebase. Group them together into
a new commit-reach.c file and corresponding header. After we group these
walks into one place, we can reduce duplicate logic by calling
equivalent methods.

The can_all_from_reach_with_flags method is used in a stateful way by
upload-pack.c. The parameters are very flexible, so we will be able to
use its commit walking logic for many other callers.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-reach: move walk methods from commit.c</title>
<updated>2018-07-20T22:38:54Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2018-07-20T16:33:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5227c385667cadd3b34668329016755181fa98ea'/>
<id>urn:sha1:5227c385667cadd3b34668329016755181fa98ea</id>
<content type='text'>
There are several commit walks in the codebase. Group them together into
a new commit-reach.c file and corresponding header. After we group these
walks into one place, we can reduce duplicate logic by calling
equivalent methods.

The method declarations in commit.h are not touched by this commit and
will be moved in a following commit. Many consumers need to point to
commit-reach.h and that would bloat this commit.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sb/object-store-grafts'</title>
<updated>2018-07-18T19:20:28Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-07-18T19:20:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=00624d608cc69bd62801c93e74d1ea7a7ddd6598'/>
<id>urn:sha1:00624d608cc69bd62801c93e74d1ea7a7ddd6598</id>
<content type='text'>
The conversion to pass "the_repository" and then "a_repository"
throughout the object access API continues.

* sb/object-store-grafts:
  commit: allow lookup_commit_graft to handle arbitrary repositories
  commit: allow prepare_commit_graft to handle arbitrary repositories
  shallow: migrate shallow information into the object parser
  path.c: migrate global git_path_* to take a repository argument
  cache: convert get_graft_file to handle arbitrary repositories
  commit: convert read_graft_file to handle arbitrary repositories
  commit: convert register_commit_graft to handle arbitrary repositories
  commit: convert commit_graft_pos() to handle arbitrary repositories
  shallow: add repository argument to is_repository_shallow
  shallow: add repository argument to check_shallow_file_for_update
  shallow: add repository argument to register_shallow
  shallow: add repository argument to set_alternate_shallow_file
  commit: add repository argument to lookup_commit_graft
  commit: add repository argument to prepare_commit_graft
  commit: add repository argument to read_graft_file
  commit: add repository argument to register_commit_graft
  commit: add repository argument to commit_graft_pos
  object: move grafts to object parser
  object-store: move object access functions to object-store.h
</content>
</entry>
</feed>
