<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/object.h, branch v2.30.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.30.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.30.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2020-10-31T17:46:34Z</updated>
<entry>
<title>object: allow clear_commit_marks_all to handle any repo</title>
<updated>2020-10-31T17:46:34Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2020-10-31T12:46:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=cd8888452cd0e09c9efdc22da729023ae255b54b'/>
<id>urn:sha1:cd8888452cd0e09c9efdc22da729023ae255b54b</id>
<content type='text'>
Allow callers to specify the repository to use.  Rename the function to
repo_clear_commit_marks to document its new scope.  No functional change
intended.

Signed-off-by: René Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>maintenance: add auto condition for commit-graph task</title>
<updated>2020-09-17T18:30:05Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2020-09-17T18:11:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4ddc79b2dac18189e7ff18e99a6fabdd161f866c'/>
<id>urn:sha1:4ddc79b2dac18189e7ff18e99a6fabdd161f866c</id>
<content type='text'>
Instead of writing a new commit-graph in every 'git maintenance run
--auto' process (when maintenance.commit-graph.enalbed is configured to
be true), only write when there are "enough" commits not in a
commit-graph file.

This count is controlled by the maintenance.commit-graph.auto config
option.

To compute the count, use a depth-first search starting at each ref, and
leaving markers using the SEEN flag. If this count reaches the limit,
then terminate early and start the task. Otherwise, this operation will
peel every ref and parse the commit it points to. If these are all in
the commit-graph, then this is typically a very fast operation. Users
with many refs might feel a slow-down, and hence could consider updating
their limit to be very small. A negative value will force the step to
run every time.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'tb/fix-persistent-shallow' into master</title>
<updated>2020-07-09T21:00:44Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-07-09T21:00:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=24ecfdf206ee0e9e01f86d333d90d281fdfd12d0'/>
<id>urn:sha1:24ecfdf206ee0e9e01f86d333d90d281fdfd12d0</id>
<content type='text'>
When "fetch.writeCommitGraph" configuration is set in a shallow
repository and a fetch moves the shallow boundary, we wrote out
broken commit-graph files that do not match the reality, which has
been corrected.

* tb/fix-persistent-shallow:
  commit.c: don't persist substituted parents when unshallowing
</content>
</entry>
<entry>
<title>commit.c: don't persist substituted parents when unshallowing</title>
<updated>2020-07-08T23:13:46Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2020-07-08T21:10:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ce16364e897e70a17bd1864b6007719eeec959f3'/>
<id>urn:sha1:ce16364e897e70a17bd1864b6007719eeec959f3</id>
<content type='text'>
Since 37b9dcabfc (shallow.c: use '{commit,rollback}_shallow_file',
2020-04-22), Git knows how to reset stat-validity checks for the
$GIT_DIR/shallow file, allowing it to change between a shallow and
non-shallow state in the same process (e.g., in the case of 'git fetch
--unshallow').

However, when $GIT_DIR/shallow changes, Git does not alter or remove any
grafts (nor substituted parents) in memory.

This comes up in a "git fetch --unshallow" with fetch.writeCommitGraph
set to true. Ordinarily in a shallow repository (and before 37b9dcabfc,
even in this case), commit_graph_compatible() would return false,
indicating that the repository should not be used to write a
commit-graphs (since commit-graph files cannot represent a shallow
history). But since 37b9dcabfc, in an --unshallow operation that check
succeeds.

Thus even though the repository isn't shallow any longer (that is, we
have all of the objects), the in-core representation of those objects
still has munged parents at the shallow boundaries.  When the
commit-graph write proceeds, we use the incorrect parentage, producing
wrong results.

There are two ways for a user to work around this: either (1) set
'fetch.writeCommitGraph' to 'false', or (2) drop the commit-graph after
unshallowing.

One way to fix this would be to reset the parsed object pool entirely
(flushing the cache and thus preventing subsequent reads from modifying
their parents) after unshallowing. That would produce a problem when
callers have a now-stale reference to the old pool, and so this patch
implements a different approach. Instead, attach a new bit to the pool,
'substituted_parent', which indicates if the repository *ever* stored a
commit which had its parents modified (i.e., the shallow boundary
prior to unshallowing).

This bit needs to be sticky because all reads subsequent to modifying a
commit's parents are unreliable when unshallowing. Modify the check in
'commit_graph_compatible' to take this bit into account, and correctly
avoid generating commit-graphs in this case, thus solving the bug.

Helped-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Helped-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Reported-by: Jay Conrod &lt;jayconrod@google.com&gt;
Reviewed-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'rs/pack-bits-in-object-better'</title>
<updated>2020-07-07T05:09:17Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-07-07T05:09:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=480e78595ea531482018826d35fed0fb0529afa2'/>
<id>urn:sha1:480e78595ea531482018826d35fed0fb0529afa2</id>
<content type='text'>
By renumbering object flag bits, "struct object" managed to lose
bloated inter-field padding.

* rs/pack-bits-in-object-better:
  revision: reallocate TOPO_WALK object flags
</content>
</entry>
<entry>
<title>Merge branch 'bc/http-push-flagsfix'</title>
<updated>2020-07-07T05:09:17Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-07-07T05:09:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=67d99b82de27e98bfec7d174ef4b6369d5b85d7a'/>
<id>urn:sha1:67d99b82de27e98bfec7d174ef4b6369d5b85d7a</id>
<content type='text'>
The code to push changes over "dumb" HTTP had a bad interaction
with the commit reachability code due to incorrect allocation of
object flag bits, which has been corrected.

* bc/http-push-flagsfix:
  http-push: ensure unforced pushes fail when data would be lost
</content>
</entry>
<entry>
<title>revision: reallocate TOPO_WALK object flags</title>
<updated>2020-06-24T16:09:44Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2020-06-24T13:05:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=23c4319f0d1bd68a15368e97a4612768333ff486'/>
<id>urn:sha1:23c4319f0d1bd68a15368e97a4612768333ff486</id>
<content type='text'>
The bit fields in struct object have an unfortunate layout.  Here's what
pahole reports on x86_64 GNU/Linux:

struct object {
	unsigned int               parsed:1;             /*     0: 0  4 */
	unsigned int               type:3;               /*     0: 1  4 */

	/* XXX 28 bits hole, try to pack */

	/* Force alignment to the next boundary: */
	unsigned int               :0;

	unsigned int               flags:29;             /*     4: 0  4 */

	/* XXX 3 bits hole, try to pack */

	struct object_id           oid;                  /*     8    32 */

	/* size: 40, cachelines: 1, members: 4 */
	/* sum members: 32 */
	/* sum bitfield members: 33 bits, bit holes: 2, sum bit holes: 31 bits */
	/* last cacheline: 40 bytes */
};

Notice the 1+3+29=33 bits in bit fields and 28+3=31 bits in holes.

There are holes inside the flags bit field as well -- while some object
flags are used for more than one purpose, 22, 23 and 24 are still free.
Use  23 and 24 instead of 27 and 28 for TOPO_WALK_EXPLORED and
TOPO_WALK_INDEGREE.  This allows us to reduce FLAG_BITS by one so that
all bitfields combined fit into a single 32-bit slot:

struct object {
	unsigned int               parsed:1;             /*     0: 0  4 */
	unsigned int               type:3;               /*     0: 1  4 */
	unsigned int               flags:28;             /*     0: 4  4 */
	struct object_id           oid;                  /*     4    32 */

	/* size: 36, cachelines: 1, members: 4 */
	/* last cacheline: 36 bytes */
};

With this tight packing the size of struct object is reduced by 10%.
Other architectures probably benefit as well.

Signed-off-by: René Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>http-push: ensure unforced pushes fail when data would be lost</title>
<updated>2020-06-23T22:40:59Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2020-06-23T21:52:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=64472d15e90f14d920eade6bb1d1c4a01fca2280'/>
<id>urn:sha1:64472d15e90f14d920eade6bb1d1c4a01fca2280</id>
<content type='text'>
When we push using the DAV-based protocol, the client is the one that
performs the ref updates and therefore makes the checks to see whether
an unforced push should be allowed.  We make this check by determining
if either (a) we lack the object file for the old value of the ref or
(b) the new value of the ref is not newer than the old value, and in
either case, reject the push.

However, the ref_newer function, which performs this latter check, has
an odd behavior due to the reuse of certain object flags.  Specifically,
it will incorrectly return false in its first invocation and then
correctly return true on a subsequent invocation.  This occurs because
the object flags used by http-push.c are the same as those used by
commit-reach.c, which implements ref_newer, and one piece of code
misinterprets the flags set by the other.

Note that this does not occur in all cases.  For example, if the example
used in the tests is changed to use one repository instead of two and
rewind the head to add a commit, the test passes and we correctly reject
the push.  However, the example provided does trigger this behavior, and
the code has been broken in this way since at least Git 2.0.0.

To solve this problem, let's move the two sets of object flags so that
they don't overlap, since we're clearly using them at the same time.
The new set should not conflict with other usage because other users are
either builtin code (which is not compiled into git http-push) or
upload-pack (which we similarly do not use here).

Reported-by: Michael Ward &lt;mward@smartsoftwareinc.com&gt;
Signed-off-by: René Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>object: drop parsed_object_pool-&gt;commit_count</title>
<updated>2020-06-17T21:37:14Z</updated>
<author>
<name>Abhishek Kumar</name>
<email>abhishekkumar8222@gmail.com</email>
</author>
<published>2020-06-17T09:14:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=6da43d937ca96d277556fa92c5a664fb1cbcc8ac'/>
<id>urn:sha1:6da43d937ca96d277556fa92c5a664fb1cbcc8ac</id>
<content type='text'>
14ba97f8 (alloc: allow arbitrary repositories for alloc functions,
2018-05-15) introduced parsed_object_pool-&gt;commit_count to keep count of
commits per repository and was used to assign commit-&gt;index.

However, commit-slab code requires commit-&gt;index values to be unique
and a global count would be correct, rather than a per-repo count.

Let's introduce a static counter variable, `parsed_commits_count` to
keep track of parsed commits so far.

As commit_count has no use anymore, let's also drop it from the struct.

Signed-off-by: Abhishek Kumar &lt;abhishekkumar8222@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>revision: --show-pulls adds helpful merges</title>
<updated>2020-04-10T16:58:55Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2020-04-10T12:19:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8d049e182e2e213a012e2d6839becfe0e2de79db'/>
<id>urn:sha1:8d049e182e2e213a012e2d6839becfe0e2de79db</id>
<content type='text'>
The default file history simplification of "git log -- &lt;path&gt;" or
"git rev-list -- &lt;path&gt;" focuses on providing the smallest set of
commits that first contributed a change. The revision walk greatly
restricts the set of walked commits by visiting only the first
TREESAME parent of a merge commit, when one exists. This means
that portions of the commit-graph are not walked, which can be a
performance benefit, but can also "hide" commits that added changes
but were ignored by a merge resolution.

The --full-history option modifies this by walking all commits and
reporting a merge commit as "interesting" if it has _any_ parent
that is not TREESAME. This tends to be an over-representation of
important commits, especially in an environment where most merge
commits are created by pull request completion.

Suppose we have a commit A and we create a commit B on top that
changes our file. When we merge the pull request, we create a merge
commit M. If no one else changed the file in the first-parent
history between M and A, then M will not be TREESAME to its first
parent, but will be TREESAME to B. Thus, the simplified history
will be "B". However, M will appear in the --full-history mode.

However, suppose that a number of topics T1, T2, ..., Tn were
created based on commits C1, C2, ..., Cn between A and M as
follows:

  A----C1----C2--- ... ---Cn----M------P1---P2--- ... ---Pn
   \     \     \            \  /      /    /            /
    \     \__.. \            \/ ..__T1    /           Tn
     \           \__..       /\     ..__T2           /
      \_____________________B  \____________________/

If the commits T1, T2, ... Tn did not change the file, then all of
P1 through Pn will be TREESAME to their first parent, but not
TREESAME to their second. This means that all of those merge commits
appear in the --full-history view, with edges that immediately
collapse into the lower history without introducing interesting
single-parent commits.

The --simplify-merges option was introduced to remove these extra
merge commits. By noticing that the rewritten parents are reachable
from their first parents, those edges can be simplified away. Finally,
the commits now look like single-parent commits that are TREESAME to
their "only" parent. Thus, they are removed and this issue does not
cause issues anymore. However, this also ends up removing the commit
M from the history view! Even worse, the --simplify-merges option
requires walking the entire history before returning a single result.

Many Git users are using Git alongside a Git service that provides
code storage alongside a code review tool commonly called "Pull
Requests" or "Merge Requests" against a target branch.  When these
requests are accepted and merged, they typically create a merge
commit whose first parent is the previous branch tip and the second
parent is the tip of the topic branch used for the request. This
presents a valuable order to the parents, but also makes that merge
commit slightly special. Users may want to see not only which
commits changed a file, but which pull requests merged those commits
into their branch. In the previous example, this would mean the
users want to see the merge commit "M" in addition to the single-
parent commit "C".

Users are even more likely to want these merge commits when they
use pull requests to merge into a feature branch before merging that
feature branch into their trunk.

In some sense, users are asking for the "first" merge commit to
bring in the change to their branch. As long as the parent order is
consistent, this can be handled with the following rule:

  Include a merge commit if it is not TREESAME to its first
  parent, but is TREESAME to a later parent.

These merges look like the merge commits that would result from
running "git pull &lt;topic&gt;" on a main branch. Thus, the option to
show these commits is called "--show-pulls". This has the added
benefit of showing the commits created by closing a pull request or
merge request on any of the Git hosting and code review platforms.

To test these options, extend the standard test example to include
a merge commit that is not TREESAME to its first parent. It is
surprising that that option was not already in the example, as it
is instructive.

In particular, this extension demonstrates a common issue with file
history simplification. When a user resolves a merge conflict using
"-Xours" or otherwise ignoring one side of the conflict, they create
a TREESAME edge that probably should not be TREESAME. This leads
users to become frustrated and complain that "my change disappeared!"
In my experience, showing them history with --full-history and
--simplify-merges quickly reveals the problematic merge. As mentioned,
this option is expensive to compute. The --show-pulls option
_might_ show the merge commit (usually titled "resolving conflicts")
more quickly. Of course, this depends on the user having the correct
parent order, which is backwards when using "git pull master" from a
topic branch.

There are some special considerations when combining the --show-pulls
option with --simplify-merges. This requires adding a new PULL_MERGE
object flag to store the information from the initial TREESAME
comparisons. This helps avoid dropping those commits in later filters.
This is covered by a test, including how the parents can be simplified.
Since "struct object" has already ruined its 32-bit alignment by using
33 bits across parsed, type, and flags member, let's not make it worse.
PULL_MERGE is used in revision.c with the same value (1u&lt;&lt;15) as
REACHABLE in commit-graph.c. The REACHABLE flag is only used when
writing a commit-graph file, and a revision walk using --show-pulls
does not happen in the same process. Care must be taken in the future
to ensure this remains the case.

Update Documentation/rev-list-options.txt with significant details
around this option. This requires updating the example in the
History Simplification section to demonstrate some of the problems
with TREESAME second parents.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
