<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/object.h, branch v2.40.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.40.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.40.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2022-09-07T19:18:57Z</updated>
<entry>
<title>parse_object(): allow skipping hash check</title>
<updated>2022-09-07T19:18:57Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2022-09-06T23:01:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c868d8e91fee01999eb34c5559250ea059293b33'/>
<id>urn:sha1:c868d8e91fee01999eb34c5559250ea059293b33</id>
<content type='text'>
The parse_object() function checks the object hash of any object it
parses. This is a nice feature, as it means we may catch bit corruption
during normal use, rather than waiting for specific fsck operations.

But it also can be slow. It's particularly noticeable for blobs, where
except for the hash check, we could return without loading the object
contents at all. Now one may wonder what is the point of calling
parse_object() on a blob in the first place then, but usually it's not
intentional: we were fed an oid from somewhere, don't know the type, and
want an object struct. For commits and trees, the parsing is usually
helpful; we're about to look at the contents anyway. But this is less
true for blobs, where we may be collecting them as part of a
reachability traversal, etc, and don't actually care what's in them. And
blobs, of course, tend to be larger.

We don't want to just throw out the hash-checks for blobs, though. We do
depend on them in some circumstances (e.g., rev-list --verify-objects
uses parse_object() to check them). It's only the callers that know
how they're going to use the result. And so we can help them by
providing a special flag to skip the hash check.

We could just apply this to blobs, as they're going to be the main
source of performance improvement. But if a caller doesn't care about
checking the hash, we might as well skip it for other object types, too.
Even though we can't avoid reading the object contents, we can still
skip the actual hash computation.

If this seems like it is making Git a little bit less safe against
corruption, it may be. But it's part of a series of tradeoffs we're
already making. For instance, "rev-list --objects" does not open the
contents of blobs it prints. And when a commit graph is present, we skip
opening most commits entirely. The important thing will be to use this
flag in cases where it's safe to skip the check. For instance, when
serving a pack for a fetch, we know the client will fully index the
objects and do a connectivity check itself. There's little to be gained
from the server side re-hashing a blob itself. And indeed, most of the
time we don't! The revision machinery won't open up a blob reached by
traversal, but only one requested directly with a "want" line. So
applied properly, this new feature shouldn't make anything less safe in
practice.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>revision: allow --ancestry-path to take an argument</title>
<updated>2022-08-19T17:45:08Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2022-08-19T04:28:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=257418c59040c13bfa839e01922e21833cda6a52'/>
<id>urn:sha1:257418c59040c13bfa839e01922e21833cda6a52</id>
<content type='text'>
We have long allowed users to run e.g.
    git log --ancestry-path master..seen
which shows all commits which satisfy all three of these criteria:
  * are an ancestor of seen
  * are not an ancestor of master
  * have master as an ancestor

This commit allows another variant:
    git log --ancestry-path=$TOPIC master..seen
which shows all commits which satisfy all of these criteria:
  * are an ancestor of seen
  * are not an ancestor of master
  * have $TOPIC in their ancestry-path
that last bullet can be defined as commits meeting any of these
criteria:
    * are an ancestor of $TOPIC
    * have $TOPIC as an ancestor
    * are $TOPIC

This also allows multiple --ancestry-path arguments, which can be
used to find commits with any of the given topics in their ancestry
path.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Acked-by: Derrick Stolee &lt;derrickstolee@github.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>reflog: libify delete reflog function and helpers</title>
<updated>2022-03-02T23:24:47Z</updated>
<author>
<name>John Cai</name>
<email>johncai86@gmail.com</email>
</author>
<published>2022-03-02T22:27:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=7d3d226e700904d6fbf3d9a1b351ebeb02f2cf04'/>
<id>urn:sha1:7d3d226e700904d6fbf3d9a1b351ebeb02f2cf04</id>
<content type='text'>
Currently stash shells out to reflog in order to delete refs. In an
effort to reduce how much we shell out to a subprocess, libify the
functionality that stash needs into reflog.c.

Add a reflog_delete function that is pretty much the logic in the while
loop in builtin/reflog.c cmd_reflog_delete(). This is a function that
builtin/reflog.c and builtin/stash.c can both call.

Also move functions needed by reflog_delete and export them.

Helped-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: John Cai &lt;johncai86@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>*.[ch] *_INIT macros: use { 0 } for a "zero out" idiom</title>
<updated>2021-09-27T21:47:59Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2021-09-27T12:54:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9865b6e6a4ca1e895fd473c827cf1822f3bd8249'/>
<id>urn:sha1:9865b6e6a4ca1e895fd473c827cf1822f3bd8249</id>
<content type='text'>
In C it isn't required to specify that all members of a struct are
zero'd out to 0, NULL or '\0', just providing a "{ 0 }" will
accomplish that.

Let's also change code that provided N zero'd fields to just
provide one, and change e.g. "{ NULL }" to "{ 0 }" for
consistency. I.e. even if the first member is a pointer let's use "0"
instead of "NULL". The point of using "0" consistently is to pick one,
and to not have the reader wonder why we're not using the same pattern
everywhere.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>builtin/pack-objects.c: remove duplicate hash lookup</title>
<updated>2021-08-30T06:25:43Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2021-08-30T02:48:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b0173340c6b5fb330f5ea22504389fc6c5367f14'/>
<id>urn:sha1:b0173340c6b5fb330f5ea22504389fc6c5367f14</id>
<content type='text'>
In the original code from 08cdfb1337 (pack-objects --keep-unreachable,
2007-09-16), we add each object to the packing list with type
`obj-&gt;type`, where `obj` comes from `lookup_unknown_object()`. Unless we
had already looked up and parsed the object, this will be `OBJ_NONE`.
That's fine, since oe_set_type() sets the type_valid bit to '0', and we
determine the real type later on.

So the only thing we need from the object lookup is access to the
`flags` field so that we can mark that we've added the object with
`OBJECT_ADDED` to avoid adding it again (we can just pass `OBJ_NONE`
directly instead of grabbing it from the object).

But add_object_entry() already rejects duplicates! This has been the
behavior since 7a979d99ba (Thin pack - create packfile with missing
delta base., 2006-02-19), but 08cdfb1337 didn't take advantage of it.
Moreover, to do the OBJECT_ADDED check, we have to do a hash lookup in
`obj_hash`.

So we can drop the lookup_unknown_object() call completely, *and* the
OBJECT_ADDED flag, too, since the spot we're touching here is the only
location that checks it.

In the end, we perform the same number of hash lookups, but with the
added bonus that we don't waste memory allocating an OBJ_NONE object (if
we were traversing, we'd need it eventually, but the whole point of this
code path is not to traverse).

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>object.h: add lookup_object_by_type() function</title>
<updated>2021-06-29T03:30:18Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2021-06-22T16:06:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=7463064b28086c0a765e247bc8336f8e32356494'/>
<id>urn:sha1:7463064b28086c0a765e247bc8336f8e32356494</id>
<content type='text'>
In some cases it's useful for efficiency reasons to get the type of an
object before deciding whether to parse it, but we still want an object
struct. E.g., in reachable.c, bitmaps give us the type, but we just want
to mark flags on each object. Likewise, we may loop over every object
and only parse tags in order to peel them; checking the type first lets
us avoid parsing the non-tags.

But our lookup_blob(), etc, functions make getting an object struct
annoying: we have to call the right function for every type. And we
cannot just use the generic lookup_object(), because it only returns an
already-seen object; it won't allocate a new object struct.

Let's provide a function that dispatches to the correct lookup_*
function based on a run-time type. In fact, reachable.c already has such
a helper, so we'll just make that public.

I did change the return type from "void *" to "struct object *". While
the former is a clever way to avoid casting inside the function, it's
less safe and less informative to people reading the function
declaration.

The next commit will add a new caller.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>object.h: expand docstring for lookup_unknown_object()</title>
<updated>2021-06-29T03:30:17Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2021-06-22T16:05:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=542d6abbb4e838cf1c47612a2b4b33f2a31e6277'/>
<id>urn:sha1:542d6abbb4e838cf1c47612a2b4b33f2a31e6277</id>
<content type='text'>
The lookup_unknown_object() system is not often used and is somewhat
confusing. Let's try to explain it a bit more (which is especially
important as I'm adding a related but slightly different function in the
next commit).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jt/push-negotiation'</title>
<updated>2021-05-16T12:05:22Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-05-16T12:05:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=644f4a20468da89c1325a539c0521336f7835a64'/>
<id>urn:sha1:644f4a20468da89c1325a539c0521336f7835a64</id>
<content type='text'>
"git push" learns to discover common ancestor with the receiving
end over protocol v2.

* jt/push-negotiation:
  send-pack: support push negotiation
  fetch: teach independent negotiation (no packfile)
  fetch-pack: refactor command and capability write
  fetch-pack: refactor add_haves()
  fetch-pack: refactor process_acks()
</content>
</entry>
<entry>
<title>fetch: teach independent negotiation (no packfile)</title>
<updated>2021-05-05T01:41:29Z</updated>
<author>
<name>Jonathan Tan</name>
<email>jonathantanmy@google.com</email>
</author>
<published>2021-05-04T21:16:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9c1e657a8fd26fa3ed8d13fb8c796cef8db8b124'/>
<id>urn:sha1:9c1e657a8fd26fa3ed8d13fb8c796cef8db8b124</id>
<content type='text'>
Currently, the packfile negotiation step within a Git fetch cannot be
done independent of sending the packfile, even though there is at least
one application wherein this is useful. Therefore, make it possible for
this negotiation step to be done independently. A subsequent commit will
use this for one such application - push negotiation.

This feature is for protocol v2 only. (An implementation for protocol v0
would require a separate implementation in the fetch, transport, and
transport helper code.)

In the protocol, the main hindrance towards independent negotiation is
that the server can unilaterally decide to send the packfile. This is
solved by a "wait-for-done" argument: the server will then wait for the
client to say "done". In practice, the client will never say it; instead
it will cease requests once it is satisfied.

In the client, the main change lies in the transport and transport
helper code. fetch_refs_via_pack() performs everything needed - protocol
version and capability checks, and the negotiation itself.

There are 2 code paths that do not go through fetch_refs_via_pack() that
needed to be individually excluded: the bundle transport (excluded
through requiring smart_options, which the bundle transport doesn't
support) and transport helpers that do not support takeover. If or when
we support independent negotiation for protocol v0, we will need to
modify these 2 code paths to support it. But for now, report failure if
independent negotiation is requested in these cases.

Signed-off-by: Jonathan Tan &lt;jonathantanmy@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>lookup_unknown_object(): take a repository argument</title>
<updated>2021-04-13T20:18:46Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2021-04-13T07:16:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=45a187cc34b5016842b91ef14b59e40505afff46'/>
<id>urn:sha1:45a187cc34b5016842b91ef14b59e40505afff46</id>
<content type='text'>
All of the other lookup_foo() functions take a repository argument, but
lookup_unknown_object() was never converted, and it uses the_repository
internally. Let's fix that.

We could leave a wrapper that uses the_repository, but there aren't that
many calls, so we'll just convert them all. I looked briefly at each
site to see if we had a repository struct (besides the_repository) we
could pass, but none of them do (so this conversion to pass
the_repository is a pure noop in each case, though it does take us one
step closer to eventually getting rid of the_repository).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
