<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/builtin/fetch.c, branch v2.32.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.32.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.32.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2021-05-16T12:05:22Z</updated>
<entry>
<title>Merge branch 'jt/push-negotiation'</title>
<updated>2021-05-16T12:05:22Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-05-16T12:05:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=644f4a20468da89c1325a539c0521336f7835a64'/>
<id>urn:sha1:644f4a20468da89c1325a539c0521336f7835a64</id>
<content type='text'>
"git push" learns to discover common ancestor with the receiving
end over protocol v2.

* jt/push-negotiation:
  send-pack: support push negotiation
  fetch: teach independent negotiation (no packfile)
  fetch-pack: refactor command and capability write
  fetch-pack: refactor add_haves()
  fetch-pack: refactor process_acks()
</content>
</entry>
<entry>
<title>fetch: teach independent negotiation (no packfile)</title>
<updated>2021-05-05T01:41:29Z</updated>
<author>
<name>Jonathan Tan</name>
<email>jonathantanmy@google.com</email>
</author>
<published>2021-05-04T21:16:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9c1e657a8fd26fa3ed8d13fb8c796cef8db8b124'/>
<id>urn:sha1:9c1e657a8fd26fa3ed8d13fb8c796cef8db8b124</id>
<content type='text'>
Currently, the packfile negotiation step within a Git fetch cannot be
done independent of sending the packfile, even though there is at least
one application wherein this is useful. Therefore, make it possible for
this negotiation step to be done independently. A subsequent commit will
use this for one such application - push negotiation.

This feature is for protocol v2 only. (An implementation for protocol v0
would require a separate implementation in the fetch, transport, and
transport helper code.)

In the protocol, the main hindrance towards independent negotiation is
that the server can unilaterally decide to send the packfile. This is
solved by a "wait-for-done" argument: the server will then wait for the
client to say "done". In practice, the client will never say it; instead
it will cease requests once it is satisfied.

In the client, the main change lies in the transport and transport
helper code. fetch_refs_via_pack() performs everything needed - protocol
version and capability checks, and the negotiation itself.

There are 2 code paths that do not go through fetch_refs_via_pack() that
needed to be individually excluded: the bundle transport (excluded
through requiring smart_options, which the bundle transport doesn't
support) and transport helpers that do not support takeover. If or when
we support independent negotiation for protocol v0, we will need to
modify these 2 code paths to support it. But for now, report failure if
independent negotiation is requested in these cases.

Signed-off-by: Jonathan Tan &lt;jonathantanmy@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>fetch: add --prefetch option</title>
<updated>2021-04-16T20:36:55Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2021-04-16T12:49:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=2e03115d0c253843953ef9d113c72e0375892df4'/>
<id>urn:sha1:2e03115d0c253843953ef9d113c72e0375892df4</id>
<content type='text'>
The --prefetch option will be used by the 'prefetch' maintenance task
instead of sending refspecs explicitly across the command-line. The
intention is to modify the refspec to place all results in
refs/prefetch/ instead of anywhere else.

Create helper method filter_prefetch_refspec() to modify a given refspec
to fit the rules expected of the prefetch task:

 * Negative refspecs are preserved.
 * Refspecs without a destination are removed.
 * Refspecs whose source starts with "refs/tags/" are removed.
 * Other refspecs are placed within "refs/prefetch/".

Finally, we add the 'force' option to ensure that prefetch refs are
replaced as necessary.

There are some interesting cases that are worth testing.

An earlier version of this change dropped the "i--" from the loop that
deletes a refspec item and shifts the remaining entries down. This
allowed some refspecs to not be modified. The subtle part about the
first --prefetch test is that the "refs/tags/*" refspec appears directly
before the "refs/heads/bogus/*" refspec. Without that "i--", this
ordering would remove the "refs/tags/*" refspec and leave the last one
unmodified, placing the result in "refs/heads/*".

It is possible to have an empty refspec. This is typically the case for
remotes other than the origin, where users want to fetch a specific tag
or branch. To correctly test this case, we need to further remove the
upstream remote for the local branch. Thus, we are testing a refspec
that will be deleted, leaving nothing to fetch.

Helped-by: Tom Saeger &lt;tom.saeger@oracle.com&gt;
Helped-by: Ramsay Jones &lt;ramsay@ramsayjones.plus.com&gt;
Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jt/clone-unborn-head'</title>
<updated>2021-02-18T01:21:40Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-02-18T01:21:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=69571dfe219f48614e0e0ae7e28efae0be297764'/>
<id>urn:sha1:69571dfe219f48614e0e0ae7e28efae0be297764</id>
<content type='text'>
"git clone" tries to locally check out the branch pointed at by
HEAD of the remote repository after it is done, but the protocol
did not convey the information necessary to do so when copying an
empty repository.  The protocol v2 learned how to do so.

* jt/clone-unborn-head:
  clone: respect remote unborn HEAD
  connect, transport: encapsulate arg in struct
  ls-refs: report unborn targets of symrefs
</content>
</entry>
<entry>
<title>connect, transport: encapsulate arg in struct</title>
<updated>2021-02-05T21:49:54Z</updated>
<author>
<name>Jonathan Tan</name>
<email>jonathantanmy@google.com</email>
</author>
<published>2021-02-05T20:48:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=39835409d10de2402c4b3e10dba20286989627d4'/>
<id>urn:sha1:39835409d10de2402c4b3e10dba20286989627d4</id>
<content type='text'>
In a future patch we plan to return the name of an unborn current branch
from deep in the callchain to a caller via a new pointer parameter that
points at a variable in the caller when the caller calls
get_remote_refs() and transport_get_remote_refs().

In preparation for that, encapsulate the existing ref_prefixes
parameter into a struct. The aforementioned unborn current branch will
go into this new struct in the future patch.

Signed-off-by: Jonathan Tan &lt;jonathantanmy@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>fetch: implement support for atomic reference updates</title>
<updated>2021-01-12T20:06:15Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2021-01-12T12:27:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c7b190dabdea16d5b1c29ae50430449a8547818f'/>
<id>urn:sha1:c7b190dabdea16d5b1c29ae50430449a8547818f</id>
<content type='text'>
When executing a fetch, then git will currently allocate one reference
transaction per reference update and directly commit it. This means that
fetches are non-atomic: even if some of the reference updates fail,
others may still succeed and modify local references.

This is fine in many scenarios, but this strategy has its downsides.

- The view of remote references may be inconsistent and may show a
  bastardized state of the remote repository.

- Batching together updates may improve performance in certain
  scenarios. While the impact probably isn't as pronounced with loose
  references, the upcoming reftable backend may benefit as it needs to
  write less files in case the update is batched.

- The reference-update hook is currently being executed twice per
  updated reference. While this doesn't matter when there is no such
  hook, we have seen severe performance regressions when doing a
  git-fetch(1) with reference-transaction hook when the remote
  repository has hundreds of thousands of references.

Similar to `git push --atomic`, this commit thus introduces atomic
fetches. Instead of allocating one reference transaction per updated
reference, it causes us to only allocate a single transaction and commit
it as soon as all updates were received. If locking of any reference
fails, then we abort the complete transaction and don't update any
reference, which gives us an all-or-nothing fetch.

Note that this may not completely fix the first of above downsides, as
the consistent view also depends on the server-side. If the server
doesn't have a consistent view of its own references during the
reference negotiation phase, then the client would get the same
inconsistent view the server has. This is a separate problem though and,
if it actually exists, can be fixed at a later point.

This commit also changes the way we write FETCH_HEAD in case `--atomic`
is passed. Instead of writing changes as we go, we need to accumulate
all changes first and only commit them at the end when we know that all
reference updates succeeded. Ideally, we'd just do so via a temporary
file so that we don't need to carry all updates in-memory. This isn't
trivially doable though considering the `--append` mode, where we do not
truncate the file but simply append to it. And given that we support
concurrent processes appending to FETCH_HEAD at the same time without
any loss of data, seeding the temporary file with current contents of
FETCH_HEAD initially and then doing a rename wouldn't work either. So
this commit implements the simple strategy of buffering all changes and
appending them to the file on commit.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>fetch: allow passing a transaction to `s_update_ref()`</title>
<updated>2021-01-12T20:06:15Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2021-01-12T12:27:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d4c8db8f1b43cb26efdd53d74afce60e6d9d195a'/>
<id>urn:sha1:d4c8db8f1b43cb26efdd53d74afce60e6d9d195a</id>
<content type='text'>
The handling of ref updates is completely handled by `s_update_ref()`,
which will manage the complete lifecycle of the reference transaction.
This is fine right now given that git-fetch(1) does not support atomic
fetches, so each reference gets its own transaction. It is quite
inflexible though, as `s_update_ref()` only knows about a single
reference update at a time, so it doesn't allow us to alter the
strategy.

This commit prepares `s_update_ref()` and its only caller
`update_local_ref()` to allow passing an external transaction. If none
is given, then the existing behaviour is triggered which creates a new
transaction and directly commits it. Otherwise, if the caller provides a
transaction, then we only queue the update but don't commit it. This
optionally allows the caller to manage when a transaction will be
committed.

Given that `update_local_ref()` is always called with a `NULL`
transaction for now, no change in behaviour is expected from this
change.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>fetch: refactor `s_update_ref` to use common exit path</title>
<updated>2021-01-12T20:06:15Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2021-01-12T12:27:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c45889f10453864130ccbfa45815e2f4e384ab5e'/>
<id>urn:sha1:c45889f10453864130ccbfa45815e2f4e384ab5e</id>
<content type='text'>
The cleanup code in `s_update_ref()` is currently duplicated for both
succesful and erroneous exit paths. This commit refactors the function
to have a shared exit path for both cases to remove the duplication.

Suggested-by: Christian Couder &lt;christian.couder@gmail.com&gt;
Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>fetch: use strbuf to format FETCH_HEAD updates</title>
<updated>2021-01-12T20:06:14Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2021-01-12T12:27:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=929d044575ed4f9657074f098e3a9be410d511ef'/>
<id>urn:sha1:929d044575ed4f9657074f098e3a9be410d511ef</id>
<content type='text'>
This commit refactors `append_fetch_head()` to use a `struct strbuf` for
formatting the update which we're about to append to the FETCH_HEAD
file. While the refactoring doesn't have much of a benefit right now, it
serves as a preparatory step to implement atomic fetches where we need
to buffer all updates to FETCH_HEAD and only flush them out if all
reference updates succeeded.

No change in behaviour is expected from this commit.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>fetch: extract writing to FETCH_HEAD</title>
<updated>2021-01-12T20:06:14Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2021-01-12T12:27:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=58a646a3688ea3c15adaf481e440e9ca2d3bce06'/>
<id>urn:sha1:58a646a3688ea3c15adaf481e440e9ca2d3bce06</id>
<content type='text'>
When performing a fetch with the default `--write-fetch-head` option, we
write all updated references to FETCH_HEAD while the updates are
performed. Given that updates are not performed atomically, it means
that we we write to FETCH_HEAD even if some or all of the reference
updates fail.

Given that we simply update FETCH_HEAD ad-hoc with each reference, the
logic is completely contained in `store_update_refs` and thus quite hard
to extend. This can already be seen by the way we skip writing to the
FETCH_HEAD: instead of having a conditional which simply skips writing,
we instead open "/dev/null" and needlessly write all updates there.

We are about to extend git-fetch(1) to accept an `--atomic` flag which
will make the fetch an all-or-nothing operation with regards to the
reference updates. This will also require us to make the updates to
FETCH_HEAD an all-or-nothing operation, but as explained doing so is not
easy with the current layout. This commit thus refactors the wa we write
to FETCH_HEAD and pulls out the logic to open, append to, commit and
close the file. While this may seem rather over-the top at first,
pulling out this logic will make it a lot easier to update the code in a
subsequent commit. It also allows us to easily skip writing completely
in case `--no-write-fetch-head` was passed.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
