<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/builtin/fetch.c, branch v2.34.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.34.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.34.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2021-09-20T22:20:45Z</updated>
<entry>
<title>Merge branch 'js/run-command-close-packs'</title>
<updated>2021-09-20T22:20:45Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-09-20T22:20:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c042ad5ad58d4e43aadc806850e76fef73848d2c'/>
<id>urn:sha1:c042ad5ad58d4e43aadc806850e76fef73848d2c</id>
<content type='text'>
The run-command API has been updated so that the callers can easily
ask the file descriptors open for packfiles to be closed immediately
before spawning commands that may trigger auto-gc.

* js/run-command-close-packs:
  Close object store closer to spawning child processes
  run_auto_maintenance(): implicitly close the object store
  run-command: offer to close the object store before running
  run-command: prettify the `RUN_COMMAND_*` flags
  pull: release packs before fetching
  commit-graph: when closing the graph, also release the slab
</content>
</entry>
<entry>
<title>Merge branch 'ps/fetch-optim'</title>
<updated>2021-09-20T22:20:39Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-09-20T22:20:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=deec8aa2d05aba0a2ba495f9d41ab4e2805c8b9e'/>
<id>urn:sha1:deec8aa2d05aba0a2ba495f9d41ab4e2805c8b9e</id>
<content type='text'>
Optimize code that handles large number of refs in the "git fetch"
code path.

* ps/fetch-optim:
  fetch: avoid second connectivity check if we already have all objects
  fetch: merge fetching and consuming refs
  fetch: refactor fetch refs to be more extendable
  fetch-pack: optimize loading of refs via commit graph
  connected: refactor iterator to return next object ID directly
  fetch: avoid unpacking headers in object existence check
  fetch: speed up lookup of want refs via commit-graph
</content>
</entry>
<entry>
<title>Merge branch 'ab/retire-advice-config'</title>
<updated>2021-09-10T18:46:29Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-09-10T18:46:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=fd0d7036e0da248f354e91fb8be8771fb20adfac'/>
<id>urn:sha1:fd0d7036e0da248f354e91fb8be8771fb20adfac</id>
<content type='text'>
Code clean up to migrate callers from older advice_config[] based
API to newer advice_if_enabled() and advice_enabled() API.

* ab/retire-advice-config:
  advice: move advice.graftFileDeprecated squashing to commit.[ch]
  advice: remove use of global advice_add_embedded_repo
  advice: remove read uses of most global `advice_` variables
  advice: add enum variants for missing advice variables
</content>
</entry>
<entry>
<title>Merge branch 'ps/fetch-omit-formatting-under-quiet'</title>
<updated>2021-09-10T18:46:20Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-09-10T18:46:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=87d4aed743a4f81ea72e8b7ee3b77d208fd37149'/>
<id>urn:sha1:87d4aed743a4f81ea72e8b7ee3b77d208fd37149</id>
<content type='text'>
"git fetch --quiet" optimization to avoid useless computation of
info that will never be displayed.

* ps/fetch-omit-formatting-under-quiet:
  fetch: skip formatting updated refs with `--quiet`
</content>
</entry>
<entry>
<title>run_auto_maintenance(): implicitly close the object store</title>
<updated>2021-09-09T19:56:11Z</updated>
<author>
<name>Johannes Schindelin</name>
<email>johannes.schindelin@gmx.de</email>
</author>
<published>2021-09-09T09:47:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5a22a334cb757753230f1d73da36130513016830'/>
<id>urn:sha1:5a22a334cb757753230f1d73da36130513016830</id>
<content type='text'>
Before spawning the auto maintenance, we need to make sure that we
release all open file handles to all the `.pack` files (and MIDX files
and commit-graph files and...) so that the maintenance process has the
freedom to delete those files.

So far, we did this manually every time before calling
`run_auto_maintenance()`. With the new `close_object_store` flag, we can
do that implicitly in that function, which is more robust because future
callers won't be able to forget to close the object store.

Note: this changes behavior slightly, as we previously _always_ closed
the object store, but now we only close the object store when actually
running the auto maintenance. In practice, this should not matter (if
anything, it might speed up operations where auto maintenance is
disabled).

Suggested-by: Junio C Hamano &lt;gitster@pobox.com&gt;
Signed-off-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>fetch: avoid second connectivity check if we already have all objects</title>
<updated>2021-09-01T19:43:56Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2021-09-01T13:10:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=caff8b73402d4b5edb2c6c755506c5a90351b69a'/>
<id>urn:sha1:caff8b73402d4b5edb2c6c755506c5a90351b69a</id>
<content type='text'>
When fetching refs, we are doing two connectivity checks:

    - The first one is done such that we can skip fetching refs in the
      case where we already have all objects referenced by the updated
      set of refs.

    - The second one verifies that we have all objects after we have
      fetched objects.

We always execute both connectivity checks, but this is wasteful in case
the first connectivity check already notices that we have all objects
locally available.

Skip the second connectivity check in case we already had all objects
available. This gives us a nice speedup when doing a mirror-fetch in a
repository with about 2.3M refs where the fetching repo already has all
objects:

    Benchmark #1: HEAD~: git-fetch
      Time (mean ± σ):     30.025 s ±  0.081 s    [User: 27.070 s, System: 4.933 s]
      Range (min … max):   29.900 s … 30.111 s    5 runs

    Benchmark #2: HEAD: git-fetch
      Time (mean ± σ):     25.574 s ±  0.177 s    [User: 22.855 s, System: 4.683 s]
      Range (min … max):   25.399 s … 25.765 s    5 runs

    Summary
      'HEAD: git-fetch' ran
        1.17 ± 0.01 times faster than 'HEAD~: git-fetch'

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>fetch: merge fetching and consuming refs</title>
<updated>2021-09-01T19:43:56Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2021-09-01T13:10:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=1c7d1ab6f4a79e44406f304ec01b0a143dae9abb'/>
<id>urn:sha1:1c7d1ab6f4a79e44406f304ec01b0a143dae9abb</id>
<content type='text'>
The functions `fetch_refs()` and `consume_refs()` must always be called
together such that we first obtain all missing objects and then update
our local refs to match the remote refs. In a subsequent patch, we'll
further require that `fetch_refs()` must always be called before
`consume_refs()` such that it can correctly assert that we have all
objects after the fetch given that we're about to move the connectivity
check.

Make this requirement explicit by merging both functions into a single
`fetch_and_consume_refs()` function.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>fetch: refactor fetch refs to be more extendable</title>
<updated>2021-09-01T19:43:56Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2021-09-01T13:09:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=284b2ce8fcb100e7194b9cca6d9b99bca7da39b6'/>
<id>urn:sha1:284b2ce8fcb100e7194b9cca6d9b99bca7da39b6</id>
<content type='text'>
Refactor `fetch_refs()` code to make it more extendable by explicitly
handling error cases. The refactored code should behave the same.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>connected: refactor iterator to return next object ID directly</title>
<updated>2021-09-01T19:43:56Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2021-09-01T13:09:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9fec7b213045135655354e864d15894175428d5a'/>
<id>urn:sha1:9fec7b213045135655354e864d15894175428d5a</id>
<content type='text'>
The object ID iterator used by the connectivity checks returns the next
object ID via an out-parameter and then uses a return code to indicate
whether an item was found. This is a bit roundabout: instead of a
separate error code, we can just return the next object ID directly and
use `NULL` pointers as indicator that the iterator got no items left.
Furthermore, this avoids a copy of the object ID.

Refactor the iterator and all its implementations to return object IDs
directly. This brings a tiny performance improvement when doing a mirror-fetch of a repository with about 2.3M refs:

    Benchmark #1: 328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch
      Time (mean ± σ):     30.110 s ±  0.148 s    [User: 27.161 s, System: 5.075 s]
      Range (min … max):   29.934 s … 30.406 s    10 runs

    Benchmark #2: 328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch
      Time (mean ± σ):     29.899 s ±  0.109 s    [User: 26.916 s, System: 5.104 s]
      Range (min … max):   29.696 s … 29.996 s    10 runs

    Summary
      '328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch' ran
        1.01 ± 0.01 times faster than '328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch'

While this 1% speedup could be labelled as statistically insignificant,
the speedup is consistent on my machine. Furthermore, this is an end to
end test, so it is expected that the improvement in the connectivity
check itself is more significant.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>fetch: avoid unpacking headers in object existence check</title>
<updated>2021-09-01T19:43:56Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2021-09-01T13:09:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=47c61004c7cfbb8662b13fac813b45e3fd214665'/>
<id>urn:sha1:47c61004c7cfbb8662b13fac813b45e3fd214665</id>
<content type='text'>
When updating local refs after the fetch has transferred all objects, we
do an object existence test as a safety guard to avoid updating a ref to
an object which we don't have. We do so via `oid_object_info()`: if it
returns an error, then we know the object does not exist.

One side effect of `oid_object_info()` is that it parses the object's
type, and to do so it must unpack the object header. This is completely
pointless: we don't care for the type, but only want to assert that the
object exists.

Refactor the code to use `repo_has_object_file()`, which both makes the
code's intent clearer and is also faster because it does not unpack
object headers. In a real-world repo with 2.3M refs, this results in a
small speedup when doing a mirror-fetch:

    Benchmark #1: HEAD~: git-fetch
      Time (mean ± σ):     33.686 s ±  0.176 s    [User: 30.119 s, System: 5.262 s]
      Range (min … max):   33.512 s … 33.944 s    5 runs

    Benchmark #2: HEAD: git-fetch
      Time (mean ± σ):     31.247 s ±  0.195 s    [User: 28.135 s, System: 5.066 s]
      Range (min … max):   30.948 s … 31.472 s    5 runs

    Summary
      'HEAD: git-fetch' ran
        1.08 ± 0.01 times faster than 'HEAD~: git-fetch'

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
