<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/upload-pack.c, branch v2.36.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.36.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.36.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2022-03-01T18:13:45Z</updated>
<entry>
<title>upload-pack: look up "want" lines via commit-graph</title>
<updated>2022-03-01T18:13:45Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2022-03-01T09:33:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4de656263aa195080495fc0a103351b9eaac8160'/>
<id>urn:sha1:4de656263aa195080495fc0a103351b9eaac8160</id>
<content type='text'>
During packfile negotiation the client will send "want" and "want-ref"
lines to the server to tell it which objects it is interested in. The
server-side parses each of those and looks them up to see whether it
actually has requested objects. This lookup is performed by calling
`parse_object()` directly, which thus hits the object database. In the
general case though most of the objects the client requests will be
commits. We can thus try to look up the object via the commit-graph
opportunistically, which is much faster than doing the same via the
object database.

Refactor parsing of both "want" and "want-ref" lines to do so.

The following benchmark is executed in a repository with a huge number
of references. It uses cached request from git-fetch(1) as input to
git-upload-pack(1) that contains about 876,000 "want" lines:

    Benchmark 1: HEAD~
      Time (mean ± σ):      7.113 s ±  0.028 s    [User: 6.900 s, System: 0.662 s]
      Range (min … max):    7.072 s …  7.168 s    10 runs

    Benchmark 2: HEAD
      Time (mean ± σ):      6.622 s ±  0.061 s    [User: 6.452 s, System: 0.650 s]
      Range (min … max):    6.535 s …  6.727 s    10 runs

    Summary
      'HEAD' ran
        1.07 ± 0.01 times faster than 'HEAD~'

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>upload-pack.c: increase output buffer size</title>
<updated>2021-12-15T19:51:18Z</updated>
<author>
<name>Jacob Vosmaer</name>
<email>jacob@gitlab.com</email>
</author>
<published>2021-12-14T19:46:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=55a9651d26a6b88c68445e7d6c9f511d1207cbd8'/>
<id>urn:sha1:55a9651d26a6b88c68445e7d6c9f511d1207cbd8</id>
<content type='text'>
When serving a fetch, git upload-pack copies data from a git
pack-objects stdout pipe to its stdout. This commit increases the size
of the buffer used for that copying from 8192 to 65515, the maximum
sideband-64k packet size.

Previously, this buffer was allocated on the stack. Because the new
buffer size is nearly 64KB, we switch this to a heap allocation.

On GitLab.com we use GitLab's pack-objects cache which does writes of
65515 bytes. Because of the default 8KB buffer size, propagating these
cache writes requires 8 pipe reads and 8 pipe writes from
git-upload-pack, and 8 pipe reads from Gitaly (our Git RPC service).
If we increase the size of the buffer to the maximum Git packet size,
we need only 1 pipe read and 1 pipe write in git-upload-pack, and 1
pipe read in Gitaly to transfer the same amount of data. In benchmarks
with a pure fetch and 100% cache hit rate workload we are seeing CPU
utilization reductions of over 30%.

Signed-off-by: Jacob Vosmaer &lt;jacob@gitlab.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>run-command API users: use strvec_pushl(), not argv construction</title>
<updated>2021-11-26T06:15:07Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2021-11-25T22:52:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=2b7098936c9e91d527aa53b8d4af0b25d7e912b4'/>
<id>urn:sha1:2b7098936c9e91d527aa53b8d4af0b25d7e912b4</id>
<content type='text'>
Change a pattern of hardcoding an "argv" array size, populating it and
assigning to the "argv" member of "struct child_process" to instead
use "strvec_pushl()" to add data to the "args" member.

This implements the same behavior as before in fewer lines of code,
and moves us further towards being able to remove the "argv" member in
a subsequent commit.

Since we've entirely removed the "argv" variable(s) we can be sure
that no potential logic errors of the type discussed in a preceding
commit are being introduced here, i.e. ones where the local "argv" was
being modified after the assignment to "struct child_process"'s
"argv".

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ab/serve-cleanup'</title>
<updated>2021-09-20T22:20:43Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-09-20T22:20:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5331af2352be0d6dd57a5fc4c5dfce278cf911c7'/>
<id>urn:sha1:5331af2352be0d6dd57a5fc4c5dfce278cf911c7</id>
<content type='text'>
Code clean-up around "git serve".

* ab/serve-cleanup:
  upload-pack: document and rename --advertise-refs
  serve.[ch]: remove "serve_options", split up --advertise-refs code
  {upload,receive}-pack tests: add --advertise-refs tests
  serve.c: move version line to advertise_capabilities()
  serve: move transfer.advertiseSID check into session_id_advertise()
  serve.[ch]: don't pass "struct strvec *keys" to commands
  serve: use designated initializers
  transport: use designated initializers
  transport: rename "fetch" in transport_vtable to "fetch_refs"
  serve: mark has_capability() as static
</content>
</entry>
<entry>
<title>Merge branch 'jv/pkt-line-batch'</title>
<updated>2021-09-20T22:20:41Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-09-20T22:20:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c2509c5407a70a187664982c7f484e7daacafc4f'/>
<id>urn:sha1:c2509c5407a70a187664982c7f484e7daacafc4f</id>
<content type='text'>
Reduce number of write(2) system calls while sending the
ref advertisement.

* jv/pkt-line-batch:
  upload-pack: use stdio in send_ref callbacks
  pkt-line: add stdio packet write functions
</content>
</entry>
<entry>
<title>upload-pack: use stdio in send_ref callbacks</title>
<updated>2021-09-01T17:20:39Z</updated>
<author>
<name>Jacob Vosmaer</name>
<email>jacob@gitlab.com</email>
</author>
<published>2021-09-01T12:54:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=70afef5cdf29b5159f18df1b93722055f78740f8'/>
<id>urn:sha1:70afef5cdf29b5159f18df1b93722055f78740f8</id>
<content type='text'>
In both protocol v0 and v2, upload-pack writes one pktline packet per
advertised ref to stdout. That means one or two write(2) syscalls per
ref. This is problematic if these writes become network sends with
high overhead.

This commit changes both send_ref callbacks to use buffered IO using
stdio.

To give an example of the impact: I set up a single-threaded loop that
calls ls-remote (with HTTP and protocol v2) on a local GitLab
instance, on a repository with 11K refs. When I switch from Git
v2.32.0 to this patch, I see a 40% reduction in CPU time for Git, and
65% for Gitaly (GitLab's Git RPC service).

So using buffered IO not only saves syscalls in upload-pack, it also
saves time in things that consume upload-pack's output.

Helped-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Jacob Vosmaer &lt;jacob@gitlab.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>upload-pack.c: treat want-ref relative to namespace</title>
<updated>2021-09-01T14:54:18Z</updated>
<author>
<name>Kim Altintop</name>
<email>kim@eagain.st</email>
</author>
<published>2021-08-13T06:23:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=39551406539e6ea87f89f619f7f0800e887e9b57'/>
<id>urn:sha1:39551406539e6ea87f89f619f7f0800e887e9b57</id>
<content type='text'>
When 'upload-pack' runs within the context of a git namespace, treat any
'want-ref' lines the client sends as relative to that namespace.

Also check if the wanted ref is hidden via 'hideRefs'. If it is hidden,
respond with an error as if the ref didn't exist.

Helped-by: Jonathan Tan &lt;jonathantanmy@google.com&gt;
Signed-off-by: Kim Altintop &lt;kim@eagain.st&gt;
Reviewed-by: Jonathan Tan &lt;jonathantanmy@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>serve.[ch]: remove "serve_options", split up --advertise-refs code</title>
<updated>2021-08-05T15:59:37Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2021-08-05T01:25:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=f234da80197b3115cd3c280e98d3393a884a9327'/>
<id>urn:sha1:f234da80197b3115cd3c280e98d3393a884a9327</id>
<content type='text'>
The "advertise capabilities" mode of serve.c added in
ed10cb952d3 (serve: introduce git-serve, 2018-03-15) is only used by
the http-backend.c to call {upload,receive}-pack with the
--advertise-refs parameter. See 42526b478e3 (Add stateless RPC options
to upload-pack, receive-pack, 2009-10-30).

Let's just make cmd_upload_pack() take the two (v2) or three (v2)
parameters the the v2/v1 servicing functions need directly, and pass
those in via the function signature. The logic of whether daemon mode
is implied by the timeout belongs in the v1 function (only used
there).

Once we split up the "advertise v2 refs" from "serve v2 request" it
becomes clear that v2 never cared about those in combination. The only
time it mattered was for v1 to emit its ref advertisement, in that
case we wanted to emit the smart-http-only "no-done" capability.

Since we only do that in the --advertise-refs codepath let's just have
it set "do_done" itself in v1's upload_pack() just before send_ref(),
at that point --advertise-refs and --stateless-rpc in combination are
redundant (the only user is get_info_refs() in http-backend.c), so we
can just pass in --advertise-refs only.

Since we need to touch all the serve() and advertise_capabilities()
codepaths let's rename them to less clever and obvious names, it's
been suggested numerous times, the latest of which is [1]'s suggestion
for protocol_v2_serve_loop(). Let's go with that.

1. https://lore.kernel.org/git/CAFQ2z_NyGb8rju5CKzmo6KhZXD0Dp21u-BbyCb2aNxLEoSPRJw@mail.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>serve.[ch]: don't pass "struct strvec *keys" to commands</title>
<updated>2021-08-05T15:59:37Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2021-08-05T01:25:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=28a592e4f4870bdd444675b7240920d0879a9c1b'/>
<id>urn:sha1:28a592e4f4870bdd444675b7240920d0879a9c1b</id>
<content type='text'>
The serve.c API added in ed10cb952d3 (serve: introduce git-serve,
2018-03-15) was passing in the raw capabilities "keys", but nothing
downstream of it ever used them.

Let's remove that code because it's not needed. If we do end up
needing to pass information about the advertisement in the future
it'll make more sense to have serve.c parse the capabilities keys and
pass the result of its parsing, rather than expecting expecting its
API users to parse the same keys again.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jt/push-negotiation'</title>
<updated>2021-05-16T12:05:22Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-05-16T12:05:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=644f4a20468da89c1325a539c0521336f7835a64'/>
<id>urn:sha1:644f4a20468da89c1325a539c0521336f7835a64</id>
<content type='text'>
"git push" learns to discover common ancestor with the receiving
end over protocol v2.

* jt/push-negotiation:
  send-pack: support push negotiation
  fetch: teach independent negotiation (no packfile)
  fetch-pack: refactor command and capability write
  fetch-pack: refactor add_haves()
  fetch-pack: refactor process_acks()
</content>
</entry>
</feed>
