git/upload-pack.c, branch jch

Merge branch 'ps/upload-pack-buffer-more-writes'

2026-03-24T19:31:34Z

Reduce system overhead "git upload-pack" spends on relaying "git pack-objects" output to the "git fetch" running on the other end of the connection. * ps/upload-pack-buffer-more-writes: builtin/pack-objects: reduce lock contention when writing packfile data csum-file: drop `hashfd_throughput()` csum-file: introduce `hashfd_ext()` sideband: use writev(3p) to send pktlines wrapper: introduce writev(3p) wrappers compat/posix: introduce writev(3p) wrapper upload-pack: reduce lock contention when writing packfile data upload-pack: prefer flushing data over sending keepalive upload-pack: adapt keepalives based on buffering upload-pack: fix debug statement when flushing packfile data

upload-pack: reduce lock contention when writing packfile data

2026-03-13T15:54:14Z

In our production systems we have recently observed write contention in git-upload-pack(1). The system in question was consistently streaming packfiles at a rate of dozens of gigabits per second, but curiously the system was neither bottlenecked on CPU, memory or IOPS. We eventually discovered that Git was spending 80% of its time in `pipe_write()`, out of which almost all of the time was spent in the `ep_poll_callback` function in the kernel. Quoting the reporter: This infrastructure is part of an event notification queue designed to allow for multiple producers to emit events, but that concurrency safety is guarded by 3 layers of locking. The layer we're hitting contention in uses a simple reader/writer lock mode (a.k.a. shared versus exclusive mode), where producers need shared-mode (read mode), and various other actions use exclusive (write) mode. The system in question generates workloads where we have hundreds of git-upload-pack(1) processes active at the same point in time. These processes end up contending around those locks, and the consequence is that the Git processes stall. Now git-upload-pack(1) already has the infrastructure in place to buffer some of the data it reads from git-pack-objects(1) before actually sending it out. We only use this infrastructure in very limited ways though, so we generally end up matching one read(3p) call with one write(3p) call. Even worse, when the sideband is enabled we end up matching one read with _two_ writes: one for the pkt-line length, and one for the packfile data. Extend our use of the buffering infrastructure so that we soak up bytes until the buffer is filled up at least 2/3rds of its capacity. The change is relatively simple to implement as we already know to flush the buffer in `create_pack_file()` after git-pack-objects(1) has finished. This significantly reduces the number of write(3p) syscalls we need to do. Before this change, cloning the Linux repository resulted in around 400,000 write(3p) syscalls. With the buffering in place we only do around 130,000 syscalls. Now we could of course go even further and make sure that we always fill up the whole buffer. But this might cause an increase in read(3p) syscalls, and some tests show that this only reduces the number of write(3p) syscalls from 130,000 to 100,000. So overall this doesn't seem worth it. Note that the issue could also be fixed by adapting the write buffer that we use in the downstream git-pack-objects(1) command, and such a change would have roughly the same result. But the command that generates the packfile data may not always be git-pack-objects(1) as it can be changed via "uploadpack.packObjectsHook", so such a fix would only help in _some_ cases. Regardless of that, we'll also adapt the write buffer size of git-pack-objects(1) in a subsequent commit. Helped-by: Matt Smiley Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

upload-pack: prefer flushing data over sending keepalive

2026-03-13T15:54:13Z

When using the sideband in git-upload-pack(1) we know to send out keepalive packets in case generating the pack takes too long. These keepalives take the form of a simple empty pktline. In the preceding commit we have adapted git-upload-pack(1) to buffer data more aggressively before sending it to the client. This creates an obvious optimization opportunity: when we hit the keepalive timeout while we still hold on to some buffered data, then it makes more sense to flush out the data instead of sending the empty keepalive packet. This is overall not going to be a significant win. Most keepalives will come before the pack data starts, and once pack-objects starts producing data, it tends to do so pretty consistently. And of course we can't send data before we see the PACK header, because the whole point is to buffer the early bit waiting for packfile URIs. But the optimization is easy enough to realize. Do so and flush out data instead of sending an empty pktline. Suggested-by: Jeff King Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

upload-pack: adapt keepalives based on buffering

2026-03-13T15:54:13Z

The function `create_pack_file()` is responsible for sending the packfile data to the client of git-upload-pack(1). As generating the bytes may take significant computing resources we also have a mechanism in place that optionally sends keepalive pktlines in case we haven't sent out any data. The keepalive logic is purely based poll(3p): we pass a timeout to that syscall, and if the call times out we send out the keepalive pktline. While reasonable, this logic isn't entirely sufficient: even if the call to poll(3p) ends because we have received data on any of the file descriptors we may not necessarily send data to the client. The most important edge case here happens in `relay_pack_data()`. When we haven't seen the initial "PACK" signature from git-pack-objects(1) yet we buffer incoming data. So in the worst case, if each of the bytes of that signature arrive shortly before the configured keepalive timeout, then we may not send out any data for a time period that is (almost) four times as long as the configured timeout. This edge case is rather unlikely to matter in practice. But in a subsequent commit we're going to adapt our buffering mechanism to become more aggressive, which makes it more likely that we don't send any data for an extended amount of time. Adapt the logic so that instead of using a fixed timeout on every call to poll(3p), we instead figure out how much time has passed since the last-sent data. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

upload-pack: fix debug statement when flushing packfile data

2026-03-13T15:54:13Z

When git-upload-pack(1) writes packfile data to the client we have some logic in place that buffers some partial lines. When that buffer still contains data after git-pack-objects(1) has finished we flush the buffer so that all remaining bytes are sent out. Curiously, when we do so we also print the string "flushed." to stderr. This statement has been introduced in b1c71b7281 (upload-pack: avoid sending an incomplete pack upon failure, 2006-06-20), so quite a while ago. What's interesting though is that stderr is typically spliced through to the client-side, and consequently the client would see this message. Munging the way how we do the caching indeed confirms this: $ git clone file:///home/pks/Development/linux/ Cloning into bare repository 'linux.git'... remote: Enumerating objects: 12980346, done. remote: Counting objects: 100% (131820/131820), done. remote: Compressing objects: 100% (50290/50290), done. remote: Total 12980346 (delta 96319), reused 104500 (delta 81217), pack-reused 12848526 (from 1) Receiving objects: 100% (12980346/12980346), 3.23 GiB | 57.44 MiB/s, done. flushed. Resolving deltas: 100% (10676718/10676718), done. It's quite clear that this string shouldn't ever be visible to the client, so it rather feels like this is a left-over debug statement. The menitoned commit doesn't mention this line, either. Remove the debug output to prepare for a change in how we do the buffering in the next commit. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

Merge branch 'ps/refs-for-each'

2026-03-09T21:36:55Z

Code refactoring around refs-for-each-* API functions. * ps/refs-for-each: refs: replace `refs_for_each_fullref_in()` refs: replace `refs_for_each_namespaced_ref()` refs: replace `refs_for_each_glob_ref()` refs: replace `refs_for_each_glob_ref_in()` refs: replace `refs_for_each_rawref_in()` refs: replace `refs_for_each_rawref()` refs: replace `refs_for_each_ref_in()` refs: improve verification for-each-ref options refs: generalize `refs_for_each_fullref_in_prefixes()` refs: generalize `refs_for_each_namespaced_ref()` refs: speed up `refs_for_each_glob_ref_in()` refs: introduce `refs_for_each_ref_ext` refs: rename `each_ref_fn` refs: rename `do_for_each_ref_flags` refs: move `do_for_each_ref_flags` further up refs: move `refs_head_ref_namespaced()` refs: remove unused `refs_for_each_include_root_ref()`

refs: replace `refs_for_each_namespaced_ref()`

2026-02-23T21:21:19Z

Replace calls to `refs_for_each_namespaced_ref()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

refs: rename `each_ref_fn`

2026-02-23T21:21:18Z

Similar to the preceding commit, rename `each_ref_fn` to better match our current best practices around how we name things. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

shallow: handling fetch relative-deepen

2026-02-17T19:46:15Z

When a shallowed repository gets deepened beyond the beginning of a merged branch, we may end up with some shallows that are hidden behind the reachable shallow commits. Added test 'fetching deepen beyond merged branch' exposes that behaviour. An example showing the problem based on added test: 0. Whole initial git repo to be cloned from Graph: * 033585d (HEAD -> main) Merge branch 'branch' |\ | * 984f8b1 (branch) five | * ecb578a four |/ * 0cb5d20 three * 2b4e70d two * 61ba98b one 1. Initial shallow clone --depth=3 (all good) Shallows: 2b4e70da2a10e1d3231a0ae2df396024735601f1 ecb578a3cf37198d122ae5df7efed9abaca17144 Graph: * 033585d (HEAD -> main) Merge branch 'branch' |\ | * 984f8b1 five | * ecb578a (grafted) four * 0cb5d20 three * 2b4e70d (grafted) two 2. Deepen shallow clone with fetch --deepen=1 (NOT OK) Shallows: 0cb5d204f4ef96ed241feb0f2088c9f4794ba758 61ba98be443fd51c542eb66585a1f6d7e15fcdae Graph: * 033585d (HEAD -> main) Merge branch 'branch' |\ | * 984f8b1 five | * ecb578a four |/ * 0cb5d20 (grafted) three --- Note that second shallow commit 61ba98be443fd51c542eb66585a1f6d7e15fcdae is not reachable. On the other hand, it seems that equivalent absolute depth driven fetches result in all the correct shallows. That led to this proposal, which unifies absolute and relative deepening in a way that the same get_shallow_commits() call is used in both cases. The difference is only that depth is adapted for relative deepening by measuring equivalent depth of current local shallow commits in the current remote repo. Thus a new function get_shallows_depth() has been added and the function get_reachable_list() became redundant / removed. Same example showing the corrected second step: 2. Deepen shallow clone with fetch --deepen=1 (all good) Shallow: 61ba98be443fd51c542eb66585a1f6d7e15fcdae Graph: * 033585d (HEAD -> main) Merge branch 'branch' |\ | * 984f8b1 five | * ecb578a four |/ * 0cb5d20 three * 2b4e70d two * 61ba98b (grafted) one The get_shallows_depth() function also shares the logic of the get_shallow_commits() function, but it focuses on counting depth of each existing shallow commit. The minimum result is stored as 'data->deepen_relative', which is set not to be zero for relative deepening anyway. That way we can always sum 'data->deepen_relative' and 'depth' values, because 'data->deepen_relative' is always 0 in absolute deepening. To avoid duplicating logic between get_shallows_depth() and get_shallow_commits(), get_shallow_commits() was modified so that it is used by get_shallows_depth(). Signed-off-by: Samo Pogačnik Signed-off-by: Junio C Hamano

upload-pack: convert to use `reference_get_peeled_oid()`

2025-11-04T15:32:25Z

The `write_v0_ref()` callback is invoked from two callsites: - Once via `send_ref()` which is a callback passed to `for_each_namespaced_ref_1()` and `refs_head_ref_namespaced()`. - Once manually to announce capabilities. When sending references to the client we also send the peeled value of tags. As we don't have a `struct reference` available in the second case, we cannot easily peel by calling `reference_get_peeled_oid()`, but we instead have to depend on on global state via `peel_iterated_oid()`. We do have a reference available though in the first case, it's only the second case that keeps us from using `reference_get_peeled_oid()`. But that second case only announces capabilities anyway, so we're not really handling a reference at all here. Adapt that case to construct a reference manually and pass that to `write_v0_ref()`. Start to use `reference_get_peeled_oid()` now that we always have a `struct reference` available. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano