aboutsummaryrefslogtreecommitdiffstats
path: root/builtin/commit-tree.c (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2021-09-15The sixth batchJunio C Hamano1-0/+30
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15t1400: avoid SIGPIPE race condition on fifoJeff King1-3/+5
t1400.190 sometimes fails or even hangs because of the way it uses fifos. Our goal is to interactively read and write lines from update-ref, so we have two fifos, in and out. We open a descriptor connected to "in" and redirect output to that, so that update-ref does not see EOF as it would if we opened and closed it for each "echo" call. But we don't do the same for the output. This leads to a race where our "read response <out" has not yet opened the fifo, but update-ref tries to write to it and gets SIGPIPE. This can result in the test failing, or worse, hanging as we wait forever for somebody to write to the pipe. This is the same proble we fixed in 4783e7ea83 (t0008: avoid SIGPIPE race condition on fifo, 2013-07-12), and we can fix it the same way, by opening a second long-running descriptor. Before this patch, running: ./t1400-update-ref.sh --run=1,190 --stress failed or hung within a few dozen iterations. After it, I ran it for several hundred without problems. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12gc: remove unused launchctl_get_uid() callÆvar Arnfjörð Bjarmason1-7/+1
When the launchctl_boot_plist() function was added in a16eb6b1ff3 (maintenance: skip bootout/bootstrap when plist is registered, 2021-08-24), an unused call to launchctl_get_uid() was added along with it. That call appears to have been copy/pasted from launchctl_boot_plist(). Since we can remove that, we can also get rid of the "result" variable, whose only purpose was allow for the free() between its assignment and the return. That pattern also appears to have been copy/pasted from launchctl_boot_plist(). As the patch shows the returned value from launchctl_get_uid() wasn't used at all in this function. The launchctl_get_uid() function itself just calls xstrfmt() and getuid(), neither of which have any subtle global side-effects, so this removal is safe. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12test-tool run-command: fix flip-flop init patternÆvar Arnfjörð Bjarmason1-4/+1
In be5d88e1128 (test-tool run-command: learn to run (parts of) the testsuite, 2019-10-04) an init pattern was added that would use TESTSUITE_INIT, but then promptly memset() everything back to 0. We'd then set the "dup" on the two string lists. Our setting of "next" to "-1" thus did nothing, we'd reset it to "0" before using it. Let's set it to "0" instead, and trust the "STRING_LIST_INIT_DUP" to set "strdup_strings" appropriately for us. Note that while we compile this code, there's no in-tree user for the "testsuite" target being modified here anymore, see the discussion at and around <nycvar.QRO.7.76.6.2109091323150.59@tvgsbejvaqbjf.bet>[1]. 1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.2109091323150.59@tvgsbejvaqbjf.bet/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12tests: remove leftover untracked filesElijah Newren12-4/+18
Remove untracked files that are unwanted after they are done being used. While the set of cases in this patch is certainly far from comprehensive, it was motivated by some work to see what the fallout would be if we were to make the removal of untracked files as a side effect of other commands into an error. Some cases were a bit more involved than the testcase changes included in this patch, but the ones included here represent the simple cases. While this patch is not that important since we are not changing the behavior of those other commands into an error in the near term, I thought these changes were useful anyway as an explicit documentation of the intent that these untracked files are no longer useful. Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Acked-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12strvec: use size_t to store nr and allocJeff King1-2/+2
We converted argv_array (which later became strvec) to use size_t in 819f0e76b1 (argv-array: use size_t for count and alloc, 2020-07-28) in order to avoid the possibility of integer overflow. But later, commit d70a9eb611 (strvec: rename struct fields, 2020-07-28) accidentally converted these back to ints! Those two commits were part of the same patch series. I'm pretty sure what happened is that they were originally written in the opposite order and then cleaned up and re-ordered during an interactive rebase. And when resolving the inevitable conflict, I mistakenly took the "rename" patch completely, accidentally dropping the type change. We can correct it now; better late than never. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12compression: drop write-only core_compression_* variablesRené Scharfe3-5/+0
Since 8de7eeb54b (compression: unify pack.compression configuration parsing, 2016-11-15) the variables core_compression_level and core_compression_seen are only set, but never read. Remove them. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12packfile: use oidset for bad objectsRené Scharfe3-31/+11
Store the object ID of broken pack entries in an oidset instead of keeping only their hashes in an unsorted array. The resulting code is shorter and easier to read. It also handles the (hopefully) very rare case of having a high number of bad objects better. Helped-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12packfile: convert has_packed_and_bad() to object_idRené Scharfe3-4/+4
The single caller has a full object ID, so pass it on instead of just its hash member. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12packfile: convert mark_bad_packed_object() to object_idRené Scharfe3-8/+8
All callers have full object IDs, so pass them on instead of just their hash member. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12midx: inline nth_midxed_pack_entry()René Scharfe1-20/+9
fill_midx_entry() finds the position of an object ID and passes it to nth_midxed_pack_entry(), which uses the position to look up the object ID for its own purposes. Inline the latter into the former to avoid that lookup. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12oidset: make oidset_size() an inline functionRené Scharfe2-6/+4
oidset_size() just reads a single word from memory and returns it. Avoid the function call overhead for this trivial operation by turning it into an inline function. While we're at it, declare its parameter const to allow it to be used on read-only oidsets. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10am: fix incorrect exit status on am fail to abortElijah Newren2-2/+3
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10t4151: add a few am --abort testsElijah Newren1-0/+39
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10git-am.txt: clarify --abort behaviorElijah Newren1-0/+2
Both Johannes and I assumed (perhaps due to familiarity with rebase) that am --abort would return the user to a clean state. However, since am, unlike rebase, is intended to be used within a dirty working tree, --abort will only clean the files involved in the am operation. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10docs/protocol-v2: point readers transport config discussionJeff King1-1/+7
We recently added tips for server admins to configure various transports to support v2's GIT_PROTOCOL variable. While the protocol-v2 document is pretty technical and not of interest to most admins, it may be a starting point for them to figure out how to turn on v2. Let's put some pointers from there to the other documentation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10docs/git: discuss server-side config for GIT_PROTOCOLJeff King2-0/+23
The v2 protocol requires that the GIT_PROTOCOL environment variable gets passed around, but we don't have any documentation describing how this is supposed to work. In particular, we need to note what server admins might need to configure to make things work. The definition of the GIT_PROTOCOL variable is probably the best place for this, since: - we deal with multiple transports (ssh, http, etc). Transport-specific documentation (like the git-http-backend bits added in the previous commit) are helpful for those transports, but this gives a broader overview. Plus we do not have a specific transport endpoint program for ssh, so this is a reasonable place to mention it. - the server side of the protocol involves multiple programs. For now, upload-pack is the only endpoint which uses GIT_PROTOCOL, but that will likely expand in the future. We're better off with a central discussion of what the server admin might need to do. However, for discoverability, this patch adds a pointer from upload-pack's documentation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10docs/http-backend: mention v2 protocolJeff King1-1/+25
Historically there was a little bit of configuration needed at the webserver level in order to get the client's v2 protocol probes to Git. But when we introduced the v2 protocol, we never documented these. As of the previous commit, this should mostly work out of the box without any explicit configuration. But it's worth documenting this to make it clear how we expect it to work, especially in the face of webservers which don't provide all headers over the CGI interface. Or anybody who runs across this documentation but has an older version of Git (or _used_ to have an older version, and wonders why they still have a SetEnvIf line in their Apache config and whether it's still necessary). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10http-backend: handle HTTP_GIT_PROTOCOL CGI variableJeff King2-2/+4
When a client requests the v2 protocol over HTTP, they set the Git-Protocol header. Webservers will generally make that available to our CGI as HTTP_GIT_PROTOCOL in the environment. However, that's not sufficient for upload-pack, etc, to respect it; they look in GIT_PROTOCOL (without the HTTP_ prefix). Either the webserver or the CGI is responsible for relaying that HTTP header into the GIT_PROTOCOL variable. Traditionally, our tests have configured the webserver to do so, but that's a burden on the server admin. We can make this work out of the box by having the http-backend CGI copy the contents of HTTP_GIT_PROTOCOL to GIT_PROTOCOL. There are no new tests here. By removing the SetEnvIf line from our test Apache config, we're now relying on this behavior of http-backend to trigger the v2 protocol there (and there are numerous tests that fail if this doesn't work). There is one subtlety here: we copy HTTP_GIT_PROTOCOL only if there is no existing GIT_PROTOCOL variable. That leaves the webserver admin free to override the client's decision if they choose. This is unlikely to be useful in practice, but is more flexible. And indeed, it allows the v2-to-v0 fallback test added in the previous commit to continue working. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10t5551: test v2-to-v0 http protocol fallbackJeff King2-0/+14
Since we use the v2 protocol by default, the connection of a v2 client to a v2 server is well covered by the test suite. And with the GIT_TEST_PROTOCOL_VERSION knob, we can easily test a v0 client connecting to a v2-aware server (which will then just speak v0). But we have no regular tests that a v2 client, when encountering a non-v2-aware server, will correctly fall back to using v0. In theory this is a job for the cross-version tests in t/interop, but: - they cover only git:// and file:// clones - they are not part of the usual test suite, so nobody ever runs them anyway Since using v2 over http requires configuring the web server to pass along the Git-Protocol header, we can easily create a situation where the server does not respect the v2 probe, and the conversation falls back to v0. This works just fine. This new test is not about fixing any particular bug, but just making sure that the system works (and continues to work) as expected. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10The fifth batchJunio C Hamano1-0/+61
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09pack-objects: rename .idx files into place after .bitmap filesÆvar Arnfjörð Bjarmason1-1/+2
In preceding commits the race of renaming .idx files in place before .rev files and other auxiliary files was fixed in pack-write.c's finish_tmp_packfile(), builtin/repack.c's "struct exts", and builtin/index-pack.c's final(). As noted in the change to pack-write.c we left in place the issue of writing *.bitmap files after the *.idx, let's fix that issue. See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21) for commentary at the time when *.bitmap was implemented about how those files are written out, nothing in that commit contradicts what's being done here. Note that this commit and preceding ones only close any race condition with *.idx files being written before their auxiliary files if we're optimistic about our lack of fsync()-ing in this are not tripping us over. See the thread at [1] for a rabbit hole of various discussions about filesystem races in the face of doing and not doing fsync() (and if doing fsync(), not doing it properly). We may want to fsync the containing directory once after renaming the *.idx file into place, but that is outside the scope of this series. 1. https://lore.kernel.org/git/8735qgkvv1.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09pack-write: split up finish_tmp_packfile() functionÆvar Arnfjörð Bjarmason4-13/+39
Split up the finish_tmp_packfile() function and use the split-up version in pack-objects.c in preparation for moving the step of renaming the *.idx file later as part of a function change. Since the only other caller of finish_tmp_packfile() was in bulk-checkin.c, and it won't be needing a change to its *.idx renaming, provide a thin wrapper for the old function as a static function in that file. If other callers end up needing the simpler version it could be moved back to "pack-write.c" and "pack.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09builtin/index-pack.c: move `.idx` files into place lastTaylor Blau1-2/+2
In a similar spirit as preceding patches to `git repack` and `git pack-objects`, fix the identical problem in `git index-pack`. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09index-pack: refactor renaming in final()Ævar Arnfjörð Bjarmason1-25/+23
Refactor the renaming in final() into a helper function, this is similar in spirit to a preceding refactoring of finish_tmp_packfile() in pack-write.c. Before e37d0b8730b (builtin/index-pack.c: write reverse indexes, 2021-01-25) it probably wasn't worth it to have this sort of helper, due to the differing "else if" case for "pack" files v.s. "idx" files. But since we've got "rev" as well now, let's do the renaming via a helper, this is both a net decrease in lines, and improves the readability, since we can easily see at a glance that the logic for writing these three types of files is exactly the same, aside from the obviously differing cases of "*final_name" being NULL, and "make_read_only_if_same" being different. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09builtin/repack.c: move `.idx` files into place lastTaylor Blau1-1/+1
In a similar spirit as the previous patch, fix the identical problem from `git repack` (which invokes `pack-objects` with a temporary location for output, and then moves the files into their final locations itself). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09pack-write.c: rename `.idx` files after `*.rev`Taylor Blau1-1/+1
We treat the presence of an `.idx` file as the indicator of whether or not it's safe to use a packfile. But `finish_tmp_packfile()` (which is responsible for writing the pack and moving the temporary versions of all of its auxiliary files into place) is inconsistent about the write order. Specifically, it moves the `.rev` file into place after the `.idx`, leaving open the possibility to open a pack which looks "ready" (because the `.idx` file exists and is readable) but appears momentarily to not have a `.rev` file. This causes Git to fall back to generating the pack's reverse index in memory. Though racy, this amounts to an unnecessary slow-down at worst, and doesn't affect the correctness of the resulting reverse index. Close this race by moving the .rev file into place before moving the .idx file into place. This still leaves the issue of `.idx` files being renamed into place before the auxiliary `.bitmap` file is renamed when in pack-object.c's write_pack_file() "write_bitmap_index" is true. That race will be addressed in subsequent commits. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09pack-write: refactor renaming in finish_tmp_packfile()Ævar Arnfjörð Bjarmason3-24/+23
Refactor the renaming in finish_tmp_packfile() into a helper function. The callers are now expected to pass a "name_buffer" ending in "pack-OID." instead of the previous "pack-", we then append "pack", "idx" or "rev" to it. By doing the strbuf_setlen() in rename_tmp_packfile() we reuse the buffer and avoid the repeated allocations we'd get if that function had its own temporary "struct strbuf". This approach of reusing the buffer does make the last user in pack-object.c's write_pack_file() slightly awkward, since we needlessly do a strbuf_setlen() before calling strbuf_release() for consistency. In subsequent changes we'll move that bitmap writing code around, so let's not skip the strbuf_setlen() now. The previous strbuf_reset() idiom originated with 5889271114a (finish_tmp_packfile():use strbuf for pathname construction, 2014-03-03), which in turn was a minimal adjustment of pre-strbuf code added in 0e990530ae (finish_tmp_packfile(): a helper function, 2011-10-28). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09bulk-checkin.c: store checksum directlyTaylor Blau1-6/+6
finish_bulk_checkin() stores the checksum from finalize_hashfile() by writing to the `hash` member of `struct object_id`, but that hash has nothing to do with an object id (it's just a convenient location that happens to be sized correctly). Store the hash directly in an unsigned char array. This behaves the same as writing to the `hash` member, but makes the intent clearer (and avoids allocating an extra four bytes for the `algo` member of `struct object_id`). It likewise prevents the possibility of a segfault when reading `algo` (e.g., by calling `oid_to_hex()`) if it is uninitialized. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09t5562: use alarm() to interrupt timed child-waitJeff King1-8/+8
The t5562 script occasionally takes 60 extra seconds to complete due to a race condition in the invoke-with-content-length.pl helper. The way it's supposed to work is this: - we set up a SIGCLD handler - we kick off http-backend and write to it with a set content-length, but _don't_ close the pipe - we sleep for 60 seconds, assuming that SIGCLD from http-backend finishing will interrupt us - after the sleep finishes (whetherby 60 seconds or because it was interrupted by the signal), we check a flag to see if our SIGCLD handler was called. If not, then we complain. This usually completes immediately, because the signal interrupts our sleep. But very occasionally the child process dies _before_ we hit the sleep, so we don't realize it. The test still completes successfully (because our $exited flag is set), but it takes an extra 60 seconds. There's no way to check the flag and sleep atomically. So the best we can do with this approach is to sleep in smaller chunks (say, 1 second) and check the flag incrementally. Then we waste a maximum of 1 second if we lose the race. This was proposed in: https://lore.kernel.org/git/20190218205028.32486-1-max@max630.net/ and it does work. But we can do better. Instead of blocking on sleep and waiting for the child signal to interrupt us, we can block on the child exiting and set an alarm signal to trigger the timeout. This lets us exit the script immediately when the child behaves (with no race possible), and wait a maximum of 60 seconds when it doesn't. Note one small subtlety: perl is very willing to restart the waitpid() call after the alarm is delivered, even if we've thrown an exception via die. "perldoc -f alarm" claims you can get around this with an eval/die combo (and even has some example code), but it doesn't seem to work for me with waitpid(); instead, we continue waiting until the child exits. So instead, we'll instruct the child process to exit in the alarm handler itself. In the original code this was done by calling close($out). That would continue to work, since our child is always http-backend, which should exit when its stdin closes. But we can be even more robust against a hung or confused child by sending a KILL signal, which should terminate it immediately. Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09refs/files-backend: remove unused open mode parameterRené Scharfe1-1/+1
We only need to provide a mode if we are willing to let open(2) create the file, which is not the case here, so drop the unnecessary parameter. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09setup: use xopen and xdup in sanitize_stdfdsRené Scharfe1-5/+3
Replace the catch-all error message with specific ones for opening and duplicating by calling the wrappers xopen and xdup. The code becomes easier to follow when error handling is reduced to two letters. Remove the unnecessary mode parameter while at it -- we expect /dev/null to already exist. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09pack-bitmap: drop bitmap_index argument from try_partial_reuse()Jeff King1-3/+2
Starting in commit 0f533c7284 (pack-bitmap: read multi-pack bitmaps, 2021-08-31), we no longer look at the "struct bitmap_index" passed to try_partial_reuse(). This is because we only handle verbatim reuse from a single pack: either the pack whose bitmap we're looking at, or the "preferred" pack of a midx bitmap. And thus the primary item we look at is the "pack" parameter added by that same commit, and not the bitmap_git->pack parameter (which would be NULL for a midx bitmap). It's our caller, reuse_partial_packfile_from_bitmap(), which decides which pack to use and passes it in to us. Drop the unused parameter to prevent confusion. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09pack-bitmap: drop repository argument from prepare_midx_bitmap_git()Jeff King3-5/+3
We never look at the repository argument which is passed. This makes sense, since the multi_pack_index struct already tells us everything we need to access the files in its associated object directory. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09sparse-index: integrate with cherry-pick and rebaseDerrick Stolee3-2/+46
The hard work was already done with 'git merge' and the ORT strategy. Just add extra tests to see that we get the expected results in the non-conflict cases. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09sequencer: ensure full index if not ORT strategyDerrick Stolee1-0/+9
The sequencer is used by 'cherry-pick' and 'rebase' to sequence a list of operations that modify the index. Since we intend to remove the need for 'command_requires_full_index', we need to ensure the sparse index is expanded every time it is written to disk between these steps. That is, unless the merge strategy is 'ort' where the index can remain sparse throughout. There are two main places to be extra careful about a full index: 1. Right before calling merge_trees(), ensure the index is full. This happens within an 'else' where the 'if' block checks if the 'ort' strategy is selected. 2. During read_and_refresh_cache(), the index might be written to disk and converted to sparse in the process. Ensure it expands back to full afterwards by checking if the strategy is _not_ 'ort'. This 'if' statement is the logical negation of the 'if' in item (1). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09t1092: add cherry-pick, rebase testsDerrick Stolee1-6/+9
Add tests to check that cherry-pick and rebase behave the same in the sparse-index case as in the full index cases. These tests are agnostic to GIT_TEST_MERGE_ALGORITHM, so a full CI test suite will check both the 'ort' and 'recursive' strategies on this test. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09merge-ort: expand only for out-of-cone conflictsDerrick Stolee2-5/+38
Merge conflicts happen often enough to want to avoid expanding a sparse index when they happen, as long as those conflicts are within the sparse-checkout cone. If a conflict exists outside of the sparse-checkout cone, then we still need to expand before iterating over the index entries. This is critical to do in advance because of how the original_cache_nr is tracked to allow inserting and replacing cache entries. Iterate over the conflicted files and check if any paths are outside of the sparse-checkout cone. If so, then expand the full index. Add a test that demonstrates that we do not expand the index, even when we hit a conflict within the sparse-checkout cone. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09merge: make sparse-aware with ORTDerrick Stolee4-2/+24
Allow 'git merge' to operate without expanding a sparse index, at least not immediately. The index still will be expanded in a few cases: 1. If the merge strategy is 'recursive', then we enable command_requires_full_index at the start of the merge_recursive() method. We expect sparse-index users to also have the 'ort' strategy enabled. 2. With the 'ort' strategy, if the merge results in a conflicted file, then we expand the index before updating the working tree. The loop that iterates over the worktree replaces index entries and tracks 'origintal_cache_nr' which can become completely wrong if the index expands in the middle of the operation. This safety valve is important before that loop starts. A later change will focus this to only expand if we indeed have a conflict outside of the sparse-checkout cone. 3. Other merge strategies are executed as a 'git merge-X' subcommand, and those strategies are currently protected with the 'command_requires_full_index' guard. Some test updates are required, including a mistaken 'git checkout -b' that did not specify the base branch, causing merges to be fast-forward merges. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09diff: ignore sparse paths in diffstatDerrick Stolee1-0/+8
The diff_populate_filespec() method is used to describe the diff after a merge operation is complete. In order to avoid expanding a sparse index, the reuse_worktree_file() needs to be adapted to ignore files that are outside of the sparse-checkout cone. The file names and OIDs used for this check come from the merged tree in the case of the ORT strategy, not the index, hence the ability to look into these paths without having already expanded the index. The work done by reuse_worktree_file() is only an optimization, and requires the file being on disk for it to be of any value. Thus, it is safe to exit the method early if we do not expect the file on disk. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09Close object store closer to spawning child processesJohannes Schindelin3-16/+8
In many cases where we spawned child processes that _may_ trigger a repack, we explicitly closed the object store first (so that the `repack` process can delete the `.pack` files, which would otherwise not be possible on Windows since files cannot be deleted as long as they as still in use). Wherever possible, we now use the new `close_object_store` bit of the `run_command()` API, to delay closing the object store even further. This makes the code easier to maintain because it is now more obvious that we only release those file handles because of those child processes. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09run_auto_maintenance(): implicitly close the object storeJohannes Schindelin5-5/+1
Before spawning the auto maintenance, we need to make sure that we release all open file handles to all the `.pack` files (and MIDX files and commit-graph files and...) so that the maintenance process has the freedom to delete those files. So far, we did this manually every time before calling `run_auto_maintenance()`. With the new `close_object_store` flag, we can do that implicitly in that function, which is more robust because future callers won't be able to forget to close the object store. Note: this changes behavior slightly, as we previously _always_ closed the object store, but now we only close the object store when actually running the auto maintenance. In practice, this should not matter (if anything, it might speed up operations where auto maintenance is disabled). Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09run-command: offer to close the object store before runningJohannes Schindelin2-0/+14
Especially on Windows, where files cannot be deleted if _any_ process holds an open file handle to them, it is important to close the object store (releasing all handles to all `.pack` files) before running a command that might spawn a garbage collection. This scenario is so common that we frequently see the pattern of closing the object store before running auto maintenance or another Git command. Let's make this much more convenient by teaching the `run_command()` machinery a new flag to release the object store before spawning the process. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09run-command: prettify the `RUN_COMMAND_*` flagsJohannes Schindelin1-7/+7
The values were listed unaligned, and with powers of two spelled out in decimal. The list is easier to parse for human readers if the numbers are aligned and spelled out as powers of two (using the bit-shift operator `<<`). While at it, remove a code comment that was unclear at best, and confusing at worst. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09pack.h: line-wrap the definition of finish_tmp_packfile()Ævar Arnfjörð Bjarmason1-1/+6
Line-wrap the definition of finish_tmp_packfile() to make subsequent changes easier to read. See 0e990530ae (finish_tmp_packfile(): a helper function, 2011-10-28) for the commit that introduced this overly long line. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09entry: show finer-grained counter in "Filtering content" progress lineSZEDER Gábor1-7/+5
The "Filtering content" progress in entry.c:finish_delayed_checkout() is unusual because of how it calculates the progress count and because it shows the progress of a nested loop. It works basically like this: start_delayed_progress(p, nr_of_paths_to_filter) for_each_filter { display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter) for_each_path_handled_by_the_current_filter { checkout_entry() } } stop_progress(p) There are two issues with this approach: - The work done by the last filter (or the only filter if there is only one) is never counted, so if the last filter still has some paths to process, then the counter shown in the "done" progress line will not match the expected total. The partially-RFC series to add a GIT_TEST_CHECK_PROGRESS=1 mode[1] helps spot this issue. Under it the 'missing file in delayed checkout' and 'invalid file in delayed checkout' tests in 't0021-conversion.sh' fail, because both use only one filter. (The test 'delayed checkout in process filter' uses two filters but the first one does all the work, so that test already happens to succeed even with GIT_TEST_CHECK_PROGRESS=1.) - The progress counter is updated only once per filter, not once per processed path, so if a filter has a lot of paths to process, then the counter might stay unchanged for a long while and then make a big jump (though the user still gets a sense of progress, because we call display_throughput() after each processed path to show the amount of processed data). Move the display_progress() call to the inner loop, right next to that checkout_entry() call that does the hard work for each path, and use a dedicated counter variable that is incremented upon processing each path. After this change the 'invalid file in delayed checkout' in 't0021-conversion.sh' would succeed with the GIT_TEST_CHECK_PROGRESS=1 assertion discussed above, but the 'missing file in delayed checkout' test would still fail. It'll fail because its purposefully buggy filter doesn't process any paths, so we won't execute that inner loop at all, see [2] for how to spot that issue without GIT_TEST_CHECK_PROGRESS=1. It's not straightforward to fix it with the current progress.c library (see [3] for an attempt), so let's leave it for now. Let's also initialize the *progress to "NULL" while we're at it. Since 7a132c628e5 (checkout: make delayed checkout respect --quiet and --no-progress, 2021-08-26) we have had progress conditional on "show_progress", usually we use the idiom of a "NULL" initialization of the "*progress", rather than the more verbose ternary added in 7a132c628e5. 1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/ 2. http://lore.kernel.org/git/20210802214827.GE23408@szeder.dev 3. https://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.com/ Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09commit-graph: fix bogus counter in "Scanning merged commits" progress lineSZEDER Gábor1-1/+1
The final value of the counter of the "Scanning merged commits" progress line is always one less than its expected total, e.g.: Scanning merged commits: 83% (5/6), done. This happens because while iterating over an array the loop variable is passed to display_progress() as-is, but while C arrays (and thus the loop variable) start at 0 and end at N-1, the progress counter must end at N. Fix this by passing 'i + 1' to display_progress(), like most other callsites do. There's an RFC series to add a GIT_TEST_CHECK_PROGRESS=1 mode[1] which catches this issue in the 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules' tests in 't5510-fetch.sh'. The GIT_TEST_CHECK_PROGRESS=1 mode is not part of this series, but future changes to progress.c may add it or similar assertions to catch this and similar bugs elsewhere. 1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/ Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08The fourth batchJunio C Hamano1-0/+33
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08pull: release packs before fetchingJohannes Schindelin1-0/+2
On Windows, files cannot be removed nor renamed if there are still handles held by a process. To remedy that, we try to release all open handles to any `.pack` file before e.g. repacking (which would want to remove the original `.pack` file(s) after it is done). Since the `read_cache_unmerged()` and/or the `get_oid()` call in `git pull` can cause `.pack` files to be opened, we need to release the open handles before calling `git fetch`: the latter process might want to spawn an auto-gc, which in turn might want to repack the objects. This commit is similar in spirit to 5bdece0d705 (gc/repack: release packs when needed, 2018-12-15). This fixes https://github.com/git-for-windows/git/issues/3336. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08commit-graph: when closing the graph, also release the slabJohannes Schindelin1-0/+1
The slab has information about the commit graph. That means that it is meaningless (and even misleading) when the commit graph was closed. This seems not to matter currently, but we're about to fix a Windows-specific bug where `git pull` does not close the object store before fetching (risking that an implicit auto-gc fails to remove the now-obsolete pack file(s)), and once we have that bug fix in place, it does matter: after that bug fix, we will open the object store, do some stuff with it, then close it, fetch, and then open it again, and do more stuff. If we close the commit graph without releasing the corresponding slab, we're hit by a symptom like this in t5520.19: BUG: commit-reach.c:85: bad generation skip 9223372036854775807 > 3 at 5cd378271655d43a3b4477520014f02213ad1546 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>