diff options
| author | Derrick Stolee <stolee@gmail.com> | 2025-03-24 00:51:51 +0000 |
|---|---|---|
| committer | Junio C Hamano <gitster@pobox.com> | 2025-03-23 23:06:01 -0700 |
| commit | 6540560fd6c91091f6cf1eaedd034bc1827e1506 (patch) | |
| tree | 9e5757f91fc449f73f72f12f68ec02f0426fdc76 | |
| parent | maintenance: force progress/no-quiet to children (diff) | |
| download | git-6540560fd6c91091f6cf1eaedd034bc1827e1506.tar.gz git-6540560fd6c91091f6cf1eaedd034bc1827e1506.zip | |
maintenance: add loose-objects.batchSize config
The 'loose-objects' task of 'git maintenance run' first deletes loose
objects that exit within packfiles and then collects loose objects into
a packfile. This second step uses an implicit limit of fifty thousand
that cannot be modified by users.
Add a new config option that allows this limit to be adjusted or ignored
entirely.
While creating tests for this option, I noticed that actually there was
an off-by-one error due to the strict comparison in the limit check. I
considered making the limit check turn true on equality, but instead I
thought to use INT_MAX as a "no limit" barrier which should mean it's
never possible to hit the limit. Thus, a new decrement to the limit is
provided if the value is positive. (The restriction to positive values
is to avoid underflow if INT_MIN is configured.)
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
| -rw-r--r-- | Documentation/config/maintenance.adoc | 5 | ||||
| -rw-r--r-- | Documentation/git-maintenance.adoc | 18 | ||||
| -rw-r--r-- | builtin/gc.c | 10 | ||||
| -rwxr-xr-x | t/t7900-maintenance.sh | 28 |
4 files changed, 54 insertions, 7 deletions
diff --git a/Documentation/config/maintenance.adoc b/Documentation/config/maintenance.adoc index 72a9d6cf81..42f9545da0 100644 --- a/Documentation/config/maintenance.adoc +++ b/Documentation/config/maintenance.adoc @@ -61,6 +61,11 @@ maintenance.loose-objects.auto:: loose objects is at least the value of `maintenance.loose-objects.auto`. The default value is 100. +maintenance.loose-objects.batchSize:: + This integer config option controls the maximum number of loose objects + written into a packfile during the `loose-objects` task. The default is + fifty thousand. Use value `0` to indicate no limit. + maintenance.incremental-repack.auto:: This integer config option controls how often the `incremental-repack` task should be run as part of `git maintenance run --auto`. If zero, diff --git a/Documentation/git-maintenance.adoc b/Documentation/git-maintenance.adoc index 0450d74aff..c90b370b1f 100644 --- a/Documentation/git-maintenance.adoc +++ b/Documentation/git-maintenance.adoc @@ -126,13 +126,17 @@ loose-objects:: objects that already exist in a pack-file; concurrent Git processes will examine the pack-file for the object data instead of the loose object. Second, it creates a new pack-file (starting with "loose-") - containing a batch of loose objects. The batch size is limited to 50 - thousand objects to prevent the job from taking too long on a - repository with many loose objects. The `gc` task writes unreachable - objects as loose objects to be cleaned up by a later step only if - they are not re-added to a pack-file; for this reason it is not - advisable to enable both the `loose-objects` and `gc` tasks at the - same time. + containing a batch of loose objects. ++ +The batch size defaults to fifty thousand objects to prevent the job from +taking too long on a repository with many loose objects. Use the +`maintenance.loose-objects.batchSize` config option to adjust this size, +including a value of `0` to remove the limit. ++ +The `gc` task writes unreachable objects as loose objects to be cleaned up +by a later step only if they are not re-added to a pack-file; for this +reason it is not advisable to enable both the `loose-objects` and `gc` +tasks at the same time. incremental-repack:: The `incremental-repack` job repacks the object directory diff --git a/builtin/gc.c b/builtin/gc.c index 6672f165bd..817081e1a5 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -1163,6 +1163,7 @@ static int write_loose_object_to_stdin(const struct object_id *oid, fprintf(d->in, "%s\n", oid_to_hex(oid)); + /* If batch_size is INT_MAX, then this will return 0 always. */ return ++(d->count) > d->batch_size; } @@ -1208,6 +1209,15 @@ static int pack_loose(struct maintenance_run_opts *opts) data.count = 0; data.batch_size = 50000; + repo_config_get_int(r, "maintenance.loose-objects.batchSize", + &data.batch_size); + + /* If configured as 0, then remove limit. */ + if (!data.batch_size) + data.batch_size = INT_MAX; + else if (data.batch_size > 0) + data.batch_size--; /* Decrease for equality on limit. */ + for_each_loose_file_in_objdir(r->objects->odb->path, write_loose_object_to_stdin, NULL, diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 1909aed95e..834ddb5ad6 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -306,6 +306,34 @@ test_expect_success 'maintenance.loose-objects.auto' ' test_subcommand git prune-packed --quiet <trace-loC ' +test_expect_success 'maintenance.loose-objects.batchSize' ' + git init loose-batch && + + # This creates three objects per commit. + test_commit_bulk -C loose-batch 34 && + pack=$(ls loose-batch/.git/objects/pack/pack-*.pack) && + index="${pack%pack}idx" && + rm "$index" && + git -C loose-batch unpack-objects <"$pack" && + git -C loose-batch config maintenance.loose-objects.batchSize 50 && + + GIT_PROGRESS_DELAY=0 \ + git -C loose-batch maintenance run --no-quiet --task=loose-objects 2>err && + grep "Enumerating objects: 50, done." err && + + GIT_PROGRESS_DELAY=0 \ + git -C loose-batch maintenance run --no-quiet --task=loose-objects 2>err && + grep "Enumerating objects: 50, done." err && + + GIT_PROGRESS_DELAY=0 \ + git -C loose-batch maintenance run --no-quiet --task=loose-objects 2>err && + grep "Enumerating objects: 2, done." err && + + GIT_PROGRESS_DELAY=0 \ + git -C loose-batch maintenance run --no-quiet --task=loose-objects 2>err && + test_must_be_empty err +' + test_expect_success 'incremental-repack task' ' packDir=.git/objects/pack && for i in $(test_seq 1 5) |
