<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/builtin, branch v2.52.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.52.0</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.52.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2025-11-12T19:45:24Z</updated>
<entry>
<title>Merge branch 'tc/last-modified-active-paths-optimization'</title>
<updated>2025-11-12T19:45:24Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-11-12T19:45:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=99bd5a5c9f74abfe81196d96b8467d0d1d4723c5'/>
<id>urn:sha1:99bd5a5c9f74abfe81196d96b8467d0d1d4723c5</id>
<content type='text'>
"git last-modified" was optimized by narrowing the set of paths to
follow as it dug deeper in the history.

* tc/last-modified-active-paths-optimization:
  last-modified: implement faster algorithm
</content>
</entry>
<entry>
<title>Merge branch 'cc/fast-import-export-i18n-cleanup'</title>
<updated>2025-11-06T23:17:01Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-11-06T23:17:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e569dced68a486b38b14cdd2e3e0b34d21752a18'/>
<id>urn:sha1:e569dced68a486b38b14cdd2e3e0b34d21752a18</id>
<content type='text'>
Messages from fast-import/export are now marked for i18n.

* cc/fast-import-export-i18n-cleanup:
  gpg-interface: mark a string for translation
  fast-import: mark strings for translation
  fast-export: mark strings for translation
  gpg-interface: use left shift to define GPG_VERIFY_*
  gpg-interface: simplify ssh fingerprint parsing
</content>
</entry>
<entry>
<title>Merge branch 'rz/t0450-bisect-doc-update'</title>
<updated>2025-11-05T21:41:51Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-11-05T21:41:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c8a641c5904bc55807d4bbed035ab03dbd6c10ba'/>
<id>urn:sha1:c8a641c5904bc55807d4bbed035ab03dbd6c10ba</id>
<content type='text'>
The help text and manual page of "git bisect" command have been
made consistent with each other.

* rz/t0450-bisect-doc-update:
  bisect: update usage and docs to match each other
</content>
</entry>
<entry>
<title>Merge branch 'jt/repo-structure'</title>
<updated>2025-11-04T15:48:07Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-11-04T15:48:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a9db6c66f53f3d1230e575eec682e494b363bdc0'/>
<id>urn:sha1:a9db6c66f53f3d1230e575eec682e494b363bdc0</id>
<content type='text'>
"git repo structure", a new command.

* jt/repo-structure:
  builtin/repo: add progress meter for structure stats
  builtin/repo: add keyvalue and nul format for structure stats
  builtin/repo: add object counts in structure output
  builtin/repo: introduce structure subcommand
  ref-filter: export ref_kind_from_refname()
  ref-filter: allow NULL filter pattern
  builtin/repo: rename repo_info() to cmd_repo_info()
</content>
</entry>
<entry>
<title>last-modified: implement faster algorithm</title>
<updated>2025-11-03T15:25:41Z</updated>
<author>
<name>Toon Claes</name>
<email>toon@iotcl.com</email>
</author>
<published>2025-10-23T07:50:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=2a04e8c293766a4976ceceb4c663dd2963e0339e'/>
<id>urn:sha1:2a04e8c293766a4976ceceb4c663dd2963e0339e</id>
<content type='text'>
The current implementation of git-last-modified(1) works by doing a
revision walk, and inspecting the diff at each level of that walk to
annotate entries remaining in the hashmap of paths. In other words, if
the diff at some level touches a path which has not yet been associated
with a commit, then that commit becomes associated with the path.

While a perfectly reasonable implementation, it can perform poorly in
either one of two scenarios:

  1. There are many entries of interest, in which case there is simply
     a lot of work to do.

  2. Or, there are (even a few) entries which have not been updated in a
     long time, and so we must walk through a lot of history in order to
     find a commit that touches that path.

This patch rewrites the last-modified implementation that addresses the
second point. The idea behind the algorithm is to propagate a set of
'active' paths (a path is 'active' if it does not yet belong to a
commit) up to parents and do a truncated revision walk.

The walk is truncated because it does not produce a revision for every
change in the original pathspec, but rather only for active paths.

More specifically, consider a priority queue of commits sorted by
generation number. First, enqueue the set of boundary commits with all
paths in the original spec marked as interesting.

Then, while the queue is not empty, do the following:

  1. Pop an element, say, 'c', off of the queue, making sure that 'c'
     isn't reachable by anything in the '--not' set.

  2. For each parent 'p' (with index 'parent_i') of 'c', do the
     following:

     a. Compute the diff between 'c' and 'p'.
     b. Pass any active paths that are TREESAME from 'c' to 'p'.
     c. If 'p' has any active paths, push it onto the queue.

  3. Any path that remains active on 'c' is associated to that commit.

This ends up being equivalent to doing something like 'git log -1 --
$path' for each path simultaneously. But, it allows us to go much faster
than the original implementation by limiting the number of diffs we
compute, since we can avoid parts of history that would have been
considered by the revision walk in the original implementation, but are
known to be uninteresting to us because we have already marked all paths
in that area to be inactive.

To avoid computing many first-parent diffs, add another trick on top of
this and check if all paths active in 'c' are DEFINITELY NOT in c's
Bloom filter. Since the commit-graph only stores first-parent diffs in
the Bloom filters, we can only apply this trick to first-parent diffs.

Comparing the performance of this new algorithm shows about a 2.5x
improvement on git.git:

    Benchmark 1: master   no bloom
      Time (mean ± σ):      2.868 s ±  0.023 s    [User: 2.811 s, System: 0.051 s]
      Range (min … max):    2.847 s …  2.926 s    10 runs

    Benchmark 2: master with bloom
      Time (mean ± σ):     949.9 ms ±  15.2 ms    [User: 907.6 ms, System: 39.5 ms]
      Range (min … max):   933.3 ms … 971.2 ms    10 runs

    Benchmark 3: HEAD     no bloom
      Time (mean ± σ):     782.0 ms ±   6.3 ms    [User: 740.7 ms, System: 39.2 ms]
      Range (min … max):   776.4 ms … 798.2 ms    10 runs

    Benchmark 4: HEAD   with bloom
      Time (mean ± σ):     307.1 ms ±   1.7 ms    [User: 276.4 ms, System: 29.9 ms]
      Range (min … max):   303.7 ms … 309.5 ms    10 runs

    Summary
      HEAD   with bloom ran
        2.55 ± 0.02 times faster than HEAD     no bloom
        3.09 ± 0.05 times faster than master with bloom
        9.34 ± 0.09 times faster than master   no bloom

In short, the existing implementation is comparably fast *with* Bloom
filters as the new implementation is *without* Bloom filters. So, most
repositories should get a dramatic speed-up by just deploying this (even
without computing Bloom filters), and all repositories should get faster
still when computing Bloom filters.

When comparing a more extreme example of
`git last-modified -- COPYING t`, the difference is even 5 times better:

    Benchmark 1: master
      Time (mean ± σ):      4.372 s ±  0.057 s    [User: 4.286 s, System: 0.062 s]
      Range (min … max):    4.308 s …  4.509 s    10 runs

    Benchmark 2: HEAD
      Time (mean ± σ):     826.3 ms ±  22.3 ms    [User: 784.1 ms, System: 39.2 ms]
      Range (min … max):   810.6 ms … 881.2 ms    10 runs

    Summary
      HEAD ran
        5.29 ± 0.16 times faster than master

As an added benefit, results are more consistent now. For example
implementation in 'master' gives:

    $ git log --max-count=1 --format=%H -- pkt-line.h
    15df15fe07ef66b51302bb77e393f3c5502629de

    $ git last-modified -- pkt-line.h
    15df15fe07ef66b51302bb77e393f3c5502629de	pkt-line.h

    $ git last-modified | grep pkt-line.h
    5b49c1af03e600c286f63d9d9c9fb01403230b9f	pkt-line.h

With the changes in this patch the results of git-last-modified(1)
always match those of `git log --max-count=1`.

One thing to note though, the results might be outputted in a different
order than before. This is not considerd to be an issue because nowhere
is documented the order is guaranteed.

Based-on-patches-by: Derrick Stolee &lt;stolee@gmail.com&gt;
Based-on-patches-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Toon Claes &lt;toon@iotcl.com&gt;
Acked-by: Taylor Blau &lt;me@ttaylorr.com&gt;
[jc: tweaked use of xcalloc() to unbreak coccicheck]
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ps/maintenance-geometric'</title>
<updated>2025-11-03T14:49:55Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-11-03T14:49:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3cf3369e8114c79fe2e54714cbf6dcae8b7fad9a'/>
<id>urn:sha1:3cf3369e8114c79fe2e54714cbf6dcae8b7fad9a</id>
<content type='text'>
"git maintenance" command learns the "geometric" strategy where it
avoids doing maintenance tasks that rebuilds everything from
scratch.

* ps/maintenance-geometric:
  t7900: fix a flaky test due to git-repack always regenerating MIDX
  builtin/maintenance: introduce "geometric" strategy
  builtin/maintenance: make "gc" strategy accessible
  builtin/maintenance: extend "maintenance.strategy" to manual maintenance
  builtin/maintenance: run maintenance tasks depending on type
  builtin/maintenance: improve readability of strategies
  builtin/maintenance: don't silently ignore invalid strategy
  builtin/maintenance: make the geometric factor configurable
  builtin/maintenance: introduce "geometric-repack" task
  builtin/gc: make `too_many_loose_objects()` reusable without GC config
  builtin/gc: remove global `repack` variable
</content>
</entry>
<entry>
<title>Merge branch 'rz/bisect-help-unknown'</title>
<updated>2025-10-30T15:00:20Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-10-30T15:00:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=be414e17e53fea5c8b738d048b03a9678e23f371'/>
<id>urn:sha1:be414e17e53fea5c8b738d048b03a9678e23f371</id>
<content type='text'>
"git bisect" command did not react correctly to "git bisect help"
and "git bisect unknown", which has been corrected.

* rz/bisect-help-unknown:
  bisect: fix handling of `help` and invalid subcommands
</content>
</entry>
<entry>
<title>Merge branch 'ps/remove-packfile-store-get-packs'</title>
<updated>2025-10-30T15:00:19Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-10-30T15:00:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=55547380384c77071b2fc8ff4cc570434a4793d1'/>
<id>urn:sha1:55547380384c77071b2fc8ff4cc570434a4793d1</id>
<content type='text'>
Two slightly different ways to get at "all the packfiles" in API
has been cleaned up.

* ps/remove-packfile-store-get-packs:
  packfile: rename `packfile_store_get_all_packs()`
  packfile: introduce macro to iterate through packs
  packfile: drop `packfile_store_get_packs()`
  builtin/grep: simplify how we preload packs
  builtin/gc: convert to use `packfile_store_get_all_packs()`
  object-name: convert to use `packfile_store_get_all_packs()`
</content>
</entry>
<entry>
<title>Merge branch 'ey/commit-graph-changed-paths-config'</title>
<updated>2025-10-30T15:00:19Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-10-30T15:00:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=923436e23d0da21350363422809e2ae9e18c97d3'/>
<id>urn:sha1:923436e23d0da21350363422809e2ae9e18c97d3</id>
<content type='text'>
A new configuration variable commitGraph.changedPaths allows to
turn "--changed-paths" on by default for "git commit-graph".

* ey/commit-graph-changed-paths-config:
  commit-graph: add new config for changed-paths &amp; recommend it in scalar
</content>
</entry>
<entry>
<title>fast-import: mark strings for translation</title>
<updated>2025-10-30T14:06:58Z</updated>
<author>
<name>Christian Couder</name>
<email>christian.couder@gmail.com</email>
</author>
<published>2025-10-30T12:33:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c295115ec615781a1febad3157ea0e9e5346eba8'/>
<id>urn:sha1:c295115ec615781a1febad3157ea0e9e5346eba8</id>
<content type='text'>
Some error or warning messages in "builtin/fast-import.c" are marked
for translation, but many are not.

To be more consistent and provide a better experience to people using a
translated version, let's mark all the remaining error or warning
messages for translation.

While at it, let's make the following small changes:

  - replace "GIT" or "git" in a few error messages to just "Git",
  - replace "Expected from command, got %s" to "expected 'from'
    command, got '%s'", which makes it clearer that "from" is a command
    and should not be translated,
  - downcase error and warning messages that start with an uppercase,
  - fix test cases in "t9300-fast-import.sh" that broke because an
    error or warning message was downcased,
  - split error and warning messages that are too long,
  - adjust the indentation of some arguments of the error functions.

Signed-off-by: Christian Couder &lt;chriscool@tuxfamily.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
