<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/list-objects.c, branch v2.43.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.43.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.43.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2023-11-08T06:04:42Z</updated>
<entry>
<title>Merge branch 'tb/rev-list-unpacked-fix'</title>
<updated>2023-11-08T06:04:42Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-11-08T06:04:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8ed4eb753881cf3a695cbe531c22698ca222ff0f'/>
<id>urn:sha1:8ed4eb753881cf3a695cbe531c22698ca222ff0f</id>
<content type='text'>
"git rev-list --unpacked --objects" failed to exclude packed
non-commit objects, which has been corrected.

* tb/rev-list-unpacked-fix:
  pack-bitmap: drop --unpacked non-commit objects from results
  list-objects: drop --unpacked non-commit objects from results
</content>
</entry>
<entry>
<title>list-objects: drop --unpacked non-commit objects from results</title>
<updated>2023-11-07T02:23:51Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2023-11-06T22:56:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4263f9279e331fabb34fac7ef92a5a1061ae1d01'/>
<id>urn:sha1:4263f9279e331fabb34fac7ef92a5a1061ae1d01</id>
<content type='text'>
In git-rev-list(1), we describe the `--unpacked` option as:

    Only useful with `--objects`; print the object IDs that are not in
    packs.

This is true of commits, which we discard via get_commit_action(), but
not of the objects they reach. So if we ask for an --objects traversal
with --unpacked, we may get arbitrarily many objects which are indeed
packed.

I am nearly certain this behavior dates back to the introduction of
`--unpacked` via 12d2a18780 ("git rev-list --unpacked" shows only
unpacked commits, 2005-07-03), but I couldn't get that revision of Git
to compile for me. At least as early as v2.0.0 this has been subtly
broken:

    $ git.compile --version
    git version 2.0.0

    $ git.compile rev-list --objects --all --unpacked
    72791fe96c93f9ec5c311b8bc966ab349b3b5bbe
    05713d991c18bbeef7e154f99660005311b5004d v1.0
    153ed8b7719c6f5a68ce7ffc43133e95a6ac0fdb
    8e4020bb5a8d8c873b25de15933e75cc0fc275df one
    9200b628cf9dc883a85a7abc8d6e6730baee589c two
    3e6b46e1b7e3b91acce99f6a823104c28aae0b58 unpacked.t

There, only the first, third, and sixth entries are loose, with the
remaining set of objects belonging to at least one pack.

The implications for this are relatively benign: bare 'git repack'
invocations which invoke pack-objects with --unpacked are impacted, and
at worst we'll store a few extra objects that should have been excluded.

Arguably changing this behavior is a backwards-incompatible change,
since it alters the set of objects emitted from rev-list queries with
`--objects` and `--unpacked`. But I argue that this change is still
sensible, since the existing implementation deviates from
clearly-written documentation.

The fix here is straightforward: avoid showing any non-commit objects
which are contained in packs by discarding them within list-objects.c,
before they are shown to the user. Note that similar treatment for
`list-objects.c::show_commit()` is not needed, since that case is
already handled by `revision.c::get_commit_action()`.

Co-authored-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>rev-list: add commit object support in `--missing` option</title>
<updated>2023-11-01T03:07:18Z</updated>
<author>
<name>Karthik Nayak</name>
<email>karthik.188@gmail.com</email>
</author>
<published>2023-10-27T07:59:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9830926c7db626e374b887eeecb96515e2956638'/>
<id>urn:sha1:9830926c7db626e374b887eeecb96515e2956638</id>
<content type='text'>
The `--missing` object option in rev-list currently works only with
missing blobs/trees. For missing commits the revision walker fails with
a fatal error.

Let's extend the functionality of `--missing` option to also support
commit objects. This is done by adding a `missing_objects` field to
`rev_info`. This field is an `oidset` to which we'll add the missing
commits as we encounter them. The revision walker will now continue the
traversal and call `show_commit()` even for missing commits. In rev-list
we can then check if the commit is a missing commit and call the
existing code for parsing `--missing` objects.

A scenario where this option would be used is to find the boundary
objects between different object directories. Consider a repository with
a main object directory (GIT_OBJECT_DIRECTORY) and one or more alternate
object directories (GIT_ALTERNATE_OBJECT_DIRECTORIES). In such a
repository, using the `--missing=print` option while disabling the
alternate object directory allows us to find the boundary objects
between the main and alternate object directory.

Helped-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Helped-by: Junio C Hamano &lt;gitster@pobox.com&gt;
Signed-off-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>revision: rename bit to `do_not_die_on_missing_objects`</title>
<updated>2023-11-01T03:07:18Z</updated>
<author>
<name>Karthik Nayak</name>
<email>karthik.188@gmail.com</email>
</author>
<published>2023-10-26T10:11:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ca556f47076932a07590c62aad772cfc89491614'/>
<id>urn:sha1:ca556f47076932a07590c62aad772cfc89491614</id>
<content type='text'>
The bit `do_not_die_on_missing_tree` is used in revision.h to ensure the
revision walker does not die when encountering a missing tree. This is
currently exclusively set within `builtin/rev-list.c` to ensure the
`--missing` option works with missing trees.

In the upcoming commits, we will extend `--missing` to also support
missing commits. So let's rename the bit to
`do_not_die_on_missing_objects`, which is object type agnostic and can
be used for both trees/commits.

Signed-off-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>list-objects: respect max_allowed_tree_depth</title>
<updated>2023-08-31T22:51:08Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2023-08-31T06:22:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=670a1dadc7d98b774cc0f870a4c75f57d1cfa9d4'/>
<id>urn:sha1:670a1dadc7d98b774cc0f870a4c75f57d1cfa9d4</id>
<content type='text'>
The tree traversal in list-objects.c, which is used by "rev-list
--objects", etc, uses recursion and may run out of stack space. Let's
teach it about the new core.maxTreeDepth config option.

We unfortunately can't return an error here, as this code doesn't
produce an error return at all. We'll die() instead, which matches the
behavior when we see an otherwise broken tree.

Note that this will also generally reject such deep trees from entering
the repository from a fetch or push, due to the use of rev-list in the
connectivity check. But it's not foolproof! We stop traversing when we
see an UNINTERESTING object, and the connectivity check marks existing
ref tips as UNINTERESTING. So imagine commit X has a tree
with maximum depth N. If you then create a new commit Y with a tree
entry "Y:subdir" that points to "X^{tree}", then the depth of Y will be
N+1. But a connectivity check running "git rev-list --objects Y --not X"
won't realize that; it will stop traversing at X^{tree}, since that was
already reachable.

So this will stop naive pushes of too-deep trees, but not carefully
crafted malicious ones. Doing it robustly and efficiently would require
caching the maximum depth of each tree (i.e., the longest path to any
leaf entry). That's much more complex and not strictly needed. If each
recursive algorithm limits itself already, then that's sufficient.
Blocking the objects from entering the repo would be a nice
belt-and-suspenders addition, but it's not worth the extra cost.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jc/tree-walk-drop-base-offset'</title>
<updated>2023-08-02T16:37:23Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-08-02T16:37:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=fea92e4cac30b1204ecc8dfd53675bbdd68611e0'/>
<id>urn:sha1:fea92e4cac30b1204ecc8dfd53675bbdd68611e0</id>
<content type='text'>
Code simplification.

* jc/tree-walk-drop-base-offset:
  tree-walk: drop unused base_offset from do_match()
  tree-walk: lose base_offset that is never used in tree_entry_interesting
</content>
</entry>
<entry>
<title>tree-walk: lose base_offset that is never used in tree_entry_interesting</title>
<updated>2023-07-07T22:27:28Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-07-07T22:21:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0ad927e9e0013471cc752781f0c368f862934a44'/>
<id>urn:sha1:0ad927e9e0013471cc752781f0c368f862934a44</id>
<content type='text'>
The tree_entry_interesting() function takes base_offset, allowing
its callers to potentially pass a non-zero number to skip the early
part of the path string.

The feature is never exercised and we do not even know what bugs are
lurking there, as all callers pass 0 to the parameter.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>object-store-ll.h: split this header out of object-store.h</title>
<updated>2023-06-21T20:39:54Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-05-16T06:34:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a034e9106ff1a4cb6fcb6f2ea3a1a47b4d2ba173'/>
<id>urn:sha1:a034e9106ff1a4cb6fcb6f2ea3a1a47b4d2ba173</id>
<content type='text'>
The vast majority of files including object-store.h did not need dir.h
nor khash.h.  Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.

After this patch:
    $ git grep -h include..object-store | sort | uniq -c
          2 #include "object-store.h"
        129 #include "object-store-ll.h"

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>treewide: remove cache.h inclusion due to object.h changes</title>
<updated>2023-04-11T15:52:10Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-04-11T07:41:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d812c3b6a0b41d48ce68f971d9cec50676048863'/>
<id>urn:sha1:d812c3b6a0b41d48ce68f971d9cec50676048863</id>
<content type='text'>
Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Acked-by: Calvin Wan &lt;calvinwan@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ab/remove-implicit-use-of-the-repository' into en/header-split-cache-h</title>
<updated>2023-04-04T15:25:52Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-04-04T15:25:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e7dca80692dec7f0717d9ad6d978d0ceab90cf6a'/>
<id>urn:sha1:e7dca80692dec7f0717d9ad6d978d0ceab90cf6a</id>
<content type='text'>
* ab/remove-implicit-use-of-the-repository:
  libs: use "struct repository *" argument, not "the_repository"
  post-cocci: adjust comments for recent repo_* migration
  cocci: apply the "revision.h" part of "the_repository.pending"
  cocci: apply the "rerere.h" part of "the_repository.pending"
  cocci: apply the "refs.h" part of "the_repository.pending"
  cocci: apply the "promisor-remote.h" part of "the_repository.pending"
  cocci: apply the "packfile.h" part of "the_repository.pending"
  cocci: apply the "pretty.h" part of "the_repository.pending"
  cocci: apply the "object-store.h" part of "the_repository.pending"
  cocci: apply the "diff.h" part of "the_repository.pending"
  cocci: apply the "commit.h" part of "the_repository.pending"
  cocci: apply the "commit-reach.h" part of "the_repository.pending"
  cocci: apply the "cache.h" part of "the_repository.pending"
  cocci: add missing "the_repository" macros to "pending"
  cocci: sort "the_repository" rules by header
  cocci: fix incorrect &amp; verbose "the_repository" rules
  cocci: remove dead rule from "the_repository.pending.cocci"
</content>
</entry>
</feed>
