<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/list-objects.c, branch v2.45.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.45.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.45.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2024-03-28T21:13:50Z</updated>
<entry>
<title>Merge branch 'eb/hash-transition'</title>
<updated>2024-03-28T21:13:50Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-03-28T21:13:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=1002f28a527d33893f7dab068dbac7011f84af65'/>
<id>urn:sha1:1002f28a527d33893f7dab068dbac7011f84af65</id>
<content type='text'>
Work to support a repository that work with both SHA-1 and SHA-256
hash algorithms has started.

* eb/hash-transition: (30 commits)
  t1016-compatObjectFormat: add tests to verify the conversion between objects
  t1006: test oid compatibility with cat-file
  t1006: rename sha1 to oid
  test-lib: compute the compatibility hash so tests may use it
  builtin/ls-tree: let the oid determine the output algorithm
  object-file: handle compat objects in check_object_signature
  tree-walk: init_tree_desc take an oid to get the hash algorithm
  builtin/cat-file: let the oid determine the output algorithm
  rev-parse: add an --output-object-format parameter
  repository: implement extensions.compatObjectFormat
  object-file: update object_info_extended to reencode objects
  object-file-convert: convert commits that embed signed tags
  object-file-convert: convert commit objects when writing
  object-file-convert: don't leak when converting tag objects
  object-file-convert: convert tag objects when writing
  object-file-convert: add a function to convert trees between algorithms
  object: factor out parse_mode out of fast-import and tree-walk into in object.h
  cache: add a function to read an OID of a specific algorithm
  tag: sign both hashes
  commit: export add_header_signature to support handling signatures on tags
  ...
</content>
</entry>
<entry>
<title>Merge branch 'tb/rev-list-unpacked-fix'</title>
<updated>2023-11-08T06:04:42Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-11-08T06:04:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8ed4eb753881cf3a695cbe531c22698ca222ff0f'/>
<id>urn:sha1:8ed4eb753881cf3a695cbe531c22698ca222ff0f</id>
<content type='text'>
"git rev-list --unpacked --objects" failed to exclude packed
non-commit objects, which has been corrected.

* tb/rev-list-unpacked-fix:
  pack-bitmap: drop --unpacked non-commit objects from results
  list-objects: drop --unpacked non-commit objects from results
</content>
</entry>
<entry>
<title>list-objects: drop --unpacked non-commit objects from results</title>
<updated>2023-11-07T02:23:51Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2023-11-06T22:56:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4263f9279e331fabb34fac7ef92a5a1061ae1d01'/>
<id>urn:sha1:4263f9279e331fabb34fac7ef92a5a1061ae1d01</id>
<content type='text'>
In git-rev-list(1), we describe the `--unpacked` option as:

    Only useful with `--objects`; print the object IDs that are not in
    packs.

This is true of commits, which we discard via get_commit_action(), but
not of the objects they reach. So if we ask for an --objects traversal
with --unpacked, we may get arbitrarily many objects which are indeed
packed.

I am nearly certain this behavior dates back to the introduction of
`--unpacked` via 12d2a18780 ("git rev-list --unpacked" shows only
unpacked commits, 2005-07-03), but I couldn't get that revision of Git
to compile for me. At least as early as v2.0.0 this has been subtly
broken:

    $ git.compile --version
    git version 2.0.0

    $ git.compile rev-list --objects --all --unpacked
    72791fe96c93f9ec5c311b8bc966ab349b3b5bbe
    05713d991c18bbeef7e154f99660005311b5004d v1.0
    153ed8b7719c6f5a68ce7ffc43133e95a6ac0fdb
    8e4020bb5a8d8c873b25de15933e75cc0fc275df one
    9200b628cf9dc883a85a7abc8d6e6730baee589c two
    3e6b46e1b7e3b91acce99f6a823104c28aae0b58 unpacked.t

There, only the first, third, and sixth entries are loose, with the
remaining set of objects belonging to at least one pack.

The implications for this are relatively benign: bare 'git repack'
invocations which invoke pack-objects with --unpacked are impacted, and
at worst we'll store a few extra objects that should have been excluded.

Arguably changing this behavior is a backwards-incompatible change,
since it alters the set of objects emitted from rev-list queries with
`--objects` and `--unpacked`. But I argue that this change is still
sensible, since the existing implementation deviates from
clearly-written documentation.

The fix here is straightforward: avoid showing any non-commit objects
which are contained in packs by discarding them within list-objects.c,
before they are shown to the user. Note that similar treatment for
`list-objects.c::show_commit()` is not needed, since that case is
already handled by `revision.c::get_commit_action()`.

Co-authored-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>rev-list: add commit object support in `--missing` option</title>
<updated>2023-11-01T03:07:18Z</updated>
<author>
<name>Karthik Nayak</name>
<email>karthik.188@gmail.com</email>
</author>
<published>2023-10-27T07:59:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9830926c7db626e374b887eeecb96515e2956638'/>
<id>urn:sha1:9830926c7db626e374b887eeecb96515e2956638</id>
<content type='text'>
The `--missing` object option in rev-list currently works only with
missing blobs/trees. For missing commits the revision walker fails with
a fatal error.

Let's extend the functionality of `--missing` option to also support
commit objects. This is done by adding a `missing_objects` field to
`rev_info`. This field is an `oidset` to which we'll add the missing
commits as we encounter them. The revision walker will now continue the
traversal and call `show_commit()` even for missing commits. In rev-list
we can then check if the commit is a missing commit and call the
existing code for parsing `--missing` objects.

A scenario where this option would be used is to find the boundary
objects between different object directories. Consider a repository with
a main object directory (GIT_OBJECT_DIRECTORY) and one or more alternate
object directories (GIT_ALTERNATE_OBJECT_DIRECTORIES). In such a
repository, using the `--missing=print` option while disabling the
alternate object directory allows us to find the boundary objects
between the main and alternate object directory.

Helped-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Helped-by: Junio C Hamano &lt;gitster@pobox.com&gt;
Signed-off-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>revision: rename bit to `do_not_die_on_missing_objects`</title>
<updated>2023-11-01T03:07:18Z</updated>
<author>
<name>Karthik Nayak</name>
<email>karthik.188@gmail.com</email>
</author>
<published>2023-10-26T10:11:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ca556f47076932a07590c62aad772cfc89491614'/>
<id>urn:sha1:ca556f47076932a07590c62aad772cfc89491614</id>
<content type='text'>
The bit `do_not_die_on_missing_tree` is used in revision.h to ensure the
revision walker does not die when encountering a missing tree. This is
currently exclusively set within `builtin/rev-list.c` to ensure the
`--missing` option works with missing trees.

In the upcoming commits, we will extend `--missing` to also support
missing commits. So let's rename the bit to
`do_not_die_on_missing_objects`, which is object type agnostic and can
be used for both trees/commits.

Signed-off-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>tree-walk: init_tree_desc take an oid to get the hash algorithm</title>
<updated>2023-10-02T21:57:40Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2023-10-02T02:40:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=efed687edcaa272601e0f4e192db368972daa7ac'/>
<id>urn:sha1:efed687edcaa272601e0f4e192db368972daa7ac</id>
<content type='text'>
To make it possible for git ls-tree to display the tree encoded
in the hash algorithm of the oid specified to git ls-tree, update
init_tree_desc to take as a parameter the oid of the tree object.

Update all callers of init_tree_desc and init_tree_desc_gently
to pass the oid of the tree object.

Use the oid of the tree object to discover the hash algorithm
of the oid and store that hash algorithm in struct tree_desc.

Use the hash algorithm in decode_tree_entry and
update_tree_entry_internal to handle reading a tree object encoded in
a hash algorithm that differs from the repositories hash algorithm.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>list-objects: respect max_allowed_tree_depth</title>
<updated>2023-08-31T22:51:08Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2023-08-31T06:22:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=670a1dadc7d98b774cc0f870a4c75f57d1cfa9d4'/>
<id>urn:sha1:670a1dadc7d98b774cc0f870a4c75f57d1cfa9d4</id>
<content type='text'>
The tree traversal in list-objects.c, which is used by "rev-list
--objects", etc, uses recursion and may run out of stack space. Let's
teach it about the new core.maxTreeDepth config option.

We unfortunately can't return an error here, as this code doesn't
produce an error return at all. We'll die() instead, which matches the
behavior when we see an otherwise broken tree.

Note that this will also generally reject such deep trees from entering
the repository from a fetch or push, due to the use of rev-list in the
connectivity check. But it's not foolproof! We stop traversing when we
see an UNINTERESTING object, and the connectivity check marks existing
ref tips as UNINTERESTING. So imagine commit X has a tree
with maximum depth N. If you then create a new commit Y with a tree
entry "Y:subdir" that points to "X^{tree}", then the depth of Y will be
N+1. But a connectivity check running "git rev-list --objects Y --not X"
won't realize that; it will stop traversing at X^{tree}, since that was
already reachable.

So this will stop naive pushes of too-deep trees, but not carefully
crafted malicious ones. Doing it robustly and efficiently would require
caching the maximum depth of each tree (i.e., the longest path to any
leaf entry). That's much more complex and not strictly needed. If each
recursive algorithm limits itself already, then that's sufficient.
Blocking the objects from entering the repo would be a nice
belt-and-suspenders addition, but it's not worth the extra cost.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jc/tree-walk-drop-base-offset'</title>
<updated>2023-08-02T16:37:23Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-08-02T16:37:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=fea92e4cac30b1204ecc8dfd53675bbdd68611e0'/>
<id>urn:sha1:fea92e4cac30b1204ecc8dfd53675bbdd68611e0</id>
<content type='text'>
Code simplification.

* jc/tree-walk-drop-base-offset:
  tree-walk: drop unused base_offset from do_match()
  tree-walk: lose base_offset that is never used in tree_entry_interesting
</content>
</entry>
<entry>
<title>tree-walk: lose base_offset that is never used in tree_entry_interesting</title>
<updated>2023-07-07T22:27:28Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-07-07T22:21:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0ad927e9e0013471cc752781f0c368f862934a44'/>
<id>urn:sha1:0ad927e9e0013471cc752781f0c368f862934a44</id>
<content type='text'>
The tree_entry_interesting() function takes base_offset, allowing
its callers to potentially pass a non-zero number to skip the early
part of the path string.

The feature is never exercised and we do not even know what bugs are
lurking there, as all callers pass 0 to the parameter.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>object-store-ll.h: split this header out of object-store.h</title>
<updated>2023-06-21T20:39:54Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-05-16T06:34:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a034e9106ff1a4cb6fcb6f2ea3a1a47b4d2ba173'/>
<id>urn:sha1:a034e9106ff1a4cb6fcb6f2ea3a1a47b4d2ba173</id>
<content type='text'>
The vast majority of files including object-store.h did not need dir.h
nor khash.h.  Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.

After this patch:
    $ git grep -h include..object-store | sort | uniq -c
          2 #include "object-store.h"
        129 #include "object-store-ll.h"

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
