<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/builtin/pack-objects.c, branch v2.26.1</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.26.1</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.26.1'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2020-03-05T18:43:03Z</updated>
<entry>
<title>Merge branch 'jk/nth-packed-object-id'</title>
<updated>2020-03-05T18:43:03Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-03-05T18:43:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e8e71848ea866d7dc34eacffc20b9c3826ae29a1'/>
<id>urn:sha1:e8e71848ea866d7dc34eacffc20b9c3826ae29a1</id>
<content type='text'>
Code cleanup to use "struct object_id" more by replacing use of
"char *sha1"

* jk/nth-packed-object-id:
  packfile: drop nth_packed_object_sha1()
  packed_object_info(): use object_id internally for delta base
  packed_object_info(): use object_id for returning delta base
  pack-check: push oid lookup into loop
  pack-check: convert "internal error" die to a BUG()
  pack-bitmap: use object_id when loading on-disk bitmaps
  pack-objects: use object_id struct in pack-reuse code
  pack-objects: convert oe_set_delta_ext() to use object_id
  pack-objects: read delta base oid into object_id struct
  nth_packed_object_oid(): use customary integer return
</content>
</entry>
<entry>
<title>Merge branch 'jk/object-filter-with-bitmap'</title>
<updated>2020-03-02T23:07:18Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-03-02T23:07:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0df82d99dae85dbd4f667e95020a146ea0167975'/>
<id>urn:sha1:0df82d99dae85dbd4f667e95020a146ea0167975</id>
<content type='text'>
The object reachability bitmap machinery and the partial cloning
machinery were not prepared to work well together, because some
object-filtering criteria that partial clones use inherently rely
on object traversal, but the bitmap machinery is an optimization
to bypass that object traversal.  There however are some cases
where they can work together, and they were taught about them.

* jk/object-filter-with-bitmap:
  rev-list --count: comment on the use of count_right++
  pack-objects: support filters with bitmaps
  pack-bitmap: implement BLOB_LIMIT filtering
  pack-bitmap: implement BLOB_NONE filtering
  bitmap: add bitmap_unset() function
  rev-list: use bitmap filters for traversal
  pack-bitmap: basic noop bitmap filter infrastructure
  rev-list: allow commit-only bitmap traversals
  t5310: factor out bitmap traversal comparison
  rev-list: allow bitmaps when counting objects
  rev-list: make --count work with --objects
  rev-list: factor out bitmap-optimized routines
  pack-bitmap: refuse to do a bitmap traversal with pathspecs
  rev-list: fallback to non-bitmap traversal when filtering
  pack-bitmap: fix leak of haves/wants object lists
  pack-bitmap: factor out type iterator initialization
</content>
</entry>
<entry>
<title>pack-objects: use object_id struct in pack-reuse code</title>
<updated>2020-02-24T20:55:53Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-02-24T04:31:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=f66d4e025059b734ba8da40ec059bb0fb8991306'/>
<id>urn:sha1:f66d4e025059b734ba8da40ec059bb0fb8991306</id>
<content type='text'>
When the pack-reuse code is dumping an OFS_DELTA entry to a client that
doesn't support it, we re-write it as a REF_DELTA. To do so, we use
nth_packed_object_sha1() to get the oid, but that function is soon going
away in favor of the more type-safe nth_packed_object_id(). Let's switch
now in preparation.

Note that this does incur an extra hash copy (from the pack idx mmap to
the object_id and then to the output, rather than straight from mmap to
the output). But this is not worth worrying about. It's probably not
measurable even when it triggers, and this is fallback code that we
expect to trigger very rarely (since everybody supports OFS_DELTA these
days anyway).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-objects: convert oe_set_delta_ext() to use object_id</title>
<updated>2020-02-24T20:55:52Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-02-24T04:30:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a93c141ddef25dc999fff73c590b42d3af606ff3'/>
<id>urn:sha1:a93c141ddef25dc999fff73c590b42d3af606ff3</id>
<content type='text'>
We already store an object_id internally, and now our sole caller also
has one. Let's stop passing around the internal hash array, which adds a
bit of type safety.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-objects: read delta base oid into object_id struct</title>
<updated>2020-02-24T20:55:52Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-02-24T04:29:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3f83fd5e44c1f038c8a7033cb77399e9ef4f43a9'/>
<id>urn:sha1:3f83fd5e44c1f038c8a7033cb77399e9ef4f43a9</id>
<content type='text'>
When we're considering reusing an on-disk delta, we get the oid of the
base as a pointer to unsigned char bytes of the hash, either into the
packfile itself (for REF_DELTA) or into the pack idx (using the revindex
to convert the offset into an index entry).

Instead, we'd prefer to use a more type-safe object_id as much as
possible. We can get the pack idx using nth_packed_object_id() instead.
For the packfile bytes, we can copy them out using oidread().

This doesn't even incur an extra copy overall, since the next thing we'd
always do with that pointer is pass it to can_reuse_delta(), which needs
an object_id anyway (and called oidread() itself). So this patch also
converts that function to take the object_id directly.

Note that we did previously use NULL as a sentinel value when the object
isn't a delta. We could probably get away with using the null oid for
this, but instead we'll use an explicit boolean flag, which should make
things more obvious for people reading the code later.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>nth_packed_object_oid(): use customary integer return</title>
<updated>2020-02-24T20:55:42Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-02-24T04:27:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0763671b8e0b3ef873df13c741a911b809e6813d'/>
<id>urn:sha1:0763671b8e0b3ef873df13c741a911b809e6813d</id>
<content type='text'>
Our nth_packed_object_sha1() function returns NULL for error. So when we
wrapped it with nth_packed_object_oid(), we kept the same semantics. But
it's a bit funny, because the caller actually passes in an out
parameter, and the pointer we return is just that same struct they
passed to us (or NULL).

It's not too terrible, but it does make the interface a little
non-idiomatic. Let's switch to our usual "0 for success, negative for
error" return value. Most callers either don't check it, or are
trivially converted. The one that requires the biggest change is
actually improved, as we can ditch an extra aliased pointer variable.

Since we are changing the interface in a subtle way that the compiler
wouldn't catch, let's also change the name to catch any topics in
flight. We can drop the 'o' and make it nth_packed_object_id(). That's
slightly shorter, but also less redundant since the 'o' stands for
"object" already.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'mt/use-passed-repo-more-in-funcs'</title>
<updated>2020-02-14T20:54:22Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-02-14T20:54:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=78e67cda42a440f216fc4cbfd8a9e190c5eece5a'/>
<id>urn:sha1:78e67cda42a440f216fc4cbfd8a9e190c5eece5a</id>
<content type='text'>
Some codepaths were given a repository instance as a parameter to
work in the repository, but passed the_repository instance to its
callees, which has been cleaned up (somewhat).

* mt/use-passed-repo-more-in-funcs:
  sha1-file: allow check_object_signature() to handle any repo
  sha1-file: pass git_hash_algo to hash_object_file()
  sha1-file: pass git_hash_algo to write_object_file_prepare()
  streaming: allow open_istream() to handle any repo
  pack-check: use given repo's hash_algo at verify_packfile()
  cache-tree: use given repo's hash_algo at verify_one()
  diff: make diff_populate_filespec() honor its repo argument
</content>
</entry>
<entry>
<title>Merge branch 'jk/packfile-reuse-cleanup'</title>
<updated>2020-02-14T20:54:19Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-02-14T20:54:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a14aebeac330e6d58f9628a02521ea780daf0a5b'/>
<id>urn:sha1:a14aebeac330e6d58f9628a02521ea780daf0a5b</id>
<content type='text'>
The way "git pack-objects" reuses objects stored in existing pack
to generate its result has been improved.

* jk/packfile-reuse-cleanup:
  pack-bitmap: don't rely on bitmap_git-&gt;reuse_objects
  pack-objects: add checks for duplicate objects
  pack-objects: improve partial packfile reuse
  builtin/pack-objects: introduce obj_is_packed()
  pack-objects: introduce pack.allowPackReuse
  csum-file: introduce hashfile_total()
  pack-bitmap: simplify bitmap_has_oid_in_uninteresting()
  pack-bitmap: uninteresting oid can be outside bitmapped packfile
  pack-bitmap: introduce bitmap_walk_contains()
  ewah/bitmap: introduce bitmap_word_alloc()
  packfile: expose get_delta_base()
  builtin/pack-objects: report reused packfile objects
</content>
</entry>
<entry>
<title>pack-objects: support filters with bitmaps</title>
<updated>2020-02-14T18:46:22Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-02-14T18:22:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3ab3185f999f5d0d0079ac8246edb8fca5d9d3fd'/>
<id>urn:sha1:3ab3185f999f5d0d0079ac8246edb8fca5d9d3fd</id>
<content type='text'>
Just as rev-list recently learned to combine filters and bitmaps, let's
do the same for pack-objects. The infrastructure is all there; we just
need to pass along our filter options, and the pack-bitmap code will
decide to use bitmaps or not.

This unsurprisingly makes things faster for partial clones of large
repositories (here we're cloning linux.git):

  Test                               HEAD^               HEAD
  ------------------------------------------------------------------------------
  5310.11: simulated partial clone   38.94(37.28+5.87)   11.06(11.27+4.07) -71.6%

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap: basic noop bitmap filter infrastructure</title>
<updated>2020-02-14T18:46:22Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-02-14T18:22:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=6663ae0a0818aba5d4de289b1a37e1961ad6c367'/>
<id>urn:sha1:6663ae0a0818aba5d4de289b1a37e1961ad6c367</id>
<content type='text'>
Currently you can't use object filters with bitmaps, but we plan to
support at least some filters with bitmaps. Let's introduce some
infrastructure that will help us do that:

  - prepare_bitmap_walk() now accepts a list_objects_filter_options
    parameter (which can be NULL for no filtering; all the current
    callers pass this)

  - we'll bail early if the filter is incompatible with bitmaps (just as
    we would if there were no bitmaps at all). Currently all filters are
    incompatible.

  - we'll filter the resulting bitmap; since there are no supported
    filters yet, this is always a noop.

There should be no behavior change yet, but we'll support some actual
filters in a future patch.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
