<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/bundle.h, branch jch</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=jch</id>
<link rel='self' href='https://git.shady.money/git/atom?h=jch'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2025-01-26T02:38:11Z</updated>
<entry>
<title>bundle: avoid closing file descriptor twice</title>
<updated>2025-01-26T02:38:11Z</updated>
<author>
<name>Johannes Schindelin</name>
<email>johannes.schindelin@gmx.de</email>
</author>
<published>2025-01-25T23:57:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9a84794ad8ad1bc8ec6b2c4e1592a1f63765e753'/>
<id>urn:sha1:9a84794ad8ad1bc8ec6b2c4e1592a1f63765e753</id>
<content type='text'>
Already when introduced in c7a8a16239 (Add bundle transport,
2007-09-10), the `bundle` transport had a bug where it would open a file
descriptor to the bundle file and then close it _twice_: First, the file
descriptor (`data-&gt;fd`) is passed to `unbundle()`, which would use it as
the `stdin` of the `index-pack` process, which as a consequence would
close it via `start_command()`. However, `data-&gt;fd` would still hold the
numerical value of the file descriptor, and `close_bundle()` would see
that and happily close it again.

This seems not to have caused too many problems in almost two decades,
but I encountered a situation today where it _does_ cause problems: In
i686 variants of Git for Windows, it seems that file descriptors are
reused quickly after they have been closed.

In the particular scenario I faced, `git fetch &lt;bundle&gt; &lt;ref&gt;` gets the
same file descriptor value when opening the bundle file and importing
its embedded packfile (which implicitly closes the file descriptor) and
then when opening a pack file in `fetch_and_consume_refs()` while
looking up an object's header.

Later on, after the bundle has been imported (and the `close_bundle()`
function erroneously closes the file descriptor that has _already_ been
closed when using it as `stdin` for `git index-pack`), the same file
descriptor value has now been reused via `use_pack()`. Now, when either
the recursive fetch (which defaults to "on", unfortunately) or a
commit-graph update needs to `mmap()` the packfile, it fails due to a
now-invalid file descriptor that _should_ point to the pack file but
doesn't anymore.

To fix that, let's invalidate `data-&gt;fd` after calling `unbundle()`.
That way, `close_bundle()` does not close a file descriptor that may
have been reused for something different. While at it, document that
`unbundle()` closes the file descriptor, and ensure that it also does
that when failing to verify the bundle.

Luckily, this bug does not affect the bundle URI feature, it only
affects the `git fetch &lt;bundle&gt;` code path.

Note that this patch does not _completely_ clarifies who is responsible
to close that file descriptor, as `run_command()` may fail _without_
closing `cmd-&gt;in`. Addressing this issue thoroughly, however, would
require a rather thorough re-design of the `start_command()` and
`finish_command()` functionality to make it a lot less murky who is
responsible for what file descriptors.

At least this here patch is relatively easy to reason about, and
addresses a hard failure (`fatal: mmap: could not determine filesize`)
at the expense of leaking a file descriptor under very rare
circumstances in which `git fetch` would error out anyway.

Signed-off-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>bundle: support fsck message configuration</title>
<updated>2024-11-28T03:07:58Z</updated>
<author>
<name>Justin Tobler</name>
<email>jltobler@gmail.com</email>
</author>
<published>2024-11-27T23:33:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=187574ce869f1244de83fc6a0a5b6d614fe979f2'/>
<id>urn:sha1:187574ce869f1244de83fc6a0a5b6d614fe979f2</id>
<content type='text'>
If the `VERIFY_BUNDLE_FLAG` is set during `unbundle()`, the
git-index-pack(1) spawned is configured with the `--fsck-options` flag
to perform fsck verification. With this flag enabled, there is not a way
to configure fsck message severity though.

Extend the `unbundle_opts` type to store fsck message severity
configuration and update `unbundle()` to conditionally append it to the
`--fsck-objects` flag if provided. This enables `unbundle()` call sites
to support optionally setting the severity for specific fsck messages.

Signed-off-by: Justin Tobler &lt;jltobler@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>bundle: add bundle verification options type</title>
<updated>2024-11-28T03:07:57Z</updated>
<author>
<name>Justin Tobler</name>
<email>jltobler@gmail.com</email>
</author>
<published>2024-11-27T23:33:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=87c01003cdff8c99ebdf053441e4527d85952284'/>
<id>urn:sha1:87c01003cdff8c99ebdf053441e4527d85952284</id>
<content type='text'>
When `unbundle()` is invoked, fsck verification may be configured by
passing the `VERIFY_BUNDLE_FSCK` flag. This mechanism allows fsck checks
on the bundle to be enabled or disabled entirely. To facilitate more
fine-grained fsck configuration, additional context must be provided to
`unbundle()`.

Introduce the `unbundle_opts` type, which wraps the existing
`verify_bundle_flags`, to facilitate future extension of `unbundle()`
configuration. Also update `unbundle()` and its call sites to accept
this new options type instead of the flags directly. The end behavior is
functionally the same, but allows for the set of configurable options to
be extended. This is leveraged in a subsequent commit to enable fsck
message severity configuration.

Signed-off-by: Justin Tobler &lt;jltobler@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>unbundle: extend object verification for fetches</title>
<updated>2024-06-20T17:30:08Z</updated>
<author>
<name>Xing Xin</name>
<email>xingxin.xx@bytedance.com</email>
</author>
<published>2024-06-19T04:07:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=63d903ff52594eb52289abb89db1a4bca7b0f946'/>
<id>urn:sha1:63d903ff52594eb52289abb89db1a4bca7b0f946</id>
<content type='text'>
The existing fetch.fsckObjects and transfer.fsckObjects configurations
were not fully applied to bundle-involved fetches, including direct
bundle fetches and bundle-uri enabled fetches. Furthermore, there was no
object verification support for unbundle.

This commit extends object verification support in `bundle.c:unbundle`
by adding the `VERIFY_BUNDLE_FSCK` option to `verify_bundle_flags`. When
this option is enabled, we append the `--fsck-objects` flag to
`git-index-pack`.

The `VERIFY_BUNDLE_FSCK` option is now used by bundle-involved fetches,
where we use `fetch-pack.c:fetch_pack_fsck_objects` to determine whether
to enable this option for `bundle.c:unbundle`, specifically in:

- `transport.c:fetch_refs_from_bundle` for direct bundle fetches.
- `bundle-uri.c:unbundle_from_file` for bundle-uri enabled fetches.

This addition ensures a consistent logic for object verification during
fetches. Tests have been added to confirm functionality in the scenarios
mentioned above.

Reviewed-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Xing Xin &lt;xingxin.xx@bytedance.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>treewide: remove unnecessary cache.h inclusion from a few headers</title>
<updated>2023-03-21T17:56:50Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-03-21T06:25:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a6dc3d364cdf89075582cd521f33d599e6b53cf2'/>
<id>urn:sha1:a6dc3d364cdf89075582cd521f33d599e6b53cf2</id>
<content type='text'>
Ever since a64215b6cd ("object.h: stop depending on cache.h; make
cache.h depend on object.h", 2023-02-24), we have a few headers that
could have replaced their include of cache.h with an include of
object.h.  Make that change now.

Some C files had to start including cache.h after this change (or some
smaller header it had brought in), because the C files were depending
on things from cache.h but were only formerly implicitly getting
cache.h through one of these headers being modified in this patch.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ds/bundle-uri-3'</title>
<updated>2022-10-31T01:04:44Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2022-10-31T01:04:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d32dd8add53120cef9b30be4240010c2ab6bfc6f'/>
<id>urn:sha1:d32dd8add53120cef9b30be4240010c2ab6bfc6f</id>
<content type='text'>
Define the logical elements of a "bundle list", data structure to
store them in-core, format to transfer them, and code to parse
them.

* ds/bundle-uri-3:
  bundle-uri: suppress stderr from remote-https
  bundle-uri: quiet failed unbundlings
  bundle: add flags to verify_bundle()
  bundle-uri: fetch a list of bundles
  bundle: properly clear all revision flags
  bundle-uri: limit recursion depth for bundle lists
  bundle-uri: parse bundle list in config format
  bundle-uri: unit test "key=value" parsing
  bundle-uri: create "key=value" line parsing
  bundle-uri: create base key-value pair parsing
  bundle-uri: create bundle_list struct and helpers
  bundle-uri: use plain string in find_temp_filename()
</content>
</entry>
<entry>
<title>bundle-uri: quiet failed unbundlings</title>
<updated>2022-10-12T16:13:25Z</updated>
<author>
<name>Derrick Stolee</name>
<email>derrickstolee@github.com</email>
</author>
<published>2022-10-12T12:52:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=70334fc3ebf1c6199014d82bbbf0595b64a8fa90'/>
<id>urn:sha1:70334fc3ebf1c6199014d82bbbf0595b64a8fa90</id>
<content type='text'>
When downloading a list of bundles in "all" mode, Git has no
understanding of the dependencies between the bundles. Git attempts to
unbundle the bundles in some order, but some may not pass the
verify_bundle() step because of missing prerequisites. This is passed as
error messages to the user, even when they eventually succeed in later
attempts after their dependent bundles are unbundled.

Add a new VERIFY_BUNDLE_QUIET flag to verify_bundle() that avoids the
error messages from the missing prerequisite commits. The method still
returns the number of missing prerequisit commits, allowing callers to
unbundle() to notice that the bundle failed to apply.

Use this flag in bundle-uri.c and test that the messages go away for
'git clone --bundle-uri' commands.

Signed-off-by: Derrick Stolee &lt;derrickstolee@github.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>bundle: add flags to verify_bundle()</title>
<updated>2022-10-12T16:13:25Z</updated>
<author>
<name>Derrick Stolee</name>
<email>derrickstolee@github.com</email>
</author>
<published>2022-10-12T12:52:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=89bd7fedf947484da08e2722d663fdac23a431be'/>
<id>urn:sha1:89bd7fedf947484da08e2722d663fdac23a431be</id>
<content type='text'>
The verify_bundle() method has a 'verbose' option, but we will want to
extend this method to have more granular control over its output. First,
replace this 'verbose' option with a new 'flags' option with a single
possible value: VERIFY_BUNDLE_VERBOSE.

Signed-off-by: Derrick Stolee &lt;derrickstolee@github.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>list-objects-filter: add and use initializers</title>
<updated>2022-09-12T15:38:59Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2022-09-11T05:03:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=2a01bdedf87d7cbfc4411ff5059cfe406e1637db'/>
<id>urn:sha1:2a01bdedf87d7cbfc4411ff5059cfe406e1637db</id>
<content type='text'>
In 7e2619d8ff (list_objects_filter_options: plug leak of filter_spec
strings, 2022-09-08), we noted that the filter_spec string_list was
inconsistent in how it handled memory ownership of strings stored in the
list. The fix there was a bit of a band-aid to set the "strdup_strings"
variable right before adding anything.

That works OK, and it lets the users of the API continue to
zero-initialize the struct. But it makes the code a bit hard to follow
and accident-prone, as any other spots appending the filter_spec need to
think about whether to set the strdup_strings value, too (there's one
such spot in partial_clone_get_default_filter_spec(), which is probably
a possible memory leak).

So let's do that full cleanup now. We'll introduce a
LIST_OBJECTS_FILTER_INIT macro and matching function, and use them as
appropriate (though it is for the "_options" struct, this matches the
corresponding list_objects_filter_release() function).

This is harder than it seems! Many other structs, like
git_transport_data, embed the filter struct. So they need to initialize
it themselves even if the rest of the enclosing struct is OK with
zero-initialization. I found all of the relevant spots by grepping
manually for declarations of list_objects_filter_options. And then doing
so recursively for structs which embed it, and ones which embed those,
and so on.

I'm pretty sure I got everything, but there's no change that would alert
the compiler if any topics in flight added new declarations. To catch
this case, we now double-check in the parsing function that things were
initialized as expected and BUG() if appropriate.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>bundle.h: make "fd" version of read_bundle_header() public</title>
<updated>2022-05-16T22:02:10Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2022-05-16T20:11:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=89c6e450fe4a919ecb6fa698005a935531c732cf'/>
<id>urn:sha1:89c6e450fe4a919ecb6fa698005a935531c732cf</id>
<content type='text'>
Change the parse_bundle_header() function to be non-static, and rename
it to parse_bundle_header_fd(). The parse_bundle_header() function is
already public, and it's a thin wrapper around this function. This
will be used by code that wants to pass a fd to the bundle API.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Derrick Stolee &lt;derrickstolee@github.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
