<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/bundle-uri.c, branch v2.47.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.47.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.47.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2024-10-10T18:47:24Z</updated>
<entry>
<title>bundle-uri: plug leak in unbundle_from_file()</title>
<updated>2024-10-10T18:47:24Z</updated>
<author>
<name>Toon Claes</name>
<email>toon@iotcl.com</email>
</author>
<published>2024-10-10T09:12:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=6dab49b9fbbb63a5f58a3cc6e2295f01b1f628f0'/>
<id>urn:sha1:6dab49b9fbbb63a5f58a3cc6e2295f01b1f628f0</id>
<content type='text'>
The function `unbundle_from_file()` has two memory leaks:

  - We do not release the `struct bundle_header header` when hitting
    errors because we return early without any cleanup.

  - We do not release the `struct strbuf bundle_ref` at all.

Plug these leaks by creating a common exit path where both of these
variables are released.

While at it, refactor the code such that the variable assignments do not
happen inside the conditional statement itself according to our coding
style.

Signed-off-by: Toon Claes &lt;toon@iotcl.com&gt;
Acked-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ps/environ-wo-the-repository'</title>
<updated>2024-09-23T17:35:05Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-09-23T17:35:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3eb66799593f3676d85ca66f9e3192a7db603805'/>
<id>urn:sha1:3eb66799593f3676d85ca66f9e3192a7db603805</id>
<content type='text'>
Code clean-up.

* ps/environ-wo-the-repository: (21 commits)
  environment: stop storing "core.notesRef" globally
  environment: stop storing "core.warnAmbiguousRefs" globally
  environment: stop storing "core.preferSymlinkRefs" globally
  environment: stop storing "core.logAllRefUpdates" globally
  refs: stop modifying global `log_all_ref_updates` variable
  branch: stop modifying `log_all_ref_updates` variable
  repo-settings: track defaults close to `struct repo_settings`
  repo-settings: split out declarations into a standalone header
  environment: guard state depending on a repository
  environment: reorder header to split out `the_repository`-free section
  environment: move `set_git_dir()` and related into setup layer
  environment: make `get_git_namespace()` self-contained
  environment: move object database functions into object layer
  config: make dependency on repo in `read_early_config()` explicit
  config: document `read_early_config()` and `read_very_early_config()`
  environment: make `get_git_work_tree()` accept a repository
  environment: make `get_graft_file()` accept a repository
  environment: make `get_index_file()` accept a repository
  environment: make `get_object_directory()` accept a repository
  environment: make `get_git_common_dir()` accept a repository
  ...
</content>
</entry>
<entry>
<title>environment: move object database functions into object layer</title>
<updated>2024-09-12T17:15:40Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-09-12T11:29:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=26b4df907bca4829df44dc7eee23ccf7720898f7'/>
<id>urn:sha1:26b4df907bca4829df44dc7eee23ccf7720898f7</id>
<content type='text'>
The `odb_mkstemp()` and `odb_pack_keep()` functions are quite clearly
tied to the object store, but regardless of that they are located in
"environment.c". Move them over, which also helps to get rid of
dependencies on `the_repository` in the environment subsystem.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>fetch: add top-level trace2 regions</title>
<updated>2024-08-22T22:02:31Z</updated>
<author>
<name>Josh Steadmon</name>
<email>steadmon@google.com</email>
</author>
<published>2024-08-22T21:57:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a45ab549879d0f8e2524f820796daf2f6a1f7d00'/>
<id>urn:sha1:a45ab549879d0f8e2524f820796daf2f6a1f7d00</id>
<content type='text'>
At $DAYJOB we experienced some slow fetch operations and needed some
additional data to help diagnose the issue.

Add top-level trace2 regions for the various modes of operation of
`git-fetch`. None of these regions are in recursive code, so any
enclosed trace messages should only see their nesting level increase by
one.

Signed-off-by: Josh Steadmon &lt;steadmon@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'xx/bundie-uri-fixes'</title>
<updated>2024-07-08T21:53:11Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-07-08T21:53:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=125e38947087906be17fb4e237fc09764df97c65'/>
<id>urn:sha1:125e38947087906be17fb4e237fc09764df97c65</id>
<content type='text'>
When bundleURI interface fetches multiple bundles, Git failed to
take full advantage of all bundles and ended up slurping duplicated
objects.

* xx/bundie-uri-fixes:
  unbundle: extend object verification for fetches
  fetch-pack: expose fsckObjects configuration logic
  bundle-uri: verify oid before writing refs
</content>
</entry>
<entry>
<title>unbundle: extend object verification for fetches</title>
<updated>2024-06-20T17:30:08Z</updated>
<author>
<name>Xing Xin</name>
<email>xingxin.xx@bytedance.com</email>
</author>
<published>2024-06-19T04:07:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=63d903ff52594eb52289abb89db1a4bca7b0f946'/>
<id>urn:sha1:63d903ff52594eb52289abb89db1a4bca7b0f946</id>
<content type='text'>
The existing fetch.fsckObjects and transfer.fsckObjects configurations
were not fully applied to bundle-involved fetches, including direct
bundle fetches and bundle-uri enabled fetches. Furthermore, there was no
object verification support for unbundle.

This commit extends object verification support in `bundle.c:unbundle`
by adding the `VERIFY_BUNDLE_FSCK` option to `verify_bundle_flags`. When
this option is enabled, we append the `--fsck-objects` flag to
`git-index-pack`.

The `VERIFY_BUNDLE_FSCK` option is now used by bundle-involved fetches,
where we use `fetch-pack.c:fetch_pack_fsck_objects` to determine whether
to enable this option for `bundle.c:unbundle`, specifically in:

- `transport.c:fetch_refs_from_bundle` for direct bundle fetches.
- `bundle-uri.c:unbundle_from_file` for bundle-uri enabled fetches.

This addition ensures a consistent logic for object verification during
fetches. Tests have been added to confirm functionality in the scenarios
mentioned above.

Reviewed-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Xing Xin &lt;xingxin.xx@bytedance.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>bundle-uri: verify oid before writing refs</title>
<updated>2024-06-20T17:30:07Z</updated>
<author>
<name>Xing Xin</name>
<email>xingxin.xx@bytedance.com</email>
</author>
<published>2024-06-19T04:07:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3079026fc17275d734757558c16fe8ae236fee04'/>
<id>urn:sha1:3079026fc17275d734757558c16fe8ae236fee04</id>
<content type='text'>
When using the bundle-uri mechanism with a bundle list containing
multiple interrelated bundles, we encountered a bug where tips from
downloaded bundles were not discovered, thus resulting in rather slow
clones. This was particularly problematic when employing the
"creationTokens" heuristic.

To reproduce this issue, consider a repository with a single branch
"main" pointing to commit "A". Firstly, create a base bundle with:

  git bundle create base.bundle main

Then, add a new commit "B" on top of "A", and create an incremental
bundle for "main":

  git bundle create incr.bundle A..main

Now, generate a bundle list with the following content:

  [bundle]
      version = 1
      mode = all
      heuristic = creationToken

  [bundle "base"]
      uri = base.bundle
      creationToken = 1

  [bundle "incr"]
      uri = incr.bundle
      creationToken = 2

A fresh clone with the bundle list above should result in a reference
"refs/bundles/main" pointing to "B" in the new repository. However, git
would still download everything from the server, as if it had fetched
nothing locally.

So why the "refs/bundles/main" is not discovered? After some digging I
found that:

1. Bundles in bundle list are downloaded to local files via
   `bundle-uri.c:download_bundle_list` or via
   `bundle-uri.c:fetch_bundles_by_token` for the "creationToken"
   heuristic.
2. Each bundle is unbundled via `bundle-uri.c:unbundle_from_file`, which
   is called by `bundle-uri.c:unbundle_all_bundles` or called within
   `bundle-uri.c:fetch_bundles_by_token` for the "creationToken"
   heuristic.
3. To get all prerequisites of the bundle, the bundle header is read
   inside `bundle-uri.c:unbundle_from_file` to by calling
   `bundle.c:read_bundle_header`.
4. Then it calls `bundle.c:unbundle`, which calls
   `bundle.c:verify_bundle` to ensure the repository contains all the
   prerequisites.
5. `bundle.c:verify_bundle` calls `parse_object`, which eventually
   invokes `packfile.c:prepare_packed_git` or
   `packfile.c:reprepare_packed_git`, filling
   `raw_object_store-&gt;packed_git` and setting `packed_git_initialized`.
6. If `bundle.c:unbundle` succeeds, it writes refs via
   `refs.c:refs_update_ref` with `REF_SKIP_OID_VERIFICATION` set. Here
   bundle refs which can target arbitrary objects are written to the
   repository.
7. Finally, in `fetch-pack.c:do_fetch_pack_v2`, the functions
   `fetch-pack.c:mark_complete_and_common_ref` and
   `fetch-pack.c:mark_tips` are called with `OBJECT_INFO_QUICK` set to
   find local tips for negotiation. The `OBJECT_INFO_QUICK` flag
   prevents `packfile.c:reprepare_packed_git` from being called,
   resulting in failures to parse OIDs that reside only in the latest
   bundle.

In the example above, when unbunding "incr.bundle", "base.pack" is added
to `packed_git` due to prerequisites verification. However, "B" cannot
be found for negotiation because it exists in "incr.pack", which is not
included in `packed_git`.

Fix the bug by removing `REF_SKIP_OID_VERIFICATION` flag when writing
bundle refs. When `refs.c:refs_update_ref` is called to write the
corresponding bundle refs, it triggers `refs.c:ref_transaction_commit`.
This, in turn, invokes `refs.c:ref_transaction_prepare`, which calls
`transaction_prepare` of the refs storage backend. For files backend, it
is `files-backend.c:files_transaction_prepare`, and for reftable
backend, it is `reftable-backend.c:reftable_be_transaction_prepare`.
Both functions eventually call `object.c:parse_object`, which can invoke
`packfile.c:reprepare_packed_git` to refresh `packed_git`. This ensures
that bundle refs point to valid objects and that all tips from bundle
refs are correctly parsed during subsequent negotiations.

A set of negotiation-related tests for cloning with bundle-uri has been
included to demonstrate that downloaded bundles are utilized to
accelerate fetching.

Additionally, another test has been added to show that bundles with
incorrect headers, where refs point to non-existent objects, do not
result in any bundle refs being created in the repository.

Reviewed-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Reviewed-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Xing Xin &lt;xingxin.xx@bytedance.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>global: introduce `USE_THE_REPOSITORY_VARIABLE` macro</title>
<updated>2024-06-14T17:26:33Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-06-14T06:50:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e7da9385708accf518a80a1e17969020fb361048'/>
<id>urn:sha1:e7da9385708accf518a80a1e17969020fb361048</id>
<content type='text'>
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.

It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.

Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit

For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).

Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>cocci: apply rules to rewrite callers of "refs" interfaces</title>
<updated>2024-05-07T17:06:59Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-05-07T07:11:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=2e5c4758b75f7cbae612c89b177aa045fa4f9c68'/>
<id>urn:sha1:2e5c4758b75f7cbae612c89b177aa045fa4f9c68</id>
<content type='text'>
Apply the rules that rewrite callers of "refs" interfaces to explicitly
pass `struct ref_store`. The resulting patch has been applied with the
`--whitespace=fix` option.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>treewide: remove unnecessary includes in source files</title>
<updated>2023-12-26T20:04:31Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-12-23T17:14:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=eea0e59ffbed6e33d171ace5be13cde9faa41639'/>
<id>urn:sha1:eea0e59ffbed6e33d171ace5be13cde9faa41639</id>
<content type='text'>
Each of these were checked with
   gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).

...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file.  These cases were:
  * builtin/credential-cache.c
  * builtin/pull.c
  * builtin/send-pack.c

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
