<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/refs/debug.c, branch v2.50.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.50.0</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.50.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2025-04-15T20:50:15Z</updated>
<entry>
<title>Merge branch 'ps/object-wo-the-repository'</title>
<updated>2025-04-15T20:50:15Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-04-15T20:50:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ee847e0034dbfde11f901fbfb74d210c1edad496'/>
<id>urn:sha1:ee847e0034dbfde11f901fbfb74d210c1edad496</id>
<content type='text'>
The object layer has been updated to take an explicit repository
instance as a parameter in more code paths.

* ps/object-wo-the-repository:
  hash: stop depending on `the_repository` in `null_oid()`
  hash: fix "-Wsign-compare" warnings
  object-file: split out logic regarding hash algorithms
  delta-islands: stop depending on `the_repository`
  object-file-convert: stop depending on `the_repository`
  pack-bitmap-write: stop depending on `the_repository`
  pack-revindex: stop depending on `the_repository`
  pack-check: stop depending on `the_repository`
  environment: move access to "core.bigFileThreshold" into repo settings
  pack-write: stop depending on `the_repository` and `the_hash_algo`
  object: stop depending on `the_repository`
  csum-file: stop depending on `the_repository`
</content>
</entry>
<entry>
<title>refs/iterator: provide infrastructure to re-seek iterators</title>
<updated>2025-03-12T18:31:19Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-03-12T15:56:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=82c39c6055b5340f0e50acbe01a97e51d3907fec'/>
<id>urn:sha1:82c39c6055b5340f0e50acbe01a97e51d3907fec</id>
<content type='text'>
Reftable iterators need to be scrapped after they have either been
exhausted or aren't useful to the caller anymore, and it is explicitly
not possible to reuse them for iterations. But enabling for reuse of
iterators may allow us to tune them by reusing internal state of an
iterator. The reftable iterators for example can already be reused
internally, but we're not able to expose this to any users outside of
the reftable backend.

Introduce a new `.seek` function in the ref iterator vtable that allows
callers to seek an iterator multiple times. It is expected to be
functionally the same as calling `refs_ref_iterator_begin()` with a
different (or the same) prefix.

Note that it is not possible to adjust parameters other than the seeked
prefix for now, so exclude patterns, trimmed prefixes and flags will
remain unchanged. We do not have a usecase for changing these parameters
right now, but if we ever find one we can adapt accordingly.

Implement the callback for trivial cases. The other iterators will be
implemented in subsequent commits.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs/iterator: separate lifecycle from iteration</title>
<updated>2025-03-12T18:31:18Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-03-12T15:56:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=cec2b6f55a805c010d2acc81abf4cbc41b712130'/>
<id>urn:sha1:cec2b6f55a805c010d2acc81abf4cbc41b712130</id>
<content type='text'>
The ref and reflog iterators have their lifecycle attached to iteration:
once the iterator reaches its end, it is automatically released and the
caller doesn't have to care about that anymore. When the iterator should
be released before it has been exhausted, callers must explicitly abort
the iterator via `ref_iterator_abort()`.

This lifecycle is somewhat unusual in the Git codebase and creates two
problems:

  - Callsites need to be very careful about when exactly they call
    `ref_iterator_abort()`, as calling the function is only valid when
    the iterator itself still is. This leads to somewhat awkward calling
    patterns in some situations.

  - It is impossible to reuse iterators and re-seek them to a different
    prefix. This feature isn't supported by any iterator implementation
    except for the reftable iterators anyway, but if it was implemented
    it would allow us to optimize cases where we need to search for
    specific references repeatedly by reusing internal state.

Detangle the lifecycle from iteration so that we don't deallocate the
iterator anymore once it is exhausted. Instead, callers are now expected
to always call a newly introduce `ref_iterator_free()` function that
deallocates the iterator and its internal state.

Note that the `dir_iterator` is somewhat special because it does not
implement the `ref_iterator` interface, but is only used to implement
other iterators. Consequently, we have to provide `dir_iterator_free()`
instead of `dir_iterator_release()` as the allocated structure itself is
managed by the `dir_iterator` interfaces, as well, and not freed by
`ref_iterator_free()` like in all the other cases.

While at it, drop the return value of `ref_iterator_abort()`, which
wasn't really required by any of the iterator implementations anyway.
Furthermore, stop calling `base_ref_iterator_free()` in any of the
backends, but instead call it in `ref_iterator_free()`.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hash: stop depending on `the_repository` in `null_oid()`</title>
<updated>2025-03-10T20:16:20Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-03-10T07:13:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=7d70b29c4f0b2fd3c6698956d9fb4026632d9c6e'/>
<id>urn:sha1:7d70b29c4f0b2fd3c6698956d9fb4026632d9c6e</id>
<content type='text'>
The `null_oid()` function returns the object ID that only consists of
zeroes. Naturally, this ID also depends on the hash algorithm used, as
the number of zeroes is different between SHA1 and SHA256. Consequently,
the function returns the hash-algorithm-specific null object ID.

This is currently done by depending on `the_hash_algo`, which implicitly
makes us depend on `the_repository`. Refactor the function to instead
pass in the hash algorithm for which we want to retrieve the null object
ID. Adapt callsites accordingly by passing in `the_repository`, thus
bubbling up the dependency on that global variable by one layer.

There are a couple of trivial exceptions for subsystems that already got
rid of `the_repository`. These subsystems instead use the repository
that is available via the calling context:

  - "builtin/grep.c"
  - "grep.c"
  - "refs/debug.c"

There are also two non-trivial exceptions:

  - "diff-no-index.c": Here we know that we may not have a repository
    initialized at all, so we cannot rely on `the_repository`. Instead,
    we adapt `diff_no_index()` to get a `struct git_hash_algo` as
    parameter. The only caller is located in "builtin/diff.c", where we
    know to call `repo_set_hash_algo()` in case we're running outside of
    a Git repository. Consequently, it is fine to continue passing
    `the_repository-&gt;hash_algo` even in this case.

  - "builtin/ls-files.c": There is an in-flight patch series that drops
    `USE_THE_REPOSITORY_VARIABLE` in this file, which causes a semantic
    conflict because we use `null_oid()` in `show_submodule()`. The
    value is passed to `repo_submodule_init()`, which may use the object
    ID to resolve a tree-ish in the superproject from which we want to
    read the submodule config. As such, the object ID should refer to an
    object in the superproject, and consequently we need to use its hash
    algorithm.

    This means that we could in theory just not bother about this edge
    case at all and just use `the_repository` in "diff-no-index.c". But
    doing so would feel misdesigned.

Remove the `USE_THE_REPOSITORY_VARIABLE` preprocessor define in
"hash.c".

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>global: trivial conversions to fix `-Wsign-compare` warnings</title>
<updated>2024-12-06T11:20:04Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-12-06T10:27:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=80c9e70ebe871f0826bc101142c66ff783405100'/>
<id>urn:sha1:80c9e70ebe871f0826bc101142c66ff783405100</id>
<content type='text'>
We have a bunch of loops which iterate up to an unsigned boundary using
a signed index, which generates warnigs because we compare a signed and
unsigned value in the loop condition. Address these sites for trivial
cases and enable `-Wsign-compare` warnings for these code units.

This patch only adapts those code units where we can drop the
`DISABLE_SIGN_COMPARE_WARNINGS` macro in the same step.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>global: mark code units that generate warnings with `-Wsign-compare`</title>
<updated>2024-12-06T11:20:02Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-12-06T10:27:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=41f43b8243f42b9df2e98be8460646d4c0100ad3'/>
<id>urn:sha1:41f43b8243f42b9df2e98be8460646d4c0100ad3</id>
<content type='text'>
Mark code units that generate warnings with `-Wsign-compare`. This
allows for a structured approach to get rid of all such warnings over
time in a way that can be easily measured.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sj/ref-contents-check'</title>
<updated>2024-12-04T01:14:42Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-12-04T01:14:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=57e81b59f35198afedae18e8363dbffdc96c481d'/>
<id>urn:sha1:57e81b59f35198afedae18e8363dbffdc96c481d</id>
<content type='text'>
"git fsck" learned to issue warnings on "curiously formatted" ref
contents that have always been taken valid but something Git
wouldn't have written itself (e.g., missing terminating end-of-line
after the full object name).

* sj/ref-contents-check:
  ref: add symlink ref content check for files backend
  ref: check whether the target of the symref is a ref
  ref: add basic symref content check for files backend
  ref: add more strict checks for regular refs
  ref: port git-fsck(1) regular refs check for files backend
  ref: support multiple worktrees check for refs
  ref: initialize ref name outside of check functions
  ref: check the full refname instead of basename
  ref: initialize "fsck_ref_report" with zero
</content>
</entry>
<entry>
<title>ref: support multiple worktrees check for refs</title>
<updated>2024-11-20T23:21:32Z</updated>
<author>
<name>shejialuo</name>
<email>shejialuo@gmail.com</email>
</author>
<published>2024-11-20T11:51:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=7c78d819e6a14434ae3fa8a2127bb3969376cac1'/>
<id>urn:sha1:7c78d819e6a14434ae3fa8a2127bb3969376cac1</id>
<content type='text'>
We have already set up the infrastructure to check the consistency for
refs, but we do not support multiple worktrees. However, "git-fsck(1)"
will check the refs of worktrees. As we decide to get feature parity
with "git-fsck(1)", we need to set up support for multiple worktrees.

Because each worktree has its own specific refs, instead of just showing
the users "refs/worktree/foo", we need to display the full name such as
"worktrees/&lt;id&gt;/refs/worktree/foo". So we should know the id of the
worktree to get the full name. Add a new parameter "struct worktree *"
for "refs-internal.h::fsck_fn". Then change the related functions to
follow this new interface.

The "packed-refs" only exists in the main worktree, so we should only
check "packed-refs" in the main worktree. Use "is_main_worktree" method
to skip checking "packed-refs" in "packed_fsck" function.

Then, enhance the "files-backend.c::files_fsck_refs_dir" function to add
"worktree/&lt;id&gt;/" prefix when we are not in the main worktree.

Last, add a new test to check the refname when there are multiple
worktrees to exercise the code.

Mentored-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Mentored-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Signed-off-by: shejialuo &lt;shejialuo@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs: introduce "initial" transaction flag</title>
<updated>2024-11-20T22:59:15Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-11-20T07:51:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=1c299d03e5551847533019aa32863d2cbe589c7f'/>
<id>urn:sha1:1c299d03e5551847533019aa32863d2cbe589c7f</id>
<content type='text'>
There are two different ways to commit a transaction:

  - `ref_transaction_commit()` can be used to commit a regular
    transaction and is what almost every caller wants.

  - `initial_ref_transaction_commit()` can be used when it is known that
    the ref store that the transaction is committed for is empty and
    when there are no concurrent processes. This is used when cloning a
    new repository.

Implementing this via two separate functions has a couple of downsides.
First, every reference backend needs to implement a separate callback
even in the case where they don't special-case the initial transaction.
Second, backends are basically forced to reimplement the whole logic for
how to commit the transaction like the "files" backend does, even though
backends may wish to only tweak certain behaviour of a "normal" commit.
Third, it is awkward that callers must never prepare the transaction as
this is somewhat different than how a transaction typically works.

Refactor the code such that we instead mark initial transactions via a
separate flag when starting the transaction. This addresses all of the
mentioned painpoints, where the most important part is that it will
allow backends to have way more leeway in how exactly they want to
handle the initial transaction.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs: set up ref consistency check infrastructure</title>
<updated>2024-08-08T16:36:53Z</updated>
<author>
<name>shejialuo</name>
<email>shejialuo@gmail.com</email>
</author>
<published>2024-08-08T11:27:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ab6f79d8df7c7799ed38229eb56811fda36853ae'/>
<id>urn:sha1:ab6f79d8df7c7799ed38229eb56811fda36853ae</id>
<content type='text'>
The "struct ref_store" is the base class which contains the "be" pointer
which provides backend-specific functions whose interfaces are defined
in the "ref_storage_be". We could reuse this polymorphism to define only
one interface. For every backend, we need to provide its own function
pointer.

The interfaces defined in the `ref_storage_be` are carefully structured
in semantic. It's organized as the five parts:

1. The name and the initialization interfaces.
2. The ref transaction interfaces.
3. The ref internal interfaces (pack, rename and copy).
4. The ref filesystem interfaces.
5. The reflog related interfaces.

To keep consistent with the git-fsck(1), add a new interface named
"fsck_refs_fn" to the end of "ref_storage_be". This semantic cannot be
grouped into any above five categories. Explicitly add blank line to
make it different from others.

Last, implement placeholder functions for each ref backends.

Mentored-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Mentored-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Signed-off-by: shejialuo &lt;shejialuo@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
