git/refs.h, branch v2.32.2

Merge branch 'tb/ls-refs-optim'

2021-02-06T00:40:45Z

The ls-refs protocol operation has been optimized to narrow the sub-hierarchy of refs/ it walks to produce response. * tb/ls-refs-optim: ls-refs.c: traverse prefixes of disjoint "ref-prefix" sets ls-refs.c: initialize 'prefixes' before using it refs: expose 'for_each_fullref_in_prefixes'

refs: expose 'for_each_fullref_in_prefixes'

2021-01-23T02:57:27Z

This function was used in the ref-filter.c code to find the longest common prefix of among a set of refspecs, and then to iterate all of the references that descend from that prefix. A future patch will want to use that same code from ls-refs.c, so prepare by exposing and moving it to refs.c. Since there is nothing specific to the ref-filter code here (other than that it was previously the only caller of this function), this really belongs in the more generic refs.h header. The code moved in this patch is identical before and after, with the one exception of renaming some arguments to be consistent with other functions exposed in refs.h. Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano

refs: switch peel_ref() to peel_iterated_oid()

2021-01-21T23:51:31Z

The peel_ref() interface is confusing and error-prone: - it's typically used by ref iteration callbacks that have both a refname and oid. But since they pass only the refname, we may load the ref value from the filesystem again. This is inefficient, but also means we are open to a race if somebody simultaneously updates the ref. E.g., this: int some_ref_cb(const char *refname, const struct object_id *oid, ...) { if (!peel_ref(refname, &peeled)) printf("%s peels to %s", oid_to_hex(oid), oid_to_hex(&peeled); } could print nonsense. It is correct to say "refname peels to..." (you may see the "before" value or the "after" value, either of which is consistent), but mentioning both oids may be mixing before/after values. Worse, whether this is possible depends on whether the optimization to read from the current iterator value kicks in. So it is actually not possible with: for_each_ref(some_ref_cb); but it _is_ possible with: head_ref(some_ref_cb); which does not use the iterator mechanism (though in practice, HEAD should never peel to anything, so this may not be triggerable). - it must take a fully-qualified refname for the read_ref_full() code path to work. Yet we routinely pass it partial refnames from callbacks to for_each_tag_ref(), etc. This happens to work when iterating because there we do not call read_ref_full() at all, and only use the passed refname to check if it is the same as the iterator. But the requirements for the function parameters are quite unclear. Instead of taking a refname, let's instead take an oid. That fixes both problems. It's a little funny for a "ref" function not to involve refs at all. The key thing is that it's optimizing under the hood based on having access to the ref iterator. So let's change the name to make it clear why you'd want this function versus just peel_object(). There are two other directions I considered but rejected: - we could pass the peel information into the each_ref_fn callback. However, we don't know if the caller actually wants it or not. For packed-refs, providing it is essentially free. But for loose refs, we actually have to peel the object, which would be wasteful in most cases. We could likewise pass in a flag to the callback indicating whether the peeled information is known, but that complicates those callbacks, as they then have to decide whether to manually peel themselves. Plus it requires changing the interface of every callback, whether they care about peeling or not, and there are many of them. - we could make a function to return the peeled value of the current iterated ref (computing it if necessary), and BUG() otherwise. I.e.: int peel_current_iterated_ref(struct object_id *out); Each of the current callers is an each_ref_fn callback, so they'd mostly be happy. But: - we use those callbacks with functions like head_ref(), which do not use the iteration code. So we'd need to handle the fallback case there, anyway. - it's possible that a caller would want to call into generic code that sometimes is used during iteration and sometimes not. This encapsulates the logic to do the fast thing when possible, and fallback when necessary. The implementation is mostly obvious, but I want to call out a few things in the patch: - the test-tool coverage for peel_ref() is now meaningless, as it all collapses to a single peel_object() call (arguably they were pretty uninteresting before; the tricky part of that function is the fast-path we see during iteration, but these calls didn't trigger that). I've just dropped it entirely, though note that some other tests relied on the tags we created; I've moved that creation to the tests where it matters. - we no longer need to take a ref_store parameter, since we'd never look up a ref now. We do still rely on a global "current iterator" variable which _could_ be kept per-ref-store. But in practice this is only useful if there are multiple recursive iterations, at which point the more appropriate solution is probably a stack of iterators. No caller used the actual ref-store parameter anyway (they all call the wrapper that passes the_repository). - the original only kicked in the optimization when the "refname" pointer matched (i.e., not string comparison). We do likewise with the "oid" parameter here, but fall back to doing an actual oideq() call. This in theory lets us kick in the optimization more often, though in practice no current caller cares. It should never be wrong, though (peeling is a property of an object, so two refs pointing to the same object would peel identically). - the original took care not to touch the peeled out-parameter unless we found something to put in it. But no caller cares about this, and anyway, it is enforced by peel_object() itself (and even in the optimized iterator case, that's where we eventually end up). We can shorten the code and avoid an extra copy by just passing the out-parameter through the stack. Signed-off-by: Jeff King Reviewed-by: Taylor Blau Signed-off-by: Junio C Hamano

get_default_branch_name(): prepare for showing some advice

2020-12-13T23:53:50Z

We are about to introduce a message giving users running `git init` some advice about `init.defaultBranch`. This will necessarily be done in `repo_default_branch_name()`. Not all code paths want to show that advice, though. In particular, the `git clone` codepath _specifically_ asks for `init_db()` to be quiet, via the `INIT_DB_QUIET` flag. In preparation for showing users above-mentioned advice, let's change the function signature of `get_default_branch_name()` to accept the parameter `quiet`. Signed-off-by: Johannes Schindelin Signed-off-by: Junio C Hamano

Merge branch 'jt/interpret-branch-name-fallback'

2020-09-09T20:53:09Z

"git status" has trouble showing where it came from by interpreting reflog entries that recordcertain events, e.g. "checkout @{u}", and gives a hard/fatal error. Even though it inherently is impossible to give a correct answer because the reflog entries lose some information (e.g. "@{u}" does not record what branch the user was on hence which branch 'the upstream' needs to be computed, and even if the record were available, the relationship between branches may have changed), at least hide the error to allow "status" show its output. * jt/interpret-branch-name-fallback: wt-status: tolerate dangling marks refs: move dwim_ref() to header file sha1-name: replace unsigned int with option struct

wt-status: tolerate dangling marks

2020-09-02T21:39:25Z

When a user checks out the upstream branch of HEAD, the upstream branch not being a local branch, and then runs "git status", like this: git clone $URL client cd client git checkout @{u} git status no status is printed, but instead an error message: fatal: HEAD does not point to a branch (This error message when running "git branch" persists even after checking out other things - it only stops after checking out a branch.) This is because "git status" reads the reflog when determining the "HEAD detached" message, and thus attempts to DWIM "@{u}", but that doesn't work because HEAD no longer points to a branch. Therefore, when calculating the status of a worktree, tolerate dangling marks. This is done by adding an additional parameter to dwim_ref() and repo_dwim_ref(). Signed-off-by: Jonathan Tan Signed-off-by: Junio C Hamano

refs: move dwim_ref() to header file

2020-09-02T21:39:17Z

This makes it clear that dwim_ref() is just repo_dwim_ref() without the first parameter. Signed-off-by: Jonathan Tan Signed-off-by: Junio C Hamano

refs: make refs_ref_exists public

2020-08-21T18:20:10Z

This will be necessary to replace file existence checks for pseudorefs. Signed-off-by: Han-Wen Nienhuys Signed-off-by: Junio C Hamano

argv-array: rename to strvec

2020-07-28T22:02:17Z

The name "argv-array" isn't very good, because it describes what the data type can be used for (program argument arrays), not what it actually is (a dynamically-growing string array that maintains a NULL-terminator invariant). This leads to people being hesitant to use it for other cases where it would actually be a good fit. The existing name is also clunky to use. It's overly long, and the name often leads to saying things like "argv.argv" (i.e., the field names overlap with variable names, since they're describing the use, not the type). Let's give it a more neutral name. I settled on "strvec" because "vector" is the name for a dynamic array type in many programming languages. "strarray" would work, too, but it's longer and a bit more awkward to say (and don't we all say these things in our mind as we type them?). A more extreme direction would be a generic data structure which stores a NULL-terminated of _any_ type. That would be easy to do with void pointers, but we'd lose some type safety for the existing cases. Plus it raises questions about memory allocation and ownership. So I limited myself here to changing names only, and not semantics. If we do find a use for that more generic data type, we could perhaps implement it at a lower level and then provide type-safe wrappers around it for strings. But that can come later. This patch does the minimum to convert the struct and function names in the header and implementation, leaving a few things for follow-on patches: - files retain their original names for now - struct field names are retained for now - there's a preprocessor compat layer that lets most users remain the same for now. The exception is headers which made a manual forward declaration of the struct. I've converted them (and their dependent function declarations) here. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

Merge branch 'js/default-branch-name'

2020-07-07T05:09:17Z

The name of the primary branch in existing repositories, and the default name used for the first branch in newly created repositories, is made configurable, so that we can eventually wean ourselves off of the hardcoded 'master'. * js/default-branch-name: contrib: subtree: adjust test to change in fmt-merge-msg testsvn: respect `init.defaultBranch` remote: use the configured default branch name when appropriate clone: use configured default branch name when appropriate init: allow setting the default for the initial branch name via the config init: allow specifying the initial branch name for the new repository docs: add missing diamond brackets submodule: fall back to remote's HEAD for missing remote..branch send-pack/transport-helper: avoid mentioning a particular branch fmt-merge-msg: stop treating `master` specially