<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/list-objects-filter-options.h, branch v2.32.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.32.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.32.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2021-04-19T21:09:11Z</updated>
<entry>
<title>list-objects: implement object type filter</title>
<updated>2021-04-19T21:09:11Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2021-04-19T11:46:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b0c42a53c9d36ea69f4d2650001f05e98eb347cb'/>
<id>urn:sha1:b0c42a53c9d36ea69f4d2650001f05e98eb347cb</id>
<content type='text'>
While it already is possible to filter objects by some criteria in
git-rev-list(1), it is not yet possible to filter out only a specific
type of objects. This makes some filters less useful. The `blob:limit`
filter for example filters blobs such that only those which are smaller
than the given limit are returned. But it is unfit to ask only for these
smallish blobs, given that git-rev-list(1) will continue to print tags,
commits and trees.

Now that we have the infrastructure in place to also filter tags and
commits, we can improve this situation by implementing a new filter
which selects objects based on their type. Above query can thus
trivially be implemented with the following command:

    $ git rev-list --objects --filter=object:type=blob \
        --filter=blob:limit=200

Furthermore, this filter allows to optimize for certain other cases: if
for example only tags or commits have been selected, there is no need to
walk down trees.

The new filter is not yet supported in bitmaps. This is going to be
implemented in a subsequent commit.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>list_objects_filter_options: introduce 'list_object_filter_config_name'</title>
<updated>2020-08-04T01:03:24Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2020-07-31T20:26:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b9ea214795b89e0455bbe92fab5d30d1f3a4cc6d'/>
<id>urn:sha1:b9ea214795b89e0455bbe92fab5d30d1f3a4cc6d</id>
<content type='text'>
In a subsequent commit, we will add configuration options that are
specific to each kind of object filter, in which case it is handy to
have a function that translates between 'enum
list_objects_filter_choice' and an appropriate configuration-friendly
string.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Acked-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Use OPT_CALLBACK and OPT_CALLBACK_F</title>
<updated>2020-04-28T17:47:10Z</updated>
<author>
<name>Denton Liu</name>
<email>liu.denton@gmail.com</email>
</author>
<published>2020-04-28T08:36:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=203c85339fb51bb8b83aae8f0adde44d6e55e018'/>
<id>urn:sha1:203c85339fb51bb8b83aae8f0adde44d6e55e018</id>
<content type='text'>
In the codebase, there are many options which use OPTION_CALLBACK in a
plain ol' struct definition. However, we have the OPT_CALLBACK and
OPT_CALLBACK_F macros which are meant to abstract these plain struct
definitions away. These macros are useful as they semantically signal to
developers that these are just normal callback option with nothing fancy
happening.

Replace plain struct definitions of OPTION_CALLBACK with OPT_CALLBACK or
OPT_CALLBACK_F where applicable. The heavy lifting was done using the
following (disgusting) shell script:

	#!/bin/sh

	do_replacement () {
		tr '\n' '\r' |
			sed -e 's/{\s*OPTION_CALLBACK,\s*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\s*0,\(\s*[^[:space:]}]*\)\s*}/OPT_CALLBACK(\1,\2,\3,\4,\5,\6)/g' |
			sed -e 's/{\s*OPTION_CALLBACK,\s*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\(\s*[^[:space:]}]*\)\s*}/OPT_CALLBACK_F(\1,\2,\3,\4,\5,\6,\7)/g' |
			tr '\r' '\n'
	}

	for f in $(git ls-files \*.c)
	do
		do_replacement &lt;"$f" &gt;"$f.tmp"
		mv "$f.tmp" "$f"
	done

The result was manually inspected and then reformatted to match the
style of the surrounding code. Finally, using
`git grep OPTION_CALLBACK \*.c`, leftover results which were not handled
by the script were manually transformed.

Signed-off-by: Denton Liu &lt;liu.denton@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jk/partial-clone-sparse-blob'</title>
<updated>2019-10-07T02:32:54Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2019-10-07T02:32:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ad8f0368b45bf1ab0f1339033d0a62cee94b1ae2'/>
<id>urn:sha1:ad8f0368b45bf1ab0f1339033d0a62cee94b1ae2</id>
<content type='text'>
The name of the blob object that stores the filter specification
for sparse cloning/fetching was interpreted in a wrong place in the
code, causing Git to abort.

* jk/partial-clone-sparse-blob:
  list-objects-filter: use empty string instead of NULL for sparse "base"
  list-objects-filter: give a more specific error sparse parsing error
  list-objects-filter: delay parsing of sparse oid
  t5616: test cloning/fetching with sparse:oid=&lt;oid&gt; filter
</content>
</entry>
<entry>
<title>Merge branch 'md/list-objects-filter-combo'</title>
<updated>2019-09-18T18:50:09Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2019-09-18T18:50:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=627b82683447e299fc2e20140318c276efbf7de2'/>
<id>urn:sha1:627b82683447e299fc2e20140318c276efbf7de2</id>
<content type='text'>
The list-objects-filter API (used to create a sparse/lazy clone)
learned to take a combined filter specification.

* md/list-objects-filter-combo:
  list-objects-filter-options: make parser void
  list-objects-filter-options: clean up use of ALLOC_GROW
  list-objects-filter-options: allow mult. --filter
  strbuf: give URL-encoding API a char predicate fn
  list-objects-filter-options: make filter_spec a string_list
  list-objects-filter-options: move error check up
  list-objects-filter: implement composite filters
  list-objects-filter-options: always supply *errbuf
  list-objects-filter: put omits set in filter struct
  list-objects-filter: encapsulate filter components
</content>
</entry>
<entry>
<title>list-objects-filter: delay parsing of sparse oid</title>
<updated>2019-09-16T19:47:37Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2019-09-15T16:12:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=4c96a775945d0299e39b982ab9cb32c5132e877d'/>
<id>urn:sha1:4c96a775945d0299e39b982ab9cb32c5132e877d</id>
<content type='text'>
The list-objects-filter code has two steps to its initialization:

  1. parse_list_objects_filter() makes sure the spec is a filter we know
     about and is syntactically correct. This step is done by "rev-list"
     or "upload-pack" that is going to apply a filter, but also by "git
     clone" or "git fetch" before they send the spec across the wire.

  2. list_objects_filter__init() runs the type-specific initialization
     (using function pointers established in step 1). This happens at
     the start of traverse_commit_list_filtered(), when we're about to
     actually use the filter.

It's a good idea to parse as much as we can in step 1, in order to catch
problems early (e.g., a blob size limit that isn't a number). But one
thing we _shouldn't_ do is resolve any oids at that step (e.g., for
sparse-file contents specified by oid). In the case of a fetch, the oid
has to be resolved on the remote side.

The current code does resolve the oid during the parse phase, but
ignores any error (which we must do, because we might just be sending
the spec across the wire). This leads to two bugs:

  - if we're not in a repository (e.g., because it's git-clone parsing
    the spec), then we trigger a BUG() trying to resolve the name

  - if we did hit the error case, we still have to notice that later and
    bail. The code path in rev-list handles this, but the one in
    upload-pack does not, leading to a segfault.

We can fix both by moving the oid resolution into the sparse-oid init
function. At that point we know we have a repository (because we're
about to traverse), and handling the error there fixes the segfault.

As a bonus, we can drop the NULL sparse_oid_value check in rev-list,
since this is now handled in the sparse-oid-filter init function.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Acked-by: Jeff Hostetler &lt;jeffhost@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>list-objects-filter-options: make parser void</title>
<updated>2019-06-28T15:41:53Z</updated>
<author>
<name>Matthew DeVore</name>
<email>matvore@google.com</email>
</author>
<published>2019-06-27T22:54:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=90d21f9ebf6906f0ebb4fb1b20ec9536072e2916'/>
<id>urn:sha1:90d21f9ebf6906f0ebb4fb1b20ec9536072e2916</id>
<content type='text'>
This function always returns 0, so make it return void instead.

Signed-off-by: Matthew DeVore &lt;matvore@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>list-objects-filter-options: allow mult. --filter</title>
<updated>2019-06-28T15:41:53Z</updated>
<author>
<name>Matthew DeVore</name>
<email>matvore@google.com</email>
</author>
<published>2019-06-27T22:54:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=489fc9ee718b7c8594f17b55f090ac5292c655e1'/>
<id>urn:sha1:489fc9ee718b7c8594f17b55f090ac5292c655e1</id>
<content type='text'>
Allow combining of multiple filters by simply repeating the --filter
flag. Before this patch, the user had to combine them in a single flag
somewhat awkwardly (e.g. --filter=combine:FOO+BAR), including
URL-encoding the individual filters.

To make this work, in the --filter flag parsing callback, rather than
error out when we detect that the filter_options struct is already
populated, we modify it in-place to contain the added sub-filter. The
existing sub-filter becomes the lhs of the combined filter, and the
next sub-filter becomes the rhs. We also have to URL-encode the LHS and
RHS sub-filters.

We can simplify the operation if the LHS is already a combine: filter.
In that case, we just append the URL-encoded RHS sub-filter to the LHS
spec to get the new spec.

Helped-by: Emily Shaffer &lt;emilyshaffer@google.com&gt;
Helped-by: Jeff Hostetler &lt;git@jeffhostetler.com&gt;
Helped-by: Jeff King &lt;peff@peff.net&gt;
Helped-by: Junio C Hamano &lt;gitster@pobox.com&gt;
Signed-off-by: Matthew DeVore &lt;matvore@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>list-objects-filter-options: make filter_spec a string_list</title>
<updated>2019-06-28T15:41:53Z</updated>
<author>
<name>Matthew DeVore</name>
<email>matvore@google.com</email>
</author>
<published>2019-06-27T22:54:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=cf9ceb5a12cad9c9153d227a0f497d1b522ce085'/>
<id>urn:sha1:cf9ceb5a12cad9c9153d227a0f497d1b522ce085</id>
<content type='text'>
Make the filter_spec string a string_list rather than a raw C string.
The list of strings must be concatted together to make a complete
filter_spec. A future patch will use this capability to build "combine:"
filter specs gradually.

A strbuf would seem to be a more natural choice for this object, but it
unfortunately requires initialization besides just zero'ing out the
memory.  This results in all container structs, and all containers of
those structs, etc., to also require initialization. Initializing them
all would be more cumbersome that simply using a string_list, which
behaves properly when its contents are zero'd.

For the purposes of code simplification, change behavior in how filter
specs are conveyed over the protocol: do not normalize the tree:&lt;depth&gt;
filter specs since there should be no server in existence that supports
tree:# but not tree:#k etc.

Helped-by: Junio C Hamano &lt;gitster@pobox.com&gt;
Signed-off-by: Matthew DeVore &lt;matvore@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>list-objects-filter: implement composite filters</title>
<updated>2019-06-28T15:41:53Z</updated>
<author>
<name>Matthew DeVore</name>
<email>matvore@google.com</email>
</author>
<published>2019-06-27T22:54:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e987df5fe62b8b29be4cdcdeb3704681ada2b29e'/>
<id>urn:sha1:e987df5fe62b8b29be4cdcdeb3704681ada2b29e</id>
<content type='text'>
Allow combining filters such that only objects accepted by all filters
are shown. The motivation for this is to allow getting directory
listings without also fetching blobs. This can be done by combining
blob:none with tree:&lt;depth&gt;. There are massive repositories that have
larger-than-expected trees - even if you include only a single commit.

A combined filter supports any number of subfilters, and is written in
the following form:

	combine:&lt;filter 1&gt;+&lt;filter 2&gt;+&lt;filter 3&gt;

Certain non-alphanumeric characters in each filter must be
URL-encoded.

For now, combined filters must be specified in this form. In a
subsequent commit, rev-list will support multiple --filter arguments
which will have the same effect as specifying one filter argument
starting with "combine:". The documentation will be updated in that
commit, as the URL-encoding scheme is in general not meant to be used
directly by the user, and it is better to describe the URL-encoding
feature in terms of the repeated flag.

Helped-by: Emily Shaffer &lt;emilyshaffer@google.com&gt;
Helped-by: Jeff Hostetler &lt;git@jeffhostetler.com&gt;
Helped-by: Johannes Schindelin &lt;Johannes.Schindelin@gmx.de&gt;
Helped-by: Jonathan Tan &lt;jonathantanmy@google.com&gt;
Helped-by: Junio C Hamano &lt;gitster@pobox.com&gt;
Signed-off-by: Matthew DeVore &lt;matvore@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
