<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/tree-walk.h, branch v2.40.3</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.40.3</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.40.3'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2022-08-10T21:26:25Z</updated>
<entry>
<title>tree-walk: add a mechanism for getting non-canonicalized modes</title>
<updated>2022-08-10T21:26:25Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2022-08-10T21:01:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ec18b10bf20574fc6d60c966412a11c81f9c17e0'/>
<id>urn:sha1:ec18b10bf20574fc6d60c966412a11c81f9c17e0</id>
<content type='text'>
When using init_tree_desc() and tree_entry() to iterate over a tree, we
always canonicalize the modes coming out of the tree. This is a good
thing to prevent bugs or oddities in normal code paths, but it's
counter-productive for tools like fsck that want to see the exact
contents.

We can address this by adding an option to avoid the extra
canonicalization. A few notes on the implementation:

  - I've attached the new option to the tree_desc struct itself. The
    actual code change is in decode_tree_entry(), which is in turn
    called by the public update_tree_entry(), tree_entry(), and
    init_tree_desc() functions, plus their "gently" counterparts.

    By letting it ride along in the struct, we can avoid changing the
    signature of those functions, which are called many times. Plus it's
    conceptually simpler: you really want a particular iteration of a
    tree to be "raw" or not, rather than individual calls.

  - We still have to set the new option somewhere. The struct is
    initialized by init_tree_desc(). I added the new flags field only to
    the "gently" version. That avoids disturbing the much more numerous
    non-gentle callers, and it makes sense that anybody being careful
    about looking at raw modes would also be careful about bogus trees
    (i.e., the caller will be something like fsck in the first place).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>tree-walk.c: break circular dependency with unpack-trees</title>
<updated>2020-02-04T18:32:15Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-02-01T11:39:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5290d4513496d89f84570985a0e02e97dff477ff'/>
<id>urn:sha1:5290d4513496d89f84570985a0e02e97dff477ff</id>
<content type='text'>
The unpack-trees API depends on the tree-walk API. But we've recently
introduced a dependency in tree-walk.c on MAX_UNPACK_TREES, which
doesn't otherwise care about unpack-trees at all.

Let's break that dependency by reversing the constants: we'll introduce
a new MAX_TRAVERSE_TREES which belongs to the tree-walk API. And then we
can define MAX_UNPACK_TREES in terms of that (since unpack-trees cannot
possibly work with more trees than it can traverse at once via
tree-walk).

The value for both will remain at 8. This is somewhat arbitrary and
probably more than is necessary, per ca885a4fe6 (read-tree() and
unpack_trees(): use consistent limit, 2008-03-13), but there's not
really any pressing need to reduce it.

Suggested-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Acked-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>tree-walk: move doc to tree-walk.h</title>
<updated>2019-11-18T06:21:29Z</updated>
<author>
<name>Heba Waly</name>
<email>heba.waly@gmail.com</email>
</author>
<published>2019-11-17T21:04:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=bbcfa3002a6534613d8b74ecac0e19876dbfc2d3'/>
<id>urn:sha1:bbcfa3002a6534613d8b74ecac0e19876dbfc2d3</id>
<content type='text'>
Move the documentation from Documentation/technical/api-tree-walking.txt
to tree-walk.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.

Documentation/technical/api-tree-walking.txt is removed because the
information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header file.

Signed-off-by: Heba Waly &lt;heba.waly@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jk/tree-walk-overflow'</title>
<updated>2019-08-22T19:34:10Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2019-08-22T19:34:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=1b01cdbf2e65331879a4668880a191dfac953761'/>
<id>urn:sha1:1b01cdbf2e65331879a4668880a191dfac953761</id>
<content type='text'>
Codepaths to walk tree objects have been audited for integer
overflows and hardened.

* jk/tree-walk-overflow:
  tree-walk: harden make_traverse_path() length computations
  tree-walk: add a strbuf wrapper for make_traverse_path()
  tree-walk: accept a raw length for traverse_path_len()
  tree-walk: use size_t consistently
  tree-walk: drop oid from traverse_info
  setup_traverse_info(): stop copying oid
</content>
</entry>
<entry>
<title>tree-walk: harden make_traverse_path() length computations</title>
<updated>2019-08-01T20:06:52Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2019-07-31T04:38:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5aa02f98685d78666293149087d3f69b97528cfb'/>
<id>urn:sha1:5aa02f98685d78666293149087d3f69b97528cfb</id>
<content type='text'>
The make_traverse_path() function isn't very careful about checking its
output buffer boundaries. In fact, it doesn't even _know_ the size of
the buffer it's writing to, and just assumes that the caller used
traverse_path_len() correctly. And even then we assume that our
traverse_info.pathlen components are all correct, and just blindly write
into the buffer.

Let's improve this situation a bit:

  - have the caller pass in their allocated buffer length, which we'll
    check against our own computations

  - check for integer underflow as we do our backwards-insertion of
    pathnames into the buffer

  - check that we do not run out items in our list to traverse before
    we've filled the expected number of bytes

None of these should be triggerable in practice (especially since our
switch to size_t everywhere in a previous commit), but it doesn't hurt
to check our assumptions.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>tree-walk: add a strbuf wrapper for make_traverse_path()</title>
<updated>2019-08-01T20:06:52Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2019-07-31T04:38:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c43ab062598d0299ea6e0d115a6018189a7793bf'/>
<id>urn:sha1:c43ab062598d0299ea6e0d115a6018189a7793bf</id>
<content type='text'>
All but one of the callers of make_traverse_path() allocate a new heap
buffer to store the path. Let's give them an easy way to write to a
strbuf, which saves them from computing the length themselves (which is
especially tricky when they want to add to the path). It will also make
it easier for us to change the make_traverse_path() interface in a
future patch to improve its bounds-checking.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>tree-walk: accept a raw length for traverse_path_len()</title>
<updated>2019-08-01T20:06:52Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2019-07-31T04:38:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b3b3cbcbf246b1051ad453bc02e24a89573e2911'/>
<id>urn:sha1:b3b3cbcbf246b1051ad453bc02e24a89573e2911</id>
<content type='text'>
We take a "struct name_entry", but only care about the length of the
path name. Let's just take that length directly, making it easier to use
the function from callers that sometimes do not have a name_entry at
all.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>tree-walk: use size_t consistently</title>
<updated>2019-08-01T20:06:40Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2019-07-31T04:38:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=37806080d7be1ab5b2fa918f6a528652596ea2c1'/>
<id>urn:sha1:37806080d7be1ab5b2fa918f6a528652596ea2c1</id>
<content type='text'>
We store and manipulate the cumulative traverse_info.pathlen as an
"int", which can overflow when we are fed ridiculously long pathnames
(e.g., ones at the edge of 2GB or 4GB, even if the individual tree entry
names are smaller than that). The results can be confusing, though
after some prodding I was not able to use this integer overflow to cause
an under-allocated buffer.

Let's consistently use size_t to generate and store these, and make
sure our addition doesn't overflow.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>tree-walk: drop oid from traverse_info</title>
<updated>2019-07-31T20:34:25Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2019-07-31T04:38:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9055384710dd8963b125f4f87c24d8f67d9fa24f'/>
<id>urn:sha1:9055384710dd8963b125f4f87c24d8f67d9fa24f</id>
<content type='text'>
As the previous commit shows, the presence of an oid in each level of
the traverse_info is confusing and ultimately not necessary. Let's drop
it to make it clear that it will not always be set (as well as convince
us that it's unused, and let the compiler catch any merges with other
branches that do add new uses).

Since the oid is part of name_entry, we'll actually stop embedding a
name_entry entirely, and instead just separately hold the pathname, its
length, and the mode.

This makes the resulting code slightly more verbose as we have to pass
those elements around individually. But it also makes it more clear what
each code path is going to use (and in most of the paths, we really only
care about the pathname itself).

A few of these conversions are noisier than they need to be, as they
also take the opportunity to rename "len" to "namelen" for clarity
(especially where we also have "pathlen" or "ce_len" alongside).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>tree-walk.c: remove the_repo from get_tree_entry_follow_symlinks()</title>
<updated>2019-06-27T19:45:17Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2019-06-27T09:28:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0dd1f0c3a60161db7908472eff43e948711ce9bd'/>
<id>urn:sha1:0dd1f0c3a60161db7908472eff43e948711ce9bd</id>
<content type='text'>
Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
