<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/http-walker.c, branch v2.16.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.16.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.16.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2017-08-23T22:12:07Z</updated>
<entry>
<title>pack: move find_sha1_pack()</title>
<updated>2017-08-23T22:12:07Z</updated>
<author>
<name>Jonathan Tan</name>
<email>jonathantanmy@google.com</email>
</author>
<published>2017-08-18T22:20:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d6fe0036fd5e0cf7f51aa84381ebd321e898350a'/>
<id>urn:sha1:d6fe0036fd5e0cf7f51aa84381ebd321e898350a</id>
<content type='text'>
Signed-off-by: Jonathan Tan &lt;jonathantanmy@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jk/http-walker-buffer-underflow-fix'</title>
<updated>2017-03-17T20:50:26Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2017-03-17T20:50:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=2af882be0181ee29aa3a4fb3181b94e9bca53c51'/>
<id>urn:sha1:2af882be0181ee29aa3a4fb3181b94e9bca53c51</id>
<content type='text'>
"Dumb http" transport used to misparse a nonsense http-alternates
response, which has been fixed.

* jk/http-walker-buffer-underflow-fix:
  http-walker: fix buffer underflow processing remote alternates
</content>
</entry>
<entry>
<title>http-walker: fix buffer underflow processing remote alternates</title>
<updated>2017-03-13T17:20:29Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2017-03-13T14:04:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d61434ae813cc86a1a87d05cc61e36e87b0e20a9'/>
<id>urn:sha1:d61434ae813cc86a1a87d05cc61e36e87b0e20a9</id>
<content type='text'>
If we parse a remote alternates (or http-alternates), we
expect relative lines like:

  ../../foo.git/objects

which we convert into "$URL/../foo.git/" (and then use that
as a base for fetching more objects).

But if the remote feeds us nonsense like just:

  ../

we will try to blindly strip the last 7 characters, assuming
they contain the string "objects". Since we don't _have_ 7
characters at all, this results in feeding a small negative
value to strbuf_add(), which converts it to a size_t,
resulting in a big positive value. This should consistently
fail (since we can't generall allocate the max size_t minus
7 bytes), so there shouldn't be any security implications.

Let's fix this by using strbuf_strip_suffix() to drop the
characters we want. If they're not present, we'll ignore the
alternate (in theory we could use it as-is, but the rest of
the http-walker code unconditionally tacks "objects/" back
on, so it is it not prepared to handle such a case).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>http: release strbuf on disabled alternates</title>
<updated>2017-03-06T18:52:57Z</updated>
<author>
<name>Eric Wong</name>
<email>e@80x24.org</email>
</author>
<published>2017-03-04T01:50:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5cae73d5d23ec109322948478a957e77d2733f94'/>
<id>urn:sha1:5cae73d5d23ec109322948478a957e77d2733f94</id>
<content type='text'>
This likely has no real-world impact on memory usage,
but it is cleaner for future readers.

Fixes: abcbdc03895f ("http: respect protocol.*.allow=user for http-alternates")
Signed-off-by: Eric Wong &lt;e@80x24.org&gt;
Reviewed-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>http: inform about alternates-as-redirects behavior</title>
<updated>2017-03-06T18:52:15Z</updated>
<author>
<name>Eric Wong</name>
<email>e@80x24.org</email>
</author>
<published>2017-03-04T08:36:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=de46138826cbc8229e25691e613e09ae1f683235'/>
<id>urn:sha1:de46138826cbc8229e25691e613e09ae1f683235</id>
<content type='text'>
It is disconcerting for users to not notice the behavior
change in handling alternates from commit cb4d2d35c4622ec2
("http: treat http-alternates like redirects")

Give the user a hint about the config option so they can
see the URL and decide whether or not they want to enable
http.followRedirects in their config.

Signed-off-by: Eric Wong &lt;e@80x24.org&gt;
Reviewed-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>http: respect protocol.*.allow=user for http-alternates</title>
<updated>2016-12-15T17:29:13Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2016-12-14T22:39:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=abcbdc03895ff3f00280e54af11fee92d6877044'/>
<id>urn:sha1:abcbdc03895ff3f00280e54af11fee92d6877044</id>
<content type='text'>
The http-walker may fetch the http-alternates (or
alternates) file from a remote in order to find more
objects. This should count as a "not from the user" use of
the protocol. But because we implement the redirection
ourselves and feed the new URL to curl, it will use the
CURLOPT_PROTOCOLS rules, not the more restrictive
CURLOPT_REDIR_PROTOCOLS.

The ideal solution would be for each curl request we make to
know whether or not is directly from the user or part of an
alternates redirect, and then set CURLOPT_PROTOCOLS as
appropriate. However, that would require plumbing that
information through all of the various layers of the http
code.

Instead, let's check the protocol at the source: when we are
parsing the remote http-alternates file. The only downside
is that if there's any mismatch between what protocol we
think it is versus what curl thinks it is, it could violate
the policy.

To address this, we'll make the parsing err on the picky
side, and only allow protocols that it can parse
definitively. So for example, you can't elude the "http"
policy by asking for "HTTP://", even though curl might
handle it; we would reject it as unknown. The only unsafe
case would be if you have a URL that starts with "http://"
but curl interprets as another protocol. That seems like an
unlikely failure mode (and we are still protected by our
base CURLOPT_PROTOCOL setting, so the worst you could do is
trigger one of https, ftp, or ftps).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Brandon Williams &lt;bmwill@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>http-walker: complain about non-404 loose object errors</title>
<updated>2016-12-06T20:43:34Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2016-12-06T18:25:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3680f16f9d6832dc78c6b80f1e8a546385d946f9'/>
<id>urn:sha1:3680f16f9d6832dc78c6b80f1e8a546385d946f9</id>
<content type='text'>
Since commit 17966c0a6 (http: avoid disconnecting on 404s
for loose objects, 2016-07-11), we turn off curl's
FAILONERROR option and instead manually deal with failing
HTTP codes.

However, the logic to do so only recognizes HTTP 404 as a
failure. This is probably the most common result, but if we
were to get another code, the curl result remains CURLE_OK,
and we treat it as success. We still end up detecting the
failure when we try to zlib-inflate the object (which will
fail), but instead of reporting the HTTP error, we just
claim that the object is corrupt.

Instead, let's catch anything in the 300's or above as an
error (300's are redirects which are not an error at the
HTTP level, but are an indication that we've explicitly
disabled redirects, so we should treat them as such; we
certainly don't have the resulting object content).

Note that we also fill in req-&gt;errorstr, which we didn't do
before. Without FAILONERROR, curl will not have filled this
in, and it will remain a blank string. This never mattered
for the 404 case, because in the logic below we hit the
"missing_target()" branch and print nothing. But for other
errors, we'd want to say _something_, if only to fill in the
blank slot in the error message.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ew/http-walker' into jk/http-walker-limit-redirect</title>
<updated>2016-12-06T20:43:23Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2016-12-06T20:40:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=43ec089eea1d8eee125718b0b7a83720b036ae3e'/>
<id>urn:sha1:43ec089eea1d8eee125718b0b7a83720b036ae3e</id>
<content type='text'>
* ew/http-walker:
  list: avoid incompatibility with *BSD sys/queue.h
  http-walker: reduce O(n) ops with doubly-linked list
  http: avoid disconnecting on 404s for loose objects
  http-walker: remove unused parameter from fetch_object
</content>
</entry>
<entry>
<title>http: treat http-alternates like redirects</title>
<updated>2016-12-06T20:32:48Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2016-12-06T18:24:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=cb4d2d35c4622ec2513c1c352d30ff8f9f9cdb9e'/>
<id>urn:sha1:cb4d2d35c4622ec2513c1c352d30ff8f9f9cdb9e</id>
<content type='text'>
The previous commit made HTTP redirects more obvious and
tightened up the default behavior. However, there's another
way for a server to ask a git client to fetch arbitrary
content: by having an http-alternates file (or a regular
alternates file, which is used as a backup).

Similar to the HTTP redirect case, a malicious server can
claim to have refs pointing at object X, return a 404 when
the client asks for X, but point to some other URL via
http-alternates, which the client will transparently fetch.
The end result is that it looks from the user's perspective
like the objects came from the malicious server, as the
other URL is not mentioned at all.

Worse, because we feed the new URL to curl ourselves, the
usual protocol restrictions do not kick in (neither curl's
default of disallowing file://, nor the protocol
whitelisting in f4113cac0 (http: limit redirection to
protocol-whitelist, 2015-09-22).

Let's apply the same rules here as we do for HTTP redirects.
Namely:

  - unless http.followRedirects is set to "always", we will
    not follow remote redirects from http-alternates (or
    alternates) at all

  - set CURLOPT_PROTOCOLS alongside CURLOPT_REDIR_PROTOCOLS
    restrict ourselves to a known-safe set and respect any
    user-provided whitelist.

  - mention alternate object stores on stderr so that the
    user is aware another source of objects may be involved

The first item may prove to be too restrictive. The most
common use of alternates is to point to another path on the
same server. While it's possible for a single-server
redirect to be an attack, it takes a fairly obscure setup
(victim and evil repository on the same host, host speaks
dumb http, and evil repository has access to edit its own
http-alternates file).

So we could make the checks more specific, and only cover
cross-server redirects. But that means parsing the URLs
ourselves, rather than letting curl handle them. This patch
goes for the simpler approach. Given that they are only used
with dumb http, http-alternates are probably pretty rare.
And there's an escape hatch: the user can allow redirects on
a specific server by setting http.&lt;url&gt;.followRedirects to
"always".

Reported-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>http-walker: reduce O(n) ops with doubly-linked list</title>
<updated>2016-07-12T22:17:42Z</updated>
<author>
<name>Eric Wong</name>
<email>e@80x24.org</email>
</author>
<published>2016-07-11T20:51:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=94e99012fc7a02c5504214294279fa49b4cc8ce3'/>
<id>urn:sha1:94e99012fc7a02c5504214294279fa49b4cc8ce3</id>
<content type='text'>
Using the a Linux-kernel-derived doubly-linked list
implementation from the Userspace RCU library allows us to
enqueue and delete items from the object request queue in
constant time.

This change reduces enqueue times in the prefetch() function
where object request queue could grow to several thousand
objects.

I left out the list_for_each_entry* family macros from list.h
which relied on the __typeof__ operator as we support platforms
without it.  Thus, list_entry (aka "container_of") needs to be
called explicitly inside macro-wrapped for loops.

The downside is this costs us an additional pointer per object
request, but this is offset by reduced overhead on queue
operations leading to improved performance and shorter queue
depths.

Signed-off-by: Eric Wong &lt;e@80x24.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
