<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/http-backend.c, branch v2.4.9</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.4.9</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.4.9'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2015-05-26T03:43:18Z</updated>
<entry>
<title>http-backend: spool ref negotiation requests to buffer</title>
<updated>2015-05-26T03:43:18Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2015-05-20T07:37:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=6bc0cb5176a5e42ca4a74e3558e8f0790ed09bb1'/>
<id>urn:sha1:6bc0cb5176a5e42ca4a74e3558e8f0790ed09bb1</id>
<content type='text'>
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request.  In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.

In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:

  1. Apache proxies the request to the CGI, http-backend.

  2. http-backend gzip-inflates the data and sends
     the result to upload-pack.

  3. upload-pack acts on the data and generates output over
     the pipe back to Apache. Apache isn't reading because
     it's busy writing (step 1).

This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).

We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.

The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:

  1. We limit the in-memory buffer to prevent an obvious
     denial-of-service attack. This is a new hard limit on
     requests, but it's unlikely to come into play. The
     default value is 10MB, which covers even the ridiculous
     100,000-ref negotation in the included test (that
     actually caps out just over 5MB). But it's configurable
     on the off chance that you don't mind spending some
     extra memory to make even ridiculous requests work.

  2. We must take care only to buffer when we have to. For
     pushes, the incoming packfile may be of arbitrary
     size, and we should connect the input directly to
     receive-pack. There's no deadlock problem here, though,
     because we do not produce any output until the whole
     packfile has been read.

     For upload-pack's initial ref advertisement, we
     similarly do not need to buffer. Even though we may
     generate a lot of output, there is no request body at
     all (i.e., it is a GET, not a POST).

[1] http://article.gmane.org/gmane.comp.version-control.git/269020

Test-adapted-from: Dennis Kaarsemaker &lt;dennis@kaarsemaker.net&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>http-backend: fix die recursion with custom handler</title>
<updated>2015-05-15T18:13:47Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2015-05-15T06:29:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=7253a02348c1c44c84f8a13309d6d30b443bf23c'/>
<id>urn:sha1:7253a02348c1c44c84f8a13309d6d30b443bf23c</id>
<content type='text'>
When we die() in http-backend, we call a custom handler that
writes an HTTP 500 response to stdout, then reports the
error to stderr. Our routines for writing out the HTTP
response may themselves die, leading to us entering die()
again.

When it was originally written, that was OK; our custom
handler keeps a variable to notice this and does not
recurse. However, since cd163d4 (usage.c: detect recursion
in die routines and bail out immediately, 2012-11-14), the
main die() implementation detects recursion before we even
get to our custom handler, and bails without printing
anything useful.

We can handle this case by doing two things:

  1. Installing a custom die_is_recursing handler that
     allows us to enter up to one level of recursion. Only
     the first call to our custom handler will try to write
     out the error response. So if we die again, that is OK.
     If we end up dying more than that, it is a sign that we
     are in an infinite recursion.

  2. Reporting the error to stderr before trying to write
     out the HTTP response. In the current code, if we do
     die() trying to write out the response, we'll exit
     immediately from this second die(), and never get a
     chance to output the original error (which is almost
     certainly the more interesting one; the second die is
     just going to be along the lines of "I tried to write
     to stdout but it was closed").

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'rs/run-command-env-array'</title>
<updated>2014-10-24T21:57:54Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2014-10-24T21:57:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=217610d7d61864f24efc0ea837dc911be44fd9c6'/>
<id>urn:sha1:217610d7d61864f24efc0ea837dc911be44fd9c6</id>
<content type='text'>
Add managed "env" array to child_process to clarify the lifetime
rules.

* rs/run-command-env-array:
  use env_array member of struct child_process
  run-command: add env_array, an optional argv_array for env
</content>
</entry>
<entry>
<title>use env_array member of struct child_process</title>
<updated>2014-10-19T22:26:34Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2014-10-19T11:14:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a915459097b72da9cc058172a54029352b684b0f'/>
<id>urn:sha1:a915459097b72da9cc058172a54029352b684b0f</id>
<content type='text'>
Convert users of struct child_process to using the managed env_array for
specifying environment variables instead of supplying an array on the
stack or bringing their own argv_array.  This shortens and simplifies
the code and ensures automatically that the allocated memory is freed
after use.

Signed-off-by: Rene Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs.c: change resolve_ref_unsafe reading argument to be a flags field</title>
<updated>2014-10-15T17:47:24Z</updated>
<author>
<name>Ronnie Sahlberg</name>
<email>sahlberg@google.com</email>
</author>
<published>2014-07-15T19:59:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=7695d118e5a3c9c6fcb4cb15eb766a1c57422aed'/>
<id>urn:sha1:7695d118e5a3c9c6fcb4cb15eb766a1c57422aed</id>
<content type='text'>
resolve_ref_unsafe takes a boolean argument for reading (a nonexistent ref
resolves successfully for writing but not for reading).  Change this to be
a flags field instead, and pass the new constant RESOLVE_REF_READING when
we want this behaviour.

While at it, swap two of the arguments in the function to put output
arguments at the end.  As a nice side effect, this ensures that we can
catch callers that were unaware of the new API so they can be audited.

Give the wrapper functions resolve_refdup and read_ref_full the same
treatment for consistency.

Signed-off-by: Ronnie Sahlberg &lt;sahlberg@google.com&gt;
Signed-off-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'rs/child-process-init'</title>
<updated>2014-09-11T17:33:27Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2014-09-11T17:33:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=825fd93767510895a98ad7f12e3c1af3e40e367b'/>
<id>urn:sha1:825fd93767510895a98ad7f12e3c1af3e40e367b</id>
<content type='text'>
Code clean-up.

* rs/child-process-init:
  run-command: inline prepare_run_command_v_opt()
  run-command: call run_command_v_opt_cd_env() instead of duplicating it
  run-command: introduce child_process_init()
  run-command: introduce CHILD_PROCESS_INIT
</content>
</entry>
<entry>
<title>run-command: introduce CHILD_PROCESS_INIT</title>
<updated>2014-08-20T16:53:37Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2014-08-19T19:09:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d3180279322c7450a47decf8833de47f444ca93f'/>
<id>urn:sha1:d3180279322c7450a47decf8833de47f444ca93f</id>
<content type='text'>
Most struct child_process variables are cleared using memset first after
declaration.  Provide a macro, CHILD_PROCESS_INIT, that can be used to
initialize them statically instead.  That's shorter, doesn't require a
function call and is slightly more readable (especially given that we
already have STRBUF_INIT, ARGV_ARRAY_INIT etc.).

Helped-by: Johannes Sixt &lt;j6t@kdbg.org&gt;
Signed-off-by: Rene Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>http-backend.c: replace `git_config()` with `git_config_get_bool()` family</title>
<updated>2014-08-07T20:33:26Z</updated>
<author>
<name>Tanay Abhra</name>
<email>tanayabh@gmail.com</email>
</author>
<published>2014-08-07T16:21:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=6881f0ccb4365d1beb1673f740952064d204bd50'/>
<id>urn:sha1:6881f0ccb4365d1beb1673f740952064d204bd50</id>
<content type='text'>
Use `git_config_get_bool()` family instead of `git_config()` to take advantage of
the config-set API which provides a cleaner control flow.

Signed-off-by: Tanay Abhra &lt;tanayabh@gmail.com&gt;
Reviewed-by: Matthieu Moy &lt;Matthieu.Moy@imag.fr&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'maint'</title>
<updated>2014-07-21T19:35:39Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2014-07-21T19:35:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9ab08822556c49a7856dadd0e9a42f9ec2aaf850'/>
<id>urn:sha1:9ab08822556c49a7856dadd0e9a42f9ec2aaf850</id>
<content type='text'>
* maint:
  use xmemdupz() to allocate copies of strings given by start and length
  use xcalloc() to allocate zero-initialized memory
</content>
</entry>
<entry>
<title>use xmemdupz() to allocate copies of strings given by start and length</title>
<updated>2014-07-21T17:37:02Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2014-07-19T15:35:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5c0b13f85ab3a5326508b854768eb70c8829cda4'/>
<id>urn:sha1:5c0b13f85ab3a5326508b854768eb70c8829cda4</id>
<content type='text'>
Use xmemdupz() to allocate the memory, copy the data and make sure to
NUL-terminate the result, all in one step.  The resulting code is
shorter, doesn't contain the constants 1 and '\0', and avoids
duplicating function parameters.

For blame, the last copied byte (o-&gt;file.ptr[o-&gt;file.size]) is always
set to NUL by fake_working_tree_commit() or read_sha1_file(), so no
information is lost by the conversion to using xmemdupz().

Signed-off-by: Rene Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
