<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/http-backend.c, branch v2.7.3</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.7.3</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.7.3'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2015-11-20T13:02:05Z</updated>
<entry>
<title>Convert struct object to object_id</title>
<updated>2015-11-20T13:02:05Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2015-11-10T02:22:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=f2fd0760f62e79609fef7bfd7ecebb002e8e4ced'/>
<id>urn:sha1:f2fd0760f62e79609fef7bfd7ecebb002e8e4ced</id>
<content type='text'>
struct object is one of the major data structures dealing with object
IDs.  Convert it to use struct object_id instead of an unsigned char
array.  Convert get_object_hash to refer to the new member as well.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
</content>
</entry>
<entry>
<title>prefer git_pathdup to git_path in some possibly-dangerous cases</title>
<updated>2015-08-10T22:37:12Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2015-08-10T09:35:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=fcd12db6af118b70b5c15cf5fdd6800eeecc370a'/>
<id>urn:sha1:fcd12db6af118b70b5c15cf5fdd6800eeecc370a</id>
<content type='text'>
Because git_path uses a static buffer that is shared with
calls to git_path, mkpath, etc, it can be dangerous to
assign the result to a variable or pass it to a non-trivial
function. The value may change unexpectedly due to other
calls.

None of the cases changed here has a known bug, but they're
worth converting away from git_path because:

  1. It's easy to use git_pathdup in these cases.

  2. They use constructs (like assignment) that make it
     hard to tell whether they're safe or not.

The extra malloc overhead should be trivial, as an
allocation should be an order of magnitude cheaper than a
system call (which we are clearly about to make, since we
are constructing a filename). The real cost is that we must
remember to free the result.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>convert "enum date_mode" into a struct</title>
<updated>2015-06-29T18:39:07Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2015-06-25T16:55:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a5481a6c9438cbd9c246cfa59ff49c31a0926fb6'/>
<id>urn:sha1:a5481a6c9438cbd9c246cfa59ff49c31a0926fb6</id>
<content type='text'>
In preparation for adding date modes that may carry extra
information beyond the mode itself, this patch converts the
date_mode enum into a struct.

Most of the conversion is fairly straightforward; we pass
the struct as a pointer and dereference the type field where
necessary. Locations that declare a date_mode can use a "{}"
constructor.  However, the tricky case is where we use the
enum labels as constants, like:

  show_date(t, tz, DATE_NORMAL);

Ideally we could say:

  show_date(t, tz, &amp;{ DATE_NORMAL });

but of course C does not allow that. Likewise, we cannot
cast the constant to a struct, because we need to pass an
actual address. Our options are basically:

  1. Manually add a "struct date_mode d = { DATE_NORMAL }"
     definition to each caller, and pass "&amp;d". This makes
     the callers uglier, because they sometimes do not even
     have their own scope (e.g., they are inside a switch
     statement).

  2. Provide a pre-made global "date_normal" struct that can
     be passed by address. We'd also need "date_rfc2822",
     "date_iso8601", and so forth. But at least the ugliness
     is defined in one place.

  3. Provide a wrapper that generates the correct struct on
     the fly. The big downside is that we end up pointing to
     a single global, which makes our wrapper non-reentrant.
     But show_date is already not reentrant, so it does not
     matter.

This patch implements 3, along with a minor macro to keep
the size of the callers sane.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'bc/object-id'</title>
<updated>2015-06-05T19:17:37Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2015-06-05T19:17:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5455ee0573a22bb793a7083d593ae1ace909cd4c'/>
<id>urn:sha1:5455ee0573a22bb793a7083d593ae1ace909cd4c</id>
<content type='text'>
for_each_ref() callback functions were taught to name the objects
not with "unsigned char sha1[20]" but with "struct object_id".

* bc/object-id: (56 commits)
  struct ref_lock: convert old_sha1 member to object_id
  warn_if_dangling_symref(): convert local variable "junk" to object_id
  each_ref_fn_adapter(): remove adapter
  rev_list_insert_ref(): remove unneeded arguments
  rev_list_insert_ref_oid(): new function, taking an object_oid
  mark_complete(): remove unneeded arguments
  mark_complete_oid(): new function, taking an object_oid
  clear_marks(): rewrite to take an object_id argument
  mark_complete(): rewrite to take an object_id argument
  send_ref(): convert local variable "peeled" to object_id
  upload-pack: rewrite functions to take object_id arguments
  find_symref(): convert local variable "unused" to object_id
  find_symref(): rewrite to take an object_id argument
  write_one_ref(): rewrite to take an object_id argument
  write_refs_to_temp_dir(): convert local variable sha1 to object_id
  submodule: rewrite to take an object_id argument
  shallow: rewrite functions to take object_id arguments
  handle_one_ref(): rewrite to take an object_id argument
  add_info_ref(): rewrite to take an object_id argument
  handle_one_reflog(): rewrite to take an object_id argument
  ...
</content>
</entry>
<entry>
<title>http-backend: spool ref negotiation requests to buffer</title>
<updated>2015-05-26T03:43:18Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2015-05-20T07:37:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=6bc0cb5176a5e42ca4a74e3558e8f0790ed09bb1'/>
<id>urn:sha1:6bc0cb5176a5e42ca4a74e3558e8f0790ed09bb1</id>
<content type='text'>
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request.  In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.

In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:

  1. Apache proxies the request to the CGI, http-backend.

  2. http-backend gzip-inflates the data and sends
     the result to upload-pack.

  3. upload-pack acts on the data and generates output over
     the pipe back to Apache. Apache isn't reading because
     it's busy writing (step 1).

This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).

We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.

The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:

  1. We limit the in-memory buffer to prevent an obvious
     denial-of-service attack. This is a new hard limit on
     requests, but it's unlikely to come into play. The
     default value is 10MB, which covers even the ridiculous
     100,000-ref negotation in the included test (that
     actually caps out just over 5MB). But it's configurable
     on the off chance that you don't mind spending some
     extra memory to make even ridiculous requests work.

  2. We must take care only to buffer when we have to. For
     pushes, the incoming packfile may be of arbitrary
     size, and we should connect the input directly to
     receive-pack. There's no deadlock problem here, though,
     because we do not produce any output until the whole
     packfile has been read.

     For upload-pack's initial ref advertisement, we
     similarly do not need to buffer. Even though we may
     generate a lot of output, there is no request body at
     all (i.e., it is a GET, not a POST).

[1] http://article.gmane.org/gmane.comp.version-control.git/269020

Test-adapted-from: Dennis Kaarsemaker &lt;dennis@kaarsemaker.net&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>show_head_ref(): convert local variable "unused" to object_id</title>
<updated>2015-05-25T19:19:33Z</updated>
<author>
<name>Michael Haggerty</name>
<email>mhagger@alum.mit.edu</email>
</author>
<published>2015-05-25T18:38:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5f9cf5abd2f918d374d11a0f468bc2387a3d32c2'/>
<id>urn:sha1:5f9cf5abd2f918d374d11a0f468bc2387a3d32c2</id>
<content type='text'>
Signed-off-by: Michael Haggerty &lt;mhagger@alum.mit.edu&gt;
Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>http-backend: rewrite to take an object_id argument</title>
<updated>2015-05-25T19:19:33Z</updated>
<author>
<name>Michael Haggerty</name>
<email>mhagger@alum.mit.edu</email>
</author>
<published>2015-05-25T18:38:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=f72f54210728a3550fa314e99470dcbbef001297'/>
<id>urn:sha1:f72f54210728a3550fa314e99470dcbbef001297</id>
<content type='text'>
Signed-off-by: Michael Haggerty &lt;mhagger@alum.mit.edu&gt;
Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>each_ref_fn: change to take an object_id parameter</title>
<updated>2015-05-25T19:19:27Z</updated>
<author>
<name>Michael Haggerty</name>
<email>mhagger@alum.mit.edu</email>
</author>
<published>2015-05-25T18:38:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=2b2a5be394bc67bed86bc009195c664dca740bd6'/>
<id>urn:sha1:2b2a5be394bc67bed86bc009195c664dca740bd6</id>
<content type='text'>
Change typedef each_ref_fn to take a "const struct object_id *oid"
parameter instead of "const unsigned char *sha1".

To aid this transition, implement an adapter that can be used to wrap
old-style functions matching the old typedef, which is now called
"each_ref_sha1_fn"), and make such functions callable via the new
interface. This requires the old function and its cb_data to be
wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be
used as the cb_data for an adapter function, each_ref_fn_adapter().

This is an enormous diff, but most of it consists of simple,
mechanical changes to the sites that call any of the "for_each_ref"
family of functions. Subsequent to this change, the call sites can be
rewritten one by one to use the new interface.

Signed-off-by: Michael Haggerty &lt;mhagger@alum.mit.edu&gt;
Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>http-backend: fix die recursion with custom handler</title>
<updated>2015-05-15T18:13:47Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2015-05-15T06:29:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=7253a02348c1c44c84f8a13309d6d30b443bf23c'/>
<id>urn:sha1:7253a02348c1c44c84f8a13309d6d30b443bf23c</id>
<content type='text'>
When we die() in http-backend, we call a custom handler that
writes an HTTP 500 response to stdout, then reports the
error to stderr. Our routines for writing out the HTTP
response may themselves die, leading to us entering die()
again.

When it was originally written, that was OK; our custom
handler keeps a variable to notice this and does not
recurse. However, since cd163d4 (usage.c: detect recursion
in die routines and bail out immediately, 2012-11-14), the
main die() implementation detects recursion before we even
get to our custom handler, and bails without printing
anything useful.

We can handle this case by doing two things:

  1. Installing a custom die_is_recursing handler that
     allows us to enter up to one level of recursion. Only
     the first call to our custom handler will try to write
     out the error response. So if we die again, that is OK.
     If we end up dying more than that, it is a sign that we
     are in an infinite recursion.

  2. Reporting the error to stderr before trying to write
     out the HTTP response. In the current code, if we do
     die() trying to write out the response, we'll exit
     immediately from this second die(), and never get a
     chance to output the original error (which is almost
     certainly the more interesting one; the second die is
     just going to be along the lines of "I tried to write
     to stdout but it was closed").

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'rs/run-command-env-array'</title>
<updated>2014-10-24T21:57:54Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2014-10-24T21:57:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=217610d7d61864f24efc0ea837dc911be44fd9c6'/>
<id>urn:sha1:217610d7d61864f24efc0ea837dc911be44fd9c6</id>
<content type='text'>
Add managed "env" array to child_process to clarify the lifetime
rules.

* rs/run-command-env-array:
  use env_array member of struct child_process
  run-command: add env_array, an optional argv_array for env
</content>
</entry>
</feed>
