<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/shallow.c, branch v2.16.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.16.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.16.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2017-09-29T02:23:43Z</updated>
<entry>
<title>Merge branch 'ma/leakplugs'</title>
<updated>2017-09-29T02:23:43Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2017-09-29T02:23:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=69c54c72845ebc686d0f4bdd8d44b06f799b0a80'/>
<id>urn:sha1:69c54c72845ebc686d0f4bdd8d44b06f799b0a80</id>
<content type='text'>
Memory leaks in various codepaths have been plugged.

* ma/leakplugs:
  pack-bitmap[-write]: use `object_array_clear()`, don't leak
  object_array: add and use `object_array_pop()`
  object_array: use `object_array_clear()`, not `free()`
  leak_pending: use `object_array_clear()`, not `free()`
  commit: fix memory leak in `reduce_heads()`
  builtin/commit: fix memory leak in `prepare_index()`
</content>
</entry>
<entry>
<title>Merge branch 'jk/write-in-full-fix'</title>
<updated>2017-09-25T06:24:06Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2017-09-25T06:24:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c50424a6f07f17ff9b06927606df650cd73a09a3'/>
<id>urn:sha1:c50424a6f07f17ff9b06927606df650cd73a09a3</id>
<content type='text'>
Many codepaths did not diagnose write failures correctly when disks
go full, due to their misuse of write_in_full() helper function,
which have been corrected.

* jk/write-in-full-fix:
  read_pack_header: handle signed/unsigned comparison in read result
  config: flip return value of store_write_*()
  notes-merge: use ssize_t for write_in_full() return value
  pkt-line: check write_in_full() errors against "&lt; 0"
  convert less-trivial versions of "write_in_full() != len"
  avoid "write_in_full(fd, buf, len) != len" pattern
  get-tar-commit-id: check write_in_full() return against 0
  config: avoid "write_in_full(fd, buf, len) &lt; len" pattern
</content>
</entry>
<entry>
<title>object_array: add and use `object_array_pop()`</title>
<updated>2017-09-24T01:06:04Z</updated>
<author>
<name>Martin Ågren</name>
<email>martin.agren@gmail.com</email>
</author>
<published>2017-09-22T23:34:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=719920393737b3934a168f35ab45e09104edeed8'/>
<id>urn:sha1:719920393737b3934a168f35ab45e09104edeed8</id>
<content type='text'>
In a couple of places, we pop objects off an object array `foo` by
decreasing `foo.nr`. We access `foo.nr` in many places, but most if not
all other times we do so read-only, e.g., as we iterate over the array.
But when we change `foo.nr` behind the array's back, it feels a bit
nasty and looks like it might leak memory.

Leaks happen if the popped element has an allocated `name` or `path`.
At the moment, that is not the case. Still, 1) the object array might
gain more fields that want to be freed, 2) a code path where we pop
might start using names or paths, 3) one of these code paths might be
copied to somewhere where we do, and 4) using a dedicated function for
popping is conceptually cleaner.

Introduce and use `object_array_pop()` instead. Release memory in the
new function. Document that popping an object leaves the associated
elements in limbo.

The converted places were identified by grepping for "\.nr\&gt;" and
looking for "--".

Make the new function return NULL on an empty array. This is consistent
with `pop_commit()` and allows the following:

	while ((o = object_array_pop(&amp;foo)) != NULL) {
		// do something
	}

But as noted above, we don't need to go out of our way to avoid reading
`foo.nr`. This is probably more readable:

	while (foo.nr) {
		... o = object_array_pop(&amp;foo);
		// do something
	}

The name of `object_array_pop()` does not quite align with
`add_object_array()`. That is unfortunate. On the other hand, it matches
`object_array_clear()`. Arguably it's `add_...` that is the odd one out,
since it reads like it's used to "add" an "object array". For that
reason, side with `object_array_clear()`.

Signed-off-by: Martin Ågren &lt;martin.agren@gmail.com&gt;
Reviewed-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>avoid "write_in_full(fd, buf, len) != len" pattern</title>
<updated>2017-09-14T06:17:59Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2017-09-13T17:16:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=06f46f237afa823c0a2775e60ed8fbd80e7c751f'/>
<id>urn:sha1:06f46f237afa823c0a2775e60ed8fbd80e7c751f</id>
<content type='text'>
The return value of write_in_full() is either "-1", or the
requested number of bytes[1]. If we make a partial write
before seeing an error, we still return -1, not a partial
value. This goes back to f6aa66cb95 (write_in_full: really
write in full or return error on disk full., 2007-01-11).

So checking anything except "was the return value negative"
is pointless. And there are a couple of reasons not to do
so:

  1. It can do a funny signed/unsigned comparison. If your
     "len" is signed (e.g., a size_t) then the compiler will
     promote the "-1" to its unsigned variant.

     This works out for "!= len" (unless you really were
     trying to write the maximum size_t bytes), but is a
     bug if you check "&lt; len" (an example of which was fixed
     recently in config.c).

     We should avoid promoting the mental model that you
     need to check the length at all, so that new sites are
     not tempted to copy us.

  2. Checking for a negative value is shorter to type,
     especially when the length is an expression.

  3. Linus says so. In d34cf19b89 (Clean up write_in_full()
     users, 2007-01-11), right after the write_in_full()
     semantics were changed, he wrote:

       I really wish every "write_in_full()" user would just
       check against "&lt;0" now, but this fixes the nasty and
       stupid ones.

     Appeals to authority aside, this makes it clear that
     writing it this way does not have an intentional
     benefit. It's a historical curiosity that we never
     bothered to clean up (and which was undoubtedly
     cargo-culted into new sites).

So let's convert these obviously-correct cases (this
includes write_str_in_full(), which is just a wrapper for
write_in_full()).

[1] A careful reader may notice there is one way that
    write_in_full() can return a different value. If we ask
    write() to write N bytes and get a return value that is
    _larger_ than N, we could return a larger total. But
    besides the fact that this would imply a totally broken
    version of write(), it would already invoke undefined
    behavior. Our internal remaining counter is an unsigned
    size_t, which means that subtracting too many byte will
    wrap it around to a very large number. So we'll instantly
    begin reading off the end of the buffer, trying to write
    gigabytes (or petabytes) of data.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Reviewed-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>tempfile: auto-allocate tempfiles on heap</title>
<updated>2017-09-06T08:19:54Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2017-09-05T12:15:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=076aa2cbda5782426c45cd65017b81d77876297a'/>
<id>urn:sha1:076aa2cbda5782426c45cd65017b81d77876297a</id>
<content type='text'>
The previous commit taught the tempfile code to give up
ownership over tempfiles that have been renamed or deleted.
That makes it possible to use a stack variable like this:

  struct tempfile t;

  create_tempfile(&amp;t, ...);
  ...
  if (!err)
          rename_tempfile(&amp;t, ...);
  else
          delete_tempfile(&amp;t);

But doing it this way has a high potential for creating
memory errors. The tempfile we pass to create_tempfile()
ends up on a global linked list, and it's not safe for it to
go out of scope until we've called one of those two
deactivation functions.

Imagine that we add an early return from the function that
forgets to call delete_tempfile(). With a static or heap
tempfile variable, the worst case is that the tempfile hangs
around until the program exits (and some functions like
setup_shallow_temporary rely on this intentionally, creating
a tempfile and then leaving it for later cleanup).

But with a stack variable as above, this is a serious memory
error: the variable goes out of scope and may be filled with
garbage by the time the tempfile code looks at it.  Let's
see if we can make it harder to get this wrong.

Since many callers need to allocate arbitrary numbers of
tempfiles, we can't rely on static storage as a general
solution. So we need to turn to the heap. We could just ask
all callers to pass us a heap variable, but that puts the
burden on them to call free() at the right time.

Instead, let's have the tempfile code handle the heap
allocation _and_ the deallocation (when the tempfile is
deactivated and removed from the list).

This changes the return value of all of the creation
functions. For the cleanup functions (delete and rename),
we'll add one extra bit of safety: instead of taking a
tempfile pointer, we'll take a pointer-to-pointer and set it
to NULL after freeing the object. This makes it safe to
double-call functions like delete_tempfile(), as the second
call treats the NULL input as a noop. Several callsites
follow this pattern.

The resulting patch does have a fair bit of noise, as each
caller needs to be converted to handle:

  1. Storing a pointer instead of the struct itself.

  2. Passing the pointer instead of taking the struct
     address.

  3. Handling a "struct tempfile *" return instead of a file
     descriptor.

We could play games to make this less noisy. For example, by
defining the tempfile like this:

  struct tempfile {
	struct heap_allocated_part_of_tempfile {
                int fd;
                ...etc
        } *actual_data;
  }

Callers would continue to have a "struct tempfile", and it
would be "active" only when the inner pointer was non-NULL.
But that just makes things more awkward in the long run.
There aren't that many callers, so we can simply bite
the bullet and adjust all of them. And the compiler makes it
easy for us to find them all.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>tempfile: do not delete tempfile on failed close</title>
<updated>2017-09-06T08:19:53Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2017-09-05T12:14:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=49bd0fc2220eef17d8f5fd3ee76e391d03df8a6d'/>
<id>urn:sha1:49bd0fc2220eef17d8f5fd3ee76e391d03df8a6d</id>
<content type='text'>
When close_tempfile() fails, we delete the tempfile and
reset the fields of the tempfile struct. This makes it
easier for callers to return without cleaning up, but it
also makes this common pattern:

  if (close_tempfile(tempfile))
	return error_errno("error closing %s", tempfile-&gt;filename.buf);

wrong, because the "filename" field has been reset after the
failed close. And it's not easy to fix, as in many cases we
don't have another copy of the filename (e.g., if it was
created via one of the mks_tempfile functions, and we just
have the original template string).

Let's drop the feature that a failed close automatically
deletes the file. This puts the burden on the caller to do
the deletion themselves, but this isn't that big a deal.
Callers which do:

  if (write(...) || close_tempfile(...)) {
	delete_tempfile(...);
	return -1;
  }

already had to call delete when the write() failed, and so
aren't affected. Likewise, any caller which just calls die()
in the error path is OK; we'll delete the tempfile during
the atexit handler.

Because this patch changes the semantics of close_tempfile()
without changing its signature, all callers need to be
manually checked and converted to the new scheme. This patch
covers all in-tree callers, but there may be others for
not-yet-merged topics. To catch these, we rename the
function to close_tempfile_gently(), which will attract
compile-time attention to new callers. (Technically the
original could be considered "gentle" already in that it
didn't die() on errors, but this one is even more so).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>always check return value of close_tempfile</title>
<updated>2017-09-06T08:19:53Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2017-09-05T12:14:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=45c6b1ed24724f7f3041a60a4313df7d9c4b9909'/>
<id>urn:sha1:45c6b1ed24724f7f3041a60a4313df7d9c4b9909</id>
<content type='text'>
If close_tempfile() encounters an error, then it deletes the
tempfile and resets the "struct tempfile". But many code
paths ignore the return value and continue to use the
tempfile. Instead, we should generally treat this the same
as a write() error.

Note that in the postimage of some of these cases our error
message will be bogus after a failed close because we look
at tempfile-&gt;filename (either directly or via get_tempfile_path).
But after the failed close resets the tempfile object, this
is guaranteed to be the empty string. That will be addressed
in a future patch (because there are many more cases of the
same problem than just these instances).

Note also in the hunk in gpg-interface.c that it's fine to
call delete_tempfile() in the error path, even if
close_tempfile() failed and already deleted the file. The
tempfile code is smart enough to know the second deletion is
a noop.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>setup_temporary_shallow: move tempfile struct into function</title>
<updated>2017-09-06T08:19:52Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2017-09-05T12:14:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c0e963b77c6892d40d4ca81a71098aaa6c4eaed3'/>
<id>urn:sha1:c0e963b77c6892d40d4ca81a71098aaa6c4eaed3</id>
<content type='text'>
The setup_temporary_shallow() function creates a temporary
file, but we never access the tempfile struct outside of the
function. This is OK, since it means we'll just clean up the
tempfile on exit.  But we can simplify the code a bit by
moving the global tempfile struct to the only function in
which it's used.

Note that it must remain "static" due to tempfile.c's
requirement that tempfile storage never goes away until
program exit.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>setup_temporary_shallow: avoid using inactive tempfile</title>
<updated>2017-09-06T08:19:52Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2017-09-05T12:14:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0899013993cd5eb327bc1a40d7c629db37e5ea83'/>
<id>urn:sha1:0899013993cd5eb327bc1a40d7c629db37e5ea83</id>
<content type='text'>
When there are no shallow entries to write, we skip creating
the tempfile entirely and try to return the empty string.

But we do so by calling get_tempfile_path() on the inactive
tempfile object. This will trigger an assertion that kills
the program. The bug was introduced by 6e122b449b
(setup_temporary_shallow(): use tempfile module,
2015-08-10). But nobody seems to have noticed since then
because we do not end up calling this function at all when
there are no shallow items. In other words, this code path
is completely unexercised.

Since the tempfile object is a static global, it _is_
possible that we call the function twice, writing out
shallow info the first time and then "reusing" our tempfile
object the second time. But:

  1. It seems unlikely that this was the intent, as hitting
     this code path would imply somebody clearing the
     shallow_info list between calls.

     And if somebody _did_ call the function multiple times
     without clearing the shallow_info list, we'd hit a
     different BUG for trying to reuse an already-active
     tempfile.

  2. I verified by code inspection that the function is only
     called once per program. And also replacing this code
     with a BUG() and running the test suite demonstrates
     that it is not triggered there.

So we could probably just replace this with an assertion and
confirm that it's never called. However, the original intent
does seem to be that you _could_ call it when the
shallow_info is empty. And that's easy enough to do; since
the return value doesn't need to point to a writable buffer,
we can just return a string literal.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit: convert lookup_commit_graft to struct object_id</title>
<updated>2017-07-13T19:02:40Z</updated>
<author>
<name>Stefan Beller</name>
<email>sbeller@google.com</email>
</author>
<published>2017-07-13T00:44:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8b65a34c4a12cdaae7f43f65f760da0f48ad67e3'/>
<id>urn:sha1:8b65a34c4a12cdaae7f43f65f760da0f48ad67e3</id>
<content type='text'>
With this patch, commit.h doesn't contain the string 'sha1' any more.

Signed-off-by: Stefan Beller &lt;sbeller@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
