git/pack-write.c, branch v2.16.2

avoid looking at errno for short read_in_full() returns

2017-09-27T06:45:24Z

When a caller tries to read a particular set of bytes via read_in_full(), there are three possible outcomes: 1. An error, in which case -1 is returned and errno is set. 2. A short read, in which fewer bytes are returned and errno is unspecified (we never saw a read error, so we may have some random value from whatever syscall failed last). 3. The full read completed successfully. Many callers handle cases 1 and 2 together by just checking the result against the requested size. If their combined error path looks at errno (e.g., by calling die_errno), they may report a nonsense value. Let's fix these sites by having them distinguish between the two error cases. That avoids the random errno confusion, and lets us give more detailed error messages. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

pack: convert struct pack_idx_entry to struct object_id

2017-05-08T06:12:57Z

Convert struct pack_idx_entry to use struct object_id by changing the definition and applying the following semantic patch, plus the standard object_id transforms: @@ struct pack_idx_entry E1; @@ - E1.sha1 + E1.oid.hash @@ struct pack_idx_entry *E1; @@ - E1->sha1 + E1->oid.hash Signed-off-by: brian m. carlson Signed-off-by: Junio C Hamano

odb_mkstemp: write filename into strbuf

2017-03-28T22:28:04Z

The odb_mkstemp() function expects the caller to provide a fixed buffer to write the resulting tempfile name into. But it creates the template using snprintf without checking the return value. This means we could silently truncate the filename. In practice, it's unlikely that the truncation would end in the template-pattern that mkstemp needs to open the file. So we'd probably end up failing either way, unless the path was specially crafted. The simplest fix would be to notice the truncation and die. However, we can observe that most callers immediately xstrdup() the result anyway. So instead, let's switch to using a strbuf, which is easier for them (and isn't a big deal for the other 2 callers, who can just strbuf_release when they're done with it). Note that many of the callers used static buffers, but this was purely to avoid putting a large buffer on the stack. We never passed the static buffers out of the function, so there's no complicated memory handling we need to change. Signed-off-by: Jeff King

do not check odb_mkstemp return value for errors

2017-03-28T22:28:04Z

The odb_mkstemp function does not return an error; it dies on failure instead. But many of its callers compare the resulting descriptor against -1 and die themselves. Mostly this is just pointless, but it does raise a question when looking at the callers: if they show the results of the "template" buffer after a failure, what's in it? The answer is: it doesn't matter, because it cannot happen. So let's make that clear by removing the bogus error checks. In bitmap_writer_finish(), we can drop the error-handling code entirely. In the other two cases, it's shared with the open() in another code path; we can just move the error-check next to that open() call. And while we're at it, let's flesh out the function's docstring a bit to make the error behavior clear. Signed-off-by: Jeff King

encode_in_pack_object_header: respect output buffer length

2017-03-24T19:34:07Z

The encode_in_pack_object_header() writes a variable-length header to an output buffer, but it doesn't actually know long the buffer is. At first glance, this looks like it might be possible to overflow. In practice, this is probably impossible. The smallest buffer we use is 10 bytes, which would hold the header for an object up to 2^67 bytes. Obviously we're not likely to see such an object, but we might worry that an object could lie about its size (causing us to overflow before we realize it does not actually have that many bytes). But the argument is passed as a uintmax_t. Even on systems that have __int128 available, uintmax_t is typically restricted to 64-bit by the ABI. So it's unlikely that a system exists where this could be exploited. Still, it's easy enough to use a normal out/len pair and make sure we don't write too far. That protects the hypothetical 128-bit system, makes it harder for callers to accidentally specify a too-small buffer, and makes the resulting code easier to audit. Note that the one caller in fast-import tried to catch such a case, but did so _after_ the call (at which point we'd have already overflowed!). This check can now go away. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

use QSORT

2016-09-29T22:42:18Z

Apply the semantic patch contrib/coccinelle/qsort.cocci to the code base, replacing calls of qsort(3) with QSORT. The resulting code is shorter and supports empty arrays with NULL pointers. Signed-off-by: Rene Scharfe Signed-off-by: Junio C Hamano

sha1_file: drop free_pack_by_name

2016-07-29T18:05:06Z

The point of this function is to drop an entry from the "packed_git" cache that points to a file we might be overwriting, because our contents may not be the same (and hence the only caller was pack-objects as it moved a temporary packfile into place). In older versions of git, this could happen because the names of packfiles were derived from the set of objects they contained, not the actual bits on disk. But since 1190a1a (pack-objects: name pack files after trailer hash, 2013-12-05), the name reflects the actual bits on disk, and any two packfiles with the same name can be used interchangeably. Dropping this function not only saves a few lines of code, it makes the lifetime of "struct packed_git" much easier to reason about: namely, we now do not ever free these structs. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

pack-write: simplify index_pack_lockfile using skip_prefix() and xstrfmt()

2014-09-02T17:37:24Z

Get rid of magic string length constants by using skip_prefix() instead of memcmp() and use xstrfmt() for building a string instead of a PATH_MAX-sized buffer, snprintf() and xstrdup(). Signed-off-by: Rene Scharfe Signed-off-by: Junio C Hamano

finish_tmp_packfile():use strbuf for pathname construction

2014-03-03T20:15:10Z

The old version fixes a maximum length on the buffer, which could be a problem if one is not certain of the length of get_object_directory(). Using strbuf can avoid the protential bug. Helped-by: Michael Haggerty Helped-by: Eric Sunshine Signed-off-by: Sun He Signed-off-by: Junio C Hamano

Merge branch 'jk/pack-bitmap'

2014-02-27T22:01:48Z

Borrow the bitmap index into packfiles from JGit to speed up enumeration of objects involved in a commit range without having to fully traverse the history. * jk/pack-bitmap: (26 commits) ewah: unconditionally ntohll ewah data ewah: support platforms that require aligned reads read-cache: use get_be32 instead of hand-rolled ntoh_l block-sha1: factor out get_be and put_be wrappers do not discard revindex when re-preparing packfiles pack-bitmap: implement optional name_hash cache t/perf: add tests for pack bitmaps t: add basic bitmap functionality tests count-objects: recognize .bitmap in garbage-checking repack: consider bitmaps when performing repacks repack: handle optional files created by pack-objects repack: turn exts array into array-of-struct repack: stop using magic number for ARRAY_SIZE(exts) pack-objects: implement bitmap writing rev-list: add bitmap mode to speed up object lists pack-objects: use bitmaps when packing objects pack-objects: split add_object_entry pack-bitmap: add support for bitmap indexes documentation: add documentation for the bitmap format ewah: compressed bitmap implementation ...