git/usage.c, branch v2.16.2

add UNLEAK annotation for reducing leak false positives

2017-09-08T06:43:17Z

It's a common pattern in git commands to allocate some memory that should last for the lifetime of the program and then not bother to free it, relying on the OS to throw it away. This keeps the code simple, and it's fast (we don't waste time traversing structures or calling free at the end of the program). But it also triggers warnings from memory-leak checkers like valgrind or LSAN. They know that the memory was still allocated at program exit, but they don't know _when_ the leaked memory stopped being useful. If it was early in the program, then it's probably a real and important leak. But if it was used right up until program exit, it's not an interesting leak and we'd like to suppress it so that we can see the real leaks. This patch introduces an UNLEAK() macro that lets us do so. To understand its design, let's first look at some of the alternatives. Unfortunately the suppression systems offered by leak-checking tools don't quite do what we want. A leak-checker basically knows two things: 1. Which blocks were allocated via malloc, and the callstack during the allocation. 2. Which blocks were left un-freed at the end of the program (and which are unreachable, but more on that later). Their suppressions work by mentioning the function or callstack of a particular allocation, and marking it as OK to leak. So imagine you have code like this: int cmd_foo(...) { /* this allocates some memory */ char *p = some_function(); printf("%s", p); return 0; } You can say "ignore allocations from some_function(), they're not leaks". But that's not right. That function may be called elsewhere, too, and we would potentially want to know about those leaks. So you can say "ignore the callstack when main calls some_function". That works, but your annotations are brittle. In this case it's only two functions, but you can imagine that the actual allocation is much deeper. If any of the intermediate code changes, you have to update the suppression. What we _really_ want to say is that "the value assigned to p at the end of the function is not a real leak". But leak-checkers can't understand that; they don't know about "p" in the first place. However, we can do something a little bit tricky if we make some assumptions about how leak-checkers work. They generally don't just report all un-freed blocks. That would report even globals which are still accessible when the leak-check is run. Instead they take some set of memory (like BSS) as a root and mark it as "reachable". Then they scan the reachable blocks for anything that looks like a pointer to a malloc'd block, and consider that block reachable. And then they scan those blocks, and so on, transitively marking anything reachable from a global as "not leaked" (or at least leaked in a different category). So we can mark the value of "p" as reachable by putting it into a variable with program lifetime. One way to do that is to just mark "p" as static. But that actually affects the run-time behavior if the function is called twice (you aren't likely to call main() twice, but some of our cmd_*() functions are called from other commands). Instead, we can trick the leak-checker by putting the value into _any_ reachable bytes. This patch keeps a global linked-list of bytes copied from "unleaked" variables. That list is reachable even at program exit, which confers recursive reachability on whatever values we unleak. In other words, you can do: int cmd_foo(...) { char *p = some_function(); printf("%s", p); UNLEAK(p); return 0; } to annotate "p" and suppress the leak report. But wait, couldn't we just say "free(p)"? In this toy example, yes. But UNLEAK()'s byte-copying strategy has several advantages over actually freeing the memory: 1. It's recursive across structures. In many cases our "p" is not just a pointer, but a complex struct whose fields may have been allocated by a sub-function. And in some cases (e.g., dir_struct) we don't even have a function which knows how to free all of the struct members. By marking the struct itself as reachable, that confers reachability on any pointers it contains (including those found in embedded structs, or reachable by walking heap blocks recursively. 2. It works on cases where we're not sure if the value is allocated or not. For example: char *p = argc > 1 ? argv[1] : some_function(); It's safe to use UNLEAK(p) here, because it's not freeing any memory. In the case that we're pointing to argv here, the reachability checker will just ignore our bytes. 3. Likewise, it works even if the variable has _already_ been freed. We're just copying the pointer bytes. If the block has been freed, the leak-checker will skip over those bytes as uninteresting. 4. Because it's not actually freeing memory, you can UNLEAK() before we are finished accessing the variable. This is helpful in cases like this: char *p = some_function(); return another_function(p); Writing this with free() requires: int ret; char *p = some_function(); ret = another_function(p); free(p); return ret; But with unleak we can just write: char *p = some_function(); UNLEAK(p); return another_function(p); This patch adds the UNLEAK() macro and enables it automatically when Git is compiled with SANITIZE=leak. In normal builds it's a noop, so we pay no runtime cost. It also adds some UNLEAK() annotations to show off how the feature works. On top of other recent leak fixes, these are enough to get t0000 and t0001 to pass when compiled with LSAN. Note the case in commit.c which actually converts a strbuf_release() into an UNLEAK. This code was already non-leaky, but the free didn't do anything useful, since we're exiting. Converting it to an annotation means that non-leak-checking builds pay no runtime cost. The cost is minimal enough that it's probably not worth going on a crusade to convert these kinds of frees to UNLEAKS. I did it here for consistency with the "sb" leak (though it would have been equally correct to go the other way, and turn them both into strbuf_release() calls). Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

die(): stop hiding errors due to overzealous recursion guard

2017-06-21T21:09:13Z

Change the recursion limit for the default die routine from a *very* low 1 to 1024. This ensures that infinite recursions are broken, but doesn't lose the meaningful error messages under threaded execution where threads concurrently start to die. The intent of the existing code, as explained in commit cd163d4b4e ("usage.c: detect recursion in die routines and bail out immediately", 2012-11-14), is to break infinite recursion in cases where the die routine itself calls die(), and would thus infinitely recurse. However, doing that very aggressively by immediately printing out "recursion detected in die handler" if we've already called die() once means that threaded invocations of git can end up only printing out the "recursion detected" error, while hiding the meaningful error. An example of this is running a threaded grep which dies on execution against pretty much any repo, git.git will do: git grep -P --threads=8 '(*LIMIT_MATCH=1)-?-?-?---$' With the current version of git this will print some combination of multiple PCRE failures that caused the abort and multiple "recursion detected", some invocations will print out multiple "recursion detected" errors with no PCRE error at all! Before this change, running the above grep command 1000 times against git.git[1] and taking the top 20 results will on my system yield the following distribution of actual errors ("E") and recursion errors ("R"): 322 E R 306 E 116 E R R 65 R R 54 R E 49 E E 44 R 15 E R R R 9 R R R 7 R E R 5 R R E 3 E R R R R 2 E E R 1 R R R R 1 R R R E 1 R E R R The exact results are obviously random and system-dependent, but this shows the race condition in this code. Some small part of the time we're about to print out the actual error ("E") but another thread's recursion error beats us to it, and sometimes we print out nothing but the recursion error. With this change we get, now with "W" to mean the new warning being emitted indicating that we've called die() many times: 502 E 160 E W E 120 E E 53 E W 35 E W E E 34 W E E 29 W E E E 16 E E W 16 E E E 11 W E E E E 7 E E W E 4 W E 3 W W E E 2 E W E E E 1 W W E 1 W E W E 1 E W W E E E 1 E W W E E 1 E W W E 1 E W E E W Which still sucks a bit, due to a still present race-condition in this code we're sometimes going to print out several errors still, or several warnings, or two duplicate errors without the warning. But we will never have a case where we completely hide the actual error as we do now. Now, git-grep could make use of the pluggable error facility added in commit c19a490e37 ("usage: allow pluggable die-recursion checks", 2013-04-16). There's other threaded code that calls set_die_routine() or set_die_is_recursing_routine(). But this is about fixing the general die() behavior with threading when we don't have such a custom routine yet. Right now the common case is not an infinite recursion in the handler, but us losing error messages by default because we're overly paranoid about our recursion check. So let's just set the recursion limit to a number higher than the number of threads we're ever likely to spawn. Now we won't lose errors, and if we have a recursing die handler we'll still die within microseconds. There are race conditions in this code itself, in particular the "dying" variable is not thread mutexed, so we e.g. won't be dying at exactly 1024, or for that matter even be able to accurately test "dying == 2", see the cases where we print out more than one "W" above. But that doesn't really matter, for the recursion guard we just need to die "soon", not at exactly 1024 calls, and for printing the correct error and only one warning most of the time in the face of threaded death this is good enough and a net improvement on the current code. 1. for i in {1..1000}; do git grep -P --threads=8 '(*LIMIT_MATCH=1)-?-?-?---$' 2>&1|perl -pe 's/^fatal: r.*/R/; s/^fatal: p.*/E/; s/^warning.*/W/' | tr '\n' ' '; echo; done | sort | uniq -c | sort -nr | head -n 20 Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Junio C Hamano

Merge branch 'bw/forking-and-threading' into maint

2017-06-13T20:27:00Z

The "run-command" API implementation has been made more robust against dead-locking in a threaded environment. * bw/forking-and-threading: usage.c: drop set_error_handle() run-command: restrict PATH search to executable files run-command: expose is_executable function run-command: block signals between fork and execve run-command: add note about forking and threading run-command: handle dup2 and close errors in child run-command: eliminate calls to error handling functions in child run-command: don't die in child when duping /dev/null run-command: prepare child environment before forking string-list: add string_list_remove function run-command: use the async-signal-safe execv instead of execvp run-command: prepare command before forking t0061: run_command executes scripts without a #! line t5550: use write_script to generate post-update hook

usage: add NORETURN to BUG() function definitions

2017-05-22T02:00:53Z

Commit d8193743e0 ("usage.c: add BUG() function", 12-05-2017) added the BUG() functions and macros as a replacement for calls to die("BUG: .."). The use of NORETURN on the declarations (in git-compat-util.h) and the lack of NORETURN on the function definitions, however, leads sparse to complain thus: SP usage.c usage.c:220:6: error: symbol 'BUG_fl' redeclared with different type (originally declared at git-compat-util.h:1074) - different modifiers In order to suppress the sparse error, add the NORETURN to the function definitions. Signed-off-by: Ramsay Jones Signed-off-by: Junio C Hamano

usage.c: drop set_error_handle()

2017-05-15T04:00:25Z

The set_error_handle() function was introduced by 3b331e926 (vreportf: report to arbitrary filehandles, 2015-08-11) so that run-command could send post-fork, pre-exec errors to the parent's original stderr. That use went away in 79319b194 (run-command: eliminate calls to error handling functions in child, 2017-04-19), which pushes all of the error reporting to the parent. This leaves no callers of set_error_handle(). As we're not likely to add any new ones, let's drop it. Signed-off-by: Jeff King Acked-by: Brandon Williams Reviewed-by: Ramsay Jones Signed-off-by: Junio C Hamano

usage.c: add BUG() function

2017-05-15T02:29:51Z

There's a convention in Git's code base to write assertions as: if (...some_bad_thing...) die("BUG: the terrible thing happened"); with the idea that users should never see a "BUG:" message (but if they, it at least gives a clue what happened). We use die() here because it's convenient, but there are a few draw-backs: 1. Without parsing the messages, it's hard for callers to distinguish BUG assertions from regular errors. For instance, it would be nice if the test suite could check that we don't hit any assertions, but test_must_fail will pass BUG deaths as OK. 2. It would be useful to add more debugging features to BUG assertions, like file/line numbers or dumping core. 3. The die() handler can be replaced, and might not actually exit the whole program (e.g., it may just pthread_exit()). This is convenient for normal errors, but for an assertion failure (which is supposed to never happen), we're probably better off taking down the whole process as quickly and cleanly as possible. We could address these by checking in die() whether the error message starts with "BUG", and behaving appropriately. But there's little advantage at that point to sharing the die() code, and only downsides (e.g., we can't change the BUG() interface independently). Moreover, converting all of the existing BUG calls reveals that the test suite does indeed trigger a few of them. Instead, this patch introduces a new BUG() function, which prints an error before dying via SIGABRT. This gives us test suite checking and core dumps. The function is actually a macro (when supported) so that we can show the file/line number. We can convert die("BUG") invocations to BUG() in further patches, dealing with any test fallouts individually. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

Merge branch 'jk/vreport-sanitize'

2017-01-31T21:14:56Z

An error message with an ASCII control character like '\r' in it can alter the message to hide its early part, which is problematic when a remote side gives such an error message that the local side will relay with a "remote: " prefix. * jk/vreport-sanitize: vreport: sanitize ASCII control chars Revert "vreportf: avoid intermediate buffer"

vreport: sanitize ASCII control chars

2017-01-11T21:54:08Z

Our error() and die() calls may report messages with arbitrary data (e.g., filenames or even data from a remote server). Let's make it harder to cause confusion with mischievous filenames. E.g., try: git rev-parse "$(printf "\rfatal: this argument is too sneaky")" -- or git rev-parse "$(printf "\x1b[5mblinky\x1b[0m")" -- Let's block all ASCII control characters, with the exception of TAB and LF. We use both in our own messages (and we are necessarily sanitizing the complete output of snprintf here, as we do not have access to the individual varargs). And TAB and LF are unlikely to cause confusion (you could put "\nfatal: sneaky\n" in your filename, but it would at least not _cover up_ the message leading to it, unlike "\r"). We'll replace the characters with a "?", which is similar to how "ls" behaves. It might be nice to do something less lossy, like converting them to "\x" hex codes. But replacing with a single character makes it easy to do in-place and without worrying about length limitations. This feature should kick in rarely enough that the "?" marks are almost never seen. We'll leave high-bit characters as-is, as they are likely to be UTF-8 (though there may be some Unicode mischief you could cause, which may require further patches). Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

Revert "vreportf: avoid intermediate buffer"

2017-01-11T21:52:00Z

This reverts commit f4c3edc0b156362a92bf9de4f0ec794e90a757fc. The purpose of that commit was to let us write errors of arbitrary length to stderr by skipping the intermediate buffer and sending our varargs straight to fprintf. That works, but it comes with a downside: we do not get access to the varargs before they are sent to stderr. On balance, it's not a good tradeoff. Error messages larger than our 4K buffer are quite uncommon, and we've lost the ability to make any modifications to the output (e.g., to remove non-printable characters). The only way to have both would be one of: 1. Write into a dynamic buffer. But this is a bad idea for a low-level function that may be called when malloc() has failed. 2. Do our own printf-format varargs parsing. This is too complex to be worth the trouble. Let's just revert that change and go back to a fixed buffer. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

Merge branch 'cc/apply-am'

2016-09-19T20:47:18Z

"git am" has been taught to make an internal call to "git apply"'s innards without spawning the latter as a separate process. * cc/apply-am: (41 commits) builtin/am: use apply API in run_apply() apply: learn to use a different index file apply: pass apply state to build_fake_ancestor() apply: refactor `git apply` option parsing apply: change error_routine when silent usage: add get_error_routine() and get_warn_routine() usage: add set_warn_routine() apply: don't print on stdout in verbosity_silent mode apply: make it possible to silently apply apply: use error_errno() where possible apply: make some parsing functions static again apply: move libified code from builtin/apply.c to apply.{c,h} apply: rename and move opt constants to apply.h builtin/apply: rename option parsing functions builtin/apply: make create_one_file() return -1 on error builtin/apply: make try_create_file() return -1 on error builtin/apply: make write_out_results() return -1 on error builtin/apply: make write_out_one_result() return -1 on error builtin/apply: make create_file() return -1 on error builtin/apply: make add_index_file() return -1 on error ...