summaryrefslogtreecommitdiffstats
path: root/contrib/persistent-https
diff options
context:
space:
mode:
authorJeff King <peff@peff.net>2026-02-15 04:07:44 -0500
committerJunio C Hamano <gitster@pobox.com>2026-02-17 09:45:29 -0800
commitfe732a8b9f1837189eb3533a262548a652c11e61 (patch)
tree6e87f57371877e6a5fd8ceb16f8563db200b321a /contrib/persistent-https
parent2ec30e71f44233b9afceda0f2992029674187674 (diff)
downloadgit-fe732a8b9f1837189eb3533a262548a652c11e61.tar.gz
git-fe732a8b9f1837189eb3533a262548a652c11e61.zip
ref-filter: avoid strrchr() in rstrip_ref_components()
To strip path components from our refname string, we repeatedly call strrchr() to find the trailing slash, shortening the string each time by assigning NUL over it. This has two downsides: 1. Calling strrchr() in a loop is quadratic, since each call has to call strlen() under the hood to find the end of the string (even though we know exactly where it is from the last loop iteration). 2. We need a temporary buffer, since we're munging the string with NUL as we shorten it (which we must do, because strrchr() has no other way of knowing what we consider the end of the string). Using memrchr() would let us fix both of these, but it isn't portable. So instead, let's just open-code the string traversal from back to front as we loop. I doubt that the quadratic nature is a serious concern. You can see it in practice with something like: git init git commit --allow-empty -m foo echo "$(git rev-parse HEAD) refs/heads$(perl -e 'print "/a" x 500_000')" >.git/packed-refs time git for-each-ref --format='%(refname:rstrip=-1)' That takes ~5.5s to run on my machine before this patch, and ~11ms after. But I don't think there's a reasonable way for somebody to infect you with such a garbage ref, as the wire protocol is limited to 64k pkt-lines. The difference is measurable for me for a 32k-component ref (about 19ms vs 7ms), so perhaps you could create some chaos by pushing a lot of them. But we also run into filesystem limits (if the loose backend is in use), and in practice it seems like there are probably simpler and more effective ways to waste CPU. Likewise the extra allocation probably isn't really measurable. In fact, since our goal is to return an allocated string, we end up having to make the same allocation anyway (though it is sized to the result, rather than the input). My main goal was simplicity in avoiding the need to handle cleaning it up in the early return path. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'contrib/persistent-https')
0 files changed, 0 insertions, 0 deletions