<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/connect.c, branch v2.30.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.30.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.30.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2021-02-06T00:31:27Z</updated>
<entry>
<title>Merge branch 'jk/forbid-lf-in-git-url' into maint</title>
<updated>2021-02-06T00:31:27Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-02-06T00:31:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=635ff6759069dfea7c6e448afa8c245405f53a18'/>
<id>urn:sha1:635ff6759069dfea7c6e448afa8c245405f53a18</id>
<content type='text'>
Newline characters in the host and path part of git:// URL are
now forbidden.

* jk/forbid-lf-in-git-url:
  fsck: reject .gitmodules git:// urls with newlines
  git_connect_git(): forbid newlines in host and path
</content>
</entry>
<entry>
<title>git_connect_git(): forbid newlines in host and path</title>
<updated>2021-01-07T22:25:44Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2021-01-07T09:43:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=a02ea577174ab8ed18f847cf1693f213e0b9c473'/>
<id>urn:sha1:a02ea577174ab8ed18f847cf1693f213e0b9c473</id>
<content type='text'>
When we connect to a git:// server, we send an initial request that
looks something like:

  002dgit-upload-pack repo.git\0host=example.com

If the repo path contains a newline, then it's included literally, and
we get:

  002egit-upload-pack repo
  .git\0host=example.com

This works fine if you really do have a newline in your repository name;
the server side uses the pktline framing to parse the string, not
newlines. However, there are many _other_ protocols in the wild that do
parse on newlines, such as HTTP. So a carefully constructed git:// URL
can actually turn into a valid HTTP request. For example:

  git://localhost:1234/%0d%0a%0d%0aGET%20/%20HTTP/1.1 %0d%0aHost:localhost%0d%0a%0d%0a

becomes:

  0050git-upload-pack /
  GET / HTTP/1.1
  Host:localhost

  host=localhost:1234

on the wire. Again, this isn't a problem for a real Git server, but it
does mean that feeding a malicious URL to Git (e.g., through a
submodule) can cause it to make unexpected cross-protocol requests.
Since repository names with newlines are presumably quite rare (and
indeed, we already disallow them in git-over-http), let's just disallow
them over this protocol.

Hostnames could likewise inject a newline, but this is unlikely a
problem in practice; we'd try resolving the hostname with a newline in
it, which wouldn't work. Still, it doesn't hurt to err on the side of
caution there, since we would not expect them to work in the first
place.

The ssh and local code paths are unaffected by this patch. In both cases
we're trying to run upload-pack via a shell, and will quote the newline
so that it makes it intact. An attacker can point an ssh url at an
arbitrary port, of course, but unless there's an actual ssh server
there, we'd never get as far as sending our shell command anyway.  We
_could_ similarly restrict newlines in those protocols out of caution,
but there seems little benefit to doing so.

The new test here is run alongside the git-daemon tests, which cover the
same protocol, but it shouldn't actually contact the daemon at all.  In
theory we could make the test more robust by setting up an actual
repository with a newline in it (so that our clone would succeed if our
new check didn't kick in). But a repo directory with newline in it is
likely not portable across all filesystems. Likewise, we could check
git-daemon's log that it was not contacted at all, but we do not
currently record the log (and anyway, it would make the test racy with
the daemon's log write). We'll just check the client-side stderr to make
sure we hit the expected code path.

Reported-by: Harold Kim &lt;h.kim@flatt.tech&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jk/leakfix'</title>
<updated>2020-08-27T21:04:49Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-08-27T21:04:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0d9a8e33f9fd07efa10072576df01a9cae5d89e6'/>
<id>urn:sha1:0d9a8e33f9fd07efa10072576df01a9cae5d89e6</id>
<content type='text'>
Code clean-up.

* jk/leakfix:
  submodule--helper: fix leak of core.worktree value
  config: fix leak in git_config_get_expiry_in_days()
  config: drop git_config_get_string_const()
  config: fix leaks from git_config_get_string_const()
  checkout: fix leak of non-existent branch names
  submodule--helper: use strbuf_release() to free strbufs
  clear_pattern_list(): clear embedded hashmaps
</content>
</entry>
<entry>
<title>config: fix leaks from git_config_get_string_const()</title>
<updated>2020-08-14T17:52:04Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-08-14T16:17:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=f1de981e8b6dedccf915095792c9afbe3c989591'/>
<id>urn:sha1:f1de981e8b6dedccf915095792c9afbe3c989591</id>
<content type='text'>
There are two functions to get a single config string:

  - git_config_get_string()

  - git_config_get_string_const()

One might naively think that the first one allocates a new string and
the second one just points us to the internal configset storage. But
in fact they both allocate a new copy; the second one exists only to
avoid having to cast when using it with a const global which we never
intend to free.

The documentation for the function explains that clearly, but it seems
I'm not alone in being surprised by this. Of 17 calls to the function,
13 of them leak the resulting value.

We could obviously fix these by adding the appropriate free(). But it
would be simpler still if we actually had a non-allocating way to get
the string. There's git_config_get_value() but that doesn't quite do
what we want. If the config key is present but is a boolean with no
value (e.g., "[foo]bar" in the file), then we'll get NULL (whereas the
string versions will print an error and die).

So let's introduce a new variant, git_config_get_string_tmp(), that
behaves as these callers expect. We need a new name because we have new
semantics but the same function signature (so even if we converted the
four remaining callers, topics in flight might be surprised). The "tmp"
is because this value should only be held onto for a short time. In
practice it's rare for us to clear and refresh the configset,
invalidating the pointer, but hopefully the "tmp" makes callers think
about the lifetime. In each of the converted cases here the value only
needs to last within the local function or its immediate caller.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>strvec: rename struct fields</title>
<updated>2020-07-31T02:18:06Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-07-29T00:37:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d70a9eb611a9d242c1d26847d223b8677609305b'/>
<id>urn:sha1:d70a9eb611a9d242c1d26847d223b8677609305b</id>
<content type='text'>
The "argc" and "argv" names made sense when the struct was argv_array,
but now they're just confusing. Let's rename them to "nr" (which we use
for counts elsewhere) and "v" (which is rather terse, but reads well
when combined with typical variable names like "args.v").

Note that we have to update all of the callers immediately. Playing
tricks with the preprocessor is hard here, because we wouldn't want to
rewrite unrelated tokens.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>strvec: fix indentation in renamed calls</title>
<updated>2020-07-28T22:02:18Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-07-28T20:26:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=f6d8942b1fc6c968980c8ae03054d7b2114b4415'/>
<id>urn:sha1:f6d8942b1fc6c968980c8ae03054d7b2114b4415</id>
<content type='text'>
Code which split an argv_array call across multiple lines, like:

  argv_array_pushl(&amp;args, "one argument",
                   "another argument", "and more",
		   NULL);

was recently mechanically renamed to use strvec, which results in
mis-matched indentation like:

  strvec_pushl(&amp;args, "one argument",
                   "another argument", "and more",
		   NULL);

Let's fix these up to align the arguments with the opening paren. I did
this manually by sifting through the results of:

  git jump grep 'strvec_.*,$'

and liberally applying my editor's auto-format. Most of the changes are
of the form shown above, though I also normalized a few that had
originally used a single-tab indentation (rather than our usual style of
aligning with the open paren). I also rewrapped a couple of obvious
cases (e.g., where previously too-long lines became short enough to fit
on one), but I wasn't aggressive about it. In cases broken to three or
more lines, the grouping of arguments is sometimes meaningful, and it
wasn't worth my time or reviewer time to ponder each case individually.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>strvec: convert more callers away from argv_array name</title>
<updated>2020-07-28T22:02:18Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-07-28T20:24:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ef8d7ac42a6a62d678166fe25ea743315809d2bb'/>
<id>urn:sha1:ef8d7ac42a6a62d678166fe25ea743315809d2bb</id>
<content type='text'>
We eventually want to drop the argv_array name and just use strvec
consistently. There's no particular reason we have to do it all at once,
or care about interactions between converted and unconverted bits.
Because of our preprocessor compat layer, the names are interchangeable
to the compiler (so even a definition and declaration using different
names is OK).

This patch converts remaining files from the first half of the alphabet,
to keep the diff to a manageable size.

The conversion was done purely mechanically with:

  git ls-files '*.c' '*.h' |
  xargs perl -i -pe '
    s/ARGV_ARRAY/STRVEC/g;
    s/argv_array/strvec/g;
  '

and then selectively staging files with "git add '[abcdefghjkl]*'".
We'll deal with any indentation/style fallouts separately.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'bc/sha-256-part-2'</title>
<updated>2020-07-07T05:09:13Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-07-07T05:09:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=12210859da0c16c644dab658d9e1ef671ec28788'/>
<id>urn:sha1:12210859da0c16c644dab658d9e1ef671ec28788</id>
<content type='text'>
SHA-256 migration work continues.

* bc/sha-256-part-2: (44 commits)
  remote-testgit: adapt for object-format
  bundle: detect hash algorithm when reading refs
  t5300: pass --object-format to git index-pack
  t5704: send object-format capability with SHA-256
  t5703: use object-format serve option
  t5702: offer an object-format capability in the test
  t/helper: initialize the repository for test-sha1-array
  remote-curl: avoid truncating refs with ls-remote
  t1050: pass algorithm to index-pack when outside repo
  builtin/index-pack: add option to specify hash algorithm
  remote-curl: detect algorithm for dumb HTTP by size
  builtin/ls-remote: initialize repository based on fetch
  t5500: make hash independent
  serve: advertise object-format capability for protocol v2
  connect: parse v2 refs with correct hash algorithm
  connect: pass full packet reader when parsing v2 refs
  Documentation/technical: document object-format for protocol v2
  t1302: expect repo format version 1 for SHA-256
  builtin/show-index: provide options to determine hash algo
  t5302: modernize test formatting
  ...
</content>
</entry>
<entry>
<title>serve: advertise object-format capability for protocol v2</title>
<updated>2020-05-27T17:07:07Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2020-05-25T19:59:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9de0dd361c9ea2ca6eca14a7dd43fe11d170a253'/>
<id>urn:sha1:9de0dd361c9ea2ca6eca14a7dd43fe11d170a253</id>
<content type='text'>
In order to communicate the protocol supported by the server side, add
support for advertising the object-format capability.  We check that the
client side sends us an identical algorithm if it sends us its own
object-format capability, and assume it speaks SHA-1 if not.

In the test, when we're using an algorithm other than SHA-1, we need to
specify the algorithm in use so we don't get a failure with an "unknown
format" message.  Add a test that we handle a mismatched algorithm.
Remove the test_oid_init call since it's no longer necessary.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>connect: parse v2 refs with correct hash algorithm</title>
<updated>2020-05-27T17:07:07Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2020-05-25T19:59:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ab67235bc4900d8203a1a6b6f33cf8afa845e43e'/>
<id>urn:sha1:ab67235bc4900d8203a1a6b6f33cf8afa845e43e</id>
<content type='text'>
When using protocol v2, we need to know what hash algorithm is used by
the remote end.  See if the server has sent us an object-format
capability, and if so, use it to determine the hash algorithm in use and
set that value in the packet reader.  Parse the refs using this
algorithm.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
