<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/utf8.h, branch jch</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=jch</id>
<link rel='self' href='https://git.shady.money/git/atom?h=jch'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2024-10-02T22:52:48Z</updated>
<entry>
<title>utf8.h: squelch unused-parameter warnings with NO_ICONV</title>
<updated>2024-10-02T22:52:48Z</updated>
<author>
<name>Mike Hommey</name>
<email>mh@glandium.org</email>
</author>
<published>2024-10-02T20:01:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e03b2a2105e7969ebc804c519b0a356e20a8a8d7'/>
<id>urn:sha1:e03b2a2105e7969ebc804c519b0a356e20a8a8d7</id>
<content type='text'>
Since DEVELOPER=YesPlease build enables -Wunused-parameter warnings
these days, the fallback definition for reencode_string_len() that
did not touch any of its parameters but one needs to be annotated
properly.

Signed-off-by: Mike Hommey &lt;mh@glandium.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>doc: switch links to https</title>
<updated>2023-11-26T01:07:05Z</updated>
<author>
<name>Josh Soref</name>
<email>jsoref@gmail.com</email>
</author>
<published>2023-11-24T03:35:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d05b08cd52cfda627f1d865bdfe6040a2c9521b5'/>
<id>urn:sha1:d05b08cd52cfda627f1d865bdfe6040a2c9521b5</id>
<content type='text'>
These sites offer https versions of their content.
Using the https versions provides some protection for users.

Signed-off-by: Josh Soref &lt;jsoref@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Sync with Git 2.31.6</title>
<updated>2022-12-13T12:09:40Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2022-12-13T12:09:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8a755eddf5bf256613bc584f32cd44401a25897c'/>
<id>urn:sha1:8a755eddf5bf256613bc584f32cd44401a25897c</id>
<content type='text'>
</content>
</entry>
<entry>
<title>utf8: fix truncated string lengths in `utf8_strnwidth()`</title>
<updated>2022-12-09T05:26:21Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2022-12-01T14:46:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=522cc87fdc25449222a5894a428eebf4b8d5eaa9'/>
<id>urn:sha1:522cc87fdc25449222a5894a428eebf4b8d5eaa9</id>
<content type='text'>
The `utf8_strnwidth()` function accepts an optional string length as
input parameter. This parameter can either be set to `-1`, in which case
we call `strlen()` on the input. Or it can be set to a positive integer
that indicates a precomputed length, which callers typically compute by
calling `strlen()` at some point themselves.

The input parameter is an `int` though, whereas `strlen()` returns a
`size_t`. This can lead to implementation-defined behaviour though when
the `size_t` cannot be represented by the `int`. In the general case
though this leads to wrap-around and thus to negative string sizes,
which is sure enough to not lead to well-defined behaviour.

Fix this by accepting a `size_t` instead of an `int` as string length.
While this takes away the ability of callers to simply pass in `-1` as
string length, it really is trivial enough to convert them to instead
pass in `strlen()` instead.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>t0060: test ntfs/hfs-obscured dotfiles</title>
<updated>2021-05-04T02:52:02Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2021-05-03T20:43:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=801ed010bf13465bf67608beabbaa1ec2550204f'/>
<id>urn:sha1:801ed010bf13465bf67608beabbaa1ec2550204f</id>
<content type='text'>
We have tests that cover various filesystem-specific spellings of
".gitmodules", because we need to reliably identify that path for some
security checks. These are from dc2d9ba318 (is_{hfs,ntfs}_dotgitmodules:
add tests, 2018-05-12), with the actual code coming from e7cb0b4455
(is_ntfs_dotgit: match other .git files, 2018-05-11) and 0fc333ba20
(is_hfs_dotgit: match other .git files, 2018-05-02).

Those latter two commits also added similar matching functions for
.gitattributes and .gitignore. These ended up not being used in the
final series, and are currently dead code. But in preparation for them
being used in some fsck checks, let's make sure they actually work by
throwing a few basic tests at them. Likewise, let's cover .mailmap
(which does need matching code added).

I didn't bother with the whole battery of tests that we cover for
.gitmodules. These functions are all based on the same generic matcher,
so it's sufficient to test most of the corner cases just once.

Note that the ntfs magic prefix names in the tests come from the
algorithm described in e7cb0b4455 (and are different for each file).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>*.[ch]: remove extern from function declarations using spatch</title>
<updated>2019-05-05T06:20:06Z</updated>
<author>
<name>Denton Liu</name>
<email>liu.denton@gmail.com</email>
</author>
<published>2019-04-29T08:28:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=554544276a604c144df45efcb060c80aa322088c'/>
<id>urn:sha1:554544276a604c144df45efcb060c80aa322088c</id>
<content type='text'>
There has been a push to remove extern from function declarations.
Remove some instances of "extern" for function declarations which are
caught by Coccinelle. Note that Coccinelle has some difficulty with
processing functions with `__attribute__` or varargs so some `extern`
declarations are left behind to be dealt with in a future patch.

This was the Coccinelle patch used:

	@@
	type T;
	identifier f;
	@@
	- extern
	  T f(...);

and it was run with:

	$ git ls-files \*.{c,h} |
		grep -v ^compat/ |
		xargs spatch --sp-file contrib/coccinelle/noextern.cocci --in-place

Files under `compat/` are intentionally excluded as some are directly
copied from external sources and we should avoid churning them as much
as possible.

Signed-off-by: Denton Liu &lt;liu.denton@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Support working-tree-encoding "UTF-16LE-BOM"</title>
<updated>2019-01-31T18:27:52Z</updated>
<author>
<name>Torsten Bögershausen</name>
<email>tboegi@web.de</email>
</author>
<published>2019-01-30T15:01:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=aab2a1ae48ff65781a5379a01a4abb4f75e5641d'/>
<id>urn:sha1:aab2a1ae48ff65781a5379a01a4abb4f75e5641d</id>
<content type='text'>
Users who want UTF-16 files in the working tree set the .gitattributes
like this:
test.txt working-tree-encoding=UTF-16

The unicode standard itself defines 3 allowed ways how to encode UTF-16.
The following 3 versions convert all back to 'g' 'i' 't' in UTF-8:

a) UTF-16, without BOM, big endian:
$ printf "\000g\000i\000t" | iconv -f UTF-16 -t UTF-8 | od -c
0000000    g   i   t

b) UTF-16, with BOM, little endian:
$ printf "\377\376g\000i\000t\000" | iconv -f UTF-16 -t UTF-8 | od -c
0000000    g   i   t

c) UTF-16, with BOM, big endian:
$ printf "\376\377\000g\000i\000t" | iconv -f UTF-16 -t UTF-8 | od -c
0000000    g   i   t

Git uses libiconv to convert from UTF-8 in the index into ITF-16 in the
working tree.
After a checkout, the resulting file has a BOM and is encoded in "UTF-16",
in the version (c) above.
This is what iconv generates, more details follow below.

iconv (and libiconv) can generate UTF-16, UTF-16LE or UTF-16BE:

d) UTF-16
$ printf 'git' | iconv -f UTF-8 -t UTF-16 | od -c
0000000  376 377  \0   g  \0   i  \0   t

e) UTF-16LE
$ printf 'git' | iconv -f UTF-8 -t UTF-16LE | od -c
0000000    g  \0   i  \0   t  \0

f)  UTF-16BE
$ printf 'git' | iconv -f UTF-8 -t UTF-16BE | od -c
0000000   \0   g  \0   i  \0   t

There is no way to generate version (b) from above in a Git working tree,
but that is what some applications need.
(All fully unicode aware applications should be able to read all 3 variants,
but in practise we are not there yet).

When producing UTF-16 as an output, iconv generates the big endian version
with a BOM. (big endian is probably chosen for historical reasons).

iconv can produce UTF-16 files with little endianess by using "UTF-16LE"
as encoding, and that file does not have a BOM.

Not all users (especially under Windows) are happy with this.
Some tools are not fully unicode aware and can only handle version (b).

Today there is no way to produce version (b) with iconv (or libiconv).
Looking into the history of iconv, it seems as if version (c) will
be used in all future iconv versions (for compatibility reasons).

Solve this dilemma and introduce a Git-specific "UTF-16LE-BOM".
libiconv can not handle the encoding, so Git pick it up, handles the BOM
and uses libiconv to convert the rest of the stream.
(UTF-16BE-BOM is added for consistency)

Rported-by: Adrián Gimeno Balaguer &lt;adrigibal@gmail.com&gt;
Signed-off-by: Torsten Bögershausen &lt;tboegi@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'en/incl-forward-decl'</title>
<updated>2018-08-20T19:41:32Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-08-20T19:41:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5ade03446430f1190fdf1eca8b8fde5f63367110'/>
<id>urn:sha1:5ade03446430f1190fdf1eca8b8fde5f63367110</id>
<content type='text'>
Code hygiene improvement for the header files.

* en/incl-forward-decl:
  Remove forward declaration of an enum
  compat/precompose_utf8.h: use more common include guard style
  urlmatch.h: fix include guard
  Move definition of enum branch_track from cache.h to branch.h
  alloc: make allocate_alloc_state and clear_alloc_state more consistent
  Add missing includes and forward declarations
</content>
</entry>
<entry>
<title>Add missing includes and forward declarations</title>
<updated>2018-08-15T18:52:09Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2018-08-15T17:54:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ef3ca95475ce467ae883cc8175ed40e6f7d27800'/>
<id>urn:sha1:ef3ca95475ce467ae883cc8175ed40e6f7d27800</id>
<content type='text'>
I looped over the toplevel header files, creating a temporary two-line C
program for each consisting of
  #include "git-compat-util.h"
  #include $HEADER
This patch is the result of manually fixing errors in compiling those
tiny programs.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>reencode_string: use size_t for string lengths</title>
<updated>2018-07-24T17:19:29Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2018-07-24T10:50:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c7d017d7e1cca37ca20f73c11fa9f1b319a2c3a5'/>
<id>urn:sha1:c7d017d7e1cca37ca20f73c11fa9f1b319a2c3a5</id>
<content type='text'>
The iconv interface takes a size_t, which is the appropriate
type for an in-memory buffer. But our reencode_string_*
functions use integers, meaning we may get confusing results
when the sizes exceed INT_MAX. Let's use size_t
consistently.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
