<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/compat/precompose_utf8.c, branch v2.26.1</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.26.1</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.26.1'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2019-01-31T18:27:52Z</updated>
<entry>
<title>Support working-tree-encoding "UTF-16LE-BOM"</title>
<updated>2019-01-31T18:27:52Z</updated>
<author>
<name>Torsten Bögershausen</name>
<email>tboegi@web.de</email>
</author>
<published>2019-01-30T15:01:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=aab2a1ae48ff65781a5379a01a4abb4f75e5641d'/>
<id>urn:sha1:aab2a1ae48ff65781a5379a01a4abb4f75e5641d</id>
<content type='text'>
Users who want UTF-16 files in the working tree set the .gitattributes
like this:
test.txt working-tree-encoding=UTF-16

The unicode standard itself defines 3 allowed ways how to encode UTF-16.
The following 3 versions convert all back to 'g' 'i' 't' in UTF-8:

a) UTF-16, without BOM, big endian:
$ printf "\000g\000i\000t" | iconv -f UTF-16 -t UTF-8 | od -c
0000000    g   i   t

b) UTF-16, with BOM, little endian:
$ printf "\377\376g\000i\000t\000" | iconv -f UTF-16 -t UTF-8 | od -c
0000000    g   i   t

c) UTF-16, with BOM, big endian:
$ printf "\376\377\000g\000i\000t" | iconv -f UTF-16 -t UTF-8 | od -c
0000000    g   i   t

Git uses libiconv to convert from UTF-8 in the index into ITF-16 in the
working tree.
After a checkout, the resulting file has a BOM and is encoded in "UTF-16",
in the version (c) above.
This is what iconv generates, more details follow below.

iconv (and libiconv) can generate UTF-16, UTF-16LE or UTF-16BE:

d) UTF-16
$ printf 'git' | iconv -f UTF-8 -t UTF-16 | od -c
0000000  376 377  \0   g  \0   i  \0   t

e) UTF-16LE
$ printf 'git' | iconv -f UTF-8 -t UTF-16LE | od -c
0000000    g  \0   i  \0   t  \0

f)  UTF-16BE
$ printf 'git' | iconv -f UTF-8 -t UTF-16BE | od -c
0000000   \0   g  \0   i  \0   t

There is no way to generate version (b) from above in a Git working tree,
but that is what some applications need.
(All fully unicode aware applications should be able to read all 3 variants,
but in practise we are not there yet).

When producing UTF-16 as an output, iconv generates the big endian version
with a BOM. (big endian is probably chosen for historical reasons).

iconv can produce UTF-16 files with little endianess by using "UTF-16LE"
as encoding, and that file does not have a BOM.

Not all users (especially under Windows) are happy with this.
Some tools are not fully unicode aware and can only handle version (b).

Today there is no way to produce version (b) with iconv (or libiconv).
Looking into the history of iconv, it seems as if version (c) will
be used in all future iconv versions (for compatibility reasons).

Solve this dilemma and introduce a Git-specific "UTF-16LE-BOM".
libiconv can not handle the encoding, so Git pick it up, handles the BOM
and uses libiconv to convert the rest of the stream.
(UTF-16BE-BOM is added for consistency)

Rported-by: Adrián Gimeno Balaguer &lt;adrigibal@gmail.com&gt;
Signed-off-by: Torsten Bögershausen &lt;tboegi@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>config: don't include config.h by default</title>
<updated>2017-06-15T19:56:22Z</updated>
<author>
<name>Brandon Williams</name>
<email>bmwill@google.com</email>
</author>
<published>2017-06-14T18:07:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b2141fc1d20e659810245ec6ca1c143c60e033ec'/>
<id>urn:sha1:b2141fc1d20e659810245ec6ca1c143c60e033ec</id>
<content type='text'>
Stop including config.h by default in cache.h.  Instead only include
config.h in those files which require use of the config system.

Signed-off-by: Brandon Williams &lt;bmwill@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>typofix: assorted typofixes in comments, documentation and messages</title>
<updated>2016-05-06T20:16:37Z</updated>
<author>
<name>Li Peng</name>
<email>lip@dtdream.com</email>
</author>
<published>2016-05-06T12:36:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=832c0e5e63a0f61c3788847d4a7abb82d9e86ef4'/>
<id>urn:sha1:832c0e5e63a0f61c3788847d4a7abb82d9e86ef4</id>
<content type='text'>
Many instances of duplicate words (e.g. "the the path") and
a few typoes are fixed, originally in multiple patches.

    wildmatch: fix duplicate words of "the"
    t: fix duplicate words of "output"
    transport-helper: fix duplicate words of "read"
    Git.pm: fix duplicate words of "return"
    path: fix duplicate words of "look"
    pack-protocol.txt: fix duplicate words of "the"
    precompose-utf8: fix typo of "sequences"
    split-index: fix typo
    worktree.c: fix typo
    remote-ext: fix typo
    utf8: fix duplicate words of "the"
    git-cvsserver: fix duplicate words

Signed-off-by: Li Peng &lt;lip@dtdream.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>config: rename git_config_set_or_die to git_config_set</title>
<updated>2016-02-22T18:23:55Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2016-02-22T11:23:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3d1806487af395fb33d1de92633e96571b296305'/>
<id>urn:sha1:3d1806487af395fb33d1de92633e96571b296305</id>
<content type='text'>
Rename git_config_set_or_die functions to git_config_set, leading
to the new default behavior of dying whenever a configuration
error occurs.

By now all callers that shall die on error have been transitioned
to the _or_die variants, thus making this patch a simple rename
of the functions.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>compat: die when unable to set core.precomposeunicode</title>
<updated>2016-02-22T18:23:54Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2016-02-22T11:23:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=2f29c1bf34ec12c24072bb54a2c009bd1f17a2ee'/>
<id>urn:sha1:2f29c1bf34ec12c24072bb54a2c009bd1f17a2ee</id>
<content type='text'>
When calling `git_config_set` to set 'core.precomposeunicode' we
ignore the return value of the function, which may indicate that
we were unable to write the value back to disk. As the function
is only called by init-db we can and should die when an error
occurs.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>probe_utf8_pathname_composition: use internal strbuf</title>
<updated>2015-10-05T18:06:49Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2015-10-05T03:45:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=fdf729661a777d8bd598f40055d92b2df5601332'/>
<id>urn:sha1:fdf729661a777d8bd598f40055d92b2df5601332</id>
<content type='text'>
When we are initializing a .git directory, we may call
probe_utf8_pathname_composition to detect utf8 mangling. We
pass in a path buffer for it to use, and it blindly
strcpy()s into it, not knowing whether the buffer is large
enough to hold the result or not.

In practice this isn't a big deal, because the buffer we
pass in already contains "$GIT_DIR/config", and we append
only a few extra bytes to it. But we can easily do the right
thing just by calling git_path_buf ourselves. Technically
this results in a different pathname (before we appended our
utf8 characters to the "config" path, and now they get their
own files in $GIT_DIR), but that should not matter for our
purposes.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>precompose_utf8: drop unused variable</title>
<updated>2015-10-05T18:05:51Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2015-10-05T03:43:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e2b021eb5b5ce2f4be3c10f1f8063981b1a53053'/>
<id>urn:sha1:e2b021eb5b5ce2f4be3c10f1f8063981b1a53053</id>
<content type='text'>
The result of iconv is assigned to a variable, but we never
use it (instead, we check errno and whether the function
consumed all bytes). Let's drop the assignment, as it
triggers gcc's -Wunused-but-set-variable.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Set core.precomposeunicode to true on e.g. HFS+</title>
<updated>2013-08-27T14:41:32Z</updated>
<author>
<name>Torsten Bögershausen</name>
<email>tboegi@web.de</email>
</author>
<published>2013-08-27T13:50:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=92b0c8bed0d3f6ed5442e3ffa178413772faa31b'/>
<id>urn:sha1:92b0c8bed0d3f6ed5442e3ffa178413772faa31b</id>
<content type='text'>
When core.precomposeunicode was introduced in 76759c7d,
it was set to false on a unicode decomposing file system like HFS+
to be compatible with older versions of Git.

The Mac OS users need to find out that this configuration exist
and change it manually from false to true.

A smoother workflow can be achieved,
so set core.precomposeunicode to true on a decomposing file system.

Signed-off-by: Torsten Bögershausen &lt;tboegi@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>utf8.c: add reencode_string_len() that can handle NULs in string</title>
<updated>2013-04-18T23:28:28Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2013-04-18T23:08:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b782bbab94e3618aea352907caa77321b487b918'/>
<id>urn:sha1:b782bbab94e3618aea352907caa77321b487b918</id>
<content type='text'>
Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>precompose-utf8: fix spelling of "want" in error message</title>
<updated>2013-04-12T19:24:04Z</updated>
<author>
<name>Stefano Lattarini</name>
<email>stefano.lattarini@gmail.com</email>
</author>
<published>2013-04-11T22:36:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0f7b4c2e77a353f027b9c9869b588145fb036520'/>
<id>urn:sha1:0f7b4c2e77a353f027b9c9869b588145fb036520</id>
<content type='text'>
Noticed using Lucas De Marchi's codespell tool.

Signed-off-by: Stefano Lattarini &lt;stefano.lattarini@gmail.com&gt;
Signed-off-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Acked-by: Matthieu Moy &lt;Matthieu.Moy@imag.fr&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
