<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/utf8.h, branch v2.18.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.18.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.18.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2018-05-29T08:10:05Z</updated>
<entry>
<title>Sync with Git 2.17.1</title>
<updated>2018-05-29T08:10:05Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-05-29T08:09:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=7913f53b5628997165e075008d6142da1c04271a'/>
<id>urn:sha1:7913f53b5628997165e075008d6142da1c04271a</id>
<content type='text'>
* maint: (25 commits)
  Git 2.17.1
  Git 2.16.4
  Git 2.15.2
  Git 2.14.4
  Git 2.13.7
  fsck: complain when .gitmodules is a symlink
  index-pack: check .gitmodules files with --strict
  unpack-objects: call fsck_finish() after fscking objects
  fsck: call fsck_finish() after fscking objects
  fsck: check .gitmodules content
  fsck: handle promisor objects in .gitmodules check
  fsck: detect gitmodules files
  fsck: actually fsck blob data
  fsck: simplify ".git" check
  index-pack: make fsck error message more specific
  verify_path: disallow symlinks in .gitmodules
  update-index: stat updated files earlier
  verify_dotfile: mention case-insensitivity in comment
  verify_path: drop clever fallthrough
  skip_prefix: add case-insensitive variant
  ...
</content>
</entry>
<entry>
<title>is_hfs_dotgit: match other .git files</title>
<updated>2018-05-22T03:50:11Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2018-05-02T19:23:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0fc333ba20b43a8afee5023e92cb3384ff4e59a6'/>
<id>urn:sha1:0fc333ba20b43a8afee5023e92cb3384ff4e59a6</id>
<content type='text'>
Both verify_path() and fsck match ".git", ".GIT", and other
variants specific to HFS+. Let's allow matching other
special files like ".gitmodules", which we'll later use to
enforce extra restrictions via verify_path() and fsck.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
</content>
</entry>
<entry>
<title>utf8: add function to detect a missing UTF-16/32 BOM</title>
<updated>2018-04-16T02:40:56Z</updated>
<author>
<name>Lars Schneider</name>
<email>larsxschneider@gmail.com</email>
</author>
<published>2018-04-15T18:16:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=c6e48652f69f6955bbbb423100e0df2a49467db8'/>
<id>urn:sha1:c6e48652f69f6955bbbb423100e0df2a49467db8</id>
<content type='text'>
If the endianness is not defined in the encoding name, then let's
be strict and require a BOM to avoid any encoding confusion. The
is_missing_required_utf_bom() function returns true if a required BOM
is missing.

The Unicode standard instructs to assume big-endian if there in no BOM
for UTF-16/32 [1][2]. However, the W3C/WHATWG encoding standard used
in HTML5 recommends to assume little-endian to "deal with deployed
content" [3]. Strictly requiring a BOM seems to be the safest option
for content in Git.

This function is used in a subsequent commit.

[1] http://unicode.org/faq/utf_bom.html#gen6
[2] http://www.unicode.org/versions/Unicode10.0.0/ch03.pdf
     Section 3.10, D98, page 132
[3] https://encoding.spec.whatwg.org/#utf-16le

Signed-off-by: Lars Schneider &lt;larsxschneider@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>utf8: add function to detect prohibited UTF-16/32 BOM</title>
<updated>2018-04-16T02:40:56Z</updated>
<author>
<name>Lars Schneider</name>
<email>larsxschneider@gmail.com</email>
</author>
<published>2018-04-15T18:16:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=10ecb82e4f1f507d5f122e00fd4829b30953f853'/>
<id>urn:sha1:10ecb82e4f1f507d5f122e00fd4829b30953f853</id>
<content type='text'>
Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE
or UTF-32LE a BOM must not be used [1]. The function returns true if
this is the case.

This function is used in a subsequent commit.

[1] http://unicode.org/faq/utf_bom.html#bom10

Signed-off-by: Lars Schneider &lt;larsxschneider@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>typofix: assorted typofixes in comments, documentation and messages</title>
<updated>2016-05-06T20:16:37Z</updated>
<author>
<name>Li Peng</name>
<email>lip@dtdream.com</email>
</author>
<published>2016-05-06T12:36:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=832c0e5e63a0f61c3788847d4a7abb82d9e86ef4'/>
<id>urn:sha1:832c0e5e63a0f61c3788847d4a7abb82d9e86ef4</id>
<content type='text'>
Many instances of duplicate words (e.g. "the the path") and
a few typoes are fixed, originally in multiple patches.

    wildmatch: fix duplicate words of "the"
    t: fix duplicate words of "output"
    transport-helper: fix duplicate words of "read"
    Git.pm: fix duplicate words of "return"
    path: fix duplicate words of "look"
    pack-protocol.txt: fix duplicate words of "the"
    precompose-utf8: fix typo of "sequences"
    split-index: fix typo
    worktree.c: fix typo
    remote-ext: fix typo
    utf8: fix duplicate words of "the"
    git-cvsserver: fix duplicate words

Signed-off-by: Li Peng &lt;lip@dtdream.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>utf8: add function to align a string into given strbuf</title>
<updated>2015-09-17T17:02:48Z</updated>
<author>
<name>Karthik Nayak</name>
<email>karthik.188@gmail.com</email>
</author>
<published>2015-09-10T15:48:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=110dcda50d5ddaf3557666eea3b012a6ccc74dce'/>
<id>urn:sha1:110dcda50d5ddaf3557666eea3b012a6ccc74dce</id>
<content type='text'>
Add strbuf_utf8_align() which will align a given string into a strbuf
as per given align_type and width. If the width is greater than the
string length then no alignment is performed.

Helped-by: Eric Sunshine &lt;sunshine@sunshineco.com&gt;
Mentored-by: Christian Couder &lt;christian.couder@gmail.com&gt;
Mentored-by: Matthieu Moy &lt;matthieu.moy@grenoble-inp.fr&gt;
Signed-off-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'es/utf8-stupid-compiler-workaround'</title>
<updated>2015-06-24T19:21:46Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2015-06-24T19:21:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5d24b109a64827708cbe98b865aba5d51a2f7c3b'/>
<id>urn:sha1:5d24b109a64827708cbe98b865aba5d51a2f7c3b</id>
<content type='text'>
A compilation workaround.

* es/utf8-stupid-compiler-workaround:
  utf8: NO_ICONV: silence uninitialized variable warning
</content>
</entry>
<entry>
<title>utf8: NO_ICONV: silence uninitialized variable warning</title>
<updated>2015-06-05T22:36:35Z</updated>
<author>
<name>Eric Sunshine</name>
<email>sunshine@sunshineco.com</email>
</author>
<published>2015-06-05T06:42:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e654eb29ab9da97f6acabc261f88aa1f41f78a8f'/>
<id>urn:sha1:e654eb29ab9da97f6acabc261f88aa1f41f78a8f</id>
<content type='text'>
The last argument of reencode_string_len() is an 'int *' which is
assigned the length of the converted string. When NO_ICONV is defined,
however, reencode_string_len() is stubbed out by the macro:

    #define reencode_string_len(a,b,c,d,e) NULL

which never assigns a value to the final argument. When called like
this:

    int n;
    char *s = reencode_string_len(..., &amp;n);
    if (s)
        do_something(s, n);

some compilers complain that 'n' is used uninitialized within the
conditional.

Signed-off-by: Eric Sunshine &lt;sunshine@sunshineco.com&gt;
Reviewed-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>utf8-bom: introduce skip_utf8_bom() helper</title>
<updated>2015-04-16T18:35:06Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2015-04-16T17:45:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=dde843e7378f65004415bd108038659de9ce2abd'/>
<id>urn:sha1:dde843e7378f65004415bd108038659de9ce2abd</id>
<content type='text'>
With the recent change to ignore the UTF8 BOM at the beginning of
.gitignore files, we now have two codepaths that do such a skipping
(the other one is for reading the configuration files).

Introduce utf8_bom[] constant string and skip_utf8_bom() helper
and teach .gitignore code how to use it.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>utf8: add is_hfs_dotgit() helper</title>
<updated>2014-12-17T19:04:39Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2014-12-15T22:56:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=6162a1d323d24fd8cbbb1a6145a91fb849b2568f'/>
<id>urn:sha1:6162a1d323d24fd8cbbb1a6145a91fb849b2568f</id>
<content type='text'>
We do not allow paths with a ".git" component to be added to
the index, as that would mean repository contents could
overwrite our repository files. However, asking "is this
path the same as .git" is not as simple as strcmp() on some
filesystems.

HFS+'s case-folding does more than just fold uppercase into
lowercase (which we already handle with strcasecmp). It may
also skip past certain "ignored" Unicode code points, so
that (for example) ".gi\u200ct" is mapped ot ".git".

The full list of folds can be found in the tables at:

  https://www.opensource.apple.com/source/xnu/xnu-1504.15.3/bsd/hfs/hfscommon/Unicode/UCStringCompareData.h

Implementing a full "is this path the same as that path"
comparison would require us importing the whole set of
tables.  However, what we want to do is much simpler: we
only care about checking ".git". We know that 'G' is the
only thing that folds to 'g', and so on, so we really only
need to deal with the set of ignored code points, which is
much smaller.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
