<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/hash.h, branch v2.45.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.45.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.45.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2023-10-02T21:57:39Z</updated>
<entry>
<title>cache: add a function to read an OID of a specific algorithm</title>
<updated>2023-10-02T21:57:39Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2023-10-02T02:40:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=095261a18d91cf47e8e8ef3b55f5e88e52ffc85f'/>
<id>urn:sha1:095261a18d91cf47e8e8ef3b55f5e88e52ffc85f</id>
<content type='text'>
Currently, we always read a object ID of the current algorithm with
oidread.  However, once we start converting objects, we'll need to
consider what happens when we want to read an object ID of a specific
algorithm, such as the compatibility algorithm.  To make this easier,
let's define oidread_algop, which specifies which algorithm we should
use for our object ID, and define oidread in terms of it.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hash-ll.h: split out of hash.h to remove dependency on repository.h</title>
<updated>2023-04-24T19:47:32Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-04-22T20:17:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d1cbe1e6d8a9cab2b4ffe8a17d34db214dce1e49'/>
<id>urn:sha1:d1cbe1e6d8a9cab2b4ffe8a17d34db214dce1e49</id>
<content type='text'>
hash.h depends upon and includes repository.h, due to the definition and
use of the_hash_algo (defined as the_repository-&gt;hash_algo).  However,
most headers trying to include hash.h are only interested in the layout
of the structs like object_id.  Move the parts of hash.h that do not
depend upon repository.h into a new file hash-ll.h (the "low level"
parts of hash.h), and adjust other files to use this new header where
the convenience inline functions aren't needed.

This allows hash.h and object.h to be fairly small, minimal headers.  It
also exposes a lot of hidden dependencies on both path.h (which was
brought in by repository.h) and repository.h (which was previously
implicitly brought in by object.h), so also adjust other files to be
more explicit about what they depend upon.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hash.h: move some oid-related declarations from cache.h</title>
<updated>2023-02-24T01:25:28Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-02-24T00:09:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=41227cb138c91fbd369ac6ee4877f253b39260cc'/>
<id>urn:sha1:41227cb138c91fbd369ac6ee4877f253b39260cc</id>
<content type='text'>
These defines and enum are all oid-related and as such seem to make
more sense being included in hash.h.  Further, moving them there
allows us to remove some includes of cache.h in other files.

The change to line-log.h might look unrelated, but line-log.h includes
diffcore.h, which previously included cache.h, which included the
kitchen sink.  Since this patch makes diffcore.h no longer include
cache.h, the compiler complains about the 'struct string_list *'
function parameter.  Add a forward declaration for struct string_list to
address this.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>treewide: remove unnecessary git-compat-util.h includes in headers</title>
<updated>2023-02-24T01:25:28Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-02-24T00:09:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=f332121e75d3aa2b0ce7efd120ac3ede19e9a733'/>
<id>urn:sha1:f332121e75d3aa2b0ce7efd120ac3ede19e9a733</id>
<content type='text'>
For sanity, we should probably do one of the following:

(a) make C and header files both depend upon everything they need
(b) consistently exclude git-compat-util.h from headers and require it
    be the first include in C files

Currently, we have some of the headers following (a) and others
following (b), which makes things messy.  In the past I was pushed
towards (b), as per [1] and [2].  Further, during this series I
discovered that this mixture empirically will mean that we end up with C
files that do not directly include git-compat-util.h, and do include
headers that don't include git-compat-util.h, with the result that we
likely have headers included before an indirect inclusion of
git-compat-util.h.  Since git-compat-util.h has tricky platform-specific
stuff that is meant to be included before everything else, this state of
affairs is risky and may lead to things breaking in subtle ways (and
only on some platforms) as per [1] and [2].

Since including git-compat-util.h in existing header files makes it
harder for us to catch C files that are missing that include, let's
switch to (b) to make the enforcement of this rule easier.  Remove the
inclusion of git-compat-util.h from header files other than the ones
that have been approved as alternate first includes.

[1] https://lore.kernel.org/git/20180811173406.GA9119@sigill.intra.peff.net/
[2] https://lore.kernel.org/git/20180811174301.GA9287@sigill.intra.peff.net/

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Makefile + hash.h: remove PPC_SHA1 implementation</title>
<updated>2022-08-31T21:37:31Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2022-08-31T09:18:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9dc523aa0e20271cfe1474bef9fafbe62b7ff603'/>
<id>urn:sha1:9dc523aa0e20271cfe1474bef9fafbe62b7ff603</id>
<content type='text'>
Remove the PPC_SHA1 implementation added in a6ef3518f9a ([PATCH] PPC
assembly implementation of SHA1, 2005-04-22). When this was added
Apple consumer hardware used the PPC architecture, and the
implementation was intended to improve SHA-1 speed there.

Since it was added we've moved to using sha1collisiondetection by
default, and anyone wanting hard-rolled non-DC SHA-1 implementation
can use OpenSSL's via the OPENSSL_SHA1 knob.

The PPC_SHA1 originally originally targeted 32 bit PPC, and later the
64 bit PPC 970 (a.k.a. Apple PowerPC G5). See 926172c5e48 (block-sha1:
improve code on large-register-set machines, 2009-08-10) for a
reference about the performance on G5 (a comment in block-sha1/sha1.c
being removed here).

I can't get it to do anything but segfault on both the BE and LE POWER
machines in the GCC compile farm[1]. Anyone who's concerned about
performance on PPC these days is likely to be using the IBM POWER
processors.

There have been proposals to entirely remove non-sha1collisiondetection
implementations from the tree[2]. I think per [3] that would be a bit
overzealous. I.e. there are various set-ups git's speed is going to be
more important than the relatively implausible SHA-1 collision attack,
or where such attacks are entirely mitigated by other means (e.g. by
incoming objects being checked with DC_SHA1).

But that really doesn't apply to PPC_SHA1 in particular, which seems
to have outlived its usefulness.

As this gets rid of the only in-tree *.S assembly file we can remove
the small bits of logic from the Makefile needed to build objects
from *.S (as opposed to *.c)

The code being removed here was also throwing warnings with the
"-pedantic" flag, it could have been fixed as 544d93bc3b4 (block-sha1:
remove use of obsolete x86 assembly, 2022-03-10) did for block-sha1/*,
but as noted above let's remove it instead.

1. https://cfarm.tetaneutral.net/machines/list/
   Tested on gcc{110,112,135,203}, a mixture of POWER [789] ppc64 and
   ppc64le. All segfault in anything needing object
   hashing (e.g. t/t1007-hash-object.sh) when compiled with
   PPC_SHA1=Y.
2. https://lore.kernel.org/git/20200223223758.120941-1-mh@glandium.org/
3. https://lore.kernel.org/git/20200224044732.GK1018190@coredump.intra.peff.net/

Acked-by: brian m. carlson" &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>sha256: add support for Nettle</title>
<updated>2022-07-10T21:43:34Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2022-07-10T13:29:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=e55573583685eda80dd33d85ab2cea59b86332c5'/>
<id>urn:sha1:e55573583685eda80dd33d85ab2cea59b86332c5</id>
<content type='text'>
For SHA-256, we currently have support for OpenSSL and libgcrypt because
these two libraries contain optimized implementations that can take
advantage of native processor instructions.  However, OpenSSL is not
suitable for linking against for Linux distros due to licensing
incompatibilities with the GPLv2, and libgcrypt has been less favored by
cryptographers due to some security-related implementation issues,
which, while not affecting our use of hash algorithms, has affected its
reputation.

Let's add another option that's compatible with the GPLv2, which is
Nettle.  This is an option which is generally better than libgcrypt
because on many distros GnuTLS (which uses Nettle) is used for HTTPS and
therefore as a practical matter it will be available on most systems.
As a result, prefer it over libgcrypt and our built-in implementation.

Nettle also has recently gained support for Intel's SHA-NI instructions,
which compare very favorably to other implementations, as well as
assembly implementations for when SHA-NI is not available.

A git gc on git.git sees a 12% performance improvement with Nettle over
our block SHA-256 implementation due to general assembly improvements.
With SHA-NI, the performance of raw SHA-256 on a 2 GiB file goes from
7.296 seconds with block SHA-256 to 1.523 seconds with Nettle.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hash.h: provide constants for the hash IDs</title>
<updated>2021-09-07T17:47:43Z</updated>
<author>
<name>Han-Wen Nienhuys</name>
<email>hanwen@google.com</email>
</author>
<published>2021-08-30T14:57:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=27f3796ac1f9b711fae506f63186034ed178690c'/>
<id>urn:sha1:27f3796ac1f9b711fae506f63186034ed178690c</id>
<content type='text'>
This will simplify referencing them from code that is not deeply integrated with
Git, in particular, the reftable library.

Signed-off-by: Han-Wen Nienhuys &lt;hanwen@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>oidtree: avoid unaligned access to crit-bit tree</title>
<updated>2021-08-15T20:13:50Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2021-08-14T20:00:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8bcda98da5db9003145674d8a7cbe7b81cab0d24'/>
<id>urn:sha1:8bcda98da5db9003145674d8a7cbe7b81cab0d24</id>
<content type='text'>
The flexible array member "k" of struct cb_node is used to store the key
of the crit-bit tree node.  It offers no alignment guarantees -- in fact
the current struct layout puts it one byte after a 4-byte aligned
address, i.e. guaranteed to be misaligned.

oidtree uses a struct object_id as cb_node key.  Since cf0983213c (hash:
add an algo member to struct object_id, 2021-04-26) it requires 4-byte
alignment.  The mismatch is reported by UndefinedBehaviorSanitizer at
runtime like this:

hash.h:277:2: runtime error: member access within misaligned address 0x00015000802d for type 'struct object_id', which requires 4 byte alignment
0x00015000802d: note: pointer points here
 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00
             ^
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior hash.h:277:2 in

We can fix that by:

1. eliminating the alignment requirement of struct object_id,
2. providing the alignment in struct cb_node, or
3. avoiding the issue by only using memcpy to access "k".

Currently we only store one of two values in "algo" in struct object_id.
We could use a uint8_t for that instead and widen it only once we add
support for our twohundredth algorithm or so.  That would not only avoid
alignment issues, but also reduce the memory requirements for each
instance of struct object_id by ca. 9%.

Supporting keys with alignment requirements might be useful to spread
the use of crit-bit trees.  It can be achieved by using a wider type for
"k" (e.g. uintmax_t), using different types for the members "byte" and
"otherbits" (e.g. uint16_t or uint32_t for each), or by avoiding the use
of flexible arrays like khash.h does.

This patch implements the third option, though, because it has the least
potential for causing side-effects and we're close to the next release.
If one of the other options is implemented later as well to get their
additional benefits we can get rid of the extra copies introduced here.

Reported-by: Andrzej Hunt &lt;andrzej@ahunt.org&gt;
Signed-off-by: René Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>oidcpy_with_padding: constify `src' arg</title>
<updated>2021-07-08T04:28:02Z</updated>
<author>
<name>Eric Wong</name>
<email>e@80x24.org</email>
</author>
<published>2021-07-07T23:10:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=90e07f0a342df836a33b92e179eb105243dba88d'/>
<id>urn:sha1:90e07f0a342df836a33b92e179eb105243dba88d</id>
<content type='text'>
As with `oidcpy', the source struct will not be modified and
this will allow an upcoming const-correct caller to use it.

Signed-off-by: Eric Wong &lt;e@80x24.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>parallel-checkout: send the new object_id algo field to the workers</title>
<updated>2021-05-17T20:38:54Z</updated>
<author>
<name>Matheus Tavares</name>
<email>matheus.bernardino@usp.br</email>
</author>
<published>2021-05-17T19:49:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3d20ed27b8bdac5c82f4f78af802f9afa499f651'/>
<id>urn:sha1:3d20ed27b8bdac5c82f4f78af802f9afa499f651</id>
<content type='text'>
An object_id storing a SHA-1 name has some unused bytes at the end of
the hash array. Since these bytes are not used, they are usually not
initialized to any value either. However, at
parallel_checkout.c:send_one_item() the object_id of a cache entry is
copied into a buffer which is later sent to a checkout worker through a
pipe write(). This makes Valgrind complain about passing uninitialized
bytes to a syscall. The worker won't use these uninitialized bytes
either, but the warning could confuse someone trying to debug this code;
So instead of using oidcpy(), send_one_item() uses hashcpy() to only
copy the used/initialized bytes of the object_id, and leave the
remaining part with zeros.

However, since cf0983213c ("hash: add an algo member to struct
object_id", 2021-04-26), using hashcpy() is no longer sufficient here as
it won't copy the new algo field from the object_id. Let's add and use a
new function which meets both our requirements of copying all the
important object_id data while still avoiding the uninitialized bytes,
by padding the end of the hash array in the destination object_id. With
this change, we also no longer need the destination buffer from
send_one_item() to be initialized with zeros, so let's switch from
xcalloc() to xmalloc() to make this clear.

Signed-off-by: Matheus Tavares &lt;matheus.bernardino@usp.br&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
