<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/hex.c, branch v2.45.4</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.45.4</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.45.4'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2023-09-29T22:14:56Z</updated>
<entry>
<title>hex-ll: separate out non-hash-algo functions</title>
<updated>2023-09-29T22:14:56Z</updated>
<author>
<name>Calvin Wan</name>
<email>calvinwan@google.com</email>
</author>
<published>2023-09-29T21:20:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d88e8106e892109f4f4b13312dd026d0a7fcfeec'/>
<id>urn:sha1:d88e8106e892109f4f4b13312dd026d0a7fcfeec</id>
<content type='text'>
In order to further reduce all-in-one headers, separate out functions in
hex.h that do not operate on object hashes into its own file, hex-ll.h,
and update the include directives in the .c files that need only such
functions accordingly.

Signed-off-by: Calvin Wan &lt;calvinwan@google.com&gt;
Signed-off-by: Jonathan Tan &lt;jonathantanmy@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hex: retire get_sha1_hex()</title>
<updated>2023-07-24T23:11:23Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-07-24T23:11:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=08e5fb1296238c9c4468ae2cfbd7a49045159c60'/>
<id>urn:sha1:08e5fb1296238c9c4468ae2cfbd7a49045159c60</id>
<content type='text'>
The naming convention around get_sha1_hex() and its friends is
awkward these days, after "struct object_id" was introduced.

There are three public functions around this area:

 * get_sha1_hex()       - use the implied the_hash_algo, fill uchar *
 * get_oid_hex()        - use the implied the_hash_algo, fill oid *
 * get_oid_hex_algop()  - use the passed algop, fill oid *

Between the latter two, the "_algop" suffix signals whether the
the_hash_algo is used as the implied algorithm or the caller should
pass an algorithm explicitly.  That is very much understandable and
is a good convention.

Between the former two, however, the "SHA1" vs "OID" in the names
differentiate in what type of variable the result is stored.

We could argue that it makes sense to use "SHA1" to mean "flat byte
buffer" to honor the historical practice in the days before "struct
object_id" was invented, but the natural fourth friend of the above
group would take an algop and fill a flat byte buffer, and it would
be strange to name it get_sha1_hex_algop().  Do we use the passed in
algo, or are we limited to SHA-1 ;-)?

In fact, such a function exists, albeit as a private helper function
used by the implementation of these functions, and is named a lot
more sensibly: get_hash_hex_algop().

Correct the misnomer of get_sha1_hex() and use "hash", instead of
"sha1", as "flat byte buffer that stores binary (as opposed to
hexadecimal) representation of the hash".

The four (2x2) friends now become:

 * get_hash_hex()       - use the implied the_hash_algo, fill uchar *
 * get_oid_hex()        - use the implied the_hash_algo, fill oid *
 * get_hash_hex_algop() - use the passed algop, fill uchar *
 * get_oid_hex_algop()  - use the passed algop, fill oid *

As there are only two remaining calls to get_sha1_hex() in the
codebase right now, the blast radious of this change is fairly
small.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hash-ll.h: split out of hash.h to remove dependency on repository.h</title>
<updated>2023-04-24T19:47:32Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-04-22T20:17:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d1cbe1e6d8a9cab2b4ffe8a17d34db214dce1e49'/>
<id>urn:sha1:d1cbe1e6d8a9cab2b4ffe8a17d34db214dce1e49</id>
<content type='text'>
hash.h depends upon and includes repository.h, due to the definition and
use of the_hash_algo (defined as the_repository-&gt;hash_algo).  However,
most headers trying to include hash.h are only interested in the layout
of the structs like object_id.  Move the parts of hash.h that do not
depend upon repository.h into a new file hash-ll.h (the "low level"
parts of hash.h), and adjust other files to use this new header where
the convenience inline functions aren't needed.

This allows hash.h and object.h to be fairly small, minimal headers.  It
also exposes a lot of hidden dependencies on both path.h (which was
brought in by repository.h) and repository.h (which was previously
implicitly brought in by object.h), so also adjust other files to be
more explicit about what they depend upon.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hex.h: move some hex-related declarations from cache.h</title>
<updated>2023-02-24T01:25:28Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-02-24T00:09:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b73ecb48114926d063d7ab96943bafcc0ae913b6'/>
<id>urn:sha1:b73ecb48114926d063d7ab96943bafcc0ae913b6</id>
<content type='text'>
hex.c contains code for hex-related functions, but for some reason these
functions were declared in the catch-all cache.h.  Move the function
declarations into a hex.h header instead.

This also allows us to remove includes of cache.h from a few C files.
For now, we make cache.h include hex.h, so that it is easier to review
the direct changes being made by this patch.  In the next patch, we will
remove that, and add the necessary direct '#include "hex.h"' in the
hundreds of C files that need it.

Note that reviewing the header changes in this commit might be
simplified via
    git log --no-walk -p --color-moved $COMMIT -- '*.h'`
In particular, it highlights the simple movement of code in .h files
rather nicely.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hex: print objects using the hash algorithm member</title>
<updated>2021-04-27T07:31:39Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2021-04-26T01:03:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3dd71461e25b4cc7ea2a2d8deef1c0486bb32580'/>
<id>urn:sha1:3dd71461e25b4cc7ea2a2d8deef1c0486bb32580</id>
<content type='text'>
Now that all code paths correctly set the hash algorithm member of
struct object_id, write an object's hex representation using the hash
algorithm member embedded in it.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hex: default to the_hash_algo on zero algorithm value</title>
<updated>2021-04-27T07:31:39Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2021-04-26T01:03:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b8505ecbf2b1e4ef27b9597fd113cb1679792b29'/>
<id>urn:sha1:b8505ecbf2b1e4ef27b9597fd113cb1679792b29</id>
<content type='text'>
There are numerous places in the codebase where we assume we can
initialize data by zeroing all its bytes.  However, when we do that with
a struct object_id, it leaves the structure with a zero value for the
algorithm, which is invalid.

We could forbid this pattern and require that all struct object_id
instances be initialized using oidclr, but this seems burdensome and
it's unnatural to most C programmers.  Instead, if the algorithm is
zero, assume we wanted to use the default hash algorithm instead.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hash: set, copy, and use algo field in struct object_id</title>
<updated>2021-04-27T07:31:38Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2021-04-26T01:02:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5a6dce70d7bb12ee2bc7926254c5b6741b91ac5f'/>
<id>urn:sha1:5a6dce70d7bb12ee2bc7926254c5b6741b91ac5f</id>
<content type='text'>
Now that struct object_id has an algorithm field, we should populate it.
This will allow us to handle object IDs in any supported algorithm and
distinguish between them.  Ensure that the field is written whenever we
write an object ID by storing it explicitly every time we write an
object.  Set values for the empty blob and tree values as well.

In addition, use the algorithm field to compare object IDs.  Note that
because we zero-initialize struct object_id in many places throughout
the codebase, we default to the default algorithm in cases where the
algorithm field is zero rather than explicitly initialize all of those
locations.

This leads to a branch on every comparison, but the alternative is to
compare the entire buffer each time and padding the buffer for SHA-1.
That alternative ranges up to 3.9% worse than this approach on the perf
t0001, t1450, and t1451.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hex: add functions to parse hex object IDs in any algorithm</title>
<updated>2020-02-24T17:33:21Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2020-02-22T20:17:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=61e2a70ff26f83f761fa36e0cc16996878d8dd59'/>
<id>urn:sha1:61e2a70ff26f83f761fa36e0cc16996878d8dd59</id>
<content type='text'>
There are some places where we need to parse a hex object ID in any
algorithm without knowing beforehand which algorithm is in use. An
example is when parsing fast-import marks.

Add a get_oid_hex_any to parse an object ID and return the algorithm it
belongs to, and additionally add parse_oid_hex_any which is the
equivalent change for parse_oid_hex. If the object is not parseable, we
return GIT_HASH_UNKNOWN.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hex: introduce parsing variants taking hash algorithms</title>
<updated>2020-02-24T17:33:21Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2020-02-22T20:17:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=dadacf10dc9e11046e2c8c49347174e71cef3fa3'/>
<id>urn:sha1:dadacf10dc9e11046e2c8c49347174e71cef3fa3</id>
<content type='text'>
Introduce variants of get_oid_hex and parse_oid_hex that parse an
arbitrary hash algorithm, implementing internal functions to avoid
duplication.  These functions can be used in the transport code to parse
refs properly.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hex: drop sha1_to_hex()</title>
<updated>2019-11-13T01:09:10Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2019-11-11T09:04:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b19f3fe9dd1a0c00fb8b30df9896912727f2eea2'/>
<id>urn:sha1:b19f3fe9dd1a0c00fb8b30df9896912727f2eea2</id>
<content type='text'>
There's only a single caller left of sha1_to_hex(), since everybody
that has an object name in "unsigned char[]" now uses hash_to_hex()
instead.

This case is in the sha1dc wrapper, where we print a hex sha1 when
we find a collision. This one will always be sha1, regardless of the
current hash algorithm, so we can't use hash_to_hex() here. In
practice we'd probably not be running sha1 at all if it isn't the
current algorithm, but it's possible we might still occasionally
need to compute a sha1 in a post-sha256 world.

Since sha1_to_hex() is just a wrapper for hash_to_hex_algop(), let's
call that ourselves. There's value in getting rid of the sha1-specific
wrapper to de-clutter the global namespace, and to make sure nobody uses
it (and as with sha1_to_hex_r() in the previous patch, we'll drop the
coccinelle transformations, too).

The sha1_to_hex() function is mentioned in a comment; we can easily
swap that out for oid_to_hex() to give a better example.  Also
update the comment that was left stale when we added "struct
object_id *" as a way to name an object and added functions to
convert it to hex.

The function is also mentioned in some test vectors in t4100, but
that's not runnable code, so there's no point in trying to clean it
up.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
