<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/split-index.h, branch jch</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=jch</id>
<link rel='self' href='https://git.shady.money/git/atom?h=jch'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2024-06-14T17:26:33Z</updated>
<entry>
<title>hash-ll: merge with "hash.h"</title>
<updated>2024-06-14T17:26:33Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2024-06-14T06:50:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8a676bdc5c3a9377322834326605b3804c7c54bf'/>
<id>urn:sha1:8a676bdc5c3a9377322834326605b3804c7c54bf</id>
<content type='text'>
The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split
out of hash.h to remove dependency on repository.h, 2023-04-22) to make
explicit the split between hash-related functions that rely on the
global `the_repository`, and those that don't. This split is no longer
necessary now that we we have removed the reliance on `the_repository`.

Merge "hash-ll.h" back into "hash.h". This causes some code units to not
include "repository.h" anymore, which requires us to add some forward
declarations.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hash-ll.h: split out of hash.h to remove dependency on repository.h</title>
<updated>2023-04-24T19:47:32Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-04-22T20:17:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d1cbe1e6d8a9cab2b4ffe8a17d34db214dce1e49'/>
<id>urn:sha1:d1cbe1e6d8a9cab2b4ffe8a17d34db214dce1e49</id>
<content type='text'>
hash.h depends upon and includes repository.h, due to the definition and
use of the_hash_algo (defined as the_repository-&gt;hash_algo).  However,
most headers trying to include hash.h are only interested in the layout
of the structs like object_id.  Move the parts of hash.h that do not
depend upon repository.h into a new file hash-ll.h (the "low level"
parts of hash.h), and adjust other files to use this new header where
the convenience inline functions aren't needed.

This allows hash.h and object.h to be fairly small, minimal headers.  It
also exposes a lot of hidden dependencies on both path.h (which was
brought in by repository.h) and repository.h (which was previously
implicitly brought in by object.h), so also adjust other files to be
more explicit about what they depend upon.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>treewide: reduce includes of cache.h in other headers</title>
<updated>2023-04-11T15:52:11Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-04-11T07:42:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b7b189cd5ae99f336c1185f8f8c27a118314ced1'/>
<id>urn:sha1:b7b189cd5ae99f336c1185f8f8c27a118314ced1</id>
<content type='text'>
We had a handful of headers including cache.h that didn't need to
anymore.  Drop those includes and replace them with includes of
smaller files, or forward declarations.  However, note that two .c
files now need to directly include cache.h, though they should have
been including it all along given they are directly using structs
defined in it.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Acked-by: Calvin Wan &lt;calvinwan@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>split-index: convert struct split_index to object_id</title>
<updated>2018-05-02T04:59:50Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2018-05-02T00:25:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=2182abd94bf69a8daba18e71efae3e9f02a6c465'/>
<id>urn:sha1:2182abd94bf69a8daba18e71efae3e9f02a6c465</id>
<content type='text'>
Convert the base_sha1 member of struct split_index to use struct
object_id and rename it base_oid.  Include cache.h to make the structure
visible.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>split-index: rename 'new' variables</title>
<updated>2018-02-22T18:08:05Z</updated>
<author>
<name>Brandon Williams</name>
<email>bmwill@google.com</email>
</author>
<published>2018-02-14T18:59:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=75b7b971ae07882c5ec6f76e21a0dbd76d70c187'/>
<id>urn:sha1:75b7b971ae07882c5ec6f76e21a0dbd76d70c187</id>
<content type='text'>
Rename C++ keyword in order to bring the codebase closer to being able
to be compiled with a C++ compiler.

Signed-off-by: Brandon Williams &lt;bmwill@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Revert "split-index: add and use unshare_split_index()"</title>
<updated>2017-06-24T19:02:39Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2017-06-24T19:02:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=64719b115d61582fa501690ee6caff4c478b4b1a'/>
<id>urn:sha1:64719b115d61582fa501690ee6caff4c478b4b1a</id>
<content type='text'>
This reverts commit f9d7abec2ad2f9eb3d8873169cc28c34273df082;
see public-inbox.org/git/CAP8UFD0bOfzY-_hBDKddOcJdPUpP2KEVaX_SrCgvAMYAHtseiQ@mail.gmail.com
</content>
</entry>
<entry>
<title>split-index: add and use unshare_split_index()</title>
<updated>2017-05-08T01:50:20Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2017-05-05T14:57:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=f9d7abec2ad2f9eb3d8873169cc28c34273df082'/>
<id>urn:sha1:f9d7abec2ad2f9eb3d8873169cc28c34273df082</id>
<content type='text'>
When split-index is being used, we have two cache_entry arrays in
index_state-&gt;cache[] and index_state-&gt;split_index-&gt;base-&gt;cache[].

index_state-&gt;cache[] may share the same entries with base-&gt;cache[] so
we can quickly determine what entries are shared. This makes memory
management tricky, we can't free base-&gt;cache[] until we know
index_state-&gt;cache[] does not point to any of those entries.

unshare_split_index() is added for this purpose, to find shared
entries and either duplicate them in index_state-&gt;cache[], or discard
them. Either way it should be safe to free base-&gt;cache[] after
unshare_split_index().

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>split-index: add {add,remove}_split_index() functions</title>
<updated>2017-03-01T21:24:21Z</updated>
<author>
<name>Christian Couder</name>
<email>christian.couder@gmail.com</email>
</author>
<published>2017-02-27T18:00:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=cef4fc7ebe869e910d0fd5643cd60328ed76356a'/>
<id>urn:sha1:cef4fc7ebe869e910d0fd5643cd60328ed76356a</id>
<content type='text'>
Also use the functions in cmd_update_index() in
builtin/update-index.c.

These functions will be used in a following commit to tweak
our use of the split-index feature depending on the setting
of a configuration variable.

Signed-off-by: Christian Couder &lt;chriscool@tuxfamily.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>split-index: the reading part</title>
<updated>2014-06-13T18:49:40Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2014-06-13T12:19:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=76b07b37a3fff15ce4852bd070a610b651c42e77'/>
<id>urn:sha1:76b07b37a3fff15ce4852bd070a610b651c42e77</id>
<content type='text'>
CE_REMOVE'd entries are removed here because only parts of the code
base (unpack_trees in fact) test this bit when they look for the
presence of an entry. Leaving them may confuse the code ignores this
bit and expects to see a real entry.

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>split-index: the writing part</title>
<updated>2014-06-13T18:49:40Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2014-06-13T12:19:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=96a1d8d34c16be8fc5acd7ceb4d34573d8e2a76b'/>
<id>urn:sha1:96a1d8d34c16be8fc5acd7ceb4d34573d8e2a76b</id>
<content type='text'>
prepare_to_write_split_index() does the major work, classifying
deleted, updated and added entries. write_link_extension() then just
writes it down.

An observation is, deleting an entry, then adding it back is recorded
as "entry X is deleted, entry X is added", not "entry X is replaced".
This is simpler, with small overhead: a replaced entry is stored
without its path, a new entry is store with its path.

A note about unpack_trees() and the deduplication code inside
prepare_to_write_split_index(). Usually tracking updated/removed
entries via read-cache API is enough. unpack_trees() manipulates the
index in a different way: it throws the entire source index out,
builds up a new one, copying/duplicating entries (using dup_entry)
from the source index over if necessary, then returns the new index.

A naive solution would be marking the entire source index "deleted"
and add their duplicates as new. That could bring $GIT_DIR/index back
to the original size. So we try harder and memcmp() between the
original and the duplicate to see if it needs updating.

We could avoid memcmp() too, by avoiding duplicating the original
entry in dup_entry(). The performance gain this way is within noise
level and it complicates unpack-trees.c. So memcmp() is the preferred
way to deal with deduplication.

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
