git/split-index.h, branch jch

hash-ll: merge with "hash.h"

2024-06-14T17:26:33Z

The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split out of hash.h to remove dependency on repository.h, 2023-04-22) to make explicit the split between hash-related functions that rely on the global `the_repository`, and those that don't. This split is no longer necessary now that we we have removed the reliance on `the_repository`. Merge "hash-ll.h" back into "hash.h". This causes some code units to not include "repository.h" anymore, which requires us to add some forward declarations. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

hash-ll.h: split out of hash.h to remove dependency on repository.h

2023-04-24T19:47:32Z

hash.h depends upon and includes repository.h, due to the definition and use of the_hash_algo (defined as the_repository->hash_algo). However, most headers trying to include hash.h are only interested in the layout of the structs like object_id. Move the parts of hash.h that do not depend upon repository.h into a new file hash-ll.h (the "low level" parts of hash.h), and adjust other files to use this new header where the convenience inline functions aren't needed. This allows hash.h and object.h to be fairly small, minimal headers. It also exposes a lot of hidden dependencies on both path.h (which was brought in by repository.h) and repository.h (which was previously implicitly brought in by object.h), so also adjust other files to be more explicit about what they depend upon. Signed-off-by: Elijah Newren Signed-off-by: Junio C Hamano

treewide: reduce includes of cache.h in other headers

2023-04-11T15:52:11Z

We had a handful of headers including cache.h that didn't need to anymore. Drop those includes and replace them with includes of smaller files, or forward declarations. However, note that two .c files now need to directly include cache.h, though they should have been including it all along given they are directly using structs defined in it. Signed-off-by: Elijah Newren Acked-by: Calvin Wan Signed-off-by: Junio C Hamano

split-index: convert struct split_index to object_id

2018-05-02T04:59:50Z

Convert the base_sha1 member of struct split_index to use struct object_id and rename it base_oid. Include cache.h to make the structure visible. Signed-off-by: brian m. carlson Signed-off-by: Junio C Hamano

split-index: rename 'new' variables

2018-02-22T18:08:05Z

Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams Signed-off-by: Junio C Hamano

Revert "split-index: add and use unshare_split_index()"

2017-06-24T19:02:39Z

This reverts commit f9d7abec2ad2f9eb3d8873169cc28c34273df082; see public-inbox.org/git/CAP8UFD0bOfzY-_hBDKddOcJdPUpP2KEVaX_SrCgvAMYAHtseiQ@mail.gmail.com

split-index: add and use unshare_split_index()

2017-05-08T01:50:20Z

When split-index is being used, we have two cache_entry arrays in index_state->cache[] and index_state->split_index->base->cache[]. index_state->cache[] may share the same entries with base->cache[] so we can quickly determine what entries are shared. This makes memory management tricky, we can't free base->cache[] until we know index_state->cache[] does not point to any of those entries. unshare_split_index() is added for this purpose, to find shared entries and either duplicate them in index_state->cache[], or discard them. Either way it should be safe to free base->cache[] after unshare_split_index(). Signed-off-by: Nguyễn Thái Ngọc Duy Signed-off-by: Junio C Hamano

split-index: add {add,remove}_split_index() functions

2017-03-01T21:24:21Z

Also use the functions in cmd_update_index() in builtin/update-index.c. These functions will be used in a following commit to tweak our use of the split-index feature depending on the setting of a configuration variable. Signed-off-by: Christian Couder Signed-off-by: Junio C Hamano

split-index: the reading part

2014-06-13T18:49:40Z

CE_REMOVE'd entries are removed here because only parts of the code base (unpack_trees in fact) test this bit when they look for the presence of an entry. Leaving them may confuse the code ignores this bit and expects to see a real entry. Signed-off-by: Nguyễn Thái Ngọc Duy Signed-off-by: Junio C Hamano

split-index: the writing part

2014-06-13T18:49:40Z

prepare_to_write_split_index() does the major work, classifying deleted, updated and added entries. write_link_extension() then just writes it down. An observation is, deleting an entry, then adding it back is recorded as "entry X is deleted, entry X is added", not "entry X is replaced". This is simpler, with small overhead: a replaced entry is stored without its path, a new entry is store with its path. A note about unpack_trees() and the deduplication code inside prepare_to_write_split_index(). Usually tracking updated/removed entries via read-cache API is enough. unpack_trees() manipulates the index in a different way: it throws the entire source index out, builds up a new one, copying/duplicating entries (using dup_entry) from the source index over if necessary, then returns the new index. A naive solution would be marking the entire source index "deleted" and add their duplicates as new. That could bring $GIT_DIR/index back to the original size. So we try harder and memcmp() between the original and the duplicate to see if it needs updating. We could avoid memcmp() too, by avoiding duplicating the original entry in dup_entry(). The performance gain this way is within noise level and it complicates unpack-trees.c. So memcmp() is the preferred way to deal with deduplication. Signed-off-by: Nguyễn Thái Ngọc Duy Signed-off-by: Junio C Hamano