<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/hashmap.h, branch jch</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=jch</id>
<link rel='self' href='https://git.shady.money/git/atom?h=jch'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2023-06-21T20:39:54Z</updated>
<entry>
<title>hash-ll, hashmap: move oidhash() to hash-ll</title>
<updated>2023-06-21T20:39:54Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-05-16T06:34:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b9a7ac2c6897efbf78fd546b21986498577e3585'/>
<id>urn:sha1:b9a7ac2c6897efbf78fd546b21986498577e3585</id>
<content type='text'>
oidhash() was used by both hashmap and khash, which makes sense.
However, the location of this function in hashmap.[ch] meant that
khash.h had to depend upon hashmap.h, making people unfamiliar with
khash think that it was built upon hashmap.  (Or at least, I personally
was confused for a while about this in the past.)

Move this function to hash-ll, so that khash.h can stop depending upon
hashmap.h.

This has another benefit as well: it allows us to remove hashmap.h's
dependency on hash-ll.h.  While some callers of hashmap.h were making
use of oidhash, most were not, so this change provides another way to
reduce the number of includes.

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'en/header-split-cache-h-part-2'</title>
<updated>2023-05-09T23:45:46Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-05-09T23:45:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ccd12a3d6cc62f51b746654ae56e26d92f89ba92'/>
<id>urn:sha1:ccd12a3d6cc62f51b746654ae56e26d92f89ba92</id>
<content type='text'>
More header clean-up.

* en/header-split-cache-h-part-2: (22 commits)
  reftable: ensure git-compat-util.h is the first (indirect) include
  diff.h: reduce unnecessary includes
  object-store.h: reduce unnecessary includes
  commit.h: reduce unnecessary includes
  fsmonitor: reduce includes of cache.h
  cache.h: remove unnecessary headers
  treewide: remove cache.h inclusion due to previous changes
  cache,tree: move basic name compare functions from read-cache to tree
  cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c
  hash-ll.h: split out of hash.h to remove dependency on repository.h
  tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h
  dir.h: move DTYPE defines from cache.h
  versioncmp.h: move declarations for versioncmp.c functions from cache.h
  ws.h: move declarations for ws.c functions from cache.h
  match-trees.h: move declarations for match-trees.c functions from cache.h
  pkt-line.h: move declarations for pkt-line.c functions from cache.h
  base85.h: move declarations for base85.c functions from cache.h
  copy.h: move declarations for copy.c functions from cache.h
  server-info.h: move declarations for server-info.c functions from cache.h
  packfile.h: move pack_window and pack_entry from cache.h
  ...
</content>
</entry>
<entry>
<title>hash-ll.h: split out of hash.h to remove dependency on repository.h</title>
<updated>2023-04-24T19:47:32Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-04-22T20:17:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=d1cbe1e6d8a9cab2b4ffe8a17d34db214dce1e49'/>
<id>urn:sha1:d1cbe1e6d8a9cab2b4ffe8a17d34db214dce1e49</id>
<content type='text'>
hash.h depends upon and includes repository.h, due to the definition and
use of the_hash_algo (defined as the_repository-&gt;hash_algo).  However,
most headers trying to include hash.h are only interested in the layout
of the structs like object_id.  Move the parts of hash.h that do not
depend upon repository.h into a new file hash-ll.h (the "low level"
parts of hash.h), and adjust other files to use this new header where
the convenience inline functions aren't needed.

This allows hash.h and object.h to be fairly small, minimal headers.  It
also exposes a lot of hidden dependencies on both path.h (which was
brought in by repository.h) and repository.h (which was previously
implicitly brought in by object.h), so also adjust other files to be
more explicit about what they depend upon.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hashmap.h: fix minor typo</title>
<updated>2023-03-30T17:18:39Z</updated>
<author>
<name>Siddharth Singh</name>
<email>siddhartth@google.com</email>
</author>
<published>2023-03-30T15:28:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ec063d259108c6c9b96dbaddbd1ae76748d309ec'/>
<id>urn:sha1:ec063d259108c6c9b96dbaddbd1ae76748d309ec</id>
<content type='text'>
The word "no" should be "not".

Signed-off-by: Siddharth Singh &lt;siddhartth@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hashmap: provide deallocation function names</title>
<updated>2020-11-02T20:15:50Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2020-11-02T18:55:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=6da1a258142ac2422c8c57c54b92eaed3c86226e'/>
<id>urn:sha1:6da1a258142ac2422c8c57c54b92eaed3c86226e</id>
<content type='text'>
hashmap_free(), hashmap_free_entries(), and hashmap_free_() have existed
for a while, but aren't necessarily the clearest names, especially with
hashmap_partial_clear() being added to the mix and lazy-initialization
now being supported.  Peff suggested we adopt the following names[1]:

  - hashmap_clear() - remove all entries and de-allocate any
    hashmap-specific data, but be ready for reuse

  - hashmap_clear_and_free() - ditto, but free the entries themselves

  - hashmap_partial_clear() - remove all entries but don't deallocate
    table

  - hashmap_partial_clear_and_free() - ditto, but free the entries

This patch provides the new names and converts all existing callers over
to the new naming scheme.

[1] https://lore.kernel.org/git/20201030125059.GA3277724@coredump.intra.peff.net/

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hashmap: introduce a new hashmap_partial_clear()</title>
<updated>2020-11-02T20:15:50Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2020-11-02T18:55:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=33f20d82177871225e17d9dd44169a52a36c9f1d'/>
<id>urn:sha1:33f20d82177871225e17d9dd44169a52a36c9f1d</id>
<content type='text'>
merge-ort is a heavy user of strmaps, which are built on hashmap.[ch].
clear_or_reinit_internal_opts() in merge-ort was taking about 12% of
overall runtime in my testcase involving rebasing 35 patches of
linux.git across a big rename.  clear_or_reinit_internal_opts() was
calling hashmap_free() followed by hashmap_init(), meaning that not only
was it freeing all the memory associated with each of the strmaps just
to immediately allocate a new array again, it was allocating a new array
that was likely smaller than needed (thus resulting in later need to
rehash things).  The ending size of the map table on the previous commit
was likely almost perfectly sized for the next commit we wanted to pick,
and not dropping and reallocating the table immediately is a win.

Add some new API to hashmap to clear a hashmap of entries without
freeing map-&gt;table (and instead only zeroing it out like alloc_table()
would do, along with zeroing the count of items in the table and the
shrink_at field).

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hashmap: allow re-use after hashmap_free()</title>
<updated>2020-11-02T20:15:50Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2020-11-02T18:55:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b7879b0ba6ee1306a42227f7fd7f4e5f50409184'/>
<id>urn:sha1:b7879b0ba6ee1306a42227f7fd7f4e5f50409184</id>
<content type='text'>
Previously, once map-&gt;table had been freed, any calls to hashmap_put(),
hashmap_get(), or hashmap_remove() would cause a NULL pointer
dereference (since hashmap_free_() also zeros the memory; without that
zeroing, calling these functions would cause a use-after-free problem).

Modify these functions to check for a NULL table and automatically
allocate as needed.

Also add a HASHMAP_INIT(fn, data) macro for initializing hashmaps on the
stack without calling hashmap_init().

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hashmap: adjust spacing to fix argument alignment</title>
<updated>2020-11-02T20:15:50Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2020-11-02T18:55:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=97a39a4a930ebec9162f90ebd0412aed47d413d0'/>
<id>urn:sha1:97a39a4a930ebec9162f90ebd0412aed47d413d0</id>
<content type='text'>
No actual code changes; just whitespace adjustments.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hashmap: add usage documentation explaining hashmap_free[_entries]()</title>
<updated>2020-10-13T20:06:37Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2020-10-13T00:40:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=6474b869393b2d40b6e1b3ab5633ce2bad6abe48'/>
<id>urn:sha1:6474b869393b2d40b6e1b3ab5633ce2bad6abe48</id>
<content type='text'>
The existence of hashmap_free() and hashmap_free_entries() confused me,
and the docs weren't clear enough.  We are dealing with a map table,
entries in that table, and possibly also things each of those entries
point to.  I had to consult other source code examples and the
implementation.  Add a brief note to clarify the differences.  This will
become even more important once we introduce a new
hashmap_partial_clear() function which will add the question of whether
the table itself has been freed.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hashmap_for_each_entry(): workaround MSVC's runtime check failure #3</title>
<updated>2020-09-30T20:26:54Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-09-30T15:26:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0ad621f61e4b2adc3a7bad088c25ebf2c8b9f43d'/>
<id>urn:sha1:0ad621f61e4b2adc3a7bad088c25ebf2c8b9f43d</id>
<content type='text'>
The OFFSETOF_VAR(var, member) macro is implemented in terms of
offsetof(typeof(*var), member) with compilers that know typeof(),
but its fallback implemenation compares &amp;(var-&gt;member) and (var) and
count the distance in bytes, i.e.

    ((uintptr_t)&amp;(var)-&gt;member - (uintptr_t)(var))

MSVC's runtime check, when fed an uninitialized 'var', flags this as
a use of an uninitialized variable (and that is legit---uninitialized
contents of 'var' is subtracted) in a debug build.

After auditing all 6 uses of OFFSETOF_VAR(), 1 of them does feed a
potentially uninitialized 'var' to the macro in the beginning of the
for() loop:

    #define hashmap_for_each_entry(map, iter, var, member) \
            for (var = hashmap_iter_first_entry_offset(map, iter, \
                                                    OFFSETOF_VAR(var, member)); \
                    var; \
                    var = hashmap_iter_next_entry_offset(iter, \
                                                    OFFSETOF_VAR(var, member)))

We can work around this by making sure that var has _some_ value
when OFFSETOF_VAR() is called.  Strictly speaking, it invites
undefined behaviour to use NULL here if we end up with pointer
comparison, but MSVC runtime seems to be happy with it, and most
other systems have typeof() and don't even need pointer comparison
fallback code.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
Signed-off-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
