<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/hashmap.h, branch v2.16.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.16.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.16.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2017-12-05T21:37:43Z</updated>
<entry>
<title>hashmap: adjust documentation to reflect reality</title>
<updated>2017-12-05T21:37:43Z</updated>
<author>
<name>Johannes Schindelin</name>
<email>johannes.schindelin@gmx.de</email>
</author>
<published>2017-11-29T23:51:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=826c778f7c3bd6c8356a8ecfd1cb0e4326fcf1d8'/>
<id>urn:sha1:826c778f7c3bd6c8356a8ecfd1cb0e4326fcf1d8</id>
<content type='text'>
The hashmap API is just complicated enough that even at least one
long-time Git contributor has to look up how to use it every time he
finds a new use case. When that happens, it is really useful if the
provided example code is correct...

While at it, "fix a memory leak", avoid statements before variable
declarations, fix a const -&gt; no-const cast, several %l specifiers (which
want to be %ld), avoid using an undefined constant, call scanf()
correctly, use FLEX_ALLOC_STR() where appropriate, and adjust the style
here and there.

Signed-off-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
Reviewed-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hashmap: add API to disable item counting when threaded</title>
<updated>2017-09-07T00:42:02Z</updated>
<author>
<name>Jeff Hostetler</name>
<email>jeffhost@microsoft.com</email>
</author>
<published>2017-09-06T15:43:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=8b604d19515c4be18403047045faa363d4de217b'/>
<id>urn:sha1:8b604d19515c4be18403047045faa363d4de217b</id>
<content type='text'>
This is to address concerns raised by ThreadSanitizer on the mailing list
about threaded unprotected R/W access to map.size with my previous "disallow
rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4).

See:
https://public-inbox.org/git/adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.agren@gmail.com/

Add API to hashmap to disable item counting and thus automatic rehashing.
Also include API to later re-enable them.

When item counting is disabled, the map.size field is invalid.  So to
prevent accidents, the field has been renamed and an accessor function
hashmap_get_size() has been added.  All direct references to this
field have been been updated.  And the name of the field changed
to map.private_size to communicate this.

Here is the relevant output from ThreadSanitizer showing the problem:

WARNING: ThreadSanitizer: data race (pid=10554)
  Read of size 4 at 0x00000082d488 by thread T2 (mutexes: write M16):
    #0 hashmap_add hashmap.c:209
    #1 hash_dir_entry_with_parent_and_prefix name-hash.c:302
    #2 handle_range_dir name-hash.c:347
    #3 handle_range_1 name-hash.c:415
    #4 lazy_dir_thread_proc name-hash.c:471
    #5 &lt;null&gt; &lt;null&gt;

  Previous write of size 4 at 0x00000082d488 by thread T1 (mutexes: write M31):
    #0 hashmap_add hashmap.c:209
    #1 hash_dir_entry_with_parent_and_prefix name-hash.c:302
    #2 handle_range_dir name-hash.c:347
    #3 handle_range_1 name-hash.c:415
    #4 handle_range_dir name-hash.c:380
    #5 handle_range_1 name-hash.c:415
    #6 lazy_dir_thread_proc name-hash.c:471
    #7 &lt;null&gt; &lt;null&gt;

Martin gives instructions for running TSan on test t3008 in this post:
https://public-inbox.org/git/CAN0heSoJDL9pWELD6ciLTmWf-a=oyxe4EXXOmCKvsG5MSuzxsA@mail.gmail.com/

Signed-off-by: Jeff Hostetler &lt;jeffhost@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hashmap: migrate documentation from Documentation/technical into header</title>
<updated>2017-06-30T20:11:59Z</updated>
<author>
<name>Stefan Beller</name>
<email>sbeller@google.com</email>
</author>
<published>2017-06-30T19:14:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=1ecbf31d0298a1ed952623108e23234d5cf37086'/>
<id>urn:sha1:1ecbf31d0298a1ed952623108e23234d5cf37086</id>
<content type='text'>
While at it, clarify the use of `key`, `keydata`, `entry_or_key` as well
as documenting the new data pointer for the compare function.

Rework the example.

Signed-off-by: Stefan Beller &lt;sbeller@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hashmap.h: compare function has access to a data field</title>
<updated>2017-06-30T19:49:28Z</updated>
<author>
<name>Stefan Beller</name>
<email>sbeller@google.com</email>
</author>
<published>2017-06-30T19:14:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=7663cdc86c860d5b5293a1dd4b0fb6c4e006d08e'/>
<id>urn:sha1:7663cdc86c860d5b5293a1dd4b0fb6c4e006d08e</id>
<content type='text'>
When using the hashmap a common need is to have access to caller provided
data in the compare function. A couple of times we abuse the keydata field
to pass in the data needed. This happens for example in patch-ids.c.

This patch changes the function signature of the compare function
to have one more void pointer available. The pointer given for each
invocation of the compare function must be defined in the init function
of the hashmap and is just passed through.

Documentation of this new feature is deferred to a later patch.
This is a rather mechanical conversion, just adding the new pass-through
parameter.  However while at it improve the naming of the fields of all
compare functions used by hashmaps by ensuring unused parameters are
prefixed with 'unused_' and naming the parameters what they are (instead
of 'unused' make it 'unused_keydata').

Signed-off-by: Stefan Beller &lt;sbeller@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hashmap: add disallow_rehash setting</title>
<updated>2017-03-22T20:41:41Z</updated>
<author>
<name>Jeff Hostetler</name>
<email>jeffhost@microsoft.com</email>
</author>
<published>2017-03-22T17:14:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0607e10009ee4e37cb49b4cec8d28a9dda1656a4'/>
<id>urn:sha1:0607e10009ee4e37cb49b4cec8d28a9dda1656a4</id>
<content type='text'>
Teach hashmap to allow rehashes to be suppressed.
This is useful when hashmaps are accessed by multiple
threads.  It still requires the caller to properly
manage their locking.  This just prevents unexpected
rehashing during inserts and deletes.

Signed-off-by: Jeff Hostetler &lt;jeffhost@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hashmap: allow memihash computation to be continued</title>
<updated>2017-03-22T20:41:41Z</updated>
<author>
<name>Jeff Hostetler</name>
<email>jeffhost@microsoft.com</email>
</author>
<published>2017-03-22T17:14:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=f75619bd6d21760e1da416d4e27bce6468beffcd'/>
<id>urn:sha1:f75619bd6d21760e1da416d4e27bce6468beffcd</id>
<content type='text'>
Add variant of memihash() to allow the hash computation to
be continued.  There are times when we compute the hash on
a full path and then the hash on just the path to the parent
directory.  This can be expensive on large repositories.

With this, we can hash the parent directory first. And then
continue the computation to include the "/filename".

Signed-off-by: Jeff Hostetler &lt;jeffhost@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hashmap: add string interning API</title>
<updated>2014-07-07T20:56:38Z</updated>
<author>
<name>Karsten Blees</name>
<email>karsten.blees@gmail.com</email>
</author>
<published>2014-07-02T22:22:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=7b64d42d22206d9995a8f0cb3b515e623cac4702'/>
<id>urn:sha1:7b64d42d22206d9995a8f0cb3b515e623cac4702</id>
<content type='text'>
Interning short strings with high probability of duplicates can reduce the
memory footprint and speed up comparisons.

Add strintern() and memintern() APIs that use a hashmap to manage the pool
of unique, interned strings.

Note: strintern(getenv()) could be used to sanitize git's use of getenv(),
in case we ever encounter a platform where a call to getenv() invalidates
previous getenv() results (which is allowed by POSIX).

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hashmap: add simplified hashmap_get_from_hash() API</title>
<updated>2014-07-07T20:56:35Z</updated>
<author>
<name>Karsten Blees</name>
<email>karsten.blees@gmail.com</email>
</author>
<published>2014-07-02T22:22:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ab73a9d119240b0b908ccb9edd19b8e536ce29b9'/>
<id>urn:sha1:ab73a9d119240b0b908ccb9edd19b8e536ce29b9</id>
<content type='text'>
Hashmap entries are typically looked up by just a key. The hashmap_get()
API expects an initialized entry structure instead, to support compound
keys. This flexibility is currently only needed by find_dir_entry() in
name-hash.c (and compat/win32/fscache.c in the msysgit fork). All other
(currently five) call sites of hashmap_get() have to set up a near emtpy
entry structure, resulting in duplicate code like this:

  struct hashmap_entry keyentry;
  hashmap_entry_init(&amp;keyentry, hash(key));
  return hashmap_get(map, &amp;keyentry, key);

Add a hashmap_get_from_hash() API that allows hashmap lookups by just
specifying the key and its hash code, i.e.:

  return hashmap_get_from_hash(map, hash(key), key);

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hashmap: factor out getting a hash code from a SHA1</title>
<updated>2014-07-07T20:56:24Z</updated>
<author>
<name>Karsten Blees</name>
<email>karsten.blees@gmail.com</email>
</author>
<published>2014-07-02T22:20:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=039dc71a7cb824300e242f8abc0fcb19dac93641'/>
<id>urn:sha1:039dc71a7cb824300e242f8abc0fcb19dac93641</id>
<content type='text'>
Copying the first bytes of a SHA1 is duplicated in six places,
however, the implications (the actual value would depend on the
endianness of the platform) is documented only once.

Add a properly documented API for this.

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hashmap.h: use 'unsigned int' for hash-codes everywhere</title>
<updated>2014-02-24T23:26:30Z</updated>
<author>
<name>Karsten Blees</name>
<email>karsten.blees@gmail.com</email>
</author>
<published>2013-12-18T13:41:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=b6aad994737458177ddf68939719f90e7909f656'/>
<id>urn:sha1:b6aad994737458177ddf68939719f90e7909f656</id>
<content type='text'>
Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
