<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/packfile.c, branch v2.22.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=v2.22.2</id>
<link rel='self' href='https://git.shady.money/git/atom?h=v2.22.2'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2019-07-29T19:38:22Z</updated>
<entry>
<title>Merge branch 'ds/close-object-store' into maint</title>
<updated>2019-07-29T19:38:22Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2019-07-29T19:38:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=dea6737bb76ba231474668a804d3f7178b766c47'/>
<id>urn:sha1:dea6737bb76ba231474668a804d3f7178b766c47</id>
<content type='text'>
The commit-graph file is now part of the "files that the runtime
may keep open file descriptors on, all of which would need to be
closed when done with the object store", and the file descriptor to
an existing commit-graph file now is closed before "gc" finalizes a
new instance to replace it.

* ds/close-object-store:
  packfile: rename close_all_packs to close_object_store
  packfile: close commit-graph in close_all_packs
  commit-graph: use raw_object_store when closing
  commit-graph: extract write_commit_graph_file()
  commit-graph: extract copy_oids_to_commits()
  commit-graph: extract count_distinct_commits()
  commit-graph: extract fill_oids_from_all_packs()
  commit-graph: extract fill_oids_from_commit_hex()
  commit-graph: extract fill_oids_from_packs()
  commit-graph: create write_commit_graph_context
  commit-graph: remove Future Work section
  commit-graph: collapse parameters into flags
  commit-graph: return with errors during write
  commit-graph: fix the_repository reference
</content>
</entry>
<entry>
<title>Merge branch 'rs/copy-array' into maint</title>
<updated>2019-07-29T19:38:15Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2019-07-29T19:38:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=90334a8497f3637876806deb159c4d9c39a7899b'/>
<id>urn:sha1:90334a8497f3637876806deb159c4d9c39a7899b</id>
<content type='text'>
Code clean-up.

* rs/copy-array:
  use COPY_ARRAY for copying arrays
  coccinelle: use COPY_ARRAY for copying arrays
</content>
</entry>
<entry>
<title>Merge branch 'mh/import-transport-fd-fix' into maint</title>
<updated>2019-07-25T21:27:07Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2019-07-25T21:27:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=dae29547c9f0a4895f02d3ba21dc93a7b666e830'/>
<id>urn:sha1:dae29547c9f0a4895f02d3ba21dc93a7b666e830</id>
<content type='text'>
The ownership rule for the file descriptor to fast-import remote
backend was mixed up, leading to unrelated file descriptor getting
closed, which has been fixed.

* mh/import-transport-fd-fix:
  Use xmmap_gently instead of xmmap in use_pack
  dup() the input fd for fast-import used for remote helpers
</content>
</entry>
<entry>
<title>use COPY_ARRAY for copying arrays</title>
<updated>2019-06-18T01:15:04Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2019-06-15T18:36:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=921d49be86bd44ca290c8db6cc6f419dac3ed442'/>
<id>urn:sha1:921d49be86bd44ca290c8db6cc6f419dac3ed442</id>
<content type='text'>
Convert calls of memcpy(3) to use COPY_ARRAY, which shortens and
simplifies the code a bit.

Patch generated by Coccinelle and contrib/coccinelle/array.cocci.

Signed-off-by: Rene Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>packfile: rename close_all_packs to close_object_store</title>
<updated>2019-06-12T18:33:54Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2019-05-17T18:41:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=2d511cfc0bfe1d2b98ba8b272ddd9ba83e84e5f8'/>
<id>urn:sha1:2d511cfc0bfe1d2b98ba8b272ddd9ba83e84e5f8</id>
<content type='text'>
The close_all_packs() method is now responsible for more than just pack-files.
It also closes the commit-graph and the multi-pack-index. Rename the function
to be more descriptive of its larger role. The name also fits because the
input parameter is a raw_object_store.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>packfile: close commit-graph in close_all_packs</title>
<updated>2019-06-12T18:33:54Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2019-05-17T18:41:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=5472c32c3724e1404c3d80368edc28910969c29a'/>
<id>urn:sha1:5472c32c3724e1404c3d80368edc28910969c29a</id>
<content type='text'>
The close_all_packs() method is used to close all read handles to
pack-files and the multi-pack-index before running 'git gc --auto'.
This is particularly important on the Windows platform, where read
handles block any writes to those files. Replacing one of these
files with a rename() will fail in this situation.

The commit-graph also performs a rename, so is susceptable to this
problem. We are careful to close the commit-graph before writing,
but that doesn't work when a 'git fetch' (or similar) process runs
'git gc --auto' which may write a commit-graph.

Here, close the commit-graph as part of close_all_packs().

Reported-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ds/midx-too-many-packs'</title>
<updated>2019-05-19T07:45:30Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2019-05-19T07:45:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=454b419729142cd65466cabdb253d06c3699d098'/>
<id>urn:sha1:454b419729142cd65466cabdb253d06c3699d098</id>
<content type='text'>
The code to generate the multi-pack idx file was not prepared to
see too many packfiles and ran out of open file descriptor, which
has been corrected.

* ds/midx-too-many-packs:
  midx: add packs to packed_git linked list
  midx: pass a repository pointer
</content>
</entry>
<entry>
<title>Use xmmap_gently instead of xmmap in use_pack</title>
<updated>2019-05-16T09:02:30Z</updated>
<author>
<name>Mike Hommey</name>
<email>mh@glandium.org</email>
</author>
<published>2019-05-16T00:37:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=3203566a7109e2b83519b379581005cee178c3fd'/>
<id>urn:sha1:3203566a7109e2b83519b379581005cee178c3fd</id>
<content type='text'>
use_pack has its own error message on mmap error, but it can't be
reached when using xmmap, which dies with its own error.

Signed-off-by: Mike Hommey &lt;mh@glandium.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'nd/sha1-name-c-wo-the-repository'</title>
<updated>2019-05-08T15:37:25Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2019-05-08T15:37:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=0b179f3175d1a152b1d22ce8352efda34b258ce2'/>
<id>urn:sha1:0b179f3175d1a152b1d22ce8352efda34b258ce2</id>
<content type='text'>
Further code clean-up to allow the lowest level of name-to-object
mapping layer to work with a passed-in repository other than the
default one.

* nd/sha1-name-c-wo-the-repository: (34 commits)
  sha1-name.c: remove the_repo from get_oid_mb()
  sha1-name.c: remove the_repo from other get_oid_*
  sha1-name.c: remove the_repo from maybe_die_on_misspelt_object_name
  submodule-config.c: use repo_get_oid for reading .gitmodules
  sha1-name.c: add repo_get_oid()
  sha1-name.c: remove the_repo from get_oid_with_context_1()
  sha1-name.c: remove the_repo from resolve_relative_path()
  sha1-name.c: remove the_repo from diagnose_invalid_index_path()
  sha1-name.c: remove the_repo from handle_one_ref()
  sha1-name.c: remove the_repo from get_oid_1()
  sha1-name.c: remove the_repo from get_oid_basic()
  sha1-name.c: remove the_repo from get_describe_name()
  sha1-name.c: remove the_repo from get_oid_oneline()
  sha1-name.c: add repo_interpret_branch_name()
  sha1-name.c: remove the_repo from interpret_branch_mark()
  sha1-name.c: remove the_repo from interpret_nth_prior_checkout()
  sha1-name.c: remove the_repo from get_short_oid()
  sha1-name.c: add repo_for_each_abbrev()
  sha1-name.c: store and use repo in struct disambiguate_state
  sha1-name.c: add repo_find_unique_abbrev_r()
  ...
</content>
</entry>
<entry>
<title>midx: add packs to packed_git linked list</title>
<updated>2019-05-07T04:48:42Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2019-04-29T16:18:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=af96fe3392fb078cb5447bcb94f2ed8d79d0a4a8'/>
<id>urn:sha1:af96fe3392fb078cb5447bcb94f2ed8d79d0a4a8</id>
<content type='text'>
The multi-pack-index allows searching for objects across multiple
packs using one object list. The original design gains many of
these performance benefits by keeping the packs in the
multi-pack-index out of the packed_git list.

Unfortunately, this has one major drawback. If the multi-pack-index
covers thousands of packs, and a command loads many of those packs,
then we can hit the limit for open file descriptors. The
close_one_pack() method is used to limit this resource, but it
only looks at the packed_git list, and uses an LRU cache to prevent
thrashing.

Instead of complicating this close_one_pack() logic to include
direct references to the multi-pack-index, simply add the packs
opened by the multi-pack-index to the packed_git list. This
immediately solves the file-descriptor limit problem, but requires
some extra steps to avoid performance issues or other problems:

1. Create a multi_pack_index bit in the packed_git struct that is
   one if and only if the pack was loaded from a multi-pack-index.

2. Skip packs with the multi_pack_index bit when doing object
   lookups and abbreviations. These algorithms already check the
   multi-pack-index before the packed_git struct. This has a very
   small performance hit, as we need to walk more packed_git
   structs. This is acceptable, since these operations run binary
   search on the other packs, so this walk-and-ignore logic is
   very fast by comparison.

3. When closing a multi-pack-index file, do not close its packs,
   as those packs will be closed using close_all_packs(). In some
   cases, such as 'git repack', we run 'close_midx()' without also
   closing the packs, so we need to un-set the multi_pack_index bit
   in those packs. This is necessary, and caught by running
   t6501-freshen-objects.sh with GIT_TEST_MULTI_PACK_INDEX=1.

To manually test this change, I inserted trace2 logging into
close_pack_fd() and set pack_max_fds to 10, then ran 'git rev-list
--all --objects' on a copy of the Git repo with 300+ pack-files and
a multi-pack-index. The logs verified the packs are closed as
we read them beyond the file descriptor limit.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
