aboutsummaryrefslogtreecommitdiffstats
path: root/reftable/readwrite_test.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2024-08-13t: move reftable/readwrite_test.c to the unit testing frameworkChandra Pratap1-979/+0
reftable/readwrite_test.c exercises the functions defined in reftable/reader.{c,h} and reftable/writer.{c,h}. Migrate reftable/readwrite_test.c to the unit testing framework. Migration involves refactoring the tests to use the unit testing framework instead of reftable's test framework and renaming the tests to align with unit-tests' naming conventions. Since some tests in reftable/readwrite_test.c use the functions set_test_hash(), noop_flush() and strbuf_add_void() defined in reftable/test_framework.{c,h} but these files are not #included in the ported unit test, copy these functions in the new test file. While at it, ensure structs are 0-initialized with '= { 0 }' instead of '= { NULL }'. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-17Merge branch 'ps/no-writable-strings'Junio C Hamano1-16/+16
Building with "-Werror -Wwrite-strings" is now supported. * ps/no-writable-strings: (27 commits) config.mak.dev: enable `-Wwrite-strings` warning builtin/merge: always store allocated strings in `pull_twohead` builtin/rebase: always store allocated string in `options.strategy` builtin/rebase: do not assign default backend to non-constant field imap-send: fix leaking memory in `imap_server_conf` imap-send: drop global `imap_server_conf` variable mailmap: always store allocated strings in mailmap blob revision: always store allocated strings in output encoding remote-curl: avoid assigning string constant to non-const variable send-pack: always allocate receive status parse-options: cast long name for OPTION_ALIAS http: do not assign string constant to non-const field compat/win32: fix const-correctness with string constants pretty: add casts for decoration option pointers object-file: make `buf` parameter of `index_mem()` a constant object-file: mark cached object buffers as const ident: add casts for fallback name and GECOS entry: refactor how we remove items for delayed checkouts line-log: always allocate the output prefix line-log: stop assigning string constant to file parent buffer ...
2024-06-07reftable: cast away constness when assigning constants to recordsPatrick Steinhardt1-16/+16
The reftable records are used in multiple ways throughout the reftable library. In many of those cases they merely act as input to a function without getting modified by it at all. Most importantly, this happens when writing records and when querying for records. We rely on this in our tests and thus assign string constants to those fields, which is about to generate warnings as those fields are of type `char *`. While we could go through the process and instead allocate those strings in all of our tests, this feels quite unnecessary. Instead, add casts to `char *` for all of those strings. As this is part of our tests, this also nicely serves as a demonstration that nothing writes or frees those string constants, which would otherwise lead to segfaults. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13reftable/reader: adapt interface to allow reuse of iteratorsPatrick Steinhardt1-11/+24
Refactor the interfaces exposed by `struct reftable_reader` and `struct table_iterator` such that they support iterator reuse. This is done by separating initialization of the iterator and seeking on it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05reftable/record: convert old and new object IDs to arraysPatrick Steinhardt1-37/+25
In 7af607c58d (reftable/record: store "val1" hashes as static arrays, 2024-01-03) and b31e3cc620 (reftable/record: store "val2" hashes as static arrays, 2024-01-03) we have converted ref records to store their object IDs in a static array. Convert log records to do the same so that their old and new object IDs are arrays, too. This change results in two allocations less per log record that we're iterating over. Before: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 8,068,495 allocs, 8,068,373 frees, 401,011,862 bytes allocated After: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 6,068,489 allocs, 6,068,367 frees, 361,011,822 bytes allocated Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12Merge branch 'ps/reftable-styles'Junio C Hamano1-3/+5
Code clean-up in various reftable code paths. * ps/reftable-styles: reftable/record: improve semantics when initializing records reftable/merged: refactor initialization of iterators reftable/merged: refactor seeking of records reftable/stack: use `size_t` to track stack length reftable/stack: use `size_t` to track stack slices during compaction reftable/stack: index segments with `size_t` reftable/stack: fix parameter validation when compacting range reftable: introduce macros to allocate arrays reftable: introduce macros to grow arrays
2024-02-06reftable: introduce macros to allocate arraysPatrick Steinhardt1-3/+5
Similar to the preceding commit, let's carry over macros to allocate arrays with `REFTABLE_ALLOC_ARRAY()` and `REFTABLE_CALLOC_ARRAY()`. This requires us to change the signature of `reftable_calloc()`, which only takes a single argument right now and thus puts the burden on the caller to calculate the final array's size. This is a net improvement though as it means that we can now provide proper overflow checks when multiplying the array size with the member size. Convert callsites of `reftable_calloc()` to the new signature and start using the new macros where possible. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-01reftable/writer: fix writing multi-level indicesPatrick Steinhardt1-0/+56
When finishing a section we will potentially write an index that makes it more efficient to look up relevant blocks. The index records written will encode, for each block of the indexed section, what the offset of that block is as well as the last key of that block. Thus, the reader would iterate through the index records to find the first key larger or equal to the wanted key and then use the encoded offset to look up the desired block. When there are a lot of blocks to index though we may end up writing multiple index blocks, too. To not require a linear search across all index blocks we instead end up writing a multi-level index. Instead of referring to the block we are after, an index record may point to another index block. The reader will then access the highest-level index and follow down the chain of index blocks until it hits the sought-after block. It has been observed though that it is impossible to seek ref records of the last ref block when using a multi-level index. While the multi-level index exists and looks fine for most of the part, the highest-level index was missing an index record pointing to the last block of the next index. Thus, every additional level made more refs become unseekable at the end of the ref section. The root cause is that we are not flushing the last block of the current level once done writing the level. Consequently, it wasn't recorded in the blocks that need to be indexed by the next-higher level and thus we forgot about it. Fix this bug by flushing blocks after we have written all index records. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-23reftable: honor core.fsyncJohn Cai1-12/+12
While the reffiles backend honors configured fsync settings, the reftable backend does not. Address this by fsyncing reftable files using the write-or-die api's fsync_component() in two places: when we add additional entries into the table, and when we close the reftable writer. This commits adds a flush function pointer as a new member of reftable_writer because we are not sure that the first argument to the *write function pointer always contains a file descriptor. In the case of strbuf_add_void, the first argument is a buffer. This way, we can pass in a corresponding flush function that knows how to flush depending on which writer is being used. This patch does not contain tests as they will need to wait for another patch to start to exercise the reftable backend. At that point, the tests will be added to observe that fsyncs are happening when the reftable is in use. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-16Merge branch 'ps/reftable-fixes-and-optims'Junio C Hamano1-16/+86
More fixes and optimizations to the reftable backend. * ps/reftable-fixes-and-optims: reftable/merged: transfer ownership of records when iterating reftable/merged: really reuse buffers to compute record keys reftable/record: store "val2" hashes as static arrays reftable/record: store "val1" hashes as static arrays reftable/record: constify some parts of the interface reftable/writer: fix index corruption when writing multiple indices reftable/stack: do not auto-compact twice in `reftable_stack_add()` reftable/stack: do not overwrite errors when compacting
2024-01-08Merge branch 'en/header-cleanup'Junio C Hamano1-1/+0
Remove unused header "#include". * en/header-cleanup: treewide: remove unnecessary includes in source files treewide: add direct includes currently only pulled in transitively trace2/tr2_tls.h: remove unnecessary include submodule-config.h: remove unnecessary include pkt-line.h: remove unnecessary include line-log.h: remove unnecessary include http.h: remove unnecessary include fsmonitor--daemon.h: remove unnecessary includes blame.h: remove unnecessary includes archive.h: remove unnecessary include treewide: remove unnecessary includes in source files treewide: remove unnecessary includes from header files
2024-01-03reftable/record: store "val2" hashes as static arraysPatrick Steinhardt1-8/+4
Similar to the preceding commit, convert ref records of type "val2" to store their object IDs in static arrays instead of allocating them for every single record. We're using the same benchmark as in the preceding commit, with `git show-ref --quiet` in a repository with ~350k refs. This time around though the effects aren't this huge. Before: HEAP SUMMARY: in use at exit: 21,163 bytes in 193 blocks total heap usage: 1,419,040 allocs, 1,418,847 frees, 62,153,868 bytes allocated After: HEAP SUMMARY: in use at exit: 21,163 bytes in 193 blocks total heap usage: 1,410,148 allocs, 1,409,955 frees, 61,976,068 bytes allocated This is because "val2"-type records are typically only stored for peeled tags, and the number of annotated tags in the benchmark repository is rather low. Still, it can be seen that this change leads to a reduction of allocations overall, even if only a small one. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-03reftable/record: store "val1" hashes as static arraysPatrick Steinhardt1-10/+4
When reading ref records of type "val1", we store its object ID in an allocated array. This results in an additional allocation for every single ref record we read, which is rather inefficient especially when iterating over refs. Refactor the code to instead use an embedded array of `GIT_MAX_RAWSZ` bytes. While this means that `struct ref_record` is bigger now, we typically do not store all refs in an array anyway and instead only handle a limited number of records at the same point in time. Using `git show-ref --quiet` in a repository with ~350k refs this leads to a significant drop in allocations. Before: HEAP SUMMARY: in use at exit: 21,098 bytes in 192 blocks total heap usage: 2,116,683 allocs, 2,116,491 frees, 76,098,060 bytes allocated After: HEAP SUMMARY: in use at exit: 21,098 bytes in 192 blocks total heap usage: 1,419,031 allocs, 1,418,839 frees, 62,145,036 bytes allocated Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-03reftable/writer: fix index corruption when writing multiple indicesPatrick Steinhardt1-0/+80
Each reftable may contain multiple types of blocks for refs, objects and reflog records, where each of these may have an index that makes it more efficient to find the records. It was observed that the index for log records can become corrupted under certain circumstances, where the first entry of the index points into the object index instead of to the log records. As it turns out, this corruption can occur whenever we write a log index as well as at least one additional index. Writing records and their index is basically a two-step process: 1. We write all blocks for the corresponding record. Each block that gets written is added to a list of blocks to index. 2. Once all blocks were written we finish the section. If at least two blocks have been added to the list of blocks to index then we will now write the index for those blocks and flush it, as well. When we have a very large number of blocks then we may decide to write a multi-level index, which is why we also keep track of the list of the index blocks in the same way as we previously kept track of the blocks to index. Now when we have finished writing all index blocks we clear the index and flush the last block to disk. This is done in the wrong order though because flushing the block to disk will re-add it to the list of blocks to be indexed. The result is that the next section we are about to write will have an entry in the list of blocks to index that points to the last block of the preceding section's index, which will corrupt the log index. Fix this corruption by clearing the index after having written the last block. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26treewide: remove unnecessary includes in source filesElijah Newren1-1/+0
Each of these were checked with gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE} to ensure that removing the direct inclusion of the header actually resulted in that header no longer being included at all (i.e. that no other header pulled it in transitively). ...except for a few cases where we verified that although the header was brought in transitively, nothing from it was directly used in that source file. These cases were: * builtin/credential-cache.c * builtin/pull.c * builtin/send-pack.c Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11reftable/stack: fix use of unseeded randomnessPatrick Steinhardt1-3/+3
When writing a new reftable stack, Git will first create the stack with a random suffix so that concurrent updates will not try to write to the same file. This random suffix is computed via a call to rand(3P). But we never seed the function via srand(3P), which means that the suffix is in fact always the same. Fix this bug by using `git_rand()` instead, which does not need to be initialized. While this function is likely going to be slower depending on the platform, this slowness should not matter in practice as we only use it when writing a new reftable stack. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-23reftable: rename writer_stats to reftable_writer_statsHan-Wen Nienhuys1-5/+5
This function is part of the reftable API, so it should use the reftable_ prefix Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-23reftable: add test for length of disambiguating prefixHan-Wen Nienhuys1-0/+38
The ID => ref map is trimming object IDs to a disambiguating prefix. Check that we are computing their length correctly. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-23reftable: ensure that obj_id_len is >= 2 on writingHan-Wen Nienhuys1-0/+37
When writing the same hash many times, we might decide to use a length-1 object ID prefix for the ObjectID => ref table, which is out of spec. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-23reftable: add a test that verifies that writing empty keys failsHan-Wen Nienhuys1-0/+24
Empty keys can only be written as ref records with empty names. The log record has a logical timestamp in the key, so the key is never empty. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16Merge branch 'hn/reftable-coverity-fixes'Junio C Hamano1-1/+66
Problems identified by Coverity in the reftable code have been corrected. * hn/reftable-coverity-fixes: reftable: add print functions to the record types reftable: make reftable_record a tagged union reftable: remove outdated file reftable.c reftable: implement record equality generically reftable: make reftable-record.h function signatures const correct reftable: handle null refnames in reftable_ref_record_equal reftable: drop stray printf in readwrite_test reftable: order unittests by complexity reftable: all xxx_free() functions accept NULL arguments reftable: fix resource warning reftable: ignore remove() return value in stack_test.c reftable: check reftable_stack_auto_compact() return value reftable: fix resource leak blocksource.c reftable: fix resource leak in block.c error path reftable: fix OOB stack write in print functions
2022-01-20reftable: drop stray printf in readwrite_testHan-Wen Nienhuys1-1/+0
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-20reftable: fix resource leak in block.c error pathHan-Wen Nienhuys1-0/+66
Add test coverage for corrupt zlib data. Fix memory leaks demonstrated by unittest. This problem was discovered by a Coverity scan. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-23reftable: signal overflowHan-Wen Nienhuys1-0/+35
reflog entries have unbounded size. In theory, each log ('g') block in reftable can have an arbitrary size, so the format allows for arbitrarily sized reflog messages. However, in the implementation, we are not scaling the log blocks up with the message, and writing a large message fails. This triggers a failure for reftable in t7006-pager.sh. Until this is fixed more structurally, report an error from within the reftable library for easier debugging. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08reftable: reftable file level testsHan-Wen Nienhuys1-0/+652
With support for reading and writing files in place, we can construct files (in memory) and attempt to read them back. Because some sections of the format are optional (eg. indices, log entries), we have to exercise this code using multiple sizes of input data Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>