aboutsummaryrefslogtreecommitdiffstats
path: root/reftable/reader.h (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-04-07reftable/reader: rename data structure to "table"Patrick Steinhardt1-67/+0
The `struct reftable_reader` subsystem encapsulates a table that has been read from the disk. As such, the current name of that structure is somewhat hard to understand as it only talks about the fact that we read something from disk, without really giving an indicator _what_ that is. Furthermore, this naming schema doesn't really fit well into how the other structures are named: `reftable_merged_table`, `reftable_stack`, `reftable_block` and `reftable_record` are all named after what they encapsulate. Rename the subsystem to `reftable_table`, which directly gives a hint that the data structure is about handling the individual tables part of the stack. While this change results in a lot of churn, it prepares for us exposing the APIs to third-party callers now that the reftable library is a standalone library that can be linked against by other projects. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07reftable: fix formatting of the license headerPatrick Steinhardt1-6/+6
The license headers used across the reftable library doesn't follow our typical coding style for multi-line comments. Fix it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21reftable/blocksource: adjust `read_block()` to return `ssize_t`Patrick Steinhardt1-3/+3
The `block_source_read_block()` function and its implementations return an integer as a result that reflects either the number of bytes read, or an error. As such its return type, a signed integer, isn't wrong, but it doesn't give the reader a good hint what it actually returns. Refactor the function to return an `ssize_t` instead, which is typical for functions similar to read(3p) and should thus give readers a better signal what they can expect as a result. Adjust callers to better handle the returned value to avoid warnings with -Wsign-compare. One of these callers is `reader_get_block()`, whose return value is only ever used by its callers to figure out whether or not the read was successful. So instead of bubbling up the `ssize_t` there, too, we adapt it to only indicate success or errors. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19reftable/system: stop depending on "hash.h"Patrick Steinhardt1-2/+2
We include "hash.h" in "reftable/system.h" such that we can use hash format IDs as well as the raw size of SHA1 and SHA256. As we are in the process of converting the reftable library to become standalone we of course cannot rely on those constants anymore. Introduce a new `enum reftable_hash` to replace internal uses of the hash format IDs and new constants that replace internal uses of the hash size. Adapt the reftable backend to set up the correct hash function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10Merge branch 'ps/reftable-alloc-failures'Junio C Hamano1-3/+3
The reftable library is now prepared to expect that the memory allocation function given to it may fail to allocate and to deal with such an error. * ps/reftable-alloc-failures: (26 commits) reftable/basics: fix segfault when growing `names` array fails reftable/basics: ban standard allocator functions reftable: introduce `REFTABLE_FREE_AND_NULL()` reftable: fix calls to free(3P) reftable: handle trivial allocation failures reftable/tree: handle allocation failures reftable/pq: handle allocation failures when adding entries reftable/block: handle allocation failures reftable/blocksource: handle allocation failures reftable/iter: handle allocation failures when creating indexed table iter reftable/stack: handle allocation failures in auto compaction reftable/stack: handle allocation failures in `stack_compact_range()` reftable/stack: handle allocation failures in `reftable_new_stack()` reftable/stack: handle allocation failures on reload reftable/reader: handle allocation failures in `reader_init_iter()` reftable/reader: handle allocation failures for unindexed reader reftable/merged: handle allocation failures in `merged_table_init_iter()` reftable/writer: handle allocation failures in `reftable_new_writer()` reftable/writer: handle allocation failures in `writer_index_hash()` reftable/record: handle allocation failures when decoding records ...
2024-10-04Merge branch 'ak/typofix-2.46-maint'Junio C Hamano1-1/+1
Typofixes. * ak/typofix-2.46-maint: perl: fix a typo mergetool: fix a typo reftable: fix a typo trace2: fix typos
2024-10-03reftable: fix a typoAndrew Kreimer1-1/+1
Fix a typo in comments. Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02reftable/reader: handle allocation failures in `reader_init_iter()`Patrick Steinhardt1-3/+3
Handle allocation failures in `reader_init_iter()`. This requires us to also adapt `reftable_reader_init_*_iterator()` to bubble up the new error codes. Adapt callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23reftable/reader: introduce refcountingPatrick Steinhardt1-0/+2
It was recently reported that concurrent reads and writes may cause the reftable backend to segfault. The root cause of this is that we do not properly keep track of reftable readers across reloads. Suppose that you have a reftable iterator and then decide to reload the stack while iterating through the iterator. When the stack has been rewritten since we have created the iterator, then we would end up discarding a subset of readers that may still be in use by the iterator. The consequence is that we now try to reference deallocated memory, which of course segfaults. One way to trigger this is in t5616, where some background maintenance jobs have been leaking from one test into another. This leads to stack traces like the following one: + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin AddressSanitizer:DEADLYSIGNAL ================================================================= ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp 0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0) ==657994==The signal is caused by a READ memory access. #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29 #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170 #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194 #3 0x55f23e54e72e in block_iter_next reftable/block.c:398 #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240 #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355 #6 0x55f23e5573dc in table_iter_next reftable/reader.c:339 #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69 #8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123 #9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172 #10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175 #11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464 #12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13 #13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452 #14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623 #15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659 #16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133 #17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432 #18 0x55f23dba7764 in run_builtin git.c:484 #19 0x55f23dba7764 in handle_builtin git.c:741 #20 0x55f23dbab61e in run_argv git.c:805 #21 0x55f23dbab61e in cmd_main git.c:1000 #22 0x55f23dba4781 in main common-main.c:64 #23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360 #25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27) While it is somewhat awkward that the maintenance processes survive tests in the first place, it is totally expected that reftables should work alright with concurrent writers. Seemingly they don't. The only underlying resource that we need to care about in this context is the reftable reader, which is responsible for reading a single table from disk. These readers get discarded immediately (unless reused) when calling `reftable_stack_reload()`, which is wrong. We can only close them once we know that there are no iterators using them anymore. Prepare for a fix by converting the reftable readers to be refcounted. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23reftable/reader: inline `reader_close()`Patrick Steinhardt1-1/+0
Same as with the preceding commit, we also provide a `reader_close()` function that allows the caller to close a reader without freeing it. This is unnecessary now that all users will have an allocated version of the reader. Inline it into `reftable_reader_free()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23reftable/reader: inline `init_reader()`Patrick Steinhardt1-2/+0
Most users use an allocated version of the `reftable_reader`, except for some tests. We are about to convert the reader to become refcounted though, and providing the ability to keep a reader on the stack makes this conversion harder than necessary. Update the tests to use `reftable_reader_new()` instead to prepare for this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22reftable/merged: stop using generic tables in the merged tablePatrick Steinhardt1-0/+4
The merged table provides access to a reftable stack by merging the contents of those tables into a virtual table. These subtables are being tracked via `struct reftable_table`, which is a generic interface for accessing either a single reftable or a merged reftable. So in theory, it would be possible for the merged table to merge together other merged tables. This is somewhat nonsensical though: we only ever set up a merged table over normal reftables, and there is no reason to do otherwise. This generic interface thus makes the code way harder to follow and reason about than really necessary. The abstraction layer may also have an impact on performance, even though the extra set of vtable function calls probably doesn't really matter. Refactor the merged tables to use a `struct reftable_reader` for each of the subtables instead, which gives us direct access to the underlying tables. Adjust names accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08reftable: read reftable filesHan-Wen Nienhuys1-0/+64
This supports reading a single reftable file. The commit introduces an abstract iterator type, which captures the usecases both of reading individual refs, and iterating over a segment of the ref namespace. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>