summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorLines
3 daysdoc: fix missing '=' in texi option descriptionsHEADmasterPádraig Brady-2/+2
* doc/coreutils.texi (cut invocation, fold invocation): Fix missing '=' before option parameters.
4 daysdd: always diagnose partial writes on write failurePádraig Brady-0/+44
* src/dd.c (dd_copy): Increment the partial write count upon failure. * tests/dd/partial-write.sh: Add a new test. * tests/local.mk: Reference the new test. * NEWS: Mention the bug fix. Fixes https://bugs.gnu.org/80583
4 daysdoc: clarify a recent NEWS itemPádraig Brady-1/+1
* NEWS: It was ambiguous as to whether we quoted a range of observered throughputs. Clarify this was the old and new throughput on a single test system.
5 daysdoc: NEWS: adjust 'wc -l' aarch64 benchmark after recent commitCollin Funk-2/+2
After commit e0190a9d1 (wc: improve aarch64 Neon optimization for 'wc -l', 2026-03-09), on a Ampere eMAG machine: $ yes | head -n 10000000000 > input $ (time ./src/wc -l input) 10000000000 input real 0m3.447s user 0m1.533s sys 0m1.913s $ (export GLIBC_TUNABLES='glibc.cpu.hwcaps=-ASIMD,-AVX2,-AVX512F'; \ time ./src/wc -l input) 10000000000 input real 0m15.758s user 0m14.039s sys 0m1.720s * NEWS: Mention the improved benchmark.
5 daystests: rm: check for hints when running 'rm -foo'Collin Funk-0/+54
* tests/rm/dash-hint.sh: New file. * tests/local.mk (all_tests): Add the new test.
6 daysmaint: adjust to placate coverityPádraig Brady-1/+2
* src/system.h (c32issep): Adjust to more standard layout.
6 daysyes: use a zero-copy implementation via (vm)splicePádraig Brady-9/+180
A good reference for the concepts used here is: https://mazzo.li/posts/fast-pipes.html We don't consider huge pages or busy loops here, but use vmsplice(), and splice() to get significant speedups: i7-5600U-laptop $ taskset 1 yes | taskset 2 pv > /dev/null ... [4.98GiB/s] i7-5600U-laptop $ taskset 1 src/yes | taskset 2 pv > /dev/null ... [34.1GiB/s] IBM,9043-MRX $ taskset 1 yes | taskset 2 pv > /dev/null ... [11.6GiB/s] IBM,9043-MRX $ taskset 1 src/yes | taskset 2 pv > /dev/null ... [175GiB/s] Also throughput to file (on BTRFS) was seen to increase significantly. With a Fedora 43 laptop improving from 690MiB/s to 1.1GiB/s. * bootstrap.conf: Ensure sys/uio.h is present. This was an existing transitive dependency. * m4/jm-macros.m4: Define HAVE_SPLICE appropriately. We assume vmsplice() is available if splice() is as they were introduced at the same time to Linux and glibc. * src/yes.c (repeat_pattern): A new function to efficiently duplicate a pattern in a buffer with memcpy calls that double in size. This also makes the setup for the existing write() path more efficient. (pipe_splice_size): A new function to increase the kernel pipe buffer if possible, and use an appropriately sized buffer based on that (25%). (splice_write): A new function to call vmplice() when outputting to a pipe, and also splice() if outputting to a non-pipe. * tests/misc/yes.sh: Verify the non-pipe output case, (main): Adjust to always calling write on the minimal buffer first, then trying vmsplice(), then falling back to write from bigger buffer. and the vmsplice() fallback to write() case. * NEWS: Mention the improvement.
6 daysall: use more consistent blank character determinationPádraig Brady-10/+35
* src/system.h (c32issep): A new function that is essentially iswblank() on GLIBC platforms, and iswspace() with exceptions elsewhere. * src/expand.c: Use it instead of c32isblank(). * src/fold.c: Likewise. * src/join.c: Likewise. * src/numfmt.c: Likewise. * src/unexpand.c: Likewise. * src/uniq.c: Likewise. * NEWS: Mention the improvement.
6 dayscksum: fix tagged output on 32 bit platformsPádraig Brady-4/+4
Fix an unreleased issue due to the recent change to using idx_t in commit v9.10-91-g02983e493 * src/cksum.c (output_file): Cast the idx_t before passing to printf.
6 dayswc: improve aarch64 Neon optimization for 'wc -l'Collin Funk-65/+67
$ yes abcdefghijklmnopqrstuvwxyz | head -n 200000000 > input $ time ./src/wc-prev -l input 200000000 input real 0m1.240s user 0m0.456s sys 0m0.784s $ time ./src/wc -l input 200000000 input real 0m0.936s user 0m0.141s sys 0m0.795s * configure.ac: Use unsigned char for the buffer to avoid potential compiler warnings. Check for the functions being used in src/wc_neon.c after this patch. * src/wc_neon.c (wc_lines_neon): Use vreinterpretq_s8_u8 to convert 0xff into -1 instead of bitwise AND instructions into convert it into 1. Perform the pairwise addition and lane extraction once every 8192 bytes instead of once every 64 bytes. Thanks to Lasse Collin for spotting this and reviewing a draft of this patch.
7 daystests: expand: fix false failure on various systemsPádraig Brady-3/+5
* tests/expand/mb.sh: Use $LOCALE_FR_UTF8 rather than hardcoding "en_US.UTF-8". * tests/unexpand/mb.sh: Likewise. Reported by Bruno Haible.
7 daysbuild: update to latest gnulibPádraig Brady-1/+1
* src/ls.c: Adjust for renamed acl permissions member.
8 daysmaint: remove duplicate names from THANKSCollin Funk-0/+5
* .mailmap: Prefer the most recently used email address from each commit author.
8 daysmaint: prefer memset_explicit to explicit_bzeroCollin Funk-3/+3
The explicit_bzero function is a common extension, but memset_explicit was standardized in C23. It will likely become more portable in the future, and Gnulib provides an implementation if needed. * bootstrap.conf (gnulib_modules): Add memset_explicit. Remove explicit_bzero. * gl/lib/randint.c (randint_free): Use memset_explicit instead of explicit_bzero. * gl/lib/randread.c (randread_free_body): Likewise.
9 daysexpand,unexpand: support multi-byte inputLukáš Zaoral-30/+401
* src/expand.c: Use mbbuf to support multi-byte input. * src/unexpand.c: Likewise. * tests/expand/mb.sh: New multi-byte test. * tests/unexpand/mb.sh: Likewise. * tests/local.mk: Reference new tests. * NEWS: Mention the improvement.
9 daysmaint: shred: fix typo in commentWeixie Cui-1/+1
* src/shred.c: Fix "then" -> "than" in comment.
10 daysmaint: dd: fix typo in commentWeixie Cui-1/+1
* src/dd.c: Fix "that that" -> "that the" in comment.
10 daysbuild: update gnulib submodule to latestCollin Funk-0/+0
10 daysbuild: update gnulib submodule to latestCollin Funk-0/+0
11 daysmaint: touch: reduce variable scopeCollin Funk-2/+2
* src/touch.c (main): Declare variables where they are used instead of at the start of the function.
11 daysmaint: chown,chgrp: reduce variable scopeCollin Funk-27/+20
* src/chown-core.c (describe_change, restricted_chown) (change_file_owner, chown_files): Declare variables where they are used instead of at the start of the function. * src/chown.c (main): Likewise.
11 daysinstall: allow the combination of --compare and --preserve-timestampsCollin Funk-24/+48
* NEWS: Mention the improvement. * src/install.c (enum copy_status): New type to let the caller know if the copy was performed or skipped. (copy_file): Return the new type instead of bool. Reduce variable scope. (install_file_in_file): Only strip the file if the copy was performed. Update the timestamps if the copy was skipped. (main): Don't error when --compare and --preserve-timestamps are combined. * tests/install/install-C.sh: Add some test cases.
12 dayscksum: use more defensive escaping for --checkPádraig Brady-15/+19
cksum --check is often the first interaction users have with possibly untrusted downloads, so we should try to be as defensive as possible when processing it. Specifically we currently only escape \n characters in file names presented in checksum files being parsed with cksum --check. This gives some possibilty of dumping arbitrary data to the terminal when checking downloads from an untrusted source. This change gives these advantages: 1. Avoids dumping arbitrary data to vulnerable terminals 2. Avoids visual deception with ansi codes hiding checksum failures 3. More secure if users copy and paste file names from --check output 4. Simplifies programmatic parsing Note this changes programmatic parsing, but given the original format was so awkward to parse, I expect that's extremely rare. I was not able to find example in the wild at least. To parse the new format from from shell, you can do something like: cksum -c checksums | while IFS= read -r line; do case $line in *': FAILED') filename=$(eval "printf '%s' ${line%: FAILED}") cp -v "$filename" /quarantine ;; esac done This change also slightly reduces the size of the sum(1) utility. This change also apples to md5sum, sha*sum, and b2sum. * src/cksum.c (digest_check): Call quotef() instead of cksum(1) specific quoting. * tests/cksum/md5sum-bsd.sh: Adjust accordingly. * doc/coreutils.texi (cksum general options): Describe the shell quoting used for problematic file names. * NEWS: Mention the change in behavior. Reported by: Aaron Rainbolt
12 daysmaint: tests: refactor uses of bad_unicode()Pádraig Brady-9/+3
* init.cfg: Use 0xFF rather than 0xC3 everywhere. * tests/fold/fold-characters.sh: Reuse bad_unicode(). * tests/tac/tac-locale.sh: Likewise.
12 daysfold: fix output truncation with 0xFF bytes in inputPádraig Brady-5/+9
On signed char platforms, 0xFF was converted to -1 which matches MBBUF_EOF, causing fold to stop processing. * NEWS: Mention the bug fix. * gl/lib/mbbuf.h: Avoid sign extension on signed char platforms. * tests/fold/fold-characters.sh: Adjust test case. Reported at https://src.fedoraproject.org/rpms/coreutils/pull-request/20
12 daystests: date: add timezone conversion testSylvestre Ledru-0/+4
*tests/date/date.pl: Add the test case. Add test case for https://github.com/uutils/coreutils/issues/10800 to verify `date -u -d '10:30 UTC-05'` converts to 15:30 UTC.
12 daystests: date: add edge cases for modifiersSylvestre Ledru-0/+17
* tests/date/date.pl: Add the test case. Add test cases for https://github.com/uutils/coreutils/issues/10957
12 daystests: cut: add test case for newline delimiter with -s flagSylvestre Ledru-0/+3
* tests/cut/cut.pl: Add a new test case. https://github.com/coreutils/coreutils/pull/211
13 daystests: mktemp: ensure mktemp does not depend on getrandom and ASLRoech3-0/+34
* tests/mktemp/mktemp-misc.sh: Add new test. * tests/local.mk: Reference new test. https://github.com/coreutils/coreutils/pull/206
13 daysmaint: tests: decouple debug output determinationPádraig Brady-14/+3
* tests/misc/warning-errors.sh: Simply check there is output to stderr before checking that output induces an error.
13 daystests: avoid false test failure when using address sanitizerCollin Funk-0/+3
* tests/misc/warning-errors.sh: Skip commands which have been built with sanitizers, since standard error will not be closed and checked for errors. Reported by Bruno Haible.
13 daystests: avoid failure on systems without an optimized 'cksum' or 'wc -l'Collin Funk-0/+10
* tests/misc/warning-errors.sh: Expect 'wc' and 'cksum' to exit successfully if there is not an optimized 'wc -l' implementation or CRC32 implementation. Reported by Bruno Haible.
14 daystests: shuf: ensure memory exhaustion is handled gracefullyoech3-0/+5
* tests/shuf/shuf.sh: Ensure we exit 1 upon failure to allocate memory. https://github.com/uutils/coreutils/issues/11170 https://github.com/coreutils/coreutils/pull/209
2026-03-02test: cp: add test for non-UTF8 directory namesSylvestre Ledru-0/+48
Missing test identified here: https://github.com/uutils/coreutils/pull/11148 * tests/cp/non-utf8-name.sh: Add a new test to cover this case. https://github.com/coreutils/coreutils/pull/207
2026-02-28du: fflush after outputting a linePaul Eggert-1/+1
* src/du.c (print_size): Resurrect the fflush call, since there can be significant delay between output lines.
2026-02-28tests: wc,du: add additional --files0-from test casesCollin Funk-4/+29
* tests/wc/wc-files0-from.pl ($limits): New variable. (@Tests): Prefer the error strings from getlimits over writing them by hand. Add test cases for --files0-from listing missing files and duplicate files. * tests/du/files0-from.pl ($limits): New variable. (@Tests): Prefer the error strings from getlimits over writing them by hand. Add test cases for --files0-from listing missing files. Add tests for --files0-from listing duplicate files with and without the -l option also in use.
2026-02-28build: update gnulib submodule to latestCollin Funk-1/+0
* po/POTFILES.in: Remove recently added lib/cygpath.c dependency after gnulib commit 2a893de047 (filesystem-remote: New module., 2026-02-28).
2026-02-28tests: ls: treat invalid UTF-8 paths starting with a dot as hiddenSylvestre Ledru-0/+61
* tests/ls/non-utf8-hidden.sh: Add the test case. https://github.com/uutils/coreutils/pull/11135 https://github.com/coreutils/coreutils/pull/202
2026-02-28tests: ln: verify that -f and -i override each otherSylvestre Ledru-0/+21
Identified here: <https://github.com/uutils/coreutils/pull/11129> * tests/ln/misc.sh: Add the check. https://github.com/coreutils/coreutils/pull/199
2026-02-28test: ln: verify backup suffix path traversal preventionSylvestre Ledru-0/+66
missing test detected thanks to: https://github.com/uutils/coreutils/pull/11149 * tests/ln/backup-suffix-traversal.sh: Add a test. https://github.com/coreutils/coreutils/pull/208
2026-02-28maint: fix typo in previous testPádraig Brady-1/+1
* tests/shuf/shuf.sh: Use non varying $ret rather than $?
2026-02-28tests: shuf: ensure we handle unsupported getrandom syscall gracefullyoech3-0/+6
* tests/shuf/shuf.sh: Check we fail normally or succeed where the getrandom syscall is not available. https://github.com/coreutils/coreutils/pull/205
2026-02-28build: update gnulib to latestPádraig Brady-0/+4
* NEWS: Mention the more encompassing remoteness check for df. * po/POTFILES.in: Add new lib/cygpath.c dependency.
2026-02-27du: avoid locking and flushing standard outputCollin Funk-3/+4
This results in a noticeable increase in performance: $ yes /dev/null | head -n 10000000 | tr '\n' '\0' \ | time --format=%E ./src/du-prev -l --files0-from=- > /dev/null 0:20.40 $ yes /dev/null | head -n 10000000 | tr '\n' '\0' \ | time --format=%E ./src/du -l --files0-from=- > /dev/null 0:16.57 * src/du.c (print_size): Prefer putchar and fputs which may be unlocked unlike printf. Prefer ferror to fflush.
2026-02-27stat: handle %%%N tooPaul Eggert-5/+6
* src/stat.c (main): Fix incorrect counting of '%'s before 'N'. * tests/stat/stat-fmt.sh: Test for the bug.
2026-02-27id: avoid unnecessary buffer flushingPaul Eggert-5/+4
* src/groups.c (main): * src/id.c (main, print_stuff): Don’t flush stdout before testing for write error. Do the test only when in a loop, as a one-shot will test for write error soon anyway.
2026-02-27cksum: prefer signed intPaul Eggert-129/+129
* src/cksum.c (min_digest_line_length, digest_hex_bytes) (digest_length, md5_sum_stream, sha1_sum_stream) (sha224_sum_stream, sha256_sum_stream, sha384_sum_stream) (sha512_sum_stream, sha2_sum_stream, sha3_sum_stream) (blake2b_sum_stream, sm3_sum_stream, problematic_chars) (filename_unescape, valid_digits, bsd_split_3) (algorithm_from_tag, split_3, digest_file, output_file) (b64_equal, hex_equal, digest_check, main): * src/cksum_avx2.c (cksum_avx2): * src/cksum_avx512.c (cksum_avx512): * src/cksum_crc.c (cksum_fp_t, cksum_slice8, crc_sum_stream) (crc32b_sum_stream, output_crc): * src/cksum_pclmul.c (cksum_pclmul): * src/cksum_vmull.c (cksum_vmull): * src/sum.c (bsd_sum_stream, sysv_sum_stream, output_bsd, output_sysv): Prefer signed to unsigned int where either will do. This allows better checking with -fsanitize=undefined. It should also help simplify future patches, so that they needn’t worry whether comparisons like ‘i < len - 2’ will misbehave.
2026-02-26stat: don't check QUOTING_STYLE when --printf %%N is usedCollin Funk-1/+44
* NEWS: Mention the fix. * src/stat.c (main): Only check QUOTING_STYLE if there is a %N that is not preceded by a percentage sign. * tests/stat/stat-fmt.sh: Add some test cases.
2026-02-26id: promptly diagnose write errorsCollin Funk-2/+6
* NEWS: Mention the improvement. * src/id.c (print_stuff): Call fflush for each listed user to check for write errors. * tests/misc/io-errors.sh: Add an invocation of 'id'.
2026-02-26groups: promptly diagnose write errorsCollin Funk-0/+7
* NEWS: Mention the improvement. * src/groups.c (main): Call fflush for each listed user to check for write errors. * tests/misc/io-errors.sh: Add an invocation of 'groups'.