summaryrefslogtreecommitdiffstats
path: root/src
AgeCommit message (Collapse)AuthorLines
4 daysdd: always diagnose partial writes on write failurePádraig Brady-0/+2
* src/dd.c (dd_copy): Increment the partial write count upon failure. * tests/dd/partial-write.sh: Add a new test. * tests/local.mk: Reference the new test. * NEWS: Mention the bug fix. Fixes https://bugs.gnu.org/80583
6 daysmaint: adjust to placate coverityPádraig Brady-1/+2
* src/system.h (c32issep): Adjust to more standard layout.
6 daysyes: use a zero-copy implementation via (vm)splicePádraig Brady-9/+155
A good reference for the concepts used here is: https://mazzo.li/posts/fast-pipes.html We don't consider huge pages or busy loops here, but use vmsplice(), and splice() to get significant speedups: i7-5600U-laptop $ taskset 1 yes | taskset 2 pv > /dev/null ... [4.98GiB/s] i7-5600U-laptop $ taskset 1 src/yes | taskset 2 pv > /dev/null ... [34.1GiB/s] IBM,9043-MRX $ taskset 1 yes | taskset 2 pv > /dev/null ... [11.6GiB/s] IBM,9043-MRX $ taskset 1 src/yes | taskset 2 pv > /dev/null ... [175GiB/s] Also throughput to file (on BTRFS) was seen to increase significantly. With a Fedora 43 laptop improving from 690MiB/s to 1.1GiB/s. * bootstrap.conf: Ensure sys/uio.h is present. This was an existing transitive dependency. * m4/jm-macros.m4: Define HAVE_SPLICE appropriately. We assume vmsplice() is available if splice() is as they were introduced at the same time to Linux and glibc. * src/yes.c (repeat_pattern): A new function to efficiently duplicate a pattern in a buffer with memcpy calls that double in size. This also makes the setup for the existing write() path more efficient. (pipe_splice_size): A new function to increase the kernel pipe buffer if possible, and use an appropriately sized buffer based on that (25%). (splice_write): A new function to call vmplice() when outputting to a pipe, and also splice() if outputting to a non-pipe. * tests/misc/yes.sh: Verify the non-pipe output case, (main): Adjust to always calling write on the minimal buffer first, then trying vmsplice(), then falling back to write from bigger buffer. and the vmsplice() fallback to write() case. * NEWS: Mention the improvement.
6 daysall: use more consistent blank character determinationPádraig Brady-10/+31
* src/system.h (c32issep): A new function that is essentially iswblank() on GLIBC platforms, and iswspace() with exceptions elsewhere. * src/expand.c: Use it instead of c32isblank(). * src/fold.c: Likewise. * src/join.c: Likewise. * src/numfmt.c: Likewise. * src/unexpand.c: Likewise. * src/uniq.c: Likewise. * NEWS: Mention the improvement.
6 dayscksum: fix tagged output on 32 bit platformsPádraig Brady-4/+4
Fix an unreleased issue due to the recent change to using idx_t in commit v9.10-91-g02983e493 * src/cksum.c (output_file): Cast the idx_t before passing to printf.
7 dayswc: improve aarch64 Neon optimization for 'wc -l'Collin Funk-56/+56
$ yes abcdefghijklmnopqrstuvwxyz | head -n 200000000 > input $ time ./src/wc-prev -l input 200000000 input real 0m1.240s user 0m0.456s sys 0m0.784s $ time ./src/wc -l input 200000000 input real 0m0.936s user 0m0.141s sys 0m0.795s * configure.ac: Use unsigned char for the buffer to avoid potential compiler warnings. Check for the functions being used in src/wc_neon.c after this patch. * src/wc_neon.c (wc_lines_neon): Use vreinterpretq_s8_u8 to convert 0xff into -1 instead of bitwise AND instructions into convert it into 1. Perform the pairwise addition and lane extraction once every 8192 bytes instead of once every 64 bytes. Thanks to Lasse Collin for spotting this and reviewing a draft of this patch.
7 daysbuild: update to latest gnulibPádraig Brady-1/+1
* src/ls.c: Adjust for renamed acl permissions member.
9 daysexpand,unexpand: support multi-byte inputLukáš Zaoral-30/+63
* src/expand.c: Use mbbuf to support multi-byte input. * src/unexpand.c: Likewise. * tests/expand/mb.sh: New multi-byte test. * tests/unexpand/mb.sh: Likewise. * tests/local.mk: Reference new tests. * NEWS: Mention the improvement.
10 daysmaint: shred: fix typo in commentWeixie Cui-1/+1
* src/shred.c: Fix "then" -> "than" in comment.
10 daysmaint: dd: fix typo in commentWeixie Cui-1/+1
* src/dd.c: Fix "that that" -> "that the" in comment.
11 daysmaint: touch: reduce variable scopeCollin Funk-2/+2
* src/touch.c (main): Declare variables where they are used instead of at the start of the function.
11 daysmaint: chown,chgrp: reduce variable scopeCollin Funk-27/+20
* src/chown-core.c (describe_change, restricted_chown) (change_file_owner, chown_files): Declare variables where they are used instead of at the start of the function. * src/chown.c (main): Likewise.
12 daysinstall: allow the combination of --compare and --preserve-timestampsCollin Funk-22/+19
* NEWS: Mention the improvement. * src/install.c (enum copy_status): New type to let the caller know if the copy was performed or skipped. (copy_file): Return the new type instead of bool. Reduce variable scope. (install_file_in_file): Only strip the file if the copy was performed. Update the timestamps if the copy was skipped. (main): Don't error when --compare and --preserve-timestamps are combined. * tests/install/install-C.sh: Add some test cases.
12 dayscksum: use more defensive escaping for --checkPádraig Brady-12/+6
cksum --check is often the first interaction users have with possibly untrusted downloads, so we should try to be as defensive as possible when processing it. Specifically we currently only escape \n characters in file names presented in checksum files being parsed with cksum --check. This gives some possibilty of dumping arbitrary data to the terminal when checking downloads from an untrusted source. This change gives these advantages: 1. Avoids dumping arbitrary data to vulnerable terminals 2. Avoids visual deception with ansi codes hiding checksum failures 3. More secure if users copy and paste file names from --check output 4. Simplifies programmatic parsing Note this changes programmatic parsing, but given the original format was so awkward to parse, I expect that's extremely rare. I was not able to find example in the wild at least. To parse the new format from from shell, you can do something like: cksum -c checksums | while IFS= read -r line; do case $line in *': FAILED') filename=$(eval "printf '%s' ${line%: FAILED}") cp -v "$filename" /quarantine ;; esac done This change also slightly reduces the size of the sum(1) utility. This change also apples to md5sum, sha*sum, and b2sum. * src/cksum.c (digest_check): Call quotef() instead of cksum(1) specific quoting. * tests/cksum/md5sum-bsd.sh: Adjust accordingly. * doc/coreutils.texi (cksum general options): Describe the shell quoting used for problematic file names. * NEWS: Mention the change in behavior. Reported by: Aaron Rainbolt
2026-02-28du: fflush after outputting a linePaul Eggert-1/+1
* src/du.c (print_size): Resurrect the fflush call, since there can be significant delay between output lines.
2026-02-27du: avoid locking and flushing standard outputCollin Funk-3/+4
This results in a noticeable increase in performance: $ yes /dev/null | head -n 10000000 | tr '\n' '\0' \ | time --format=%E ./src/du-prev -l --files0-from=- > /dev/null 0:20.40 $ yes /dev/null | head -n 10000000 | tr '\n' '\0' \ | time --format=%E ./src/du -l --files0-from=- > /dev/null 0:16.57 * src/du.c (print_size): Prefer putchar and fputs which may be unlocked unlike printf. Prefer ferror to fflush.
2026-02-27stat: handle %%%N tooPaul Eggert-2/+3
* src/stat.c (main): Fix incorrect counting of '%'s before 'N'. * tests/stat/stat-fmt.sh: Test for the bug.
2026-02-27id: avoid unnecessary buffer flushingPaul Eggert-4/+3
* src/groups.c (main): * src/id.c (main, print_stuff): Don’t flush stdout before testing for write error. Do the test only when in a loop, as a one-shot will test for write error soon anyway.
2026-02-27cksum: prefer signed intPaul Eggert-129/+129
* src/cksum.c (min_digest_line_length, digest_hex_bytes) (digest_length, md5_sum_stream, sha1_sum_stream) (sha224_sum_stream, sha256_sum_stream, sha384_sum_stream) (sha512_sum_stream, sha2_sum_stream, sha3_sum_stream) (blake2b_sum_stream, sm3_sum_stream, problematic_chars) (filename_unescape, valid_digits, bsd_split_3) (algorithm_from_tag, split_3, digest_file, output_file) (b64_equal, hex_equal, digest_check, main): * src/cksum_avx2.c (cksum_avx2): * src/cksum_avx512.c (cksum_avx512): * src/cksum_crc.c (cksum_fp_t, cksum_slice8, crc_sum_stream) (crc32b_sum_stream, output_crc): * src/cksum_pclmul.c (cksum_pclmul): * src/cksum_vmull.c (cksum_vmull): * src/sum.c (bsd_sum_stream, sysv_sum_stream, output_bsd, output_sysv): Prefer signed to unsigned int where either will do. This allows better checking with -fsanitize=undefined. It should also help simplify future patches, so that they needn’t worry whether comparisons like ‘i < len - 2’ will misbehave.
2026-02-26stat: don't check QUOTING_STYLE when --printf %%N is usedCollin Funk-1/+10
* NEWS: Mention the fix. * src/stat.c (main): Only check QUOTING_STYLE if there is a %N that is not preceded by a percentage sign. * tests/stat/stat-fmt.sh: Add some test cases.
2026-02-26id: promptly diagnose write errorsCollin Funk-0/+3
* NEWS: Mention the improvement. * src/id.c (print_stuff): Call fflush for each listed user to check for write errors. * tests/misc/io-errors.sh: Add an invocation of 'id'.
2026-02-26groups: promptly diagnose write errorsCollin Funk-0/+3
* NEWS: Mention the improvement. * src/groups.c (main): Call fflush for each listed user to check for write errors. * tests/misc/io-errors.sh: Add an invocation of 'groups'.
2026-02-21shuf: avoid locking standard output when using --input-rangeCollin Funk-2/+6
Here is the throughput before this patch: # write_permuted_numbers $ ./src/shuf-prev -i 0-100000000 | pv -r > /dev/null [ 153MiB/s] # write_random_numbers $ timeout 10 ./src/shuf-prev -i 0-100000 -r | pv -r > /dev/null [78.6MiB/s] Here is the throughput after this patch: # write_permuted_numbers $ timeout 10 ./src/shuf -i 0-100000000 | pv -r > /dev/null [ 308MiB/s] # write_random_numbers $ timeout 10 ./src/shuf -i 0-100000 -r | pv -r > /dev/null [ 196MiB/s] * NEWS: Mention the performance improvement. * src/shuf.c (write_permuted_numbers, write_random_numbers): Prefer fputs and fputc which may be unlocked over printf which locks standard output.
2026-02-19maint: printf: prefer static initializationCollin Funk-2/+0
* src/printf.c (main): Remove unnecessary initialization.
2026-02-19maint: fmt: prefer static initializationCollin Funk-7/+2
* src/fmt.c (prefix, max_width): Initialize variables. (main): Remove unnecessary initializations.
2026-02-19maint: sort: prefer static initializationCollin Funk-1/+0
* src/sort.c (main): Remove unnecessary initialization.
2026-02-19maint: df: prefer static initializationCollin Funk-13/+2
* src/df.c (human_output_opts, grand_fsu): Initialize variables. (main): Remove unnecessary initializations.
2026-02-19maint: split: prefer static initializationCollin Funk-5/+2
* src/split.c (outbase, infile): Initialize variables. (main): Remove unnecessary initializations.
2026-02-19maint: tail: prefer static initializationCollin Funk-7/+2
* src/tail.c (count_lines, line_end): Initialize variables. (main): Remove unnecessary initializations.
2026-02-19maint: csplit: prefer static initializationCollin Funk-8/+2
* src/csplit.c (prefix, remove_files): Initialize variables. (main): Remove unnecessary initializations.
2026-02-19maint: comm: prefer static initializationCollin Funk-12/+3
* src/comm.c (only_file_1, only_file_2, both): Initialize variables. (main): Remove unnecessary initializations.
2026-02-19maint: wc: prefer static initializationCollin Funk-4/+0
* src/wc.c (main): Remove unnecessary initializations.
2026-02-19maint: tac: prefer static initializationCollin Funk-7/+3
* src/tac.c (separator, separator_ends_record, sentinel_length): Initialize variables. (main): Remove unnecessary initializations.
2026-02-19maint: join: prefer static initializationCollin Funk-6/+1
* src/join.c (print_pairables): Initialize variable. (main): Remove unnecessary initializations.
2026-02-19maint: cut: prefer static initializationCollin Funk-6/+0
* src/cut.c (main): Remove unnecessary initializations.
2026-02-19maint: seq: prefer static initializationCollin Funk-4/+1
* src/seq.c (separator): Initialize variable. (main): Remove unnecessary initializations.
2026-02-19maint: paste: prefer static initializationCollin Funk-3/+0
* src/paste.c (main): Remove unnecessary initializations.
2026-02-19maint: install: prefer static initializationCollin Funk-4/+0
* src/install.c (main): Remove unnecessary initializations.
2026-02-19maint: chmod: prefer static initializationCollin Funk-2/+0
* src/chmod.c (main): Remove unnecessary initializations.
2026-02-19maint: head: prefer static initializationCollin Funk-7/+1
* src/head.c (line_end): Initialize variable. (main): Remove unnecessary initializations.
2026-02-19maint: ln: prefer static initializationCollin Funk-3/+0
* src/ln.c (main): Remove unnecessary initializations.
2026-02-19maint: nl: prefer static initializationCollin Funk-2/+0
* src/nl.c (main): Remove unnecessary initialization.
2026-02-19maint: touch: prefer static initializationCollin Funk-3/+0
* src/touch.c (main): Remove unnecessary initializations.
2026-02-19maint: fold: prefer static initializationCollin Funk-2/+0
* src/fold.c (main): Remove unnecessary initializations.
2026-02-19maint: tee: prefer static initializationCollin Funk-3/+0
* src/tee.c (main): Remove unnecessary initializations.
2026-02-19maint: rmdir: prefer static initializationCollin Funk-2/+0
* src/rmdir.c (main): Remove unnecessary initialization.
2026-02-19maint: tty: prefer static initializationCollin Funk-2/+0
* src/tty.c (main): Remove unnecessary initialization.
2026-02-18wc: add aarch64 Neon optimization for wc -lCollin Funk-0/+144
Here is an example of the performance improvement: $ yes abcdefghijklmnopqrstuvwxyz | head -n 100000000 > input $ time ./src/wc-prev -l < input 100000000 real 0m0.793s user 0m0.630s sys 0m0.162s $ time ./src/wc -l < input 100000000 real 0m0.230s user 0m0.065s sys 0m0.164s * NEWS: Mention the performance improvement. * gnulib: Update to the latest commit. * configure.ac: Check the the necessary intrinsics and functions. * src/local.mk (noinst_LIBRARIES) [USE_NEON_WC_LINECOUNT]: Add src/libwc_neon.a. (src_libwc_neon_a_SOURCES, wc_neon_ldadd, src_libwc_neon_a_CFLAGS) [USE_NEON_WC_LINECOUNT]: New variables. (src_wc_LDADD) [USE_NEON_WC_LINECOUNT]: Add $(wc_neon_ldadd). * src/wc.c [USE_NEON_WC_LINECOUNT]: Include sys/auxv.h and asm/hwcap.h. (neon_supported) [USE_NEON_WC_LINECOUNT]: New function. (wc_lines) [USE_NEON_WC_LINECOUNT]: Use neon_supported and wc_lines_neon. * src/wc.h (wc_lines_neon): Add declaration. * src/wc_neon.c: New file. * doc/coreutils.texi (Hardware Acceleration): Document the "-ASIMD" hwcap and the variable used in ./configure to override detection of Neon instructions. * tests/wc/wc-cpu.sh: Also add "-ASIMD" to disable the use of Neon instructions.
2026-02-17maint: expr: reduce variable scopeCollin Funk-77/+46
* src/expr.c (mbs_logical_cspn, main, trace, docolon, eval7, eval6) (eval5, eval4, eval3, eval2, eval1, eval): Declare variables where they are used instead of at the start of the function.
2026-02-16pwd: fix heap buffer overflow in file_name_prependChris Down-1/+1
file_name_prepend works by right-aligning path data in a growing buffer. When the buffer is too small, it then allocates a new buffer via xpalloc() and copies existing data to the end of the new buffer. Unfortunately, the memcpy destination is computed as buf + p->n_alloc - n_free, but xpalloc has already updated p->n_alloc to the new (larger) allocation size while n_free still reflects the old state. This places the data at too high an offset, writing past the end of the buffer. Update to properly calculate the destination offset. Fixes: v9.5-171-g61ab25c35 ("pwd: prefer xpalloc to xnrealloc")