<feed xmlns='http://www.w3.org/2005/Atom'>
<title>coreutils/NEWS, branch master</title>
<subtitle>Mirror of https://https.git.savannah.gnu.org/git/coreutils.git/
</subtitle>
<id>https://git.shady.money/coreutils/atom?h=master</id>
<link rel='self' href='https://git.shady.money/coreutils/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/coreutils/'/>
<updated>2026-04-19T18:50:56Z</updated>
<entry>
<title>doc: NEWS item for who systemd fix</title>
<updated>2026-04-19T18:50:56Z</updated>
<author>
<name>Paul Eggert</name>
<email>eggert@cs.ucla.edu</email>
</author>
<published>2026-04-19T18:50:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/coreutils/commit/?id=271d3ff5f9d3bbb259f6c58d695ce693c92cadba'/>
<id>urn:sha1:271d3ff5f9d3bbb259f6c58d695ce693c92cadba</id>
<content type='text'>
</content>
</entry>
<entry>
<title>df: improve detection of duplicate entries</title>
<updated>2026-04-15T11:56:16Z</updated>
<author>
<name>Lukáš Zaoral</name>
<email>lzaoral@redhat.com</email>
</author>
<published>2026-04-14T12:09:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/coreutils/commit/?id=f6fda635bdc8ea7b6665fa25f7ccc78484a47679'/>
<id>urn:sha1:f6fda635bdc8ea7b6665fa25f7ccc78484a47679</id>
<content type='text'>
Do not compare only with the latest entry for given device id but also
all previously saved entries with the same id.

* src/df.c (struct devlist): Add next_same_dev struct member.
(filter_mount_list): Iterate over next_same_dev to find duplicates.
* tests/df/skip-duplicates.sh: Add test cases.
* NEWS: Mention the improvement.
https://redhat.atlassian.net/browse/RHEL-5649
</content>
</entry>
<entry>
<title>cat: use splice if operating on pipes or if copy_file_range fails</title>
<updated>2026-04-07T04:57:07Z</updated>
<author>
<name>Collin Funk</name>
<email>collin.funk1@gmail.com</email>
</author>
<published>2026-03-29T23:13:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/coreutils/commit/?id=457f88513a128ce91160c4a60f821cc1612204be'/>
<id>urn:sha1:457f88513a128ce91160c4a60f821cc1612204be</id>
<content type='text'>
On a AMD Ryzen 7 3700X system:

    $ timeout 10 taskset 1 ./src/cat-prev /dev/zero \
        | taskset 2 pv -r &gt; /dev/null
    [1.67GiB/s]
    $ timeout 10 taskset 1 ./src/cat /dev/zero \
        | taskset 2 pv -r &gt; /dev/null
    [9.03GiB/s]

On a Power10 system:

    $ taskset 1 ./src/yes | timeout 10 taskset 2 ./src/cat-prev \
        | taskset 3 pv -r &gt; /dev/null
    [12.9GiB/s]
    $ taskset 1 ./src/yes | timeout 10 taskset 2 ./src/cat \
            | taskset 3 pv -r &gt; /dev/null
    [81.8GiB/s]

* NEWS: Mention the improvement.
* src/cat.c: Include isapipe.h, splice.h, and unistd--.h.
(splice_cat): New function.
(main): Use it.
* src/local.mk (noinst_HEADERS): Add src/splice.h.
* src/splice.h: New file, based on definitions from src/yes.c.
* src/yes.c: Include splice.h.
(pipe_splice_size): Use increase_pipe_size from src/splice.h.
(SPLICE_PIPE_SIZE): Remove definition, moved to src/splice.h.
* tests/cat/splice.sh: New file, based on some tests in
tests/misc/yes.sh.
* tests/local.mk (all_tests): Add the new test.
</content>
</entry>
<entry>
<title>doc: document cut(1) multi-byte and interface consolidation</title>
<updated>2026-04-06T14:52:58Z</updated>
<author>
<name>Pádraig Brady</name>
<email>P@draigBrady.com</email>
</author>
<published>2026-03-12T17:49:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/coreutils/commit/?id=a325c99781b5d9fa6cd849f583dde7b1a57fad5d'/>
<id>urn:sha1:a325c99781b5d9fa6cd849f583dde7b1a57fad5d</id>
<content type='text'>
This patch set updates cut(1) to be multi-byte aware.
It also reduces interface divergence across implementations.

multi-byte awareness was added to the existing -c, n, and -d options.
Also considered for compatibility are the -w, -F, and -O options,
as these are present on at least two other common implementations.

= Interface / New functionality =

    macOS,  i18n, uutils, Toybox, Busybox, GNU
-c    x      x       x      x        x      x
-n    x      x                              x
-w    x              x                      x
-F                          x        x      x
-O                          x        x      x

-c is needed anyway as specified by all, including POSIX.
-n is needed also as specified by i18n/macOS/POSIX
-w is somewhat less important, but seeing as it's
on two other common platforms (and its functionality is
provided on two more), providing it is worthwhile for compat.
-F and -O are really just aliases to other options
so trivial to add, and probably worthwhile for compatibility.

Interface / functionality notes:

There is a slight divergence between -n implementations.
There was already a difference between FreeBSD and i18n, and
we've aligned with the more sensible FreeBSD implementation.
Note the i18n -n implementation is otherwise buggy in any case,
so I doubt this will be a practical compatibility concern.
Actually -n is specified by POSIX, and it matches FreeBSD.
Specifically our -n will not output a character unless the
byte range encompasses _the end_ of the multi-byte character.
I.e. the -b is a limit that is not passed, and thus ensures
we don't output overlapping characters for separate cut
invocations that do not have overlapping byte ranges.

-d &lt;regex&gt; from toybox is not implemented.
That's edge case functionality IMHO and not well suited to cut(1).
This functionality is supported by awk, and regex functionality
is best restricted to awk I think.

cut is a significant part of the i18n patch, so it will be good
to avoid that downstream divergence.  Unfortunately there were
no tests with the cut i18n implementation.
Note the i18n cut implementation used fread() as so was
not reponsive to new data &lt; BUFSIZ, whereas this implementation
uses read() and thus is responsive to data as it becomes available.

= Performance =

General performance notes:

We prefer byte searching (with -d) as that can be much faster
than character by character processing, and it's supported
on single byte and UTF-8 charsets.  We also use byte searching
with -w on uni-byte locales.
This was seen to give up to 100x perf increase over the i18n patch.

Where we do use per character processing, we avoid conversion to
wide char when processing ASCII data (mcel provides this optimization).
This was seen to give a 14x performance increase over the i18n patch.

We prefer memchr() and strstr() as these are tuned for specific
platforms on glibc, even if memchr2() or memmem()
are algorithmically better.

We maintain the important memory behavior
of only buffering when necessary.

Performance testing:

There are _lots_ of combinations and optimziation opportunities.
I performance tested this patch set with the following setup:

$ yes | head -n10M &gt; sl.in
$ yes $(yes eeeaae | head -n10K | paste -s -d,) | head -n10K &gt; ll.in
$ yes $(yes eeeaae | head -n9 | paste -s -d,) | head -n1M &gt; as.in
$ yes $(yes éééááé | head -n9 | paste -s -d，) | head -n1M \
  &gt; mb.in

$ for type in sl ll as mb; do
    cat $type.in &gt;/dev/null;
    for imp in '' src/; do  # '' maps to the system i18n ver on Fedora
      echo ============ "${imp:-i18n}" $type ==============;
      for d in -d, -dc -d， -dç -w -b -c; do
        fields='-f1 -f10 -f100'
        test "$d" = "-b" &amp;&amp; { fields='-b1 -b10 -b100'; d=''; }
        test "$d" = "-c" &amp;&amp; { fields='-c1 -c10 -c100'; d=''; }
        for f in $fields; do
          for loc in C C.UTF-8; do
            # SKip -b for UTF-8 as no different
            test "$loc" = C.UTF-8 &amp;&amp; echo "$f" | grep -q -- -b \
             &amp;&amp; continue
            # Skip multi-byte delimiter for C and not allowed
            test "$loc" = C &amp;&amp; test $(echo -n "$d" | wc -c) -ge 4 \
             &amp;&amp; continue
            LC_ALL=$loc ${imp}cut $f $d /dev/null 2&gt;/dev/null &amp;&amp;
            hyperfine -m2 -M4 \
             "LC_ALL=$loc ${imp}cut $f $d $type.in &gt;/dev/null" ||
            printf 'Benchmark 1: %s\n  unsupported\n\n' \
             "LC_ALL=$loc ${imp}cut $f $d $type.in &gt;/dev/null"
          done;
        done;
      done;
    done;
  done

After a little post-processing of the results, we get:

-- cut-i18n

| command         |       sl |       ll |       as |       mb |
| --------------- | -------- | -------- | -------- | -------- |
| C -f1 -d,       |  66.3 ms |  1.605 s | 145.9 ms | 366.4 ms |
| UTF8 -f1 -d,    |  65.8 ms |  1.593 s | 145.8 ms | 370.0 ms |
| C -f10 -d,      | 301.4 ms |  1.590 s | 161.8 ms | 126.7 ms |
| UTF8 -f10 -d,   | 303.5 ms |  1.599 s | 161.8 ms | 124.6 ms |
| C -f100 -d,     | 300.6 ms |  1.596 s | 162.1 ms | 126.7 ms |
| UTF8 -f100 -d,  | 301.3 ms |  1.595 s | 162.0 ms | 124.9 ms |
| C -f1 -dc       |  66.6 ms |  1.845 s | 179.1 ms | 365.7 ms |
| UTF8 -f1 -dc    |  73.8 ms |  1.878 s | 179.1 ms | 363.1 ms |
| C -f10 -dc      | 300.7 ms | 349.8 ms |  76.0 ms | 125.3 ms |
| UTF8 -f10 -dc   | 300.4 ms | 347.2 ms |  75.7 ms | 124.8 ms |
| C -f100 -dc     | 300.1 ms | 348.1 ms |  76.5 ms | 125.5 ms |
| UTF8 -f100 -dc  | 300.8 ms | 348.7 ms |  76.4 ms | 125.8 ms |
| UTF8 -f1 -d，   | 563.5 ms | 21.775 s |  1.963 s |  1.665 s |
| UTF8 -f10 -d，  | 833.6 ms | 20.504 s |  2.022 s |  1.612 s |
| UTF8 -f100 -d， | 825.2 ms | 20.448 s |  2.009 s |  1.616 s |
| UTF8 -f1 -dç    | 563.7 ms | 21.827 s |  1.964 s |  2.319 s |
| UTF8 -f10 -dç   | 825.3 ms | 21.713 s |  2.011 s |  2.248 s |
| UTF8 -f100 -dç  | 831.6 ms | 20.505 s |  2.019 s |  2.276 s |
| C -f1 -w        |        - |        - |        - |        - |
| UTF8 -f1 -w     |        - |        - |        - |        - |
| C -f10 -w       |        - |        - |        - |        - |
| UTF8 -f10 -w    |        - |        - |        - |        - |
| C -f100 -w      |        - |        - |        - |        - |
| UTF8 -f100 -w   |        - |        - |        - |        - |
| C -b1           |  60.8 ms |  1.596 s | 154.8 ms | 313.7 ms |
| C -b10          |  51.6 ms |  1.594 s | 154.3 ms | 310.8 ms |
| C -b100         |  51.4 ms |  1.594 s | 153.0 ms | 312.2 ms |
| C -c1           |  60.7 ms |  1.597 s | 153.8 ms | 313.0 ms |
| UTF8 -c1        | 526.5 ms | 14.662 s |  1.362 s |  1.573 s |
| C -c10          |  51.8 ms |  1.591 s | 153.3 ms | 311.4 ms |
| UTF8 -c10       | 436.9 ms | 14.450 s |  1.336 s |  1.563 s |
| C -c100         |  51.0 ms |  1.593 s | 152.7 ms | 313.2 ms |
| UTF8 -c100      | 426.7 ms | 14.429 s |  1.344 s |  1.551 s |

-- src/cut

| command         |       sl |       ll |       as |       mb |
| --------------- | -------- | -------- | -------- | -------- |
| C -f1 -d,       |   4.6 ms | 108.2 ms |  45.4 ms |  24.2 ms |
| UTF8 -f1 -d,    |   4.8 ms | 108.4 ms |  45.4 ms |  24.5 ms |
| C -f10 -d,      |   4.5 ms | 109.3 ms | 123.7 ms |  24.3 ms |
| UTF8 -f10 -d,   |   4.9 ms | 114.1 ms | 124.1 ms |  24.5 ms |
| C -f100 -d,     |   4.7 ms | 119.2 ms | 124.1 ms |  24.5 ms |
| UTF8 -f100 -d,  |   4.8 ms | 120.0 ms | 125.1 ms |  24.5 ms |
| C -f1 -dc       |   4.4 ms | 120.5 ms |  11.9 ms |  24.1 ms |
| UTF8 -f1 -dc    |   4.9 ms | 120.5 ms |  12.1 ms |  24.6 ms |
| C -f10 -dc      |   4.7 ms | 125.3 ms |  11.8 ms |  24.1 ms |
| UTF8 -f10 -dc   |   4.8 ms | 126.7 ms |  12.0 ms |  24.4 ms |
| C -f100 -dc     |   4.6 ms | 127.0 ms |  11.9 ms |  24.3 ms |
| UTF8 -f100 -dc  |   4.7 ms | 126.4 ms |  12.0 ms |  24.4 ms |
| UTF8 -f1 -d，   |   6.0 ms | 169.4 ms |  15.6 ms |  67.4 ms |
| UTF8 -f10 -d，  |   6.1 ms | 173.9 ms |  15.6 ms | 237.2 ms |
| UTF8 -f100 -d， |   6.1 ms | 174.0 ms |  15.6 ms | 237.8 ms |
| UTF8 -f1 -dç    |   6.3 ms | 170.8 ms |  15.7 ms |  32.2 ms |
| UTF8 -f10 -dç   |   6.0 ms | 172.9 ms |  15.9 ms |  32.1 ms |
| UTF8 -f100 -dç  |   6.7 ms | 173.1 ms |  15.5 ms |  32.3 ms |
| C -f1 -w        | 159.6 ms | 170.1 ms |  69.1 ms |  98.9 ms |
| UTF8 -f1 -w     | 128.1 ms |  2.525 s | 246.5 ms |  1.086 s |
| C -f10 -w       | 183.3 ms | 199.2 ms |  74.6 ms | 105.0 ms |
| UTF8 -f10 -w    | 130.3 ms |  2.659 s | 276.5 ms |  1.099 s |
| C -f100 -w      | 183.8 ms | 202.5 ms |  74.1 ms | 103.6 ms |
| UTF8 -f100 -w   | 130.1 ms |  2.663 s | 276.6 ms |  1.097 s |
| C -b1           |  65.0 ms | 110.2 ms |  22.4 ms |  35.6 ms |
| C -b10          |  48.7 ms | 109.6 ms |  24.2 ms |  36.7 ms |
| C -b100         |  48.7 ms | 110.6 ms |  19.0 ms |  36.6 ms |
| C -c1           |  65.8 ms | 109.5 ms |  22.4 ms |  35.6 ms |
| UTF8 -c1        |  63.2 ms |  1.130 s | 116.9 ms | 610.2 ms |
| C -c10          |  48.7 ms | 109.8 ms |  24.3 ms |  36.8 ms |
| UTF8 -c10       |  39.7 ms |  1.133 s | 118.7 ms | 610.0 ms |
| C -c100         |  48.3 ms | 110.7 ms |  18.9 ms |  36.7 ms |
| UTF8 -c100      |  39.4 ms |  1.141 s | 115.0 ms | 598.8 ms |

In summary, compared to the i18n patch we're now as fast in all cases,
and much faster in most cases.

We can see the -f byte searching performing well,
being 120x faster in the no matching delimiter case,
to at least 3x faster in the matching delimiter case.

When we resort to per character processing we also compare well,
being 14x faster in the ASCII processing case
(due to mcel short-circuiting the wide char conversion).
Note the processing mb.in results above also show a 2x win
in per character processing cases, but the i18n patch would have
also picked that win up as it's achieved separately to this patch set:
https://lists.gnu.org/r/coreutils/2026-03/msg00117.html
</content>
</entry>
<entry>
<title>cut,fold,expand,unexpand: ensure we process all available characters</title>
<updated>2026-04-06T14:52:56Z</updated>
<author>
<name>Pádraig Brady</name>
<email>P@draigBrady.com</email>
</author>
<published>2026-04-02T19:19:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/coreutils/commit/?id=57c87043f6cdb6aeb043b78d607a43a9ae615430'/>
<id>urn:sha1:57c87043f6cdb6aeb043b78d607a43a9ae615430</id>
<content type='text'>
* gl/lib/mbbuf.h: Adjust mbbuf_fill() to process full characters
in the slop at the end of a read().  Previously valid characters
in the last MCEL_LEN_MAX bytes were ignored until the next read().
* src/cut.c (cut_fields_bytesearch): Adjust to the new naming.
* NEWS: Mention the fold(1) responsiveness fix, which was
improved with the change from fread() to read(),
and completed with this patch.
</content>
</entry>
<entry>
<title>build: update to latest gnulib</title>
<updated>2026-04-05T12:11:21Z</updated>
<author>
<name>Pádraig Brady</name>
<email>P@draigBrady.com</email>
</author>
<published>2026-04-05T12:04:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/coreutils/commit/?id=1204b29babbd3288686b118fb9cc103dd8d5f2b3'/>
<id>urn:sha1:1204b29babbd3288686b118fb9cc103dd8d5f2b3</id>
<content type='text'>
Pick up mbrto{c32,wc} optimizations on UTF-8 on GLIBC.
Note configure.ac defines the required GNULIB_WCHAR_SINGLE_LOCALE.
This speeds up wc -m by 2.6x, when processing non ASCII chars,
and will similarly speed up per character processing
in the impending cut multi-byte implementation.
* NEWS: Mention the wc -m speed improvement.
</content>
</entry>
<entry>
<title>tac: promptly diagnose write errors</title>
<updated>2026-03-21T19:10:59Z</updated>
<author>
<name>Collin Funk</name>
<email>collin.funk1@gmail.com</email>
</author>
<published>2026-03-21T08:07:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/coreutils/commit/?id=30a5cbec0efc2941513c5f51652fb851b7b87bba'/>
<id>urn:sha1:30a5cbec0efc2941513c5f51652fb851b7b87bba</id>
<content type='text'>
This patch also fixes a bug where 'tac' would print a vague error on
some inputs:

    $ seq 10000 | ./src/tac-prev &gt; /dev/full
    tac-prev: write error
    $ seq 10000 | ./src/tac &gt; /dev/full
    tac: write error: No space left on device

In this case ferror (stdout) is true, but errno has been set back to
zero by a successful fclose (stdout) call.

* src/tac.c (output): Call write_error() if fwrite fails.
* tests/misc/io-errors.sh: Check that 'tac' prints a detailed write
error.
* NEWS: Mention the improvement.
</content>
</entry>
<entry>
<title>timeout: don't exit immediately if the parent is the init process</title>
<updated>2026-03-14T03:37:10Z</updated>
<author>
<name>Collin Funk</name>
<email>collin.funk1@gmail.com</email>
</author>
<published>2026-03-14T03:37:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/coreutils/commit/?id=e644eea122462aa7fa98cbe9b8f93088074588a0'/>
<id>urn:sha1:e644eea122462aa7fa98cbe9b8f93088074588a0</id>
<content type='text'>
* src/timeout.c (main): Save the process ID before creating a child
process. Check if the result of getppid is different than the saved
process ID instead of checking if it is 1.
* tests/timeout/init-parent.sh: New file.
* tests/local.mk (all_tests): Add the new test.
* NEWS: Mention the bug fix. Also mention that this change allows
'timeout' to work when reparented by a subreaper process instead of
init.
</content>
</entry>
<entry>
<title>dd: always diagnose partial writes on write failure</title>
<updated>2026-03-12T21:19:33Z</updated>
<author>
<name>Pádraig Brady</name>
<email>P@draigBrady.com</email>
</author>
<published>2026-03-11T15:39:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/coreutils/commit/?id=711fa8f9a5f2a476b6c8790ef83691638a4be96e'/>
<id>urn:sha1:711fa8f9a5f2a476b6c8790ef83691638a4be96e</id>
<content type='text'>
* src/dd.c (dd_copy): Increment the partial write count upon failure.
* tests/dd/partial-write.sh: Add a new test.
* tests/local.mk: Reference the new test.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/80583
</content>
</entry>
<entry>
<title>doc: clarify a recent NEWS item</title>
<updated>2026-03-12T20:05:58Z</updated>
<author>
<name>Pádraig Brady</name>
<email>P@draigBrady.com</email>
</author>
<published>2026-03-11T15:57:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/coreutils/commit/?id=6382995b0a5d3e6faf23e7dd8deab5f51f9cff70'/>
<id>urn:sha1:6382995b0a5d3e6faf23e7dd8deab5f51f9cff70</id>
<content type='text'>
* NEWS: It was ambiguous as to whether we quoted a range of
observered throughputs.  Clarify this was the old and new
throughput on a single test system.
</content>
</entry>
</feed>
