diff options
| author | Paul Eggert <eggert@cs.ucla.edu> | 2023-11-16 11:34:55 -0800 |
|---|---|---|
| committer | Paul Eggert <eggert@cs.ucla.edu> | 2023-11-16 11:37:25 -0800 |
| commit | 74b9d6a6e872553dcebfc41f8cb962d5d85d77e5 (patch) | |
| tree | 05984d870a9ccc2233182acc5dee8ff139a29251 /tests | |
| parent | tests: omit inapplicable test code (diff) | |
| download | coreutils-74b9d6a6e872553dcebfc41f8cb962d5d85d77e5.tar.gz coreutils-74b9d6a6e872553dcebfc41f8cb962d5d85d77e5.zip | |
uniq: fix bug with -w in multibyte locales
-w counted bytes not characters, which is wrong in multibyte locales.
This bug exists even in Fedora, which is why the recently-added
test cases from Fedora didn’t catch it.
* src/uniq.c (find_field): New arg PLEN. All callers changed.
Compute length of field correctly in multi-byte locales.
(different): Don’t worry about check_chars; find_field now does that.
* tests/uniq/uniq.pl: Test for this bug.
Diffstat (limited to 'tests')
| -rwxr-xr-x | tests/uniq/uniq.pl | 9 |
1 files changed, 9 insertions, 0 deletions
diff --git a/tests/uniq/uniq.pl b/tests/uniq/uniq.pl index f8aff5fa5..b9653a5bf 100755 --- a/tests/uniq/uniq.pl +++ b/tests/uniq/uniq.pl @@ -292,6 +292,15 @@ if ($mb_locale ne 'C') push @new, ["$test_name-mb", @new_t, {ENV => "LC_ALL=$mb_locale"}]; } push @Tests, @new; + + # Test that -w counts characters, not bytes. + my $trouble_with_w1 = "à\ná\n"; + my @Locale_Tests = + ( + ['w1-mb', '-w1', {IN => $trouble_with_w1}, {OUT => $trouble_with_w1}, + {ENV => "LC_ALL=$mb_locale"}] + ); + push @Tests, @Locale_Tests; } # Remember that triple_test creates from each test with exactly one "IN" |
