coreutils
coreutils copied to clipboard
cksum/hashsum: Support for non-UTF-8 input in checksum files
Second part of splitting #6603.
This PR adds support for non-UTF-8 checksum file content for cksum
and hashsum
validation tools.
In practice, given a CHECKSUM file with the following content, non-UTF-8 characters can appear in 2 places:
- in the filename
- in a comment All the other places are either algorithm name, which should be ASCII, or hexa/base64 digest, which should be ASCII as well.
$ cat CHECKSUM
SHA1 (filename-might-have-non-utf-8) = 058ab38dd3603703b3a7063cf95dc51a4286b6fe
# comment-might-have-non-utf-8
The most important step is to get rid of the call to BufRead::lines()
to iterate on the lines of the CHECKSUM file, otherwise, it panics on non-UTF-8 characters.
This change makes heavy use of String
/OsString
/Vec<u8>
conversion, which happen to not be equally treated on all platforms.