coreutils icon indicating copy to clipboard operation
coreutils copied to clipboard

cksum/hashsum: Support for non-UTF-8 input in checksum files

Open RenjiSann opened this issue 4 months ago • 6 comments

Second part of splitting #6603.

This PR adds support for non-UTF-8 checksum file content for cksum and hashsum validation tools.

In practice, given a CHECKSUM file with the following content, non-UTF-8 characters can appear in 2 places:

  • in the filename
  • in a comment All the other places are either algorithm name, which should be ASCII, or hexa/base64 digest, which should be ASCII as well.
$ cat CHECKSUM
SHA1 (filename-might-have-non-utf-8) = 058ab38dd3603703b3a7063cf95dc51a4286b6fe
# comment-might-have-non-utf-8

The most important step is to get rid of the call to BufRead::lines() to iterate on the lines of the CHECKSUM file, otherwise, it panics on non-UTF-8 characters.

This change makes heavy use of String/OsString/Vec<u8> conversion, which happen to not be equally treated on all platforms.

RenjiSann avatar Oct 18 '24 13:10 RenjiSann