coreutils
coreutils copied to clipboard
cksum: Add untagged option
Closes #3811
GNU testsuite comparison:
GNU test failed: tests/misc/tee. tests/misc/tee is passing on 'main'. Maybe you have to rebase?
GNU testsuite comparison:
GNU test failed: tests/tail-2/inotify-dir-recreate. tests/tail-2/inotify-dir-recreate is passing on 'main'. Maybe you have to rebase?
Cool, but I'm not sure it's correct. I get the following results with this PR and GNU:
❯ cargo run -- cksum --untagged Cargo.lock 1822961466 74712 Cargo.lock ❯ cargo run -- cksum Cargo.lock 1559500086 74712 Cargo.lock ❯ cksum --untagged Cargo.lock 1559500086 74712 Cargo.lock ❯ cksum Cargo.lock 1559500086 74712 Cargo.lock
And another issue:
❯ cargo run -- cksum --untagged README.md thread 'main' panicked at 'attempt to subtract with overflow', src/uu/cksum/src/cksum.rs:116:9 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
As far as I can tell,
--untagged
seems to be a noop if no algorithm is given and we don't support different algorithms yet. Here's GNU when we specify the md5 algorithm:❯ cksum -a md5 README.md MD5 (README.md) = 08c3964af54e525755b7364b3221a9c8 ❯ cksum -a md5 --untagged README.md 08c3964af54e525755b7364b3221a9c8 README.md
Ok it seems that I need to take #3812 together. Let me try to finish #3812 then turn back to this PR?
Sure! I'll mark this as a draft in the meantime
Ok thanks!
is it ready for review? thanks
is it ready for review? thanks
waiting for #4356
GNU testsuite comparison:
Congrats! The gnu test tests/misc/tee is no longer failing!
GNU test failed: tests/misc/cksum. tests/misc/cksum is passing on 'main'. Maybe you have to rebase?
done
GNU testsuite comparison:
Congrats! The gnu test tests/rm/rm2 is no longer failing!
@tertsdiepraam Is there anything I should do for this PR?
Hi @tertsdiepraam I want to check what reversed style checksum
means?
The current error I made was caused by I misunderstanding the meaning of reversed style checksum
or it is just some algorithms that we don't need to reverse them
That phrasing is indeed weird. My guess (but I'm not sure) is that they meant only reversing the output, so whether it's filename followed by checksum or checksum followed by filename. This page might help explain: https://www.gnu.org/software/coreutils/manual/html_node/cksum-invocation.html
AH
❯ cksum -a sha1 a.out
SHA1 (a.out) = 70f26e95e2da7dfb6a223b9adcad311c23781c82
❯ cksum -a sha1 a.out --untagged
70f26e95e2da7dfb6a223b9adcad311c23781c82 a.out
❯ cksum -a crc a.out --untagged
447325735 11848 a.out
❯ cksum -a sysv a.out --untagged
38902 24 a.out
❯ cksum -a bsd a.out --untagged
34274 12 a.out
❯ cksum -a bsd a.out
34274 12 a.out
❯ cksum -a sysv a.out
38902 24 a.out
❯ cksum -a crc a.out
447325735 11848 a.out
that's what it does :
if the hash function is not a "legacy" or the default one (bsd,sysv, or crc as in the docs you mentionned) it does print it in kind of math function notation
hashfunct(file) = <HASH>
But if you put untagged it shows it as without the argument just like other utilities would.
<HASH> file
❯ cksum -a sha1 a.out
SHA1 (a.out) = 70f26e95e2da7dfb6a223b9adcad311c23781c82
❯ cksum -a sha1 a.out --untagged
70f26e95e2da7dfb6a223b9adcad311c23781c82 a.out
@tertsdiepraam I don't know if there is a rust way to reuse the code in the hashum, as this is some kind of aliasing : cksum is just doing some maths-like printing it seems.
This is how I believe we could manage the arguments:
By default, untagged is unset. Any of the legacy algorithm (bsd,sysv and crc) override the default.
Then all we have to do is format as we do in the hashsum.rs code i f untagged is set, and otherwise format it as:
[hashfunction]␣(path argument value)␣=␣[HASH]
Could you replace the image with output?
I don't know if there is a rust way to reuse the code in the sha1sum,... utils, as this is some kind of aliasing : cksum is just doing some maths-like printing it seems.
Indeed. I was working on this, but it stranded a bit. It's on this branch: https://github.com/uutils/coreutils/compare/main...tertsdiepraam:coreutils:cksum-rewrite
There's a weird situation where we have the additional hashsum util, which is useless, because it's basically equivalent to GNU cksum. Ideally, the individual utils share code with cksum, instead of hashsum.
@tertsdiepraam I removed the image as asked :)
I wanted to do a joke about gnu being bloated, but I will refrain from doing so ;)
Anyway, because hashsum is our doing, can we not just add a --math-style
argument
to hashsum and turn cksum into an alias just like we do for other tools like md5sum?
This way we would not have to modify the hashsum "statu quo'" for now and introduce a breaking change
I'm not concerned about breaking changes yet to be honest. We're at version 0.0.17, which should signal that there will be breaking changes. In my opinion we can just get rid of hashsum and implement the individual utils via cksum
(though we need feature parity of cksum
first).
In this case it's not GNU being bloated, it's us :)
Closing this PR as it has been superseded by #4860. Thanks anyway for your PR.