coreutils icon indicating copy to clipboard operation
coreutils copied to clipboard

cksum: Add untagged option

Open howjmay opened this issue 1 year ago • 7 comments

Closes #3811

howjmay avatar Feb 12 '23 13:02 howjmay

GNU testsuite comparison:

GNU test failed: tests/misc/tee. tests/misc/tee is passing on 'main'. Maybe you have to rebase?

github-actions[bot] avatar Feb 12 '23 14:02 github-actions[bot]

GNU testsuite comparison:

GNU test failed: tests/tail-2/inotify-dir-recreate. tests/tail-2/inotify-dir-recreate is passing on 'main'. Maybe you have to rebase?

github-actions[bot] avatar Feb 12 '23 14:02 github-actions[bot]

Cool, but I'm not sure it's correct. I get the following results with this PR and GNU:

❯ cargo run -- cksum --untagged Cargo.lock
1822961466 74712 Cargo.lock
❯ cargo run -- cksum Cargo.lock
1559500086 74712 Cargo.lock
❯ cksum --untagged Cargo.lock
1559500086 74712 Cargo.lock
❯ cksum Cargo.lock
1559500086 74712 Cargo.lock

And another issue:

❯ cargo run -- cksum --untagged README.md
thread 'main' panicked at 'attempt to subtract with overflow', src/uu/cksum/src/cksum.rs:116:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

As far as I can tell, --untagged seems to be a noop if no algorithm is given and we don't support different algorithms yet. Here's GNU when we specify the md5 algorithm:

❯ cksum -a md5 README.md
MD5 (README.md) = 08c3964af54e525755b7364b3221a9c8
❯ cksum -a md5 --untagged README.md
08c3964af54e525755b7364b3221a9c8  README.md

Ok it seems that I need to take #3812 together. Let me try to finish #3812 then turn back to this PR?

howjmay avatar Feb 12 '23 17:02 howjmay

Sure! I'll mark this as a draft in the meantime

tertsdiepraam avatar Feb 12 '23 17:02 tertsdiepraam

Ok thanks!

howjmay avatar Feb 12 '23 17:02 howjmay

is it ready for review? thanks

sylvestre avatar Feb 18 '23 09:02 sylvestre

is it ready for review? thanks

waiting for #4356

howjmay avatar Feb 18 '23 13:02 howjmay

GNU testsuite comparison:

Congrats! The gnu test tests/misc/tee is no longer failing!
GNU test failed: tests/misc/cksum. tests/misc/cksum is passing on 'main'. Maybe you have to rebase?

github-actions[bot] avatar Mar 08 '23 18:03 github-actions[bot]

done

howjmay avatar Mar 08 '23 18:03 howjmay

GNU testsuite comparison:

Congrats! The gnu test tests/rm/rm2 is no longer failing!

github-actions[bot] avatar Mar 08 '23 19:03 github-actions[bot]

@tertsdiepraam Is there anything I should do for this PR?

howjmay avatar Mar 12 '23 11:03 howjmay

Hi @tertsdiepraam I want to check what reversed style checksum means? The current error I made was caused by I misunderstanding the meaning of reversed style checksum or it is just some algorithms that we don't need to reverse them

howjmay avatar Mar 18 '23 02:03 howjmay

That phrasing is indeed weird. My guess (but I'm not sure) is that they meant only reversing the output, so whether it's filename followed by checksum or checksum followed by filename. This page might help explain: https://www.gnu.org/software/coreutils/manual/html_node/cksum-invocation.html

tertsdiepraam avatar Mar 18 '23 09:03 tertsdiepraam

AH

❯ cksum -a  sha1 a.out
SHA1 (a.out) = 70f26e95e2da7dfb6a223b9adcad311c23781c82
❯ cksum -a  sha1 a.out --untagged
70f26e95e2da7dfb6a223b9adcad311c23781c82  a.out

❯ cksum -a crc a.out --untagged
447325735 11848 a.out
❯ cksum -a sysv a.out --untagged
38902 24 a.out
❯ cksum -a bsd a.out --untagged
34274    12 a.out

❯ cksum -a bsd a.out
34274    12 a.out 
❯ cksum -a sysv a.out
38902 24 a.out
❯ cksum -a crc a.out
447325735 11848 a.out

that's what it does :

if the hash function is not a "legacy" or the default one (bsd,sysv, or crc as in the docs you mentionned) it does print it in kind of math function notation

hashfunct(file) = <HASH>

But if you put untagged it shows it as without the argument just like other utilities would.

<HASH> file

❯ cksum -a  sha1 a.out
SHA1 (a.out) = 70f26e95e2da7dfb6a223b9adcad311c23781c82
❯ cksum -a  sha1 a.out --untagged
70f26e95e2da7dfb6a223b9adcad311c23781c82  a.out

@tertsdiepraam I don't know if there is a rust way to reuse the code in the hashum, as this is some kind of aliasing : cksum is just doing some maths-like printing it seems.

This is how I believe we could manage the arguments:

By default, untagged is unset. Any of the legacy algorithm (bsd,sysv and crc) override the default.

Then all we have to do is format as we do in the hashsum.rs code i f untagged is set, and otherwise format it as:

[hashfunction]␣(path argument value)␣=␣[HASH]

anastygnome avatar Mar 19 '23 12:03 anastygnome

Could you replace the image with output?

I don't know if there is a rust way to reuse the code in the sha1sum,... utils, as this is some kind of aliasing : cksum is just doing some maths-like printing it seems.

Indeed. I was working on this, but it stranded a bit. It's on this branch: https://github.com/uutils/coreutils/compare/main...tertsdiepraam:coreutils:cksum-rewrite

There's a weird situation where we have the additional hashsum util, which is useless, because it's basically equivalent to GNU cksum. Ideally, the individual utils share code with cksum, instead of hashsum.

tertsdiepraam avatar Mar 19 '23 12:03 tertsdiepraam

@tertsdiepraam I removed the image as asked :)

I wanted to do a joke about gnu being bloated, but I will refrain from doing so ;)

Anyway, because hashsum is our doing, can we not just add a --math-style argument
to hashsum and turn cksum into an alias just like we do for other tools like md5sum?

This way we would not have to modify the hashsum "statu quo'" for now and introduce a breaking change

anastygnome avatar Mar 19 '23 13:03 anastygnome

I'm not concerned about breaking changes yet to be honest. We're at version 0.0.17, which should signal that there will be breaking changes. In my opinion we can just get rid of hashsum and implement the individual utils via cksum (though we need feature parity of cksum first).

In this case it's not GNU being bloated, it's us :)

tertsdiepraam avatar Mar 19 '23 15:03 tertsdiepraam

Closing this PR as it has been superseded by #4860. Thanks anyway for your PR.

cakebaker avatar May 14 '23 14:05 cakebaker