coreutils icon indicating copy to clipboard operation
coreutils copied to clipboard

Multi-call `hashsum` binary is no longer necessary

Open tertsdiepraam opened this issue 2 years ago • 1 comments

After https://github.com/uutils/coreutils/pull/4356, the combined hashsum binary will no longer be necessary. Still, all the hashing utilities will be based on hashsum, so my suggestion is to refactor the hashsum crate so that it defines multiple binaries, each with their own main function. We can then remove the multi-call logic and simplify the code, because the Digest type for each binary will be statically defined.

tertsdiepraam avatar Feb 22 '23 10:02 tertsdiepraam

I have changed my mind a bit. hashsum is still unnecessary and we need to remove it. cksum will first need to get feature parity and then we can start changing things up.

So here's my plan:

  1. Update cksum a bit to become a bit of a better version of the current hashsum.
  2. I want to extract the uucore::sum module to a crate multisum. The idea of this crate is to provide a common interface to many summing algorithms, which might be helpful for other projects too.
  3. Create a module for the common functionality based on the updated cksum in a subdirectory of uu.
  4. In that same subdirectory, we define all utils.

So something like this:

uu/
-- cksum/
   -- Cargo.toml
   -- src/
      -- common.rs
      -- b2sum.rs
      -- cksum.rs
      -- sum.rs
      -- etc.

I'm not sure that this exact structure will work, but I think it's gotta be something close to this.

tertsdiepraam avatar Jul 30 '23 12:07 tertsdiepraam

@RenjiSann Is there any progress for deleting hashsum? Upstream of GNU coreutils is worrying about interface divergence at #8984 .

oech3 avatar Oct 23 '25 16:10 oech3

To be honest, I don't know the "historical" reason behind us having so many interfaces for the same thing and why we have separate hashsum and cksum binaries. Maybe @sylvestre can bring some expertise about that.

Anyways, the concern is very much real, especially since we're now the default on Ubuntu.

Personally, I'm all in for ditching duplicate interfaces, but I'm not sure I have all the context in mind for taking a sound decision.

RenjiSann avatar Oct 23 '25 17:10 RenjiSann

Maybe you can start by actively marking it as deprecated to at least discourage people from using it? I don't think there was any technical reason for keeping it. I remember trying to base all the *sum utilities on cksum once, but I failed because it turned out to be more work than I thought. Maybe that's better now.

tertsdiepraam avatar Oct 23 '25 20:10 tertsdiepraam

I don't remember, sorry but i agree we should mitigate the difference here before it becomes a mess!

sylvestre avatar Oct 23 '25 21:10 sylvestre

Or libhashsum? if size of individual binaries are still important and induvidual md5sum, b2sum,... are not acceptible.

oech3 avatar Oct 25 '25 17:10 oech3