duc Add an option to graph/json/list with base 1000

on source code, base 1024 is used. It would be useful in some situation to use a base 1000. Add a flag on command line for this. I'm not very used to C language. Is it easy to do ?

Feb 07 '23 11:02 odoucet

"Olivier" == Olivier Doucet @.***> writes:

on source code, base 1024 is used. It would be useful in some situation to use a base 1000. Add a flag on command line for this. I'm not very used to C language. Is it easy to do ?

It's possible, but not very likely since all disk space calculation in Unix/Linux are based on 1024 blocks. What situation do you think this would be useful?

John

Feb 07 '23 14:02 l8gravely

As you know, hard drives manufacturer are selling their products based on base 1000 calculation. Allow to compare disk usage with the same base can, in very specific cases, be useful.

Also, as all calculation is now done in base 1024, should we show "82.1Gi" and not "82.1G" ? (cf IEC 60027-2 A.2 and ISO/IEC 80000:13-2008) https://en.wikipedia.org/wiki/Binary_prefix#Adoption_by_IEC,_NIST_and_ISO

Feb 08 '23 11:02 odoucet

"Olivier" == Olivier Doucet @.***> writes:

As you know, hard drives manufacturer are selling their products based on base 1000 calculation. Allow to compare disk usage with the same base can, in very specific cases, be useful.

Well, as far as I know, Linux filesystems are still reporting using 1024 for disk usage calculations.

Also, as all calculation is now done in base 1024, should we show "82.1Gi" and not "82.1G" ? (cf IEC 60027-2 A.2 and ISO/IEC 80000:13-2008) https://en.wikipedia.org/wiki/Binary_prefix#Adoption_by_IEC,_NIST_and_ISO

Feel free to to work up a patch and submit it, but I don't personally see a major need at this time.

As a hint, you'd probably want to replace all use of '1024' in the code with a variable and then have a switch to change it to '1000' and then report the numbers that way. This would also tweak the 'Gi' vs 'G' options I guess.

But I still don't see a need. John

Feb 08 '23 14:02 l8gravely

I think this one goes well with https://github.com/zevv/duc/issues/307

Basically, they are both about the representation of numbers. The humanize() function needs some other enum or flagsmask describing the required representation format, which can then get passed around from the cmdline parsing into the various pieces of code printing sizes and amounts.

I staretd on #307 the other day, but kind of got stuck with getting the flags in a nice way from the cmdline, as there are too many options.

Basically what we want to chose are these three things:

fully expand the number to bytes, or "abbreviate"
when abbreviating, use base 1000 or base 1024
when printing the numbers, insert thousand separators or not

That gives a total of 8 different ways of formatting (or 5, if we woudl not support thousand separators when abbreviating)

Now we only have the '-b / -bytes' flag, which covers choice #1. To support 2 and 3, we would have to make up two extra flags, and add these to all the duc subcommand option parsers.

That kind of sucks.

So maybe we can come up with a different way and obsolete the -b option: we add a new flag --format (longopt only, problaly) which takes an argument describing the format. That raises the question of course how to describe the format in a friendly way :/

So maybe we need to add multiple options to all the subcommands after all. We could use the same arguments as ls, and pass --si to switch to 1000 instead of 1024.

I dunno, sorry :(

Feb 08 '23 14:02 zevv