hub-docs
hub-docs copied to clipboard
Display size of the generated dataset, downloaded dataset files, total amount of disk used in GB when MB >= 1000 in dataset cards
Is your feature request related to a problem? Please describe. Currently, the following data fields on the hub are only displayed in MB.
--Size of the generated dataset: --Size of downloaded dataset files: --Total amount of disk used:
Figures like 1895.01 MB and 1611.50 MB can become unwieldly as they grow in size to reason about the space they require, compared to 1.89501 GB or 1.6115 GB.
Describe the solution you'd like Convert numbers to GB in dataset cards when MB > 1000.
Describe alternatives you've considered I considered advocating to truncate size to 3 decimal points, as precision to 5 or 6 decimal points in GB (such as a number like 1.543210 GB) may be an unnecessary degree of precision to provide for users.
Ultimately though, I reasoned that more precision is often better.
Additional context I'm happy to contribute to this. I didn't see exactly where this was handled in the current codebase, so any pointers appreciated. Also, let me know if I should be opening this up in the datasets repo instead...
@osanseviero do you think this is one for the https://github.com/huggingface/hub-docs or the forum?
For now, I would just push to have all issues related to the Hub in hub-docs instead of closing existing ones.