unyt icon indicating copy to clipboard operation
unyt copied to clipboard

[feature request] Data sizes

Open ygrange opened this issue 6 years ago • 6 comments

Ah, finally now that the paper is accepted (for which I congratulate you) I feel like I can ethically open this issue.

I'm not sure how appropriate this proposal is so I'd rather start an issue before hacking it in and doing a pull request that may be refused anyhow (e.g. because of scope creep).

I would like to add units of information (which seems to be the base quantity of the units Byte, bit, GiB, Mbit, etc).

It would be neat if I could use unyt to answer questions like "how much time does it take to transfer 5TiB of data using a 10Gbit network link?", without having to struggle with factors of 8 and weird prefixes.

As far as I can see before doing any work, that would require adding a few prefixes (ki to Yi), two base units (bit, byte) and two dimensions (information, bandwidth).

What is your opinion on this? Would it be worth the effort?

ygrange avatar Aug 13 '18 21:08 ygrange

I think this would be fine to add. We have prior art of this in that angle is a base dimension even though it's technically not an SI dimension. For the existing unit systems it would probably be best to just choose a sensible default for the dimensions you're adding and use that default for all the unit systems.

ngoldbaum avatar Aug 13 '18 21:08 ngoldbaum

Ok coo. I have already been playing in a fork. Will probably take some time due to holidays :)

ygrange avatar Aug 14 '18 12:08 ygrange

It also occurs to me that you shouldn't be able to create a Mig (e.g. a mibigram), so you'd need to add handling to the code that sets up SI prefixes to do special things for units with dimensions of data.

ngoldbaum avatar Aug 15 '18 15:08 ngoldbaum

I'll have a look at it in the coming period. Until now I took the crude approach (if the user wants a Mig, they get a Mig). I am a bit torn between two options: throwing a warning (I am not sure if you have any warning system implemented in the first place anyhhow) or an exception in this case. Do you have an opinion on this?

ygrange avatar Aug 27 '18 07:08 ygrange

It should throw the same error as the error you get if you ask for any other unit name that doesn't exist:

In [2]: unyt.Unit('foobar')
---------------------------------------------------------------------------
UnitParseError                            Traceback (most recent call last)
<ipython-input-2-ac9713246324> in <module>()
----> 1 unyt.Unit('foobar')

~/Documents/unyt/unyt/unit_object.py in __new__(cls, unit_expr, base_value, base_offset, dimensions, registry, latex_repr)
    308         else:
    309             # lookup the unit symbols
--> 310             unit_data = _get_unit_data_from_expr(unit_expr, registry.lut)
    311             base_value = unit_data[0]
    312             dimensions = unit_data[1]

~/Documents/unyt/unyt/unit_object.py in _get_unit_data_from_expr(unit_expr, unit_symbol_lut)
    869
    870     if isinstance(unit_expr, Symbol):
--> 871         return _lookup_unit_symbol(str(unit_expr), unit_symbol_lut)
    872
    873     if isinstance(unit_expr, Pow):

~/Documents/unyt/unyt/unit_registry.py in _lookup_unit_symbol(symbol_str, unit_symbol_lut)
    316     # no dice
    317     raise UnitParseError("Could not find unit symbol '%s' in the provided "
--> 318                          "symbols." % symbol_str)
    319
    320

UnitParseError: Could not find unit symbol 'foobar' in the provided symbols.

ngoldbaum avatar Aug 27 '18 14:08 ngoldbaum

(maybe not the same error text, just raising UnitParseError is sufficient to keep things sane :) )

ngoldbaum avatar Aug 27 '18 16:08 ngoldbaum