pint-xarray expects decorator

WIP attempt to add an @expects decorator to the API, which emulates pint.UnitRegistry().wraps.

[x] Closes #141
[x] Tests added
[x] Passes pre-commit run --all-files
[x] User visible changes (including notable bug fixes) are documented in whats-new.rst
[x] New functions/methods are listed in api.rst

Oct 20 '21 23:10 TomNicholas

If I understand wraps correctly, it makes sure that the input data is in the specified units, strips them and attaches the units specified as return value to the results. Should we copy that design for expects?

Edit: It would also make sense to have it stay the way it currently is, though

I actually did not realise that. I just kind of started implementing it the way I felt it should work. That is not that different though - perhaps we could have both behaviours via an extra kwarg pass_magnitude or something?

Oct 23 '21 17:10 TomNicholas

I actually did not realise that. I just kind of started implementing it the way I felt it should work. That is not that different though - perhaps we could have both behaviours via an extra kwarg pass_magnitude or something?

the disadvantage of a new kwarg is that it could shadow a variable (if we choose the name carefully enough that might not be as much of an issue, though). What do you think about adding a new decorator that does both? It could share all internals and additionally dequantifies before and quantifies after the call to the user function.

Oct 23 '21 18:10 keewis

the disadvantage of a new kwarg is that it could shadow a variable (if we choose the name carefully enough that might not be as much of an issue, though)

Any xarray function that only accepts kwargs but also has its own kwargs has this problem - we could copy xarray by allowing users to pass a dict of kwarg_units optionally in the same way that .sel does?

What do you think about adding a new decorator that does both?

Or we could just do that. It just needs a good name. I still think @wraps is a poor name, how about @expects_magnitude? (Because the wrapped function "expects" to be given a magnitude)

Oct 23 '21 18:10 TomNicholas

What do you think about adding a new decorator that does both? It could share all internals

@keewis I don't know if I'm being dense but I actually can't work out how to make two differently-behaving decorators that share most of their internals :sweat_smile: I even made a SO question about it, but no-one has answered it yet. I could just copy-paste the internals code for now...

Oct 28 '21 19:10 TomNicholas

if you look at the structure of a decorator without arguments, it's usually something like:

def decorator(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        # preprocess args / kwargs
        result = func(*args, **kwargs)
        # postprocess the result
        return result
    return wrapper

if you put the pre/postprocessing into separate functions, the new decorator can call those and do some additional preprocessing to drop the units (and add them back in the postprocessing step)

Oct 28 '21 19:10 keewis

if you put the pre/postprocessing into separate functions, the new decorator can call those

Yes, but I think you still need to duplicate all the triple wrapping code, like this

def decorator1(func):
    flag = True
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        preprocess(args, kwargs, flag)
        result = func(*args, **kwargs)
        postprocess(result, flag)
        return result
    return wrapper


def decorator2(func):
    flag = False
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        preprocess(args, kwargs, flag)
        result = func(*args, **kwargs)
        postprocess(result, flag)
        return result
    return wrapper

because if you instead try to move the def wrapper out separately, it no longer has access to func...

Oct 28 '21 19:10 TomNicholas

wouldn't it be possible to do this instead:

def preprocess1(*args, **kwargs):
    ...
    return args, kwargs

def preprocess2(*args, **kwargs):
    ...
    return args, kwargs

def decorator1(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        args, kwargs = preprocess1(*args, **kwargs)
        result = func(*args, **kwargs)
        result = postprocess(result)
        return result
    return wrapper

def decorator2(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        args, kwargs = preprocess1(*args, **kwargs)
        args, kwargs = preprocess2(*args, **kwargs)
        result = func(*args, **kwargs)
        result = postprocess(result)
        return result
    return wrapper

that's probably not as elegant as using a flag, but maybe it's simpler?

Oct 28 '21 19:10 keewis

wouldn't it be possible to do this instead:

Yeah, but that doesn't really solve my question - it still duplicates def wrapper (and for a decorator with arguments there is another layer of wrapper that would get duplicated too).

It might be possible to solve this by making a callable class that binds all the arguments as attributes, but it's probably not worth the effort.

Oct 28 '21 20:10 TomNicholas

fair point, I didn't consider the outer layer yet. I'll try to come up with something that also deduplicates that (might take some time, though), but in the meantime it might be fine to only share pre/postprocessing.

Oct 28 '21 20:10 keewis

If I understand wraps correctly, it makes sure that the input data is in the specified units, strips them and attaches the units specified as return value to the results. Should we copy that design for expects?

Thinking about this again I think you might be right. To be useful, an @expects decorator needs to deliver a magnitude to the internal function, because what is the use case of supplying a quantified pint array with a specific unit?

In other words: when would I ever have a function that (a) specifically wants pint Quantities instead of bare numpy(/xarray) objects but also (b) will only accept those quantities with a specific unit?

Nov 02 '21 18:11 TomNicholas

I agree, it looks like just verifying the units' dimensions is not that useful (and this can also be achieved in the function using either is_compatible_with or just using the data and having the errors bubble up).

Nov 27 '21 13:11 keewis

@keewis I think this is basically ready, though I don't know why the doctests CI fails

Nov 30 '21 23:11 TomNicholas

@keewis doctest is failing, but seems to be due to LaTeX formatting in dequantify, not because of this PR:

________ [doctest] pint_xarray.accessors.PintDatasetAccessor.dequantify ________
1117         array([2, 3, 4])
1118         Dimensions without coordinates: y
1119         Attributes:
1120             units:    s
1121 
1122         Use the registry's default format
1123 
1124         >>> pint_xarray.unit_registry.default_format = "~L"
1125         >>> d = q.pint.dequantify()
1126         >>> d.a
Differences (unified diff with -expected +actual):
    @@ -3,3 +3,3 @@
     Dimensions without coordinates: x
     Attributes:
    -    units:    \frac{\mathrm{m}}{\mathrm{s}}
    +    units:    meter / second

/home/runner/work/pint-xarray/pint-xarray/pint_xarray/accessors.py:1126: DocTestFailure

Jan 14 '22 21:01 TomNicholas

yep, that's unrelated to this PR, and should be fixed by hgrecco/pint#1403

Jan 14 '22 21:01 keewis

@keewis I think this might be ready to merge now finally??

Jan 20 '22 17:01 TomNicholas

good news is that the next release of pint shouldn't be too far off (see hgrecco/pint#1450), so that might help with the blog post.

Jan 31 '22 23:01 keewis

I've implemented my suggestion to use Signature.bind_partial / Signature.bind to normalize the parameters, so I think this leaves us with just a few open issues:

we ignore / disallow units in *args and **kwargs for now because checking a variable number of arguments seems way too complicated (except if all should have the same units, but that exception might not be worth considering).
I couldn't find a nice error message for the case where some parameters of the decorated function don't have units
specifying None as the unit will cause any units on objects passed to that parameter to be ignored. Do we want that, or should we raise an error for quantities passed to those parameters? Edit: what if we introduce checking for None and ignore for ... / Ellipsis?
The documentation is pretty minimal at the moment. Not sure whether a more elaborate user guide (I think we can use the apply_ufunc guide as a reference for that) should be added in this PR, or whether we should merge and try how it does in practice, and only then write that guide?
the units of the return value will currently always use the application registry. We should try to use the current registry instead, most likely by normalizing the input and output units before the conversion (by converting all units to either None or Unit objects)

Sep 21 '22 12:09 keewis

pint-xarray pint-xarray copied to clipboard

expects decorator

pint-xarray
pint-xarray copied to clipboard