xskillscore icon indicating copy to clipboard operation
xskillscore copied to clipboard

Generic checking of function inputs

Open dougiesquire opened this issue 3 years ago • 13 comments

Some xskillscore methods require that inputs have specific properties. For example, xr_brier_score and discrimination (and other methods yet to be implemented) require that

  • observations is boolean; and that
  • 0<=forecasts<=1.

Questions:

  1. do we want to check input properties, or is it sufficient just to be clear in the documentation?
  2. if yes, where should the functions that perform these checks live?
  3. I think the ability to operate lazily is more important than checking inputs, so checks that require data in memory (e.g. 0<=forecasts<=1) should not be run on dask-backended objects. Does this sound reasonable?

dougiesquire avatar Aug 19 '20 04:08 dougiesquire

Can observations also be 0 or 1 instead of boolean? I think this also works.

aaronspring avatar Aug 19 '20 07:08 aaronspring

  1. If 0 or 1 also works type checking doesn't bring us much. Does it? Clear documentation will do
  2. Utils or new file checks
  3. Yes

aaronspring avatar Aug 19 '20 07:08 aaronspring

Good point, 0 or 1 also works, so checking would only be run with np-backended arrays. Maybe not worth it then.

dougiesquire avatar Aug 19 '20 07:08 dougiesquire

i tried this here: https://github.com/csiro-dcfp/doppyo/blob/98acad9fe3cf658d5b000401ba898ef343abb768/doppyo/skill.py#L256 but didnt account for if not dask.is_dask_collection()

aaronspring avatar Aug 19 '20 07:08 aaronspring

I think this is a good thing to do, but I'm wondering how to do this cleanly. What comes to mind would be us creating a custom decorator that can ingest an arbitrary number of tuples like (a, ['str']) which send through the user-inputted variable and the list of appropriate variable types.

Then the decorator could loop through these and check these. Then we don't have to make a separate function for each type or have isinstance everywhere.

bradyrx avatar Aug 27 '20 17:08 bradyrx

Potentially helpful:

https://stackoverflow.com/questions/15299878/how-to-use-python-decorators-to-check-function-arguments

https://www.pythonforthelab.com/blog/how-to-use-decorators-to-validate-input/

bradyrx avatar Aug 27 '20 17:08 bradyrx

Good point, 0 or 1 also works, so checking would only be run with np-backended arrays. Maybe not worth it then.

@bradyrx How would the checks look like?

Maybe you write them first with all the isinstance calls. Then later we can refactor

aaronspring avatar Aug 27 '20 17:08 aaronspring

I think you'd just loop through the tuples and do some tuple unpacking such that a tuple would be (x, ['float', 'int']) and the check would be:

for t in tuple_list:
    variable, types = t
    if not isinstance(variable, types):
        raise ValueError(f"{variable} not one of {types}.")

Or some variant on that.

The decorator would then be

@check_types((a, 'str'), (b, ['float', 'int']))
def func(a, b):

bradyrx avatar Aug 27 '20 18:08 bradyrx

properscoring has checks https://github.com/TheClimateCorporation/properscoring/blob/1ca13dcbc1abf53d07474b74fbe3567fd4045668/properscoring/_brier.py#L41

and in xs we do a dask test https://github.com/raybellwaves/xskillscore/blob/de4b53843fbab3d841388d4fce866b81a6748476/xskillscore/tests/test_probabilistic.py#L250

So these checks trigger computation when via xr applyufunc at least @dougiesquire

aaronspring avatar Sep 04 '20 22:09 aaronspring

FWIW dask-ml has typing and uses ArrayLike = TypeVar("ArrayLike", Array, np.ndarray). Array is from dask.array import Array

https://github.com/dask/dask-ml/blob/master/dask_ml/_typing.py

But not sure if it's for type hinting or checks.

raybellwaves avatar Sep 05 '20 01:09 raybellwaves

So these checks trigger computation when via xr applyufunc at least @dougiesquire

That's not good - isn't this something we'd want to avoid?

dougiesquire avatar Sep 12 '20 03:09 dougiesquire

Sorry. I meant to say these tests don't trigger computation in these tests.

aaronspring avatar Sep 12 '20 06:09 aaronspring

But not sure if it's for type hinting or checks.

~~Exactly what typing ensures #317 ~~

EDIT: wrong at runtime no check

aaronspring avatar Dec 07 '21 17:12 aaronspring