xarray icon indicating copy to clipboard operation
xarray copied to clipboard

Add typing to plot methods

Open headtr1ck opened this issue 1 year ago • 31 comments

  • [x] Closes #6949
  • [x] Tests added (typing)
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst

headtr1ck avatar Sep 18 '22 17:09 headtr1ck

Puh, this stacking of decorators is quite a brainf*ck...

headtr1ck avatar Sep 18 '22 20:09 headtr1ck

I could add some return type annotations, but I doubt that mypy can work with the signature hacks and will always use *args, **kwargs.

Any thoughts?

headtr1ck avatar Sep 20 '22 06:09 headtr1ck

The 2D plotfunctions have the correct annotions at runtime (plotfunc.__annotations__) but mypy thinks they are *args, **kwargs. Same with the accessor methods.

Does anyone have an idea how to tell mypy that the type is actually the function signature of the "inside" newplotmethod function? Do we need to define a dummy method with the same signature or something?

headtr1ck avatar Sep 20 '22 21:09 headtr1ck

Congrats for doing all these!! This must be (almost?) the last of untyped modules?

max-sixty avatar Sep 20 '22 22:09 max-sixty

Finally it is a much larger PR than initially expected. Turns out that the col and row arguments have to be overloaded since they return a FacetGrid instead of a primitive.

I am now stuck in trying to resolve the broken test. Does anyone know how to use @overload decorators on a function that is wrapped with functools.wraps(...)? This does not seem to work and the signature is messed up?

headtr1ck avatar Sep 22 '22 22:09 headtr1ck

Wow, to fix the overloads I had to copy-paste the complete function signature for each overload and again for the accessor methods. This leads to a massive overhead which I assume the initial author wanted to prevent by using the decorators in the first place...

Are there any objections to leaving it like this? We could also deviate from the approach of adding typing inline and use stub files (.pyi like in typed_ops).

headtr1ck avatar Sep 23 '22 21:09 headtr1ck

Ok found another interesting problem: the e.g. functools.wraps(imshow) in the PlotAccessor works because in the moment it is called, there is no class method named "imshow", so it will use the module method, which is exactly what we want. However, as soon as one adds overloads, there indeed is a "imshow" class method (just the overload wrapper) and it tries to wrap this instead, which goes wrong (so the overload shadows the module method that we want)...

Does anyone know how we can force it to use the module imshow instead of the overload wrapper? The only thing I can think of is to move the Accessor to its own module, which might actually not be a bad idea?

headtr1ck avatar Sep 24 '22 10:09 headtr1ck

Maybe there's something to learn from how pandas does it: https://github.com/pandas-dev/pandas/tree/main/pandas/plotting ?

I'm a little skeptical if all the arguments in scatter are necessary and maybe they can be hidden in kwargs? Should make the overloads a little shorter.

I've been working on moving all plots to the DataArray side starting with scatter in #6778. It should also remove the list[PathCollection] overloads.

Illviljan avatar Sep 24 '22 16:09 Illviljan

Maybe there's something to learn from how pandas does it: https://github.com/pandas-dev/pandas/tree/main/pandas/plotting ?

Just had a quick look and I think their typing of the accessor is wrong, haha. They claim to return a PlotAccessor instance when calling e.g. df.plot.line(). But I did not test it. Edit: turns out that it works, but I have no idea how or why, haha Edit2: only pyright can resolve the type of these function, mypy says Any...

I'm a little skeptical if all the arguments in scatter are necessary and maybe they can be hidden in kwargs? Should make the overloads a little shorter.

I've been working on moving all plots to the DataArray side starting with scatter in #6778. It should also remove the list[PathCollection] overloads.

I saw that, that's why I left the DataArray scatter untouched for now.

headtr1ck avatar Sep 24 '22 17:09 headtr1ck

For now I see two options:

  1. Move the accessor to its own module. For the user nothing should change since nobody should use the accessor directly anyway.
  2. Create a plot.pyi typing file (should hopefully also resolve the wrapping issue since the overload only lives there)

In principle also a combination of the two is possible :)

headtr1ck avatar Sep 24 '22 17:09 headtr1ck

I'd opt for (1) & a comment on why it needs a separate module.

mathause avatar Sep 25 '22 11:09 mathause

I have tried to solve it with ParamSpec and writing a custom wraps decorator, but it did not work, presumably due to https://github.com/python/mypy/issues/13540

headtr1ck avatar Sep 25 '22 14:09 headtr1ck

Finally I am happy with the changes, the signatures are correct at runtime and static type checking time. I was finally:

  1. renaming the xarray.plot.plot module to xarray.plot.dataarray_plot since it was shadowing the plot method (Is added to breaking changes in whats-new, hopefully not many people are importing from this module directly)
  2. creating a new xarray.plot.accessor module (see https://github.com/pydata/xarray/pull/7052#issuecomment-1257029973 option-1)

@Illviljan I'm sorry if you have to change your plotting PRs if this gets merged first. If you prefer, we can merge your PRs first and I will fix the conflicts here :)

headtr1ck avatar Sep 25 '22 15:09 headtr1ck

I've merged the easy ones now. I'm not as confident on #6778 and could use another review on it.

Illviljan avatar Sep 25 '22 16:09 Illviljan

One point aside — when this is done, we may be able to turn on mypy's strict mode, at least for the tests path, so then tests always need annotations, and our annotations get tested.

(for context, a year ago or so we found some of them were wrong, but weren't being tested, because -> None wasn't in the tests)

max-sixty avatar Sep 25 '22 20:09 max-sixty

One point aside — when this is done, we may be able to turn on mypy's strict mode, at least for the tests path, so then tests always need annotations, and our annotations get tested.

(for context, a year ago or so we found some of them were wrong, but weren't being tested, because -> None wasn't in the tests)

I think there are still many untyped areas, especially internal stuff. But we can give it a try and see where it collapses :)

headtr1ck avatar Sep 25 '22 20:09 headtr1ck

wait.... did mypy 0.981 change the behavior of overloads with None and Hashable? I now get errors for

from typing import Hashable

@overload
def x(a: Hashable) -> Hashable: ...

@overload
def x(a: None = None) -> None: ...

def x(a: Hashable | None = None) -> Hashable | None:
    return a

but this works with 0.971. I cannot find anything on the changelog... Interesting that does not break other things in xarray since None is always treated differently! (In theory mypy is right, since None is actually hashable...)

Edit: see https://github.com/python/mypy/issues/13805, adding some type: ignore for this is the "solution"

headtr1ck avatar Oct 03 '22 22:10 headtr1ck

So finally I have decided to properly deprecate positional arguments since I think many users are still using them (even many tests where using them...)

headtr1ck avatar Oct 04 '22 10:10 headtr1ck

Excellent! Should we hit the green button?

max-sixty avatar Oct 04 '22 18:10 max-sixty

Excellent! Should we hit the green button?

Don't know if @Illviljan wants to merge his PR first or prefers to solve the merge conflicts himself :P

headtr1ck avatar Oct 04 '22 19:10 headtr1ck

I've merged the PR. :) Have fun with the conflicts!

Illviljan avatar Oct 07 '22 16:10 Illviljan

That was quite a merge, haha.... @Illviljan I had to undo some of your work, it just did not work well with static typing. Hopefully you are not too sad about it. :)

Is there a way to see the generated docs to check if my "hacks" worked?

headtr1ck avatar Oct 12 '22 21:10 headtr1ck

Next time I should split such a PR into smaller ones, haha. Anyone ready for a final review?

headtr1ck avatar Oct 13 '22 14:10 headtr1ck

This crashes now. Maybe you'll find the bug faster than me.

import xarray as xr

ds = xr.tutorial.scatter_example_dataset(seed=42)
fg = ds.plot.scatter("A", "B", z="z", hue="y", row="x", col="w")

Traceback (most recent call last):

  File "C:\Users\J.W\anaconda3\envs\xarray-tests\lib\site-packages\spyder_kernels\py3compat.py", line 356, in compat_exec
    exec(code, globals, locals)

  File "g:\program\dropbox\python\xarray_line_plot.py", line 122, in <module>
    fg = ds.plot.scatter("A", "B", z="z", hue="y", row="x", col="w")

  File "C:\Users\J.W\Documents\GitHub\xarray\xarray\plot\accessor.py", line 975, in scatter
    return dataset_plot.scatter(self._ds, *args, **kwargs)

  File "C:\Users\J.W\Documents\GitHub\xarray\xarray\plot\dataset_plot.py", line 234, in newplotfunc
    return _easy_facetgrid(kind="dataset", **allargs, **kwargs)

  File "C:\Users\J.W\Documents\GitHub\xarray\xarray\plot\facetgrid.py", line 770, in _easy_facetgrid
    return g.map_dataset(plotfunc, x, y, **kwargs)

  File "C:\Users\J.W\Documents\GitHub\xarray\xarray\plot\facetgrid.py", line 370, in map_dataset
    maybe_mappable = func(

  File "C:\Users\J.W\Documents\GitHub\xarray\xarray\plot\dataset_plot.py", line 262, in newplotfunc
    primitive = plotfunc(

  File "C:\Users\J.W\Documents\GitHub\xarray\xarray\plot\dataset_plot.py", line 541, in scatter
    primitive = ax.scatter(

  File "C:\Users\J.W\anaconda3\envs\xarray-tests\lib\site-packages\matplotlib\__init__.py", line 1423, in inner
    return func(ax, *map(sanitize_sequence, args), **kwargs)

  File "C:\Users\J.W\anaconda3\envs\xarray-tests\lib\site-packages\matplotlib\axes\_axes.py", line 4626, in scatter
    collection._internal_update(kwargs)

  File "C:\Users\J.W\anaconda3\envs\xarray-tests\lib\site-packages\matplotlib\artist.py", line 1186, in _internal_update
    return self._update_props(

  File "C:\Users\J.W\anaconda3\envs\xarray-tests\lib\site-packages\matplotlib\artist.py", line 1160, in _update_props
    raise AttributeError(

AttributeError: PathCollection.set() got an unexpected keyword argument 'z'

Illviljan avatar Oct 13 '22 15:10 Illviljan

Ok, good question. I have to look into it. Anyway, you should use kwargs for x and y as well :) Also, we should definitely test this haha

headtr1ck avatar Oct 13 '22 15:10 headtr1ck

Anyway, you should use kwargs for x and y as well :)

Hah, unfortunately my random test scripts I copy/pasted from can't keep up with branches. :D

Illviljan avatar Oct 13 '22 15:10 Illviljan

edit

Illviljan avatar Oct 13 '22 16:10 Illviljan

For me this code works? Besides from too few distance between the subplots such that the z-labels are on top of the neighboring plot. And the y and z axis seem to be reversed, unless it is intentional that y points to the top?

headtr1ck avatar Oct 13 '22 16:10 headtr1ck

Yeah, I had some strange issues there. One was that the branch wasn't up to date and another one that I can't replicate anymore... Sorry for the noise,

Agreed on the labels, I haven't figured how to avoid that overlap yet. y should point to the top as it does in 2D plots, it makes it easy to orient yourself as you add dimensions:

ds = xr.tutorial.scatter_example_dataset(seed=42)
fg = ds.plot.scatter(x="A", y="B", hue="y", row="x", col="w")
fg = ds.plot.scatter(x="A", y="B", z="z", hue="y", row="x", col="w")

image image

Not how it's done in plots2d (yet) though.

Illviljan avatar Oct 13 '22 17:10 Illviljan

I get much more overlap than you! Why is that? Windows again? test

I don't necessarily agree about y-axis pointing upwards, I think that is confusing. But I let people who actually use this be the judge :)

headtr1ck avatar Oct 13 '22 18:10 headtr1ck