xarray icon indicating copy to clipboard operation
xarray copied to clipboard

Align typing of dimension inputs

Open headtr1ck opened this issue 1 year ago • 5 comments

What is your issue?

Currently the input type for "one or more dims" is changing from function to function. There are some open PRs that move to str | Iterable[Hashable] which allows the use of tuples as dimensions.

Some changes are still required:

  • [ ] Accept None in all functions that accept dims as default, this would simplify typing alot (see https://github.com/pydata/xarray/pull/7048#discussion_r973813607)
  • [ ] Check if we can always include ellipsis "..." in dim arguments (see https://github.com/pydata/xarray/pull/7048#pullrequestreview-1111498309)
  • [ ] Iterable[Hashable] includes sets, which do not preserve the ordering (see https://github.com/pydata/xarray/pull/6971#discussion_r981166670). This means we need to distinguish between the cases where the order matters (constructor, transpose etc.) and where it does not (drop_dims, reductions etc.). Probably this needs to be typed as a str | Sequence[Hashable] (a numpy.ndarray is not a Sequence, but who uses this for dimensions anyway?).

headtr1ck avatar Sep 27 '22 20:09 headtr1ck

Iterables (implement __iter__) not only aren't ordered, but also neither have a length (implement __len__ nor allow for membership tests (implement __contains__). So in these cases one needs to use Sequences, too; in addition to cases where order matters.

See https://docs.python.org/3.9/library/collections.abc.html#collections-abstract-base-classes

derhintze avatar Oct 06 '22 09:10 derhintze

Yes, sequence sounds right if the order matters. Usually we do not use len(input) or x in input though. Maybe reversible container or something else exists?

A bit unfortunate that np.ndarray is not a sequence though, but I think using them for dimensions of not a very common use case.

headtr1ck avatar Oct 09 '22 11:10 headtr1ck

Usually we do not use len(input) or x in input though. Maybe reversible container or something else exists?

We can build a list if we need a len? Or IIRC there is a Sized type — a Set but not an Iterable.

max-sixty avatar Oct 09 '22 18:10 max-sixty

We can build a list if we need a len? Or IIRC there is a Sized type — a Set but not an Iterable.

We already do that.

We have the following solutions:

  1. We allow any Iterable and it is up to the user to ensure correct ordering (mypy does not complain when a set is supplied)
  2. We require a Sequence which ensures ordering (user cannot supply a np.ndarray e.g.)
  3. We make the type more complicated to support ArrayLike as well.

headtr1ck avatar Oct 13 '22 09:10 headtr1ck

Ah excellent, thanks @headtr1ck

max-sixty avatar Oct 13 '22 18:10 max-sixty