xarray
xarray copied to clipboard
Align typing of dimension inputs
What is your issue?
Currently the input type for "one or more dims" is changing from function to function.
There are some open PRs that move to str | Iterable[Hashable]
which allows the use of tuples as dimensions.
Some changes are still required:
- [ ] Accept None in all functions that accept dims as default, this would simplify typing alot (see https://github.com/pydata/xarray/pull/7048#discussion_r973813607)
- [ ] Check if we can always include ellipsis "..." in dim arguments (see https://github.com/pydata/xarray/pull/7048#pullrequestreview-1111498309)
- [ ]
Iterable[Hashable]
includes sets, which do not preserve the ordering (see https://github.com/pydata/xarray/pull/6971#discussion_r981166670). This means we need to distinguish between the cases where the order matters (constructor, transpose etc.) and where it does not (drop_dims, reductions etc.). Probably this needs to be typed as astr | Sequence[Hashable]
(a numpy.ndarray is not a Sequence, but who uses this for dimensions anyway?).
Iterables (implement __iter__
) not only aren't ordered, but also neither have a length (implement __len__
nor allow for membership tests (implement __contains__
). So in these cases one needs to use Sequences, too; in addition to cases where order matters.
See https://docs.python.org/3.9/library/collections.abc.html#collections-abstract-base-classes
Yes, sequence sounds right if the order matters.
Usually we do not use len(input)
or x in input
though. Maybe reversible container or something else exists?
A bit unfortunate that np.ndarray
is not a sequence though, but I think using them for dimensions of not a very common use case.
Usually we do not use
len(input)
orx in input
though. Maybe reversible container or something else exists?
We can build a list if we need a len? Or IIRC there is a Sized
type — a Set
but not an Iterable
.
We can build a list if we need a len? Or IIRC there is a
Sized
type — aSet
but not anIterable
.
We already do that.
We have the following solutions:
- We allow any Iterable and it is up to the user to ensure correct ordering (mypy does not complain when a set is supplied)
- We require a Sequence which ensures ordering (user cannot supply a np.ndarray e.g.)
- We make the type more complicated to support ArrayLike as well.
Ah excellent, thanks @headtr1ck