Opt out of auto creating index variables
Tries fixing #8704 by cherry-picking from #8124 as @benbovy suggested in https://github.com/pydata/xarray/issues/8704#issuecomment-1926868422
- [x] Closes #8704
- [ ] Tests added
- [ ] User visible changes (including notable bug fixes) are documented in
whats-new.rst - [ ] New functions/methods are listed in
api.rst
This currently doesn't work - when I run test_coordinates.py I get some failures:
FAILED xarray/tests/test_coordinates.py::TestCoordinates::test_init_default_index - AssertionError: {'x': <class 'xarray.core.variable.Variable'>}
FAILED xarray/tests/test_coordinates.py::TestCoordinates::test_getitem - AssertionError: {'x': <class 'xarray.core.variable.Variable'>}
FAILED xarray/tests/test_coordinates.py::TestCoordinates::test_assign - AssertionError: {'x': <class 'xarray.core.variable.Variable'>, 'y': <class 'xarray.core.variable.IndexVariable'>}
FAILED xarray/tests/test_coordinates.py::TestCoordinates::test_align - ValueError: cannot reindex or align along dimension 'x' because of conflicting dimension sizes: {2, 3}
The first couple of errors seem to be because the 1D variable doesn't have a corresponding index created by default.
I noticed that the local variable default_indexes is being defined within Coordinates.__init__ as an empty dict, which is never changed before being used to update indexes. This seems like an indication of some missing logic to populate default_indexes?
https://github.com/pydata/xarray/pull/8711/files#diff-c73f63e529ab4065cca1aa23d93c91a3b060f314e2e498b708b90fef87141a88R286
Sorry the two commits picked from #8124 where done before #8107 and messed up things a bit. I also picked and added the last merge commit from #8124, which should hopefully make things better (coordinate tests are all running fine locally at least).
About the auto_convert option added to as_variable() here: this is a temporary workaround before we make a deeper and better refactoring of non-index vs. index coordinate variables. That is not super nice since as_variable() is public API, but maybe that's fine if it is clearly stated in the docstrings that auto_convert is for internal use only (and we likely be removed in the future)?
Thanks @benbovy !
That is not super nice since as_variable() is public API, but maybe that's fine if it is clearly stated in the docstrings that auto_convert is for internal use only (and we likely be removed in the future)?
I mean I don't think that's a huge deal, but if you definitely wanted to hide it from users couldn't you make a private _as_variable(..., auto_convert) and use that for internal calls and have the public as_variable call the private one?
Why is as_variable public API?
why is
as_variablepublic API?
More context in #1303
I mean I don't think that's a huge deal
Me neither (I wanted to see what others think).
@TomNicholas have you tried running your notebook mentioned in https://github.com/pydata/xarray/issues/8704#issuecomment-1927264039 with this PR? Does it fix the errors raised?
@TomNicholas have you tried running your notebook mentioned in https://github.com/pydata/xarray/issues/8704#issuecomment-1927264039 with this PR? Does it fix the errors raised?
Yes I did, and I think it does! I still can't do a full xr.concat on Dataset objects because of a different bug (see https://github.com/pydata/xarray/pull/8714), but with this PR gets to the same point as v2023.08.0 does at least. 😀
@benbovy I would really like to get this merged as I'm using it in a new package (see https://github.com/TomNicholas/VirtualiZarr/pull/42), but I'm not really sure what the steps to make this PR ready for merging are. Does this need any new tests?
Can we catch these warnings if they are expected?
xarray/tests/test_variable.py::TestVariable::test_as_variable
/home/runner/work/xarray/xarray/xarray/tests/test_variable.py:1219: FutureWarning: variable 'x' with name matching its dimension will not be automatically converted into an `IndexVariable` object in the future.
actual = as_variable(data, name="x")
xarray/tests/test_variable.py::TestVariable::test_as_variable
/home/runner/work/xarray/xarray/xarray/tests/test_variable.py:1237: FutureWarning: variable 'time' with name matching its dimension will not be automatically converted into an `IndexVariable` object in the future.
assert as_variable(dt, "time").dtype.kind == "M"
xarray/tests/test_variable.py::TestVariable::test_as_variable
/home/runner/work/xarray/xarray/xarray/tests/test_variable.py:1239: FutureWarning: variable 'time' with name matching its dimension will not be automatically converted into an `IndexVariable` object in the future.
assert as_variable(td, "time").dtype.kind == "m"
This should be done now!
Can we catch these warnings if they are expected?
Done