xarray icon indicating copy to clipboard operation
xarray copied to clipboard

`xarray/datatree_` missing in 2024.2.0 sdist

Open mgorny opened this issue 1 year ago • 7 comments

What happened?

Apparently xarray-2024.2.0 requires xarray.datatree_ module but this module isn't included in sdist tarball.

What did you expect to happen?

No response

Minimal Complete Verifiable Example

$ tar -tf /tmp/dist/xarray-2024.2.0.tar.gz  | grep datatree_
(empty)

MVCE confirmation

  • [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [ ] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [ ] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [ ] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

n/a

mgorny avatar Feb 19 '24 03:02 mgorny

can you explain why you think we require it? Is it that the tests fail for the released version (meaning we need a importorskip for the relevant tests)? Or does anything else fail?

For context, xarray/datatree_ is a temporary directory that contains the code of the datatree repository while we move forward with the migration. This means that open_datatree intentionally does not work for now (but we also don't expose it as public API / document it).

keewis avatar Feb 19 '24 13:02 keewis

Yes, tests failing to import was the issue I've encountered. I've grepped for it and noticed it being imported in installed modules, so I've presumed it was supposed to be included. I'm sorry for being presumptious.

mgorny avatar Feb 19 '24 14:02 mgorny

no worries, we apparently forgot to skip the tests that need it, so this is a good hint.

keewis avatar Feb 19 '24 15:02 keewis

Besides the supposedly ignorable open_datatree in https://github.com/pydata/xarray/blob/e47eb92a76be8e66b2ea3437a4c77e340fb62135/xarray/backends/common.py#L134-L141

there is also still https://github.com/pydata/xarray/blob/e47eb92a76be8e66b2ea3437a4c77e340fb62135/xarray/backends/h5netcdf_.py#L37-L42 , https://github.com/pydata/xarray/blob/e47eb92a76be8e66b2ea3437a4c77e340fb62135/xarray/backends/netCDF4_.py#L48 , https://github.com/pydata/xarray/blob/e47eb92a76be8e66b2ea3437a4c77e340fb62135/xarray/backends/api.py#L72

and so on in the sdist without shipping xarray/datatree_.

bnavigator avatar Feb 27 '24 11:02 bnavigator

same as above: does this cause anything to fail? If it does, we can try to fix these.

(For reference, one of the examples above is a helper function for open_datatree which should never get called, while the remaining instances are within TYPE_CHECKING guards, so at most I'd expect typing to be affected)

keewis avatar Feb 27 '24 12:02 keewis

Same as above: Not at regular production runtime, but when running (typechecking) tests.

bnavigator avatar Feb 27 '24 13:02 bnavigator

Note that many developers like to have type checking enabled when they write code. If that code involves xarray, I supposed those developers might encounter failures.

bnavigator avatar Feb 27 '24 13:02 bnavigator

In xarray 24.3.0 too

>>> import xarray
>>> xarray.__version__
'2024.3.0'
>>> import xarray.core.datatree
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/shadchin/arc/arcadia/contrib/python/xarray/xarray/core/datatree.py", line 34, in <module>
    from xarray.datatree_.datatree.common import TreeAttrAccessMixin
ModuleNotFoundError: No module named 'xarray.datatree_'

shadchin avatar Apr 13 '24 06:04 shadchin

$ tar -tf xarray-2024.3.0.tar.gz  | grep datatree_
(empty)
$ unzip -l xarray-2024.3.0-py3-none-any.whl | grep datatree_
(empty)

shadchin avatar Apr 13 '24 06:04 shadchin

Is there a case where someone is writing a library that consumes xarray, type-checks their library, and their type check then fails?

If so, I think we should prioritize fixing this ASAP — we're somewhat violating the contract of having py.typed in our library.

Otherwise, it seems like a more theoretical problem (unless there are other cases which I'm missing), which we should consider solving but less of an immediate rush...

max-sixty avatar Apr 14 '24 18:04 max-sixty

right. I suppose we could remove the typing import until DataTree actually exists, and release a version immediately after merging the PR.

keewis avatar Apr 14 '24 21:04 keewis

Sounds like we should just stop stripping datatree out of the tarball. If people import the private core._datatree package it's at their own risk. That should be enough to make the type hints / packaging work again.

TomNicholas avatar Apr 16 '24 15:04 TomNicholas

Can you guys tell us if #8953 will fix your issues?

TomNicholas avatar Apr 17 '24 16:04 TomNicholas

If people import the private core._datatree package it's at their own risk.

xarray.datatree_ and xarray.core.datatree are both internal API, and anyone importing these two directly is on their own (at least until we finished the transition).

Additionally, unless we get a minimal example that allows us to reproduce the typing failures I wouldn't do anything besides including xarray.datatree_.datatree in the release artifacts. Instead, I think it makes more sense to focus on finishing the transition, which would resolve this issue anyways.

keewis avatar Apr 19 '24 18:04 keewis

@bnavigator @mgorny we will assume that #8953 has fixed this for you (once released) until you tell us otherwise!

TomNicholas avatar Apr 23 '24 18:04 TomNicholas