arviz icon indicating copy to clipboard operation
arviz copied to clipboard

Make netCDF4 optional.

Open twiecki opened this issue 3 years ago • 8 comments

Closes #2028.

twiecki avatar May 13 '22 16:05 twiecki

Codecov Report

Merging #2029 (27720b8) into main (c78deef) will increase coverage by 0.00%. The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #2029   +/-   ##
=======================================
  Coverage   90.64%   90.64%           
=======================================
  Files         117      117           
  Lines       12553    12554    +1     
=======================================
+ Hits        11379    11380    +1     
  Misses       1174     1174           
Impacted Files Coverage Δ
arviz/data/inference_data.py 83.04% <100.00%> (+0.02%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update c78deef...27720b8. Read the comment docs.

codecov[bot] avatar May 13 '22 16:05 codecov[bot]

This will break a lot of code and environments, we need to be careful about how to do this, I don't think removing the dependency straight away is a good idea.

The whole of ArviZ assumes netcdf files can be stored and read. For example, the doc failures are due to that, using arviz.load_arviz_data reads netcdf files (either packaged with the library or after downloading from figshare) and most documentation examples are based on this. Changing that will also mean that after doing pip install arviz people won't be able to save inferencedata as netcdf as they are now

OriolAbril avatar May 13 '22 19:05 OriolAbril

OK, maybe it should stay a dependency then?

twiecki avatar May 14 '22 16:05 twiecki

I think it should be possible to install ArviZ without installing libraries that are not required throughout the library (netcdf, but also matplotlib for example like I said before), but I am also worried about making pip install arviz default to an installation that can't save to file nor plot (in the extreme) unless some of the possible optional dependencies are installed manually as well as backward compatibility issues.

Let's see what the rest of the team thinks. One thing we discussed was splitting the library in data/stats/plots but releasing in sync for simplicity and keeping arviz as the bundle of the three (plus maybe netcdf and/or matplotlib). But it might be better to rip the band aid and have arviz make a minimal install and recommend installing arviz[things]

OriolAbril avatar May 14 '22 20:05 OriolAbril

I don't recall if it is possible to do like, pip install arviz[minimal] -- would that help with this? I'm pretty happy to allow bespoke [special_install_requires] if it allows usage.

ColCarroll avatar May 15 '22 23:05 ColCarroll

Catching up. Agree with Oriol, We definitely can't do it this way this "hardline" way its going to break a lot of things. I think we have do it the way Colin recommends, which is netcdf by default, but it can be turned off with a flag.

We should do this to enable support for pyscript, and I think jupyterlite as well since it runs on pyodide. Those would both be awesome tools to allow users to test out arviz or learn statistical computing without needing to install a local env.

Also note for the pyscript usecase we'll also need to compile both wheels everytime we release as well so its fetchable by the pyscript users. Outside of the scope of this PR but something to note

canyon289 avatar May 16 '22 00:05 canyon289

I don't recall if it is possible to do like, pip install arviz[minimal] -- would that help with this?

That would be amazing but I don't think it is possible. afaik the [name] can only be used to add dependencies to the set of requirements. Does someone know how to use this [] syntax or some kind of alternative to allow this?

I'm pretty happy to allow bespoke [special_install_requires] if it allows usage.

Just to make sure we are all on the same page, we already provide arviz[all] to install all requirements and optional dependencies at once. See https://python.arviz.org/en/latest/getting_started/Installation.html

Also note for the pyscript usecase we'll also need to compile both wheels everytime we release as well so its fetchable by the pyscript users. Outside of the scope of this PR but something to note

Whenever we release, the Azure job already builds a wheel and uploads it to pypi, see https://pypi.org/project/arviz/#files. Only one wheel is generated though and you mentioned "both wheels" @canyon289 can you create an issue with what is missing so we remember to update the release Azure job?

OriolAbril avatar May 18 '22 13:05 OriolAbril

Whenever we release, the Azure job already builds a wheel and uploads it to pypi, see https://pypi.org/project/arviz/#files. Only one wheel is generated though and you mentioned "both wheels" @canyon289 can you create an issue with what is missing so we remember to update the release Azure job?

Sure but we only need to do this if we go through with the proposal above which its uncertain if we will

canyon289 avatar May 19 '22 03:05 canyon289

What about using h5netcdf? That might make it a lot less heavy since the netCDF4 C binaries won't be necessary.

varchasgopalaswamy avatar Sep 26 '22 16:09 varchasgopalaswamy

Closing in favor of https://github.com/arviz-devs/arviz/pull/2122.

twiecki avatar Sep 27 '22 09:09 twiecki