numcodecs icon indicating copy to clipboard operation
numcodecs copied to clipboard

Decide, document (and rethink?) optional dependency handling

Open dstansby opened this issue 1 year ago • 4 comments

Currently the approach to handling optional dependencies is to not define codec classes if the dependency is not present (e.g., see https://github.com/zarr-developers/numcodecs/pull/637). This leads to a poor user experience when you want to use an optional codec, but haven't installed the optional dependency: if you try and import the codec, you get an import error without any information that to fix the error you have to install a specific package: https://github.com/zarr-developers/numcodecs/issues/526

I propose that we switch to a model where instead instantiating a codec class where an optional dependency is missing raises a helpful error message.

At some point I will create a PR with a concrete implementation of this change to help see what it would mean, but please share opinions on if there are other/better ways to improve user experience around optional dependency handling.

dstansby avatar Nov 21 '24 22:11 dstansby

Pinging @jakirkham to see if you have any thoughts, because I know you've been tidying this up recently.

dstansby avatar Nov 29 '24 14:11 dstansby

We have allowed a proliferation of increasingly experimental or custom codecs in Numcodecs. However this has come at substantial maintenance cost as they don't always get updated in a timely fashion or incompatible with Python or NumPy releases. So requires lots of finessing in requirements, warnings/errors, and CI.

We have tried to manage this by making them optional and guarding them in various ways. Though doing this correctly is not always straightforward for other contributors.

Think the answer has to be moving these out into separate repos/libraries. IOW Numcodecs extensions that users can install or not.

jakirkham avatar Dec 03 '24 01:12 jakirkham

I propose that we switch to a model where instead instantiating a codec class where an optional dependency is missing raises a helpful error message.

So I tried this out over at https://github.com/zarr-developers/numcodecs/pull/666, and it didn't work because (at least) ZFPY has default arguments in the class method signatures that require values from the zfpy package.

With the current way codecs are namespaced, ie numcodecs.{CODEC} I don't think there's a nicer way to warn or error to users if a dependency is missing. So I think the choice is:

  1. Change namespacing so codecs are in numcodecs.{codec-sub-module}.{CODEC}, forcing users to import the submodule to use it (and removing the imports from numcodecs/__init__.py
  2. Keep the status quo where codecs just silently don't exist if a dependency isn't installed

I don't think I have particular strong opinions either way - if I was writing from scratch I'd definitely go for 1), but given the pain of changing namespacing perhaps we should just stick with 2)?

dstansby avatar Dec 03 '24 11:12 dstansby

Given that we have a dynamic codec registry, why should users be doing from numcodecs import GZip, instead of something like GZip = numcodecs.get_codec('GZip')?

d-v-b avatar Feb 12 '25 09:02 d-v-b