cf-python
cf-python copied to clipboard
LAMA to Dask: managing default(-argument) `Data` instance
(Opening as a general issue since __init__ isn't listed under our table of methods in #295, so I am not sure of the migration status of Data initialisation.) Whilst I was trying to create diverse cases of instances of Data (in the context of https://github.com/NCAS-CMS/cf-python/pull/376#discussion_r847319179), I noticed cf.Data() initialised as-is, i.e. with no arguments, which is seemingly documented as being valid (initial argument array=None with array: optional stated), runs into an error:
>>> cf.Data()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/sadie/cf-python/cf/data/data.py", line 964, in __repr__
return super().__repr__().replace("<", "<CF ", 1)
File "/home/sadie/anaconda3/envs/cf-env/lib/python3.8/site-packages/cfdm/data/data.py", line 214, in __repr__
shape = self.shape
File "/home/sadie/cf-python/cf/data/data.py", line 158, in wrapper
return method(*args, **kwargs)
File "/home/sadie/cf-python/cf/data/data.py", line 6288, in shape
dx = self._get_dask()
File "/home/sadie/cf-python/cf/data/data.py", line 1387, in _get_dask
return self._custom["dask"]
KeyError: 'dask'
The fix is probably a quite simple adaptation, though I don't have time to look into it now; once I have finished reviewing my backlog of Dask migration PRs I will sort this, if you haven't already tackled it @davidhassell.
Hi Sadie, I think I'd rather make d = cf.Data(array=None, source=None) raise an exception. Let's add this to the list of items to talk about.
I think I'd rather make d = cf.Data(array=None, source=None) raise an exception. Let's add this to the list of items to talk about.
Sure, that's actually that's fine by me without the need for any debate! So I guess this issue becomes that of raising an appropriate exception rather than the error it presently runs into, as above. I'll change the title of this Issue to reflect that.
@davidhassell this Issue, which boils down now to:
make d = cf.Data(array=None, source=None) raise an exception
should be added to the list of tasks to do towards #182, since we agreed on applying that behaviour (see comments above!) and forgot about it, I guess! It's pretty trivial to implement, and concerns Data which we have largely moved on from, so probably best get it done sooner rather than later.
Assuming you agree with the above, do you want to tackle this, or shall I?
(FYI the underlying problem raised in the initial comment here is still true on our current lama-to-dask branch, assuming you call some operation on the cf.Data() e.g. run cf.Data().max().)
Thanks - All yours!
Fixed by #491
Hmm, this change causes a unit test to fail (test_docstring). Reopening for further thought ...
The new PR #551 changes tack, in that it allows a no-arg initialisation (Data()) but makes sure that the "dask" key in the _custom dictionary is inintialised to None, thereby allowing basic inspection of the instance.
This makes the Data class consistent with all other classes, i.e. all classes allow no-arg initialisations, and also allows the test_docstring.py to work.
This new approach is fine by me, assuming:
This makes the Data class consistent with all other classes, i.e. all classes allow no-arg initialisations, and also allows the test_docstring.py to work.
is the true motivation to change tack and not just that it is a convenient change in order to bypass the breaking of the docstring rewriting on the initial one! Which is definitely the case, I am sure... :smile:
(Also it does seem theoretically nice to allow the no-argument initialisation.)
Thanks for putting up the associated PR. I'll review it in a moment.
We have settled, for now on letting all classes have a no-arg init, so closing.