Improve error message trying to load `zarr` dataset
When trying to load a zspy file with specifying the store, I get a FileNotFoundError. Even with following the example at https://hyperspy.org/rosettasciio/user_guide/supported_formats/zspy.html
Making the zspy:
import numpy as np
import zarr
import hyperspy.api as hs
s = hs.signals.Signal2D(np.random.random((50, 50, 50, 50)))
store = zarr.NestedDirectoryStore("test_save.zarr")
s.save(store)
store.close()
Loading with specifying the store:
import zarr
import hyperspy.api as hs
store = zarr.NestedDirectoryStore("test_save.zarr")
s1 = hs.load(store)
This gives the error:
FileNotFoundError: File: test_save.zarr not found!
@magnunor does the file get saved and can you see it in your directory? I use this quite a bit currently and don't have issues?
Testing this a bit further, it seems to be related to the .zspy ending.
import zarr
from rsciio import zspy
import hyperspy.api as hs
If the folder ends with .zarr it does not work:
store = zarr.NestedDirectoryStore("test_save.zarr")
s1 = hs.load(store, reader=zspy)
Giving FileNotFoundError.
If the folder ends with .zspy it does work:
store = zarr.NestedDirectoryStore("test_save.zspy")
s1 = hs.load(store, reader=zspy)
So I guess this isn't really a bug, but rather a confusing error message.
@magnunor, for future reference, the full traceback is (I am putting here because this is useful to understand what is the issue):
File c:\users\m0041user\untitled14.py:19
s1 = hs.load(store)
File ~\Dev\hyperspy\hyperspy\io.py:537 in load
objects = [
File ~\Dev\hyperspy\hyperspy\io.py:538 in <listcomp>
load_single_file(filename, lazy=lazy, **kwds) for filename in filenames
File ~\Dev\hyperspy\hyperspy\io.py:575 in load_single_file
raise FileNotFoundError(f"File: {path} not found!")
FileNotFoundError: File: C:\Users\M0041User\test_save.zarr not found!
And it shows that the error comes from: https://github.com/hyperspy/hyperspy/blob/3b350ff80b68b0928b284e76d3dc0a2b31e6dbf5/hyperspy/io.py#L572-L575
This can be fixed by checking that the extension is zarr and raise an improve error message. I am a bit on the fence with what is best to do there: actually there isn't a file with that name, instead there is a folder with that name. At the same time, I am not sure if this is worth making a special case for this, because hyperspy is not expected to read zarr files.
I think this should raise a similar error to reading a hdf5 file where it will throw an error if you don't pass a specific reader.
One other thing to consider is we want want to hard code a few more "stores" specifically I like the zip store but I think that having people make their own zip store is a bit too much of an ask in my experience.
I think this should raise a similar error to reading a hdf5 file where it will throw an error if you don't pass a specific reader.
Yes, indeed, the issue is more general and it should fails more graciously when trying to load data that the extension is not supported, which is the issue here. At the moment, (off the top of my head) if it doesn't find a reader, it fail back to trying to load as an image... this is a very old (and odd) behaviour (I have always seen it!). 😄
One other thing to consider is we want want to hard code a few more "stores" specifically I like the zip store but I think that having people make their own zip store is a bit too much of an ask in my experience.
Maybe add some example about this with some narrative about the pro/con of the various store? The ZipStore is single threaded, right? If I recall correctly, when you implemented the zspy format, we did look at adding an option to specify the store when saving and it was messy or not possible when trying to load it again or something along this?
Maybe add some example about this with some narrative about the pro/con of the various store? The ZipStore is single threaded, right?
It's multi threaded but only allows a single process for writing so you can't write from distributed to a ZipStore. You can always create a Zip file after the fact though. Reading the data isn't a problem in any case. I was actually going to support data output for our camera using the .zspy zip file format and my solution was to write multiple files then create the zip container after the fact.
If I recall correctly, when you implemented the zspy format, we did look at adding an option to specify the store when saving and it was messy or not possible when trying to load it again or something along this?
It's not particular difficult to support something like ZIPStore and we could just add zip=True and then use the zip store. The loading shouldn't be a problem. The other storage types start to get a bit more complicated.