xarray
xarray copied to clipboard
Checking whether there is a chunk_store passed iterates over all files
What is your issue?
Investigating the performance of our service, I came across the following code:
https://github.com/pydata/xarray/blob/392a61484e80e6ccfd5774b68be51578077d4292/xarray/backends/zarr.py#L377
We are storing our zarr arrays in S3 using fsspec to wrap the client. Since our chunk_store object is a FSMap (https://github.com/fsspec/filesystem_spec/blob/dcff551ed789f0cea4a5ca5a8eed208bc1d0fdc5/fsspec/mapping.py#L7) which lists the files in the chunk_store if the __len__ of the object is being called:
def __len__(self):
return len(self.fs.find(self.root))
Which happens when the chunk_store (FSMap) is checked like:
if chunk_store:
....
Would it be the same if the following check would be done instead?
if chunk_store is not None:
....
Or is there a reason not to only check whether the object is not None but that is not empty? Would be good to avoid extra calls to the S3 bucket if they are not really required.