rosettasciio
rosettasciio copied to clipboard
Adding Documentation About Dask-Distributed Support for file types
Describe the functionality you would like to see.
I would like to add to the documentation information about which file loaders support the dask-distributed
backend. Mostly just add an extra column here
Currently I believe that this is only the zspy
and the new file loader #11 but we can think about adding in support for the hspy
file format as well as any of the other binary files.
Describe the context
I have defined a function in #11 that works as a drop in replacement for np.memmap
and allows for distributed loading of some data. This is particularly useful for large data sets as well as does a much better job handling the available resources.
Additional information
Using the dask-distributed
scheduler is the preferred way to interact with dask
in most cases. Supporting distributed schedulers at the loading level is important for larger datasets and allows for much better scalable preformance.