mudata icon indicating copy to clipboard operation
mudata copied to clipboard

Support backup_url kwarg

Open Zethson opened this issue 3 years ago • 2 comments

Is your feature request related to a problem? Please describe. From what I could see it is not possible to supply a backup_url for mudata objects via muon.read (since IO is primarily here I am opening the issue here) like scanpy's read function allows you to.

Describe the solution you'd like Support for it :) I guess that the download code already exists here: https://github.com/PMBio/mudatasets/blob/main/mudatasets/core.py#L28

Zethson avatar Jan 27 '22 09:01 Zethson

An alternative solution I've been thinking about would be adding fsspec support for the input paths (https://github.com/theislab/anndata/issues/657)

This would look like:

data = muon.read_h5mu("filecache::https://ebi.ac.uk/...")

Where the location to cache is some configured cache directory, or controlled with kwargs passed to fsspec, like filecache={'cache_storage':'/tmp/files'}

ivirshup avatar Jan 27 '22 15:01 ivirshup

@gtca what would you preferred, simple solution be? Having the download code in MuData instead of MuDatadatasets? I might be able to file a PR, but it would be good if you could tell me first what you'd want.

Zethson avatar Oct 29 '22 11:10 Zethson

v0.3 supports this via fsspec, namely this should work:

from mudata import read

# OpenFile and BufferedReader from fsspec are supported for remote storage, e.g.:
mdata = read(fsspec.open("s3://bucket/file.h5mu")))

# or
with fsspec.open("s3://bucket/file.h5mu") as f:
    mdata = read(f)

# or
with fsspec.open("https://server/file.h5ad") as f:
    adata = read(f)

For backup_url, if you think it's still needed, I guess it's worth a new issue on https://github.com/scverse/muon.

gtca avatar Jul 03 '24 07:07 gtca