vitessce-python
vitessce-python copied to clipboard
Use anywidget-based IPC for zarr gets which do not require serving data on localhost
This change enables using the Vitessce widget in HPC situations, on Google Colab, in VSCode, and in HuBMAP Workspaces with data that is local to the Python kernel. Uses the new API from https://github.com/manzt/anywidget/pull/453
In these environments it can be difficult / impossible to proxy the requests from the browser in which the notebook is running down to the server on which the python kernel is running. Jupyter-server-proxy can only get us so far. Similarly, this may also fix #255 because this is another environment that presents challenges (e.g., it is not a web browser so we cannot rely on the structure of the notebook URL to help us construct the data URLs).
TODO:
- [ ] Document this on https://vitessce.github.io/vitessce-python/data_options.html
- [ ] Register store in all Zarr-based Wrapper classes once decide how store will be passed/instantiated
One question is how to expose this in the API. Maybe we require the user to pass the store? Like
dataset = vc.add_dataset(name='Brain').add_object(AnnDataWrapper(
- adata_path=zarr_filepath,
+ adata_store=zarr.DirectoryStore(zarr_filepath),
obs_embedding_paths=["obsm/X_tsne"],
obs_embedding_names=["UMAP"],
obs_set_paths=["obs/CellType"],
obs_set_names=["Cell Type"],
obs_feature_matrix_path="X",
initial_feature_filter_path="var/top_highly_variable"
)
)
This would allow more than just DirectoryStores but would require more work from the user.
Or maybe we keep adata_path and add something like as_store (should it be True by default?). This would not allow any other store types but maybe that is ok?
dataset = vc.add_dataset(name='Brain').add_object(AnnDataWrapper(
adata_path=zarr_filepath,
+ as_store=False,
obs_embedding_paths=["obsm/X_tsne"],
obs_embedding_names=["UMAP"],
obs_set_paths=["obs/CellType"],
obs_set_names=["Cell Type"],
obs_feature_matrix_path="X",
initial_feature_filter_path="var/top_highly_variable"
)
)
cc @manzt
Maybe default to something like zarr.storage.FSStore, which is based on fsspec and automatically infers stores?