anndata icon indicating copy to clipboard operation
anndata copied to clipboard

Add path parameter to write_zarr method

Open antoinegaston opened this issue 8 months ago • 8 comments

Please describe your wishes and possible alternatives to achieve the desired result.

This feature would allow to write the AnnData object to a specific path in a zarr store. It requires very slight changes:

In anndata/_io/zarr.py first

def write_zarr(
    store: MutableMapping | str | Path,
    adata: AnnData,
    path: str | None = None,
    chunks=None,
    **ds_kwargs,
) -> None:
    if isinstance(store, Path):
        store = str(store)
    adata.strings_to_categoricals()
    if adata.raw is not None:
        adata.strings_to_categoricals(adata.raw.var)
    # TODO: Use spec writing system for this
    f = zarr.open(store, mode="w")
    f.attrs.setdefault("encoding-type", "anndata")
    f.attrs.setdefault("encoding-version", "0.1.0")

    def callback(func, s, k, elem, dataset_kwargs, iospec):
        if chunks is not None and not isinstance(elem, sparse.spmatrix) and k == "/X":
            func(s, k, elem, dataset_kwargs=dict(chunks=chunks, **dataset_kwargs))
        else:
            func(s, k, elem, dataset_kwargs=dataset_kwargs)

    write_dispatched(f, f"/{path}", adata, callback=callback, dataset_kwargs=ds_kwargs)

In anndata/_core/anndata.py:

class AnnData(metaclass=utils.DeprecationMixinMeta):
    ...
    def write_zarr(
        self,
        store: MutableMapping | PathLike,
        path: str | None = None,
        chunks: bool | int | tuple[int, ...] | None = None,
    ):
        """\
        Write a hierarchical Zarr array store.

        Parameters
        ----------
        store
            The filename, a :class:`~typing.MutableMapping`, or a Zarr storage class.
        path
            Path within the store at which to write the data.
        chunks
            Chunk shape.
        """
        from .._io import write_zarr

        write_zarr(store, self, path=path, chunks=chunks)

And finally adding a small test to test_readwrite.py:

def test_zarr_path(tmp_path):
    zarr_pth = Path(tmp_path) / "test.zarr"
    adata = gen_adata((100, 100), X_type=np.array)
    adata.write_zarr(zarr_pth, path="test")

    from_zarr = ad.read_zarr(zarr_pth / "test")
    assert_equal(from_zarr, adata)

antoinegaston avatar Jul 01 '24 08:07 antoinegaston