anndata icon indicating copy to clipboard operation
anndata copied to clipboard

Start backed sparse support for zarr

Open ivirshup opened this issue 2 years ago • 1 comments

Initial draft of backed sparse array support for zarr.

  • [ ] I intend to export sparse_dataset, CSRDataset, and CSCDataset class from experimental.
  • [ ] Get tests passing
  • [ ] Figure out if I'm going to expose any arguments through read_zarr

ivirshup avatar Apr 29 '22 15:04 ivirshup

Codecov Report

Merging #765 (ad016a6) into main (88dd129) will decrease coverage by 2.09%. The diff coverage is 92.74%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #765      +/-   ##
==========================================
- Coverage   84.88%   82.79%   -2.09%     
==========================================
  Files          36       36              
  Lines        5153     5197      +44     
==========================================
- Hits         4374     4303      -71     
- Misses        779      894     +115     
Flag Coverage Δ
gpu-tests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
anndata/_core/anndata.py 83.60% <66.66%> (ø)
anndata/_core/file_backing.py 90.32% <87.50%> (-1.15%) :arrow_down:
anndata/_io/specs/methods.py 87.52% <90.00%> (-0.19%) :arrow_down:
anndata/_core/sparse_dataset.py 92.73% <93.15%> (+1.06%) :arrow_up:
anndata/_core/raw.py 79.28% <100.00%> (-4.29%) :arrow_down:
anndata/_io/h5ad.py 92.89% <100.00%> (ø)
anndata/_io/utils.py 76.47% <100.00%> (ø)
anndata/experimental/__init__.py 100.00% <100.00%> (ø)
anndata/experimental/merge.py 87.56% <100.00%> (ø)
anndata/experimental/multi_files/_anncollection.py 70.56% <100.00%> (ø)
... and 2 more

... and 3 files with indirect coverage changes

:loudspeaker: Have feedback on the report? [Share it here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=scverse).

codecov[bot] avatar Apr 29 '22 15:04 codecov[bot]

@ivirshup It seems that we settled on friday on not exposing anything for read_zarr. beyond that, i was not clear what the goal of assignment would be here - did you want to include it or not? i remember you saying that writing in backed mode probably wasn't so great anyway so is the goal here to get rid of it?

ilan-gold avatar Mar 21 '23 13:03 ilan-gold

Initial draft of backed sparse array support for zarr.

  • [ ] I intend to export sparse_dataset, CSRDataset, and CSCDataset class from experimental.
  • [x] Get tests passing
  • [ ] Figure out if I'm going to expose any arguments through read_zarr

At the moment, the first and last are still open ended as is the question of setting the data. Otherwise, tests pass and you can convert from a draft IMO. Thanks again for getting this started!

ilan-gold avatar Jul 31 '23 19:07 ilan-gold

Decisions:

  1. don't expose arguments to read_zarr
  2. do export the classes from experimental.
  3. DeprecationWarning for sparse setting - eventually remove probably, but leave space for a new implementation.

ilan-gold avatar Aug 01 '23 13:08 ilan-gold

@ivirshup Where is the changelog? Is it the release-notes? And if so, which release?

ilan-gold avatar Aug 29 '23 11:08 ilan-gold

Where is the changelog?

release-notes/0.10.0.md

ivirshup avatar Aug 29 '23 11:08 ivirshup

@ivirshup have look at the deprecation warning! thanks!

ilan-gold avatar Sep 05 '23 09:09 ilan-gold

I'm not sure where I got the idea in my head that this branch was passing CI but in any case, should we make zarr a dep of anndata now? It's only used for type checking right now in sparse_dataset.py

This is what is causing the failures.

ilan-gold avatar Sep 08 '23 08:09 ilan-gold

In anndata.compat there are dummy classes defined for this. Basically replace zarr.Array and zarr.Group with anndata.compat.ZarrArray and anndata.compat.ZarrGroup

ivirshup avatar Sep 08 '23 11:09 ivirshup