zellkonverter
zellkonverter copied to clipboard
`readH5AD(..., reader="R")` fails with recent AnnData formats?
cellxgene provides H5AD files for each data set. A recent download of this one (sorry, there is no direct url; click on the cloud download button) has content like (from rhdf5::h5ls()
)
10 /obs assay_ontology_term_id H5I_GROUP
11 /obs/assay_ontology_term_id categories H5I_DATASET STRING 1
12 /obs/assay_ontology_term_id codes H5I_DATASET INTEGER 46500
whereas older downloads have
5 /obs __categories H5I_GROUP
6 /obs/__categories assay H5I_DATASET STRING 1
7 /obs/__categories assay_ontology_term_id H5I_DATASET STRING 1
8 /obs/__categories author_cell_type H5I_DATASET STRING 30
9 /obs/__categories cell_type H5I_DATASET STRING 28
I guess??? this is a change in AnnData on-disk representation? This causes h5ad <- readH5AD(local_file, reader = "R", use_hdf5 = TRUE)
to fail (an error is translated to a warning; the net result is that no colData is added to the SummarizedExperiment.
Warning message:
In value[[3L]](cond) : setting 'colData' failed for
'/Users/ma38727/Library/Caches/org.R-project.R/R/cellxgenedp/f69ba4b3-fc45-483c-8a7c-434fd056aeed.H5AD':
cannot coerce class "list" to a DataFrame
Will the R-based reader be updated, or is the best strategy to switch to the python reader?
I haven't looked into it but I'm guessing this file uses the AnnData v0.8 format. At the moment the safest /most reliable approach is to use the Python reader. The R reader is currently neglected and needs a fair bit of work but that won't happen before the next release.