anndata icon indicating copy to clipboard operation
anndata copied to clipboard

storing dict in .h5ad should raise warning when keys include '/'

Open MxMstrmn opened this issue 4 years ago • 2 comments

I was not aware how .h5ad stores their dicts and it took me quite a bit to figure out why my stored adata.uns['key'] was different from the original adata.uns['key']. Part of the problem was the large dataset which made examination of potential fail cases difficult.

Eventuelly, I figured out that some molecular descriptors include the sequence '(+/-)' which caused anndata to store the dict in a nested structure. I would suggest to check if keys include '/' and raise a warning such that the user is aware that the stored dictionary will not be identical to the original one.

Minimal Code example that clarified the problem to me:

import scanpy as sc 
from anndata import AnnData 

adata = AnnData()
bucket_list = {
    'remember': 'trivia',
    'forget/whatIwantedtoremember': 42, 
}
adata.uns['bucket_list'] = bucket_list
sc.write('adata_test.h5ad', adata)

adata = sc.read('adata_test.h5ad')
adata.uns['bucket_list']

 {'forget': {'whatIwantedtoremember': 42}, 'remember': 'trivia'}

MxMstrmn avatar Aug 09 '21 09:08 MxMstrmn

Would be better to fix this, rather than just raising a warning. Possibly by escaping/unescaping / in dict keys on save and load, but maybe there's a better solution.

grst avatar May 31 '23 05:05 grst

This issue has been automatically marked as stale because it has not had recent activity. Please add a comment if you want to keep the issue open. Thank you for your contributions!

github-actions[bot] avatar Aug 26 '25 02:08 github-actions[bot]