anndata icon indicating copy to clipboard operation
anndata copied to clipboard

Writing H5AD fails when `uns` contains a list with dictionary items

Open lazappi opened this issue 3 years ago • 2 comments

Originally noticed this with errors in {zellkonverter} (see https://github.com/theislab/zellkonverter/issues/59) but can confirm it happens with straight Python.

import anndata
import numpy as np

X = np.random.random((100, 1000))
adata = anndata.AnnData(X)
adata.uns["my_list"] = [{"X" : 1, "Y" : 2}]
adata.write_h5ad("test.h5ad")
Error message
Traceback (most recent call last):
  File "/Users/luke.zappia/miniconda/envs/anndata/lib/python3.8/site-packages/anndata/_io/utils.py", line 209, in func_wrapper
    return func(elem, key, val, *args, **kwargs)
  File "/Users/luke.zappia/miniconda/envs/anndata/lib/python3.8/site-packages/anndata/_io/h5ad.py", line 185, in write_array
    f.create_dataset(key, data=value, **dataset_kwargs)
  File "/Users/luke.zappia/miniconda/envs/anndata/lib/python3.8/site-packages/h5py/_hl/group.py", line 149, in create_dataset
    dsid = dataset.make_new_dset(group, shape, dtype, data, name, **kwds)
  File "/Users/luke.zappia/miniconda/envs/anndata/lib/python3.8/site-packages/h5py/_hl/dataset.py", line 145, in make_new_dset
    dset_id.write(h5s.ALL, h5s.ALL, data)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5d.pyx", line 232, in h5py.h5d.DatasetID.write
  File "h5py/_proxy.pyx", line 145, in h5py._proxy.dset_rw
  File "h5py/_conv.pyx", line 444, in h5py._conv.str2vlen
  File "h5py/_conv.pyx", line 95, in h5py._conv.generic_converter
  File "h5py/_conv.pyx", line 249, in h5py._conv.conv_str2vlen
TypeError: Can't implicitly convert non-string objects to strings

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/luke.zappia/miniconda/envs/anndata/lib/python3.8/site-packages/anndata/_io/utils.py", line 209, in func_wrapper
    return func(elem, key, val, *args, **kwargs)
  File "/Users/luke.zappia/miniconda/envs/anndata/lib/python3.8/site-packages/anndata/_io/h5ad.py", line 161, in write_list
    write_array(f, key, np.array(value), dataset_kwargs=dataset_kwargs)
  File "/Users/luke.zappia/miniconda/envs/anndata/lib/python3.8/site-packages/anndata/_io/utils.py", line 212, in func_wrapper
    raise type(e)(
TypeError: Can't implicitly convert non-string objects to strings

Above error raised while writing key 'uns/my_list' of <class 'h5py._hl.files.File'> from /.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/luke.zappia/miniconda/envs/anndata/lib/python3.8/site-packages/anndata/_core/anndata.py", line 1912, in write_h5ad
    _write_h5ad(
  File "/Users/luke.zappia/miniconda/envs/anndata/lib/python3.8/site-packages/anndata/_io/h5ad.py", line 118, in write_h5ad
    write_attribute(f, "uns", adata.uns, dataset_kwargs=dataset_kwargs)
  File "/Users/luke.zappia/miniconda/envs/anndata/lib/python3.8/functools.py", line 875, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "/Users/luke.zappia/miniconda/envs/anndata/lib/python3.8/site-packages/anndata/_io/h5ad.py", line 130, in write_attribute_h5ad
    _write_method(type(value))(f, key, value, *args, **kwargs)
  File "/Users/luke.zappia/miniconda/envs/anndata/lib/python3.8/site-packages/anndata/_io/h5ad.py", line 294, in write_mapping
    write_attribute(f, f"{key}/{sub_key}", sub_value, dataset_kwargs=dataset_kwargs)
  File "/Users/luke.zappia/miniconda/envs/anndata/lib/python3.8/functools.py", line 875, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "/Users/luke.zappia/miniconda/envs/anndata/lib/python3.8/site-packages/anndata/_io/h5ad.py", line 130, in write_attribute_h5ad
    _write_method(type(value))(f, key, value, *args, **kwargs)
  File "/Users/luke.zappia/miniconda/envs/anndata/lib/python3.8/site-packages/anndata/_io/utils.py", line 212, in func_wrapper
    raise type(e)(
TypeError: Can't implicitly convert non-string objects to strings

Above error raised while writing key 'uns/my_list' of <class 'h5py._hl.files.File'> from /.

Above error raised while writing key 'uns/my_list' of <class 'h5py._hl.files.File'> from /.

Versions

  • python==3.8.0
  • anndata==0.7.8

lazappi avatar Feb 14 '22 09:02 lazappi

This has been reported before (#493), and is currently unsupported.

It could be a reasonable feature request. Would need use cases and potential implementation ideas though.

ivirshup avatar Feb 14 '22 10:02 ivirshup

Hmmm...ok. I don't really have a use case other than trying to avoid errors. I do think it's pretty unexpected that something simple like this would fail though. I guess for {zellkonverter} I will look into where there is an easy way to check for this before it happens.

lazappi avatar Feb 14 '22 10:02 lazappi

This issue has been automatically marked as stale because it has not had recent activity. Please add a comment if you want to keep the issue open. Thank you for your contributions!

github-actions[bot] avatar Jun 21 '23 02:06 github-actions[bot]

Closing for now. If a use case comes up, we can reconsider

flying-sheep avatar Jun 22 '23 08:06 flying-sheep

Ran into this issue today. My use-case is "adding arbitrary metadata to an h5ad file, so that downstream users have it to work with." The specific metadata we're adding is hierarchical, and is naturally represented this way.

My current workaround is to instead recursively convert all lists into dictionaries with dummy keys before storing. (Also considered: store the output of json.dumps(data)).

Snippet, if it's useful for anyone else:

def lists_to_dicts(data):
    # Recursively converts lists to dicts
    # Required because of this issue https://github.com/scverse/anndata/issues/708
    if isinstance(data, list):
        return {
            f"_{idx}": lists_to_dicts(elem)
            for idx, elem in enumerate(data)
        }
    if isinstance(data, dict):
        for key in list(data.keys()):
            data[key] = lists_to_dicts(data[key])
    return data

jemma-nelson avatar Sep 24 '23 22:09 jemma-nelson