TypeError: Object dtype dtype('O') has no native HDF5 equivalent
Hi there,
I am running into this error when I am trying a h5ad file from my Anndata object. I downloaded a dataset from the Allen Brain Atlas [1], and then I loaded it using this code:
M1_matrix = pd.read_csv('/path/matrix.csv',index_col=0)
M1_rows = pd.read_csv('/path/human_MTG_2018-06-14_genes-rows.csv')
M1_rows.index=M1_rows['gene']
M1_columns = pd.read_csv('/path/Human_M1_data/metadata.csv')
M1_columns.index=M1_columns['sample_name']
import Anndata
adata = anndata.AnnData(X=M1_matrix.to_numpy(), obs=M1_columns, var=M1_rows)
And then I run the following to try and convert objects to strings:
adata.obs.columns = adata.obs.columns.astype(str)
adata.var.columns = adata.var.columns.astype(str)
adata.var=adata.var.convert_dtypes()
adata.obs=adata.obs.convert_dtypes()
And then when I tried to write it with:
adata.write(path/M1.h5ad)
Then I got the following error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/anndata/_io/utils.py in func_wrapper(elem, key, val, *args, **kwargs)
208 try:
--> 209 return func(elem, key, val, *args, **kwargs)
210 except Exception as e:
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/anndata/_io/h5ad.py in write_array(f, key, value, dataset_kwargs)
184 value = _to_hdf5_vlen_strings(value)
--> 185 f.create_dataset(key, data=value, **dataset_kwargs)
186
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/h5py/_hl/group.py in create_dataset(self, name, shape, dtype, data, **kwds)
148
--> 149 dsid = dataset.make_new_dset(group, shape, dtype, data, name, **kwds)
150 dset = dataset.Dataset(dsid)
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/h5py/_hl/dataset.py in make_new_dset(parent, shape, dtype, data, name, chunks, compression, shuffle, fletcher32, maxshape, compression_opts, fillvalue, scaleoffset, track_times, external, track_order, dcpl, allow_unknown_filter)
88 dtype = numpy.dtype(dtype)
---> 89 tid = h5t.py_create(dtype, logical=1)
90
h5py/h5t.pyx in h5py.h5t.py_create()
h5py/h5t.pyx in h5py.h5t.py_create()
h5py/h5t.pyx in h5py.h5t.py_create()
TypeError: Object dtype dtype('O') has no native HDF5 equivalent
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/anndata/_io/utils.py in func_wrapper(elem, key, val, *args, **kwargs)
208 try:
--> 209 return func(elem, key, val, *args, **kwargs)
210 except Exception as e:
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/anndata/_io/h5ad.py in write_series(group, key, series, dataset_kwargs)
288 else:
--> 289 write_array(group, key, series.values, dataset_kwargs=dataset_kwargs)
290
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/anndata/_io/utils.py in func_wrapper(elem, key, val, *args, **kwargs)
211 parent = _get_parent(elem)
--> 212 raise type(e)(
213 f"{e}\n\n"
TypeError: Object dtype dtype('O') has no native HDF5 equivalent
Above error raised while writing key 'cluster_order' of <class 'h5py._hl.group.Group'> from /.
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/anndata/_io/utils.py in func_wrapper(elem, key, val, *args, **kwargs)
208 try:
--> 209 return func(elem, key, val, *args, **kwargs)
210 except Exception as e:
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/anndata/_io/h5ad.py in write_dataframe(f, key, df, dataset_kwargs)
262 for col_name, (_, series) in zip(col_names, df.items()):
--> 263 write_series(group, col_name, series, dataset_kwargs=dataset_kwargs)
264
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/anndata/_io/utils.py in func_wrapper(elem, key, val, *args, **kwargs)
211 parent = _get_parent(elem)
--> 212 raise type(e)(
213 f"{e}\n\n"
TypeError: Object dtype dtype('O') has no native HDF5 equivalent
Above error raised while writing key 'cluster_order' of <class 'h5py._hl.group.Group'> from /.
Above error raised while writing key 'cluster_order' of <class 'h5py._hl.group.Group'> from /.
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
/tmp/ipykernel_29925/3214377814.py in <module>
----> 1 adata.write('/wynton/group/pollen/arnar/Scanpy/Scanpy/data/Human_M1_data/Human_M1_data.h5ad')
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/anndata/_core/anndata.py in write_h5ad(self, filename, compression, compression_opts, force_dense, as_dense)
1903 filename = self.filename
1904
-> 1905 _write_h5ad(
1906 Path(filename),
1907 self,
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/anndata/_io/h5ad.py in write_h5ad(filepath, adata, force_dense, as_dense, dataset_kwargs, **kwargs)
109 else:
110 write_attribute(f, "raw", adata.raw, dataset_kwargs=dataset_kwargs)
--> 111 write_attribute(f, "obs", adata.obs, dataset_kwargs=dataset_kwargs)
112 write_attribute(f, "var", adata.var, dataset_kwargs=dataset_kwargs)
113 write_attribute(f, "obsm", adata.obsm, dataset_kwargs=dataset_kwargs)
~/utils/miniconda3/envs/scanpy/lib/python3.9/functools.py in wrapper(*args, **kw)
875 '1 positional argument')
876
--> 877 return dispatch(args[0].__class__)(*args, **kw)
878
879 funcname = getattr(func, '__name__', 'singledispatch function')
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/anndata/_io/h5ad.py in write_attribute_h5ad(f, key, value, *args, **kwargs)
128 if key in f:
129 del f[key]
--> 130 _write_method(type(value))(f, key, value, *args, **kwargs)
131
132
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/anndata/_io/utils.py in func_wrapper(elem, key, val, *args, **kwargs)
210 except Exception as e:
211 parent = _get_parent(elem)
--> 212 raise type(e)(
213 f"{e}\n\n"
214 f"Above error raised while writing key {key!r} of {type(elem)}"
TypeError: Object dtype dtype('O') has no native HDF5 equivalent
Above error raised while writing key 'cluster_order' of <class 'h5py._hl.group.Group'> from /.
Above error raised while writing key 'cluster_order' of <class 'h5py._hl.group.Group'> from /.
Above error raised while writing key 'obs' of <class 'h5py._hl.files.File'> from /.
Thank. you so much for your help!
Dataset [1] https://portal.brain-map.org/atlases-and-data/rnaseq/human-m1-10x
just fyi, here is the output from logging.print_versions()
anndata 0.7.6
scanpy 1.8.1
sinfo 0.3.4
-----
PIL 8.4.0
beta_ufunc NA
binom_ufunc NA
bottleneck 1.3.2
cycler 0.10.0
cython_runtime NA
dateutil 2.8.2
h5py 3.5.0
igraph 0.9.7
joblib 1.0.1
kiwisolver 1.3.1
leidenalg 0.8.8
llvmlite 0.37.0
matplotlib 3.4.3
mkl 2.4.0
mpl_toolkits NA
natsort 7.1.1
nbinom_ufunc NA
numba 0.54.1
numexpr 2.7.3
numpy 1.20.1
packaging 21.0
pandas 1.3.3
pkg_resources NA
pyexpat NA
pyparsing 2.4.7
pytz 2021.3
scipy 1.7.1
six 1.16.0
sklearn 1.0.1
tables 3.6.1
texttable 1.6.4
threadpoolctl 2.2.0
wcwidth 0.2.5
-----
Python 3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0]
Linux-3.10.0-1160.36.2.el7.x86_64-x86_64-with-glibc2.17
32 logical CPU cores, x86_64
Hey!
You can check which columns are causing this issue by running adata.obs.dtypes and adata.var.dtypes and finding the columns that say 'Object'. You can cast those to integer or strings using adata.obs[col_name] = adata.obs[col_name].astype(int) or .astype(str).
I think this is related to #504, but is a bit different because I don't think the column giving the error is pd.Int64Dtype, but doesn't have any null values. We could either:
- Check these don't have null values, convert to np.int64, write these, and call it a bugfix
- Write them as nullable integers once #504 is implemented
@AB1995UCSF, you should be fine to write this if you just don't call .convert_dtypes(). E.g. just:
adata = anndata.AnnData(
...,
obs=pd.read_csv('/path/Human_M1_data/metadata.csv').set_index("sample_name"),
...,
)
This issue has been automatically marked as stale because it has not had recent activity. Please add a comment if you want to keep the issue open. Thank you for your contributions!
@ivirshup any idea what solution we want to go with?
This issue has been automatically marked as stale because it has not had recent activity. Please add a comment if you want to keep the issue open. Thank you for your contributions!