scaden
scaden copied to clipboard
Simulate errors out when celltype names are only numbers, requires text prefix to run correctly.
I ran scaden simulate with celltype names being the Leiden cluster numbers. Got the following error message and the data.h5ad file was not created.
INFO Datasets: ['testdata_all_bat'] bulk_simulator.py:84 INFO Simulating data from testdata_all_bat bulk_simulator.py:89 INFO Loading testdata_all_bat dataset ... bulk_simulator.py:141 INFO Merging unknown cell types: ['unknown'] bulk_simulator.py:107 INFO Subsampling testdata_all_bat ... bulk_simulator.py:110 /home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_core/anndata.py:120: ImplicitModificationWarning: Transforming to str index. warnings.warn("Transforming to str index.", ImplicitModificationWarning) ... storing 'ds' as categorical Traceback (most recent call last): File "/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_io/utils.py", line 209, in func_wrapper return func(elem, key, val, *args, **kwargs) File "/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_io/h5ad.py", line 247, in write_dataframe col_names = [check_key(c) for c in df.columns] File "/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_io/h5ad.py", line 247, in
col_names = [check_key(c) for c in df.columns] File "/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_io/utils.py", line 109, in check_key raise TypeError(f"{key} of type {typ} is an invalid key. Should be str.") TypeError: 0 of type <class 'int'> is an invalid key. Should be str. The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/ku_user/scadendl/bin/scaden", line 8, in
sys.exit(main()) File "/home/ku_user/scadendl/lib64/python3.6/site-packages/scaden/main.py", line 48, in main cli() File "/home/ku_user/scadendl/lib64/python3.6/site-packages/click/core.py", line 1137, in call return self.main(*args, **kwargs) File "/home/ku_user/scadendl/lib64/python3.6/site-packages/click/core.py", line 1062, in main rv = self.invoke(ctx) File "/home/ku_user/scadendl/lib64/python3.6/site-packages/click/core.py", line 1668, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/ku_user/scadendl/lib64/python3.6/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/ku_user/scadendl/lib64/python3.6/site-packages/click/core.py", line 763, in invoke return __callback(*args, **kwargs) File "/home/ku_user/scadendl/lib64/python3.6/site-packages/scaden/main.py", line 215, in simulate fmt=data_format, File "/home/ku_user/scadendl/lib64/python3.6/site-packages/scaden/simulate.py", line 22, in simulation bulk_simulator.simulate() File "/home/ku_user/scadendl/lib64/python3.6/site-packages/scaden/simulation/bulk_simulator.py", line 90, in simulate self.simulate_dataset(dataset) File "/home/ku_user/scadendl/lib64/python3.6/site-packages/scaden/simulation/bulk_simulator.py", line 130, in simulate_dataset ann_data.write(os.path.join(self.out_dir, dataset + ".h5ad")) File "/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_core/anndata.py", line 1911, in write_h5ad as_dense=as_dense, File "/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_io/h5ad.py", line 111, in write_h5ad write_attribute(f, "obs", adata.obs, dataset_kwargs=dataset_kwargs) File "/usr/lib64/python3.6/functools.py", line 807, in wrapper return dispatch(args[0].class)(*args, **kw) File "/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_io/h5ad.py", line 130, in write_attribute_h5ad _write_method(type(value))(f, key, value, *args, **kwargs) File "/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_io/utils.py", line 216, in func_wrapper ) from e TypeError: 0 of type <class 'int'> is an invalid key. Should be str. Above error raised while writing key 'obs' of <class 'h5py._hl.files.File'> from /.
I then appended "celltype_" to the Leiden cluster numbers (eg: celltype_13) in the celltype file, and simulate runs correctly, generating the data.h5ad file. I still get the following warning message though.
/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_core/anndata.py:120: ImplicitModificationWarning: Transforming to str index. warnings.warn("Transforming to str index.", ImplicitModificationWarning) ... storing 'ds' as categorical
Hi @nagendraKU ,
thanks for reporting that. Yes using only numbers can cause problem - I will try to catch that and issue a better warning.
You can ignore the ImplicitModificationWarning
though, that shouldn't cause a problem.