alona icon indicating copy to clipboard operation
alona copied to clipboard

ValueError: DataFrame constructor not properly called!

Open fcgportal opened this issue 4 years ago • 1 comments

Hi there,

Thank you for providing such a great tools to do scRNA-analysis. when I run the GSM3689776 data, there ia a error and can not finish running, the output info is :

Traceback (most recent call last): File "/anaconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/anaconda3/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/.local/lib/python3.7/site-packages/alona/main.py", line 9, in main() File "/.local/lib/python3.7/site-packages/alona/main.py", line 5, in main run(prog_name='alona.py') File "/.local/lib/python3.7/site-packages/click/core.py", line 829, in call return self.main(*args, **kwargs) File "/.local/lib/python3.7/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/.local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/.local/lib/python3.7/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/.local/lib/python3.7/site-packages/alona/alona.py", line 187, in run alonacell.analysis() File "/.local/lib/python3.7/site-packages/alona/cell.py", line 439, in analysis self.CTA_RANK_F(marker_plot=True) File "/.local/lib/python3.7/site-packages/alona/celltypes.py", line 177, in CTA_RANK_F _df = pd.DataFrame(k) File "/.local/lib/python3.7/site-packages/pandas/core/frame.py", line 528, in init raise ValueError("DataFrame constructor not properly called!") ValueError: DataFrame constructor not properly called!

fcgportal avatar Jul 30 '20 21:07 fcgportal

I found out the issue for this. You'd have to change a couple things, so check out the fork of the repo I made here: https://github.com/deontaepharr/alona/

The issues arise due to the nested function, _guess_cell_type(x), returning rather than a list of dict. So, doing

_df = pd.DataFrame(ret[k].to_list()) instead of

_df = pd.DataFrame(k) in the celltypes.py file.

Also, you will have to change the regex in the get_gene_symbols() func, if you're working with Mouse data (I used the sample data in the readme, that's how I found out the issue) to this:

if data_norm.index.str.match(r'^ENS(G|MUS(G){0,1})\d+$').any(): instead of

if data_norm.index.str.match('^ENS(G|MUS)[0-9]+$').any():

Again, I made a fork of the repo you can pull the solution from, until @oscar-franzen makes a fix

deontaepharr avatar Oct 11 '20 22:10 deontaepharr