cellphonedb icon indicating copy to clipboard operation
cellphonedb copied to clipboard

'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

Open rachel662 opened this issue 4 years ago • 16 comments

Hi there,

I've just tried this with a h5.ad counts file and I get this error Is this something to do with my version of pandas? with a txt file it works fine

thanks Rachel

rachel662 avatar Feb 24 '21 13:02 rachel662

Hi @rachel662 could you please provide the command you used to launch cellphonedb? Additionally, It would be useful to know what versions of anndata/pandas/cellphonedb you're using. You can checked that with: pip show cellphonedb pandas anndata

prete avatar Feb 24 '21 13:02 prete

hi there, thanks so much for your help - here are the versions i'm running

CellPhoneDB Version: 2.1.6

Name: pandas Version: 1.2.2

Name: anndata Version: 0.7.5

cheers Rachel

rachel662 avatar Feb 24 '21 14:02 rachel662

CellPhoneDB 2.1.6 pandas 1.2.2 anndata 0.7.5

That looks about right, could you please provide the command you used to launch cellphonedb?

prete avatar Feb 24 '21 14:02 prete

i did: python -m venv cpdb source cpdb/bin/activate pip install cellphonedb cellphonedb method analysis meta.txt adata.h5ad (what i've called the counts file)

again, thanks for your help Rachel

rachel662 avatar Feb 24 '21 14:02 rachel662

It may be related to the encoding of your file. Could you try with this test meta and h5ad and see if you get the same error? test_meta_and_count_h5ad.zip

That will help us rule out dependency errors or that kind of issues.

prete avatar Feb 24 '21 20:02 prete

Hi there, I have tried this with the test counts h5ad and test meta file and I still get the same error

thanks Rachel

rachel662 avatar Feb 25 '21 11:02 rachel662

Hi @rachel662, what is your h5py version?

zktuong avatar Mar 05 '21 17:03 zktuong

Hi there, sorry it's taken me so long to get back to you, here's my h5py version

Name: h5py Version: 2.10.0

thanks so much! Rachel

rachel662 avatar Mar 18 '21 12:03 rachel662

Hi @rachel662 could you try upgrading to the beta version (pip install -U CellPhoneDB==2.1.8b3) and see if you're still facing this issue?

prete avatar May 04 '21 11:05 prete

Hi there I tried this, however it didn't work, I think this might be because I haven't written the metadata file as a .h5ad file though?

cheers Rachel

rachel662 avatar May 25 '21 08:05 rachel662

Hi @rachel662 meta should still be a .txt/csv/.tsv file. Is there any chance you could share your meta and counts files with us so we can have a better look at this issue?

prete avatar May 25 '21 08:05 prete

I tried this, however it didn't work, I think this might be because I haven't written the metadata file as a .h5ad file though?

Hi @rachel662 not sure what you meant by that. Did you eventually manage to get it working?

prete avatar Jun 04 '21 13:06 prete

Hi @prete, I am also getting the same error when I try to read in my h5 or h5ad files.

CellphoneDB works on the command line when I use the test meta and counts data, however when I try to use my own data, I get the error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

The command I try running when I get this error is: cellphonedb method statistical_analysis s_8944_meta_tab.txt s_8944_counts.h5ad

I get the same error when I also try running: cellphonedb method statistical_analysis s_8944_meta_tab.txt S_8944_filtered_feature_bc_matrix.h5

The meta data accounts for only a subset of the barcodes that are in the entire counts data (I am only trying to observe interactions between two clusters, so in the meta data there are only the barcodes in these two annotated clusters, but the counts data has all barcodes. I'm not sure if this matters)

I've also tried inputting a folder ("raw_feature_bc_matrix") that contains my barcodes.tsv, features.tsv, and matrix.mtx data, however when I try running: cellphonedb method statistical_analysis s_8944_meta_tab.txt /raw_feature_bc_matrix ...

... I get the error: [ ][APP][22/07/21-14:51:18][ERROR] Can not read /raw_feature_bc_matrix

Please help with any of these issues if possible.

Thank you

mibo1996 avatar Aug 11 '21 16:08 mibo1996

Hi @mibo1996 could you please confirm which versions of cellphonedb and anndata are you using (pip show cellphonedb anndata)?

I've tried reproducing this error but failed. Would you be able to share any of your input files with us for debugging?

prete avatar Aug 11 '21 16:08 prete

hi @prete here is the output:

WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested Name: CellPhoneDB Version: 2.1.1 Summary: UNKNOWN Home-page: https://cellphonedb.org Author: TeichLab Author-email: [email protected] License: MIT Location: /usr/local/lib/python3.7/site-packages Requires: Flask-RESTful, Flask-Testing, SQLAlchemy, pandas, PyYAML, pika, flask, tqdm, boto3, geosketch, rpy2, click, requests Required-by:

Yes I could share my input file with you

mibo1996 avatar Aug 11 '21 16:08 mibo1996

Thank you for the fast reply. I can see you're using v2.1.1 and h5ad support was introduced in version 2.1.6 I'd first recommend you try to update CellPhoneDB to the latest version (2.1.7) using pip install -U cellphonedb and try running your command once again. If that also fails then we can have a look at your input file.

prete avatar Aug 11 '21 16:08 prete