celltypist icon indicating copy to clipboard operation
celltypist copied to clipboard

Guidance for using Nanostring CosMx RNA input

Open markdane opened this issue 3 years ago • 1 comments

Hello, I appreciate your work in making CellTypist available. I have been able to use the python API to assign predicted_labels and majority_voting types to our data but am getting the warning message below while running celltypist.annotate:

⚠️ Warning: the input file seems not a raw count matrix. The prediction result may not be accurate

The CosMx data contains counts for 960 genes and 20 negative probes. It is sparse data with an average of 250 unique genes per cell. I have processed these by normalizing each cell to have a target of 10,000 counts then computing their log1p values. The input file is attached. I have also tried using an annData object as input but this throws an error instead of just a warning.

Can you comment on whether this data is a good match for CellTypist and if I have it in the best or correct format?

-Mark Dane CT_sample_file_values.csv

markdane avatar Oct 26 '22 21:10 markdane

@markdane, if you want to use a csv file as the input, a raw count matrix (that only contains integers) is required. If you use an AnnData as input, you need to do normalization as you have already done. So just input a csv file with raw counts without normalization in your case. Also, because your data is sparse, many informative genes in the model may not be utilized by your data

ChuanXu1 avatar Oct 27 '22 11:10 ChuanXu1

This issue should be solved. Please reopen it if you still have problem :).

ChuanXu1 avatar Dec 12 '22 23:12 ChuanXu1