Using selected cell type markers can't improve the accuracy
Hi Thank you for your fancy work! I've met a little questions when using BayesPrism and wanna get your advice. I have thousands of bulk RNA-seq samples and two scRNA-seq samples for reference. To evaluate the precise proportion of TME composition, we also counted the ratio of cd8+T and cd4+T cells of ~30 samples based on immunostaining and used it for gold standard.
-
When I used all genes, markers based on seurat function FindAllMarkers() and subset markers that expressed nearly 0 in other cell types, it seems that using all genes is better than other 2, that was strange. I am confused why more informative genes can't help. Here are correlation and RSME under different genes.
-
I noticed that when I used selected genes, BayersPrism calculated expression matrix only on choosed genes. I wonder if there are some parameters help to get cell proportion based on selected gene and meanwhile obtain the whole expression matrix.
The cell.type.label and cell number is shown below: `
table(sc$cell.type.label)
B cell Cancer cell Cd4+ T cell Cd8+ T cell DCs DNT cell Epithelial M-MDSC
2057 2062 1921 3590 464 174 28 2153
Macrophage(AM) Macrophage(MM) NK cell PMN-MDSC
1072 1005 770 3436
` the corr.plot of cell types: type_cor_phi.pdf
I have observed the same trends in some datasets and would love to know the answer/opinions of the developers.