cNMF icon indicating copy to clipboard operation
cNMF copied to clipboard

How to assess the similarity between programs across samples

Open Dragonlongzhilin opened this issue 3 years ago • 1 comments

Hi cNMF team, Thanks for developing a nice tools! I extracted different programs in serveral samples and want to assess the program similarity. I noticed there are two gene expression program matrix (Zscore and TPM). Which expression matrix is suitable for calculating the correlation between programs in different samples.

Dragonlongzhilin avatar Nov 14 '22 13:11 Dragonlongzhilin

Hi, thanks for your comment. It is effective to use either the Z-score or TPM spectra matrix to compare correlation across programs. However, if using the TPM matrix, you will want to variance-normalize it to allow genes expressed at different scales to contribute equally to the program correlations - gene variances have already been output in the tpm_stats file. We also suggest using either the union or intersection of the HVGs across samples for these comparisons, especially when using the var-normed TPM-spectra. The TPM spectra matrix tends to have more baseline correlation across the non-differentially expressed genes (whereas these genes will tend to have zero values in the Z-score matrix).

michelle-curtis avatar Dec 02 '22 20:12 michelle-curtis

Yes, like @michelle-curtis said, either can work. I think using the Z-score output by default seems to work the best. And also as Michelle said, good to subset to high variance genes

dylkot avatar May 07 '24 23:05 dylkot