MAESTRO
MAESTRO copied to clipboard
gene_peak matrix in RP_model
The value range of gene_peak matrix in RP_model should between 0 and 1. Why does "genes_peaks_score_csr" in scATAC_Genescore.py have a value greater than 1, with a maximum of 1600+.
Hi, how did you get the gene activity score? Did you use the enhanced model or simple model? If you get the activity score from running the MAESTRO snakemake pipeline, the count matrix is binarized. The gene activity score is calculated by summing up the peaks (within 120kb) nearby a gene using an exponential decay model. details here https://github.com/liulab-dfci/MAESTRO/blob/master/example/Gene_activity_modelling/Gene_activity_modelling.md
if a peak is at the promoter, it will have a weight of 1, and then adding up other more distal peaks' regulatory potential, so it could be great than 1.
If you used the MAESTRO scatac-genescore
command line tool, and feed it with the raw count matrix, it could have big gene activity scores because now you can have a big count in a peak. You can binarize that count matrix and rerun the same command.
I am doing a benchmark for those different ways to calculate gene activity score and will update here.
Thanks!
thanks, but i mean the gene_peak without multiplication by count matrix, the element in this gene_peak matrix represent the peak regulatory the gene potential. Should max(gene_peak) is equal to 1? But i got some values greater than 1.
how did you get the gene_peak matrix? you are right, for every peak, the max weight should be 1.