Understanding of the read-level prediction file output

Open gsukrit opened this issue 1 year ago • 1 comments

If we wish to get the read level information of the methylated transcript, can we use the second column i.e. read levele and k-mer probability of the middle A of the file read_level_prediction_m6A_sorted to filter the methylation probability of >=0.9 ? As in, is the methylation probability of site level in the site-level prediction file same as that of probability (second column) in the read level prediction file ?

Feb 06 '24 20:02 gsukrit

Hi, Yes, the second column in the read_level_prediction_m6A_sorted will provide the read-level prediction probability from model 1. Then we have model 2 that takes those probabilities and predicts site-level probabilities. The site-level predictions provide the stochiometry that is calculated from the read-level probabilities of model 1. We used a double cutoff of 0.7 and 0.3 for read-level predictions. So you can consider reads with probability >= 0.7 to be methylated and probability <=0.3 non-methylated. I hope it helps! Thanks, Akanksha

Feb 06 '24 23:02 Akanksha2511