nucleotide-transformer icon indicating copy to clipboard operation
nucleotide-transformer copied to clipboard

Can‘t reproduce SegmentNT's MCC score on segmenting 5UTR on test set (chr20 and chr21)

Open JasonLinjc opened this issue 1 year ago • 0 comments

Dear Team,

I express my gratitude for your excellent work. Presently, I'm attempting to reproduce segmentNT (InstaDeepAI/segment_nt) on testing data (chr20 and chr21) to segment 5utr. I utilized the 5utr annotation from gencode v44 (excluding 3 level transcript) as the label and used input sequences in a sliding window of 30,000 length over chromosomes 20 and 21. I binarized segmentNT's prediction using a probability threshold of 0.5 to compute the MCC. However, my average MCC result was 0.11 across all 30-kb sequences, differing from the MCC of 0.48 cited in your paper. Could you please assist me with this discrepancy?

Thank you very much for your attention.

Kind Regards, Jiecong

JasonLinjc avatar May 12 '24 16:05 JasonLinjc