modkit icon indicating copy to clipboard operation
modkit copied to clipboard

6mA modification missing after modkit callmod

Open DexinYang1998 opened this issue 7 months ago • 3 comments

Hi,

The top track is the raw BAM file (grouped by read strand) after dorado base&modification calling. After I used the defaultmodkit call-mods (bottom), I found some of the modification information at A bases to be missing.

modkit: 0.5.0 dorado-model: 5.2.0-sup

Thanks so much!

Image

DexinYang1998 avatar May 22 '25 07:05 DexinYang1998

Hello @DexinYang1998,

I'm going to guess that the light blue is 6mA and the red is 5mC - but correct me if I'm wrong.

When you run modkit call-mods it will estimate the "filter threshold" details here which means it will discard the 10% lowest confidence calls. So it is expected that some calls will be removed. Could you show me the whole command you used?

ArtRand avatar May 22 '25 14:05 ArtRand

Thanks for your help! The light blue the the unmethylated A, and red is the 6mA. The command I used is modkit call-mods -t 40 in.bam out.bam

DexinYang1998 avatar May 22 '25 16:05 DexinYang1998

Hello @DexinYang1998,

In that case it makes sense that some of the calls will be removed (the lower confidence ones). It's hard to tell by eye, but it does look like more than 10% of the calls are being removed. You could check by first finding the threshold used by modkit call-mods it will appear in the log file like this:

estimated pass threshold <threshold> for primary sequence base A

Alternatively, you could use modkit sample-probs docs to get the threshold value. Once you have that number use modkit extract calls --filter-threshold ${threshold_value} --region ${this_region} and see if the number of calls with [column 20](estimated pass threshold 0.53554684 for primary sequence base C) == true is >10% of the total.

Let me know what you get.

ArtRand avatar May 27 '25 23:05 ArtRand