Issue with coordinates in modkit pileup
Hi,
I'm running modkit pileup (v0.2.4) as follows:
modkit pileup -t 8 --filter-threshold 0.9 {$bam} {$out_bed} --ref /ref/hg38.fa --preset traditional
Basecalling has been done using dorado basecaller (modified-bases 5mCG_5hmCG and [email protected]), and alignment has been done using minimap2.
Unfortunately, when looking at the output bed file from modkit pileup, there coordinates seem to be off consistently by one base pair upstream. The read counts regarding modified/canonical base pairs seem correct, it's just that they are all off by one position.
Is this something others have experienced before? (couldn't find any other issue about this). Any suggestions or recommendations to fix this would be highly appreciated!
Thanks :)
Hello @ccastignani,
Could you tell me how you're observing the off-by-one error? It may be an artifact of the viewer. For example, the BED specification is zero-based whereas many viewers (such as IGV) are one-based. This can make everything appear off-by-one. Also, with the --preset traditional option set, the base modifications for the (+) strand C and the (-) strand C in CpG motifs will be combined together on to the (+) strand position (see the documentation for details) which may make the (-) strand positions seem off-by-one. Happy to help debug if you can get me an example.