modkit icon indicating copy to clipboard operation
modkit copied to clipboard

Issue with coordinates in modkit pileup

Open ccastignani opened this issue 1 year ago • 1 comments

Hi,

I'm running modkit pileup (v0.2.4) as follows:

modkit pileup -t 8 --filter-threshold 0.9 {$bam} {$out_bed} --ref /ref/hg38.fa --preset traditional

Basecalling has been done using dorado basecaller (modified-bases 5mCG_5hmCG and [email protected]), and alignment has been done using minimap2.

Unfortunately, when looking at the output bed file from modkit pileup, there coordinates seem to be off consistently by one base pair upstream. The read counts regarding modified/canonical base pairs seem correct, it's just that they are all off by one position.

Is this something others have experienced before? (couldn't find any other issue about this). Any suggestions or recommendations to fix this would be highly appreciated!

Thanks :)

ccastignani avatar Feb 19 '24 18:02 ccastignani

Hello @ccastignani,

Could you tell me how you're observing the off-by-one error? It may be an artifact of the viewer. For example, the BED specification is zero-based whereas many viewers (such as IGV) are one-based. This can make everything appear off-by-one. Also, with the --preset traditional option set, the base modifications for the (+) strand C and the (-) strand C in CpG motifs will be combined together on to the (+) strand position (see the documentation for details) which may make the (-) strand positions seem off-by-one. Happy to help debug if you can get me an example.

ArtRand avatar Feb 20 '24 15:02 ArtRand