GraphAligner icon indicating copy to clipboard operation
GraphAligner copied to clipboard

Simulated pacbio reads vg graph alignment problem

Open liaoherui opened this issue 4 years ago • 5 comments

First of all, thanks for your wonderful tool ! It's really helpful for my research !

I build a vg graph from a collection of virus strain genomes (~10000bp per genome, some are very similar) , and I simulate the error free pacbio reads (1500bp per read) from one of the genomes. I align these reads to the graph with GraphAligner (1.0.9), and I find the the best alignment of some reads may be wrong cause these alignments don't contain the node of the path (refers to the reference genome be simulated).

In other words, it means the default best alignment of these reads can not cover the 1500 bp region of the simulated reference (one path in the graph). So I pick one read and output all the alignment to see if I can find one alignment that can cover the whole 1500 bp region of the reference. However, the highest one is 1368 bp that can be aligend to the region of the reference among all alignments.

In theory, the read should be aligned to the reference (path) in the graph with the whole 1500 bp...But I just can not get the ideal result. I also tried different mode 'Mum' and 'Mem'. None of them can output the 1500 bp alignment...

Is this a limitation of GraphAligner or even a bug? It will be really grateful if you can offer some possible reasons...(Btw, the path containing the simulated reference genome has '-' nodes, I wonder if this problem is related with this? )

liaoherui avatar Dec 12 '19 14:12 liaoherui