graphmap
graphmap copied to clipboard
align -x overlap produces .sam file CIGAR_MAPS_OFF_REFERENCE
Hello Using graphmap 0.5.1 I want to overlap illumina MiSeq reads derived from enrichment sequencing. The reads are enriched for a large gene family. I would hope for large numbers of relatively short contigs to be produced, each corresponding to a gene family member. I don't know if graphmap is an appropriate tool to do this but from reading the docs it seemed promising. To test I used ~200,000 reads (the reads are paired end but I'm ignoring the paired reads for now).
graphmap align -x overlap -r Blb_S11.cleaned.R1.fastq -d Blb_S11.cleaned.R1.fastq -o Blb_S11R1_gmap_test.sam
The resulting .sam file cannot be converted to .bam and sorted etc. by samtools [W::sam_parse1] mapped query cannot have zero coordinate; treated as unmapped
Picard tools ValidateSamFile identifies the following problems ERROR:CIGAR_MAPS_OFF_REFERENCE 174420 ERROR:INVALID_ALIGNMENT_START 5514 ERROR:INVALID_MAPPING_QUALITY 18417
I'm confused as it seems to me with overlapping one would expect many reads to map off reference at least one end. If I take the same reads and map them to a reference sequence (a region of one of the gene family members obtained by sequencing a PCR product) using graphmap align -x illumina. The resulting .sam alignment has reads that overhang the reference at both the start and end and samtools doesn't complain. This work was done with graphmap v0.4. I was wondering if you thought my strategy was plausible in the first place and if graphmap is an appropriate tool? Are there any changes v0.4 to 0.5.1 which might explain why samtools doesn't like overhangs? Thanks for reading Miles
Hi Miles,
I found similar problems with v0.5.1 where alignments were starting and/or ending off the reference. A few days ago I submitted a pull request for a fix that has been working for me: https://github.com/isovic/graphmap/pull/64
Basically it just drops the reads that graphmap was aligning with invalid coordinates on the reference, so I'm hoping that isovic can fix the underlying problem and get proper alignments for those reads in the future.
-Rob
Hi Thanks for the suggestion Rob. Your comment made me wonder if the version of graphmap was significant. Interestingly when I repeated the experiment using graphmap v0.4.1 the .sam file generated was converted/sorted no problem by samtools. Also, v0.4 produced a 11GB .sam file while v0.5.1 produced a 1.3GB file. Miles