minimap2
minimap2 copied to clipboard
Reference Weirdness
Hello Dr. Li,
We are running minimap2 as part of a viral haplotype detection workflow and have noticed some weirdness between different reference files.
all_viruses.fasta
contains viral variants (many accessions) say 1-90 and we
align a single sample to this file using the commands:
minimap2 -t 32 -o sample1.sam -a all_viruses.fasta sample1.cleaned.fastq.gz;
samtools view -Sb sample1.sam | samtools sort -@ 32 -o sample1.bam;
samtools index -@ 32 sample1.bam;
- In this case cleaned means:
- we have trimmed the reads
- deduplicated the sequences
- and removed host reads
- So there should only be enriched viral reads present
In this case we get alignments to many of the viruses, but we noticed a small
of reads aligning to one or two particular strains across most of the samples.
We then took a subset of all_viruses.fasta
(containes maybe 10 of the viruses
we noticed the 'weirdness' with) and ran the same command on sample1
as above.
However, in this case we get 0 reads aligning to any of the viruses (the sequences are the same) and we were wondering what may be causeing this, and if you have any suggestions.
If needed I can provide the sample and both of the references used.
We ran both minimap2 -V 2.24
and minimap2 -V 2.26
with the same results.
We checked the read counts using: samtools idxstats sample1.bam
Thank you for your time.
Cheers, Johnny