minimap2 icon indicating copy to clipboard operation
minimap2 copied to clipboard

Reference Weirdness

Open jonathan-bravo opened this issue 10 months ago • 0 comments

Hello Dr. Li,

We are running minimap2 as part of a viral haplotype detection workflow and have noticed some weirdness between different reference files.

all_viruses.fasta contains viral variants (many accessions) say 1-90 and we align a single sample to this file using the commands:

minimap2 -t 32 -o sample1.sam -a all_viruses.fasta sample1.cleaned.fastq.gz;

samtools view -Sb sample1.sam | samtools sort -@ 32 -o sample1.bam;

samtools index -@ 32 sample1.bam;
  • In this case cleaned means:
    • we have trimmed the reads
    • deduplicated the sequences
    • and removed host reads
  • So there should only be enriched viral reads present

In this case we get alignments to many of the viruses, but we noticed a small of reads aligning to one or two particular strains across most of the samples. We then took a subset of all_viruses.fasta (containes maybe 10 of the viruses we noticed the 'weirdness' with) and ran the same command on sample1 as above.

However, in this case we get 0 reads aligning to any of the viruses (the sequences are the same) and we were wondering what may be causeing this, and if you have any suggestions.

If needed I can provide the sample and both of the references used.

We ran both minimap2 -V 2.24 and minimap2 -V 2.26 with the same results.

We checked the read counts using: samtools idxstats sample1.bam

Thank you for your time.

Cheers, Johnny

jonathan-bravo avatar Sep 06 '23 17:09 jonathan-bravo