bowtie2
bowtie2 copied to clipboard
Alignment of pairs missing individuals
I am using bowtie2 to find reads or read pairs that are chimeric, aligning partly to a virus and partly to human. I initially align the data to the virus, trim the virus from the read or pair and then align the remnant to human. I do this both as paired and unpaired data.
This can be slow, so when doing this for multiple viruses, I create a database of all the viruses and align paired to it and select all reads that either align or whose mate aligned creating a paired subset of the initial data.
The problem is that the unpaired alignment of the complete data contains alignments not in the selected subset. I have lowered the minimum score threshold used in the selection alignment from the default to "G,1,3" and I get many more. But even if I change the minimum score to "C,1,0" they still don't align in paired mode.
In my selection process, I want to align paired so that the dataset remains usable as paired data, but I want the alignment to align even if its just a little bit of one of the reads.
Any suggestions? Am I missing an option?
bowtie2 -x SVAs_and_HERVs_KWHE --very-sensitive-local -1 fastq/reads_R1.fastq -2 fastq/reads_R2.fastq > paired.sam
590 reads; of these:
590 (100.00%) were paired; of these:
590 (100.00%) aligned concordantly 0 times
0 (0.00%) aligned concordantly exactly 1 time
0 (0.00%) aligned concordantly >1 times
----
590 pairs aligned concordantly 0 times; of these:
0 (0.00%) aligned discordantly 1 time
----
590 pairs aligned 0 times concordantly or discordantly; of these:
1180 mates make up the pairs; of these:
1180 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
bowtie2 -x SVAs_and_HERVs_KWHE --very-sensitive-local -U fastq/reads_R1.fastq -U fastq/reads_R2.fastq > unpaired.sam
1180 reads; of these:
1180 (100.00%) were unpaired; of these:
579 (49.07%) aligned 0 times
4 (0.34%) aligned exactly 1 time
597 (50.59%) aligned >1 times
50.93% overall alignment rate
It is seeming that -L
and -i
are key.
Replacing --very-sensitive-local
with --local -D 20 -R 3 -N 0 -L 10 -i S,1,0.25
is making a huge difference.
Using --local -D 30 -R 5 -N 0 -L 10 -i S,1,0.15 --score-min G,1,1
instead of just --very-sensitive-local
includes all reads so I guess that I just need to scale it back otherwise my selection will include all of the reads.
Ultimately going with --local -D 85 -R 5 -N 0 -L 10 -i S,1,0
for the moment.
Does anyone understand how the aligning algorithm differs when run paired vs unpaired?
Hi @jakewendt ,
I have a similar issue. It appears that bowtie2 mixed-mode is disabled. According to the documentation "If Bowtie 2 cannot find a paired-end alignment for a pair, by default it will go on to look for unpaired alignments for the constituent mates. This is called "mixed mode." You can reproduce this bug with these commands:
# Download the human genome
$ wget https://genome-idx.s3.amazonaws.com/bt/GRCh38_noalt_as.zip
# Decompress it
$ unzip GRCh38_noalt_as.zip
# Reads pairs for which only one mate aligns on the human genome
$ cat 1.fastq
@SRR2221473.42/1
GGCAACAAGAGTGAAACTCCATCTCAAAAAAAAAAAATATATATATATATGTGTGTATATATATATGTATATATATGTGTGTATATATATATGTATATATA
+
CCCFFFFFHHHCFIJIJJJJJJJJJJJJJIJJJJJJJCEEECEHFFFEEFFCDCCBCEDCDCDDDFD@CDCCDDDCDCCCBBDDDDDCDDCDECCDEEEE3
@SRR2221473.595/1
TCCATTGCATTCCATTCCATTCCATTCCATTCCAATCCGTTGCATTCCATTCCATTACATTCGGATTGATTCTATTCAACTCCCTTACTCTCCATTACATT
+
CCCFFFFFHHHHHJJJJJJJIJJJJJJJIJJJJIJJJIJJJGIIIIJJJJJJIIJJJJJJJJGJJJIJJJJIIJJIJJJJIIJIJJHHFHGHDEFFFDFE;
@SRR2221473.766/1
AGTGAAATGGAATGGAAGGGAATGGAATGAAATTGAATGGAATGGAATGGAATCAACCCGAGTGCAATGTAATGGAATTGAATGGAATGGAATGGAATGGA
+
@@@FFFFFHHHFHJJHGIIJJIIJJJJGJJJJJFIJJJIJGIIIIIJJJIJJIJIJJIIJGIIIJGGEHCHFFHF@DFFEEEEEDECDCCCCCCDDCACDC
$ cat 2.fastq
@SRR2221473.42/2
ACATATATATATACACACATATATATACATATATATATACACACATATATATACATATATATATACACACATATATATATATATTTTTTTTTTTTGAGATG
+
CCCFFFFFHHHHHJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJIIJGIIIHJJJIJIIIIJIIJJIJJGIJIIJIJIIIJIJJIHDDDDDD5<?BDC
@SRR2221473.595/2
GGAACAACCTGAATGGAATGGAATGTAATGGAGAGTAAGGGAGTTGAATAGAATCAATCCGAATGTAATGGAATGGAATGCAACGGATTGGAATGGAATGG
+
B@@FFFFFHHHHHGJJGIGIJJJJJHIJJJJIIGI?FHIJJHIFHIJJIJHIJJJJJJJJJJIJJJJJJGHHHHHGFEFFFEEEDDDDDDDDDDCDDDDD>
@SRR2221473.766/2
CATTCCATTCCTTTCCATTCTATTAGGGTTAATTCCATTCCATTCCATTCCATTCCATTCAATTCCATTCCATTCTATTCCATTGCAATCGAGTTGATTCC
+
CCCFFFFFHHHHHJJJIJJIJIJJJJJJEFHIJJGGIJJJJJJIIJJJJJHIIJJJJJJIJJJJHIIIGIIIIIJDIIJJIIIJGEIHHIHGIJIIJHHE3
# Unpaired alignment. All forward reads are aligned
$ bowtie2 -U 1.fastq,2.fastq -x GRCh38_noalt_as/GRCh38_noalt_as -p 32 --no-unal --no-sq --no-hd
6 reads; of these:
6 (100.00%) were unpaired; of these:
3 (50.00%) aligned 0 times
3 (50.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
50.00% overall alignment rate
SRR2221473.42/1 16 chr5 124767433 24 65M1D1M1D35M * 0 0 TATATATACATATATATATACACACATATATATACATATATATATACACACATATATATATATATTTTTTTTTTTTGAGATGGAGTTTCACTCTTGTTGCC 3EEEEDCCEDCDDCDDDDDBBCCCDCDDDCCDC@DFDDDCDCDECBCCDCFFEEFFFHECEEECJJJJJJJIJJJJJJJJJJJJJIJIFCHHHFFFFFCCC AS:i:-22 XN:i:0 XM:i:1 XO:i:2 XG:i:2 NM:i:3 MD:Z:65^A1^A15A19 YT:Z:UU
SRR2221473.595/1 0 chr5 49659837 0 101M * 0 0 TCCATTGCATTCCATTCCATTCCATTCCATTCCAATCCGTTGCATTCCATTCCATTACATTCGGATTGATTCTATTCAACTCCCTTACTCTCCATTACATT CCCFFFFFHHHHHJJJJJJJIJJJJJJJIJJJJIJJJIJJJGIIIIJJJJJJIIJJJJJJJJGJJJIJJJJIIJJIJJJJIIJIJJHHFHGHDEFFFDFE; AS:i:-45 XN:i:0 XM:i:8 XO:i:0 XG:i:0 NM:i:8 MD:Z:34T6C14C7G14T3T4A7C4 YT:Z:UU
SRR2221473.766/1 0 chr17_KI270729v1_random 25552 0 101M * 0 0 AGTGAAATGGAATGGAAGGGAATGGAATGAAATTGAATGGAATGGAATGGAATCAACCCGAGTGCAATGTAATGGAATTGAATGGAATGGAATGGAATGGA @@@FFFFFHHHFHJJHGIIJJIIJJJJGJJJJJFIJJJIJGIIIIIJJJIJJIJIJJIIJGIIIJGGEHCHFFHF@DFFEEEEEDECDCCCCCCDDCACDC AS:i:-54 XN:i:0 XM:i:10 XO:i:0 XG:i:0 NM:i:10 MD:Z:1A2G12T29A11T4G4G8G7C6T7 YT:Z:UU
# Paired alignment. Bowtie2 does not switch to mixed mode
$ bowtie2 -1 1.fastq -2 2.fastq -x GRCh38_noalt_as/GRCh38_noalt_as -p 32 --no-unal --no-sq --no-hd
3 reads; of these:
3 (100.00%) were paired; of these:
3 (100.00%) aligned concordantly 0 times
0 (0.00%) aligned concordantly exactly 1 time
0 (0.00%) aligned concordantly >1 times
----
3 pairs aligned concordantly 0 times; of these:
0 (0.00%) aligned discordantly 1 time
----
3 pairs aligned 0 times concordantly or discordantly; of these:
6 mates make up the pairs; of these:
6 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
Something happened between versions 2.1.0 and 2.3.4.1
version 2.1.0 has the same behaviour on my side
$ bowtie2-2.1.0/bowtie2 -1 1.fastq -2 2.fastq -x GRCh38_noalt_as/GRCh38_noalt_as -p 32 -S /dev/null
3 reads; of these:
3 (100.00%) were paired; of these:
3 (100.00%) aligned concordantly 0 times
0 (0.00%) aligned concordantly exactly 1 time
0 (0.00%) aligned concordantly >1 times
----
3 pairs aligned concordantly 0 times; of these:
0 (0.00%) aligned discordantly 1 time
----
3 pairs aligned 0 times concordantly or discordantly; of these:
6 mates make up the pairs; of these:
6 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
bowtie2-2.1.0/bowtie2 -U 1.fastq,2.fastq -x GRCh38_noalt_as/GRCh38_noalt_as -p 32 -S /dev/null
6 reads; of these:
6 (100.00%) were unpaired; of these:
3 (50.00%) aligned 0 times
3 (50.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
50.00% overall alignment rate
I have been looking into this issue, but have not found the cause. I have confirmed
that mixed mode is enabled, and some of the alignments can be found by including the -a
option in the command line. I will continue looking and update this thread when I have
more information.
$ ./bowtie2-align-s -2 read1.fq -1 read2.fq -x GRCh38_noalt_as/GRCh38_noalt_as --no-unal --no-sq --no-hd -a
3 reads; of these:
3 (100.00%) were paired; of these:
1 (33.33%) aligned concordantly 0 times
1 (33.33%) aligned concordantly exactly 1 time
1 (33.33%) aligned concordantly >1 times
----
1 pairs aligned concordantly 0 times; of these:
0 (0.00%) aligned discordantly 1 time
----
1 pairs aligned 0 times concordantly or discordantly; of these:
2 mates make up the pairs; of these:
1 (50.00%) aligned 0 times
1 (50.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
83.33% overall alignment rate
Thank you very much. This is will be very helpful.
Here's the output from a variety of versions ...
module load bowtie2/2.3.4.1
bowtie2 --version
/home/shared/cbc/software_cbc/bowtie2-2.3.4.1/bowtie2-align-s version 2.3.4.1
64-bit
Built on 14231912a8bd
Sat Feb 3 13:04:04 UTC 2018
Compiler: gcc version 4.8.2 20140120 (Red Hat 4.8.2-15) (GCC)
Options: -O3 -m64 -msse2 -funroll-loops -g3 -g -O2 -fvisibility=hidden -I/hbb_exe/include -std=c++98 -DPOPCNT_CAPABILITY -DWITH_TBB -DNO_SPINLOCK -DWITH_QUEUELOCK=1
Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}
bowtie2 -x SVAs_and_HERVs_KWHE --very-sensitive-local -1 fastq/reads_R1.fastq -2 fastq/reads_R2.fastq > /dev/null
590 reads; of these:
590 (100.00%) were paired; of these:
590 (100.00%) aligned concordantly 0 times
0 (0.00%) aligned concordantly exactly 1 time
0 (0.00%) aligned concordantly >1 times
----
590 pairs aligned concordantly 0 times; of these:
0 (0.00%) aligned discordantly 1 time
----
590 pairs aligned 0 times concordantly or discordantly; of these:
1180 mates make up the pairs; of these:
1180 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
module load bowtie2/2.3.3.1
The following have been reloaded with a version change:
1) bowtie2/2.3.4.1 => bowtie2/2.3.3.1
bowtie2 --version
/home/shared/cbc/software_cbc/bowtie2-2.3.3.1/bowtie2-align-s version 2.3.3.1
64-bit
Built on c1045ed0e5f3
Thu Oct 5 16:59:35 UTC 2017
Compiler: gcc version 4.8.2 20140120 (Red Hat 4.8.2-15) (GCC)
Options: -O3 -m64 -msse2 -funroll-loops -g3 -DPOPCNT_CAPABILITY -DWITH_TBB -DNO_SPINLOCK -DWITH_QUEUELOCK=1
Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}
bowtie2 -x SVAs_and_HERVs_KWHE --very-sensitive-local -1 fastq/reads_R1.fastq -2 fastq/reads_R2.fastq > /dev/null
590 reads; of these:
590 (100.00%) were paired; of these:
563 (95.42%) aligned concordantly 0 times
7 (1.19%) aligned concordantly exactly 1 time
20 (3.39%) aligned concordantly >1 times
----
563 pairs aligned concordantly 0 times; of these:
0 (0.00%) aligned discordantly 1 time
----
563 pairs aligned 0 times concordantly or discordantly; of these:
1126 mates make up the pairs; of these:
604 (53.64%) aligned 0 times
16 (1.42%) aligned exactly 1 time
506 (44.94%) aligned >1 times
48.81% overall alignment rate
module load bowtie2/2.2.9
The following have been reloaded with a version change:
1) bowtie2/2.3.3.1 => bowtie2/2.2.9
bowtie2 --version
/home/shared/cbc/software_cbc/bowtie2-2.2.9/bowtie2-align-s version 2.2.9
64-bit
Built on localhost.localdomain
Thu Apr 21 18:36:37 EDT 2016
Compiler: gcc version 4.1.2 20080704 (Red Hat 4.1.2-54)
Options: -O3 -m64 -msse2 -funroll-loops -g3 -DPOPCNT_CAPABILITY
Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}
bowtie2 -x SVAs_and_HERVs_KWHE --very-sensitive-local -1 fastq/reads_R1.fastq -2 fastq/reads_R2.fastq > /dev/null
590 reads; of these:
590 (100.00%) were paired; of these:
553 (93.73%) aligned concordantly 0 times
11 (1.86%) aligned concordantly exactly 1 time
26 (4.41%) aligned concordantly >1 times
----
553 pairs aligned concordantly 0 times; of these:
0 (0.00%) aligned discordantly 1 time
----
553 pairs aligned 0 times concordantly or discordantly; of these:
1106 mates make up the pairs; of these:
594 (53.71%) aligned 0 times
15 (1.36%) aligned exactly 1 time
497 (44.94%) aligned >1 times
49.66% overall alignment rate
module load bowtie2/2.2.6
The following have been reloaded with a version change:
1) bowtie2/2.2.9 => bowtie2/2.2.6
bowtie2 --version
/home/shared/cbc/software_cbc/bowtie2-2.2.6/bowtie2-align-s version 2.2.6
64-bit
Built on localhost.localdomain
Wed Jul 22 16:18:32 EDT 2015
Compiler: gcc version 4.1.2 20080704 (Red Hat 4.1.2-54)
Options: -O3 -m64 -msse2 -funroll-loops -g3 -DPOPCNT_CAPABILITY
Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}
bowtie2 -x SVAs_and_HERVs_KWHE --very-sensitive-local -1 fastq/reads_R1.fastq -2 fastq/reads_R2.fastq > /dev/null
590 reads; of these:
590 (100.00%) were paired; of these:
553 (93.73%) aligned concordantly 0 times
11 (1.86%) aligned concordantly exactly 1 time
26 (4.41%) aligned concordantly >1 times
----
553 pairs aligned concordantly 0 times; of these:
0 (0.00%) aligned discordantly 1 time
----
553 pairs aligned 0 times concordantly or discordantly; of these:
1106 mates make up the pairs; of these:
594 (53.71%) aligned 0 times
15 (1.36%) aligned exactly 1 time
497 (44.94%) aligned >1 times
49.66% overall alignment rate
module load bowtie2/2.1.0
The following have been reloaded with a version change:
1) bowtie2/2.2.6 => bowtie2/2.1.0
bowtie2 --version
/home/shared/cbc/software_cbc/bowtie2-2.1.0/bowtie2-align version 2.1.0
64-bit
Built on do-dmxp-mac.win.ad.jhu.edu
Tue Feb 26 13:34:02 EST 2013
Compiler: gcc version 4.1.2 20080704 (Red Hat 4.1.2-54)
Options: -O3 -m64 -msse2 -funroll-loops -g3
Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}
bowtie2 -x SVAs_and_HERVs_KWHE --very-sensitive-local -1 fastq/reads_R1.fastq -2 fastq/reads_R2.fastq > /dev/null
590 reads; of these:
590 (100.00%) were paired; of these:
553 (93.73%) aligned concordantly 0 times
11 (1.86%) aligned concordantly exactly 1 time
26 (4.41%) aligned concordantly >1 times
----
553 pairs aligned concordantly 0 times; of these:
0 (0.00%) aligned discordantly 1 time
----
553 pairs aligned 0 times concordantly or discordantly; of these:
1106 mates make up the pairs; of these:
594 (53.71%) aligned 0 times
15 (1.36%) aligned exactly 1 time
497 (44.94%) aligned >1 times
49.66% overall alignment rate
Hello @jakewendt,
Your inputs show differences in behavior between versions. Would it be possible to share these files so that I can recreate the issue?
Here are the reads and index.
I pushed a potential fix for the issue to the bug_fixes
branch. Here's my output:
$ ./bowtie2-align-s-debug --version
Warning: Running in debug mode. Please use debug mode only for diagnosing errors, and not for typical use of Bowtie 2.
./bowtie2-align-s-debug version 2.4.1
64-bit
Built on
Fri Jan 22 16:25:31 UTC 2021
Compiler: InstalledDir: /usr/bin
Options: -O0 -g3 -msse2 -std=c++11 -DPOPCNT_CAPABILITY
Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}
$ ./bowtie2-align-s-debug --very-sensitive-local -x bowtie2-testing/SVAs_and_HERVs_KWHE -1 bowtie2-testing/reads_R1.fastq.gz -2 bowtie2-testing/reads_R2.fastq.gz > /dev/null
Warning: Running in debug mode. Please use debug mode only for diagnosing errors, and not for typical use of Bowtie 2.
590 reads; of these:
590 (100.00%) were paired; of these:
563 (95.42%) aligned concordantly 0 times
7 (1.19%) aligned concordantly exactly 1 time
20 (3.39%) aligned concordantly >1 times
----
563 pairs aligned concordantly 0 times; of these:
0 (0.00%) aligned discordantly 1 time
----
563 pairs aligned 0 times concordantly or discordantly; of these:
1126 mates make up the pairs; of these:
604 (53.64%) aligned 0 times
16 (1.42%) aligned exactly 1 time
506 (44.94%) aligned >1 times
48.81% overall alignment rate
Thanks. No improvements on my side.
$ bowtie2-bug_fixes/bowtie2 -U 1.fastq,2.fastq -x GRCh38_noalt_as/GRCh38_noalt_as -p 32 --no-unal --no-sq --no-hd -S /dev/null
6 reads; of these:
6 (100.00%) were unpaired; of these:
3 (50.00%) aligned 0 times
3 (50.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
50.00% overall alignment rate
$bowtie2-bug_fixes/bowtie2 -1 1.fastq -2 2.fastq -x GRCh38_noalt_as/GRCh38_noalt_as -p 32 --no-unal --no-sq --no-hd -S /dev/null
3 reads; of these:
3 (100.00%) were paired; of these:
3 (100.00%) aligned concordantly 0 times
0 (0.00%) aligned concordantly exactly 1 time
0 (0.00%) aligned concordantly >1 times
----
3 pairs aligned concordantly 0 times; of these:
0 (0.00%) aligned discordantly 1 time
----
3 pairs aligned 0 times concordantly or discordantly; of these:
6 mates make up the pairs; of these:
6 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
@fplaza — this commit won’t fix the issue completely. It was mostly directed to @jakewendt‘s issue. I should have made that clear in my last message.
Thank you.
Hello @ch4rr0 ,
Any news about the investigation of this issue?
Thanks
Ultimately going with
--local -D 85 -R 5 -N 0 -L 10 -i S,1,0
for the moment.Does anyone understand how the aligning algorithm differs when run paired vs unpaired?
Hello, did you make any progress on this? I am exploring and searching the differences, too.
Upgrading to the latest version mostly resolved my issues