bwa icon indicating copy to clipboard operation
bwa copied to clipboard

paired end read name problems

Open dpellow opened this issue 4 years ago • 1 comments

When using the latest version (0.7.17) of bwa mem it seems like paired-end reads (split in 2 files) must have names like read1/1 and read1/2 (using the "/" character), while an older version I used (0.7.5) also allows names like read1_1 and read1_2 or read1.1 and read1.2 .

What is the most up to date version that supports these different naming conventions? Is there a way to prevent bwa mem from erroring out on reads that use this naming convention?

dpellow avatar Jul 15 '20 11:07 dpellow

I also meet this error.

less SRR19880797_sort.1.fastp.fastq.gz (参考)@SRR19880797.5023185 5023185/1 CTGTGGCCCTGTGCCAAACCTGGAGCAGCTGCCTTTAGAGGCCAGGAGGGCTACTTCCCGTTTCCTGAGCACTGTCCCTCTGTCTGCAGGAGTGCTGCTG + FF@FFFFFFFFGFFFFFFFGAFFEFFFFFFGFFFFGCFFGFEFGGF<FF>>GGFFFFGGFFGFFFGGBGGFGGGFFGGFFGFFFBGFGFFFFFDFFFGFC (报错)@SRR19880797.5023186 5023186/1 CTGGGAAGAAGCACAGACCACCAGGCCCCCTGTTCTCCTCCTCAGATCCCCTTCCTGCCACCTCTTCCCATTCCCAGGACTCAGCCCAGGTCACCTCGCT + GGFFFFFFFFFFGFFGEFFGGFFFGGFFFFFGGG>FFGFGEGGEEFGFGGFGGFFGGGFEFGFFEFFGFGFGFGGFFFFFFD>EFFGGDFFGG@GFA;FF (参考)@SRR19880797.5023187 5023187/1 AGGACACGGTACAAAAGGGCAGCCAGGCAGGGTTGGAAGGTGGGGTCTGAGGGGTTTCCACCTGCCCTCTCCCATCCTTCCAGGTTTTGGCGGCAGATGG + F?FFFFGFF/FGFGFF>FFFFFFFFFFFFFFFFFFFFEFFFFEFFD@FFCFFEFFFFGGFFFFDFDFFFGFFFFFFFFFEFFGFBFGFFECFF:DFBFFF

less SRR19880797_sort.2.fastp.fastq.gz (参考)@SRR19880797.5023185 5023185/2 GGCTGGCCCAGCGCCAGCGTCGGAGCGCCGGCCCCCTCCCCGGGCCGCCCCCACCCAACCAGACCCTCCAGCGCGTGCCACCGGACCTCGTGTCCTAGAC + )<;7@CDB1B:AA=DCB3AE;D?>C:5=61?469@19+9*7&&>A'@4;)9&8?>&8E3>76*='(BB,>&<&2EC'4;?=9.4>+5 (报错)@SRR19880797.5023186 5023186/2 TCCTTGAACACAGCAGGGTTGGAGGCCATGAGGCTCTGGGCCTCCGTGAAGCTGAGCTGCACAGGGTAGTAGCCGCCATTGAACGGGTTGTGGCAGGATG + FFFFFGDFFGFFEFFFFFFEFF@FFFFFEFFFFFFFFFFFFFGFDGFGF;FFGGEGFFGEFFFFFFFF>FFFFFFEFFFFGFFDFFG@FF<FFFDF=@FG (参考)@SRR19880797.5023187 5023187/2 AAATTCCACAAGAGGGTCATTAAGTGTGATAGTGGAAATGCCCTAACCTCCACCCTTACTTCTCAAATATTCTAGCTATTGGAGATAAAGTACCATATAC + GFFFFFGFF?FGFFFFFFFGFGFGFGFFFFFGFGFFFFFGFFFGFFFFFF>GFFFFFFFFFFFGGFFFGFFFFCEFFGGFGFFFFFFFFFFFFGFGGFFF

SRR19880797检查 [mem_sam_pe] paired reads have different names: "SRR19880797.5358018", "SRR19880797.10839728" [E::sam_parse1] CIGAR and query sequence are of different length [W::sam_read1] Parse error at line 9982028 [main_samview] truncated file. Mapping failed

samtools view -h ./SRR19880797/SRR19880797.bam |less +9982028 -SN

(参考)3370行 SRR19880797.1 65 chr8 143932417 60 100M chr22 20819568 0 TGGCGGTCATGTTGGTGTTGCGGTCGCTCCAGTCGAAGCCCACCTCCTCCTCCTCCTTCTCATTCAGCCACATTAGCTCCTTAGTGGCGGTTGCCACAAA FFFGFCFFGFFFGFFFFFFFFFFFGGFFGFFFEFFDFGFFFFFFGFFFFFGFGGFGGFF@GGGFGFFFFFBFFGAFFGFFFGGDAGFGDG+@D9@/?=59 NM:i:1 MD:Z:90C9 AS:i:95 XS:i:23 RG:Z:SRR19880797 (参考)3371行 SRR19880797.1 129 chr22 20819568 60 100M chr8 143932417 0 AGAGGGATTTTCTTCGCAGGGGAGCTTAACAGGGTCTTTCTCCTCTGCTCTTTCCCCAGTAGCCCAGGCCCACCTGAGAGATGCTGGACACACTGCTGGT GFDFFFFFF;FFF9FFFFGFFFBFFFFGFFFFFFFFFEFFFFFFFFFFFFFFFFFFFFFFEFFFFFFFFF>FFFGFFFEFFFFFFFF@FFDFFFFEECG: NM:i:0 MD:Z:100 AS:i:100 XS:i:20 RG:Z:SRR19880797

(报错前一行)9982027行 SRR19880797.5023186 81 chr22 22643043 0 100M chr3 126500710 0 AGCGAGGTGACCTGGGCTGAGTCCTGGGAATGGGAAGAGGTGGCAGGAAGGGGATCTGAGGAGGAGAACAGGGGGCCTGGTGGTCTGTGCTTCTTCCCAG FF;AFG@GGFFDGGFFE>DFFFFFFGGFGFGFGFFEFFGFEFGGGFFGGFGGFGFEEGGEGFGFF>GGGFFFFFGGFFFGGFFEGFFGFFFFFFFFFFGG NM:i:0 MD:Z:100 AS:i:100 XS:i:100 RG:Z:SRR19880797 (报错行)9982028行 SRR19880797.5023186 161 chr3 126500710 60 100M chr22 22643043 0 TCCTTGAACACAGCAGGGTTGGAGGCCATGAGGCTCTGGGCCTCCGTGAAGCTGAGCTGCACAGGGTAGTAGCCGCCATTGAACGGGTTGTGGCAGGATG FFFFFGDFFGFFEFFFFFFEFF@FFFFFEFFFFFFFFFFFFFGFDGFGF;FFGGEGFFGEFFFFFFFF>FFFFFFEFFFFGFFDFFG@FF<FFFDF=@FG NM:i:0 MD:Z:100 AS:i:100 XS:i:0 RG:Z:SRR19880797

jingydz avatar Mar 16 '23 07:03 jingydz