TEMP
TEMP copied to clipboard
Invalid record in bam.unproper.uniq.interval.bed
Hi,
I get this error when running "TEMP_Absence.sh":
bedtools intersect -a /cluster/project/gdc/people/crimjhim/TEPID_final.bed.sort -b merged.Ma99.bam.unproper.uniq.interval.bed -f 1.0 -wo
Error: Invalid record in file merged.Ma99.bam.unproper.uniq.interval.bed. Record is
chr1 2287009 2287008 HWI-700523F:21:C6KJ9ANXX:4:2301:14610:90784/2
Any idea why the coordinates are inverted in the bed file, and how should I fix this? I am working with pair-end Illumina-seq and the average insert size is 250 bp.
Thanks, Rimjhim
Hi Rimjhim,
It's hard to know exactly what happened without knowing anything about "merged.Ma99.bam". Would you mind posting a few entries (including read "HWI-700523F:21:C6KJ9ANXX:4:2301:14610:90784/2") from the BAM file?
Jiali
Hi Jiali,
Thank you very much for your reply, and I am sorry I should have added more details.
Here is a snippet from the merged.Ma99.bam from chr1:2286800-2287050
1 HWI-700523F:21:C6KJ9ANXX:4:1203:13109:25753 163 chr1 2286830 22 77S49M = 2286868 203 TTTGAAGCAAACAGATATGTCACCGAAAGGGCTATTAAAAGGCTCAAAAGCAGAGATAACAAACACAATGTGTCCTTAAACTTGAATC AATTTATTAACCAAGAAAGAGATCTGAATCGTAACATG BBBBBGGGFD>FEBDGDGGGGFGG/<>CDGGGGGGGGEBFCBGGGGGGGGEDGGGGGGGGGGGGGGGGGEB@GGGGGGGGGGGGEFFGGFGGGGGGGGG@FGGGBFGGGGFGGGGGGCBFEGGGGF AS:i:56 XN:i:0 XM:i:6 XO:i:0 XG:i:0 NM:i:6 MD:Z:16C2C1T14A0G6C4 YS:i:104 YT:Z:CP
2 HWI-700523F:21:C6KJ9ANXX:4:2205:5836:73918 145 chr1 2286830 36 45S53M28S = 2286577 -351 TATTAAAAGGCTCAAAAGCAGAGATAACAAACACAATGTGTCCTTAAACTTGAATCAATTTATTAACCAAGAAAGAGATC TGAATCGTAACATGAATGCACAAAGTACTAAAAAAATCAAGCTTTT 5DFGGGGGGG=40GGGGC@GEGGGDGGBGGGGGGGGGGBFGGF@FGGFF@GGGGD@GGFGGGGGFGCEGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGEGGGGGGGGGGGGGA<ABB AS:i:64 XN:i:0 XM:i:6 XO:i:0 XG:i:0 NM:i:6 MD:Z:16C2C1T14A0G6C8 YS:i:252 YT:Z:DP
3 HWI-D00418:56:C6KLUANXX:8:2102:8126:85711 83 chr1 2286830 22 37S64M1I23M1S = 2286699 -255 GGCTCAAAAGCAGAGATAACAAACACAATGTGTCCTTAAACTTGAATCAATTTATTAACCAAGAAAGAGATCTGAATCGT AACATGAATGCACAAAGTACTAAAAAAATCAAGCTTTTAGATTCAA FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBBBBB AS:i:103 XN:i:0 XM:i:9 XO:i:1 XG:i:1 NM:i:10 MD:Z:16C2C1T14A0G6C8A1G23C7 YS:i:68 YT:Z:CP
4 HWI-D00418:56:C6KLUANXX:8:2107:16542:25469 145 chr1 2286832 41 62M1I42M21S = 2286623 -313 ACTTGAATCAATTTATTAACCAAGAAAGAGATCTGAATCGTAACATGAATGCACAAAGTACTAAAAAAATCAAGCTTTTA GATTCAACAAAAGGAATCAAGTCAAACCCTAGATTGATTTACCCTA FBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBBBBB AS:i:123 XN:i:0 XM:i:11 XO:i:1 XG:i:1 NM:i:12 MD:Z:14C2C1T14A0G6C8A1G23C7G14A3 YS:i:220 YT:Z:DP
5 HWI-700523F:21:C6KJ9ANXX:4:2205:11059:54702 69 chr1 2286852 0 * = 2286852 0 TTTGTAAGATGATCAAAAACAGGAATATCTGAGAAGCTTGTAAACATATGAACAGTGAACTTTGAAGCAAACAGATATGTCACCAAAA GGGCTATTAAAAGGCTCAAAAGCAGAGATAACAAACAC CCCCCGGFGGGGGGEGGGFGGGGGGGGGGGG1><DGCEGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGBFG@GGGGCGGGGEGGGGGGGGGGGGGGGGGGGFGG0 YT:Z:UP
6 HWI-D00418:56:C6KLUANXX:8:1101:13471:25839 161 chr1 2286852 36 7S42M1I50M1D8M18S = 2287239 509 TATTAACCAAGAAAGAGATCTGAATCGTAACATGAATGCACAAAGTACTAAAAAAATCAAGCTTTTAGATTC AACAAAAGGAATCAAGTCAAACCCTAGATTGATTTACCCTAGATATGCTAAGGT BBBBBFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFF< AS:i:121 XN:i:0 XM:i:9 XO:i:2 XG:i:2 NM:i:11 MD:Z:14A0G6C8A1G23C7G14A3G7^T8 YS:i:205 YT:Z:DP
7 HWI-700523F:21:C6KJ9ANXX:4:2205:11059:54702 153 chr1 2286852 24 8S42M1I50M1D8M17S = 2286852 0 TTATTAACCAAGAAAGAGATCTGAATCGTAACATGAATGCACAAAGTACTAAAAAAATCAAGCTTTTAGATT CAACAAAAGGAATCAAGTCAAACCCTAGATTGATTTACCCTAGATATGCTAAGG @GGGCFEGGGCGGGGGGGGGGEC@FDGGGGGGGGGGGGGGGCGGGGGGGGGGGGGGGGGGEGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGBGGGGGGGGGGFGBBBBB AS:i:121 XN:i:0 XM:i:9 XO:i:2 XG:i:2 NM:i:11 MD:Z:14A0G6C8A1G23C7G14A3G7^T8 YT:Z:UP
8 HWI-700523F:21:C6KJ9ANXX:4:1203:13109:25753 83 chr1 2286868 22 1S26M1I50M1D8M40S = 2286830 -203 AATCGTAACATGAAAGCACAAAGTACTAAAAAAATCAAGCTTTTAGATTCAACAAAAGGAATCAAGTCAAAC CCTAGATTGATTTACCCTAGATATGCTAAGGTTCTAATTCAAATCAGATCTAAC =GD.F@@C0B;>F<000F>FGCGGDB0C>D@FCGFFDB:0DCGGGFCGEGGGGGGGFF>GGGE<=11<DF>F>GF1@BC1CGGGGGGGGGGGCGGGF@CGEF>E@>GGGGGGGGGGGCGGGBBBBA AS:i:104 XN:i:0 XM:i:8 XO:i:2 XG:i:2 NM:i:10 MD:Z:6C6T1A1G23C7G14A3G7^T8 YS:i:56 YT:Z:CP
9 HWI-700523F:21:C6KJ9ANXX:4:2301:14610:90784 161 chr1 2286876 36 18M1I50M1D7M2D3M1D1M3D1M5D17M2D28M = 2287003 252 ATGAATGCACAAAGTACTAAAAAAATCAAGCTTTTAGATTCAACAAAAGGAATCAA GTCAAACCCTAGATTGATTTACCCTAGATATGCTAAGGTTCTAATTCAAATCAGATCTAACCTAATAGAA BBA=?FGG>GD@F=BDFFEGG1CGGGGFGCGCEC1FGGG>1EGGGGGGGGGGGDFGGGGGGGCBFGGDD0FFC00FGGCFGGGGDGD000=FFG@0:0FB@007CF@>@FC@F?CFG>:F4BA@C= AS:i:99 XN:i:0 XM:i:11 XO:i:7 XG:i:15 NM:i:26 MD:Z:7A1G23C7G14A3G7^T7^AG3^A1^AAA1^CAAAC2T1T12^CA2A0C0A23 YS:i:201 YT:Z:DP
10 HWI-700523F:21:C6KJ9ANXX:4:2211:13774:52164 97 chr1 2286886 28 2S8M1I50M1D7M2D3M1D1M3D1M5D17M2D36M = 2287159 393 ACAAAGTACTAAAAAAATCAAGCTTTTAGATTCAACAAAAGGAATCAAGTCAAACC CTAGATTGATTTACCCTAGATATGCTAAGGTTCTAATTCAAATCAGATCTAACCTAATAGAATATCCTCA @BBCC@D=EB>BCDF;ED:11EGGDG>DD1:<FEFB1EDGFGBGGCB1FFGG@FFGGDFGFEGDBDGGG@EFB1FEGG1BC>D0FG00FGGCDE0;F0E@G>CFGG0<F>@FGCD@C0CFFFGGG8 AS:i:99 XN:i:0 XM:i:10 XO:i:7 XG:i:15 NM:i:25 MD:Z:23C7G14A3G7^T7^AG3^A1^AAA1^CAAAC2T1T12^CA2A0C0A27A3 YS:i:194 YT:Z:DP
11 HWI-D00418:56:C6KLUANXX:8:2109:9866:47127 69 chr1 2286992 0 * = 2286992 0 ATGATCAAAAACAGGAATATCTGAGAAGCTTGTAAACATATGAACAGTGAACTTTGAAGCAAACAGATATGTCACCAAAAGGGCTATT AAAAGGCTCAAAAGCAGAGATAACAAACACAATGTGTC BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF YT:Z:UP
12 HWI-D00418:56:C6KLUANXX:8:2109:9866:47127 153 chr1 2286992 42 87M1I4M1I10M1D22M = 2286992 0 AAATCAGATCTAACCTAATAGAATATCCTCAAAGAAGAGATCTAAACGAAACCCTAGTCCGTGAAAACAGAG AAACAGATCGATACGAAAAGAGAGGATGAAAAGAAACTCACATCTGCCAAGCG FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBBBBB AS:i:194 XN:i:0 XM:i:4 XO:i:3 XG:i:3 NM:i:7 MD:Z:27A60G7G4^C1A20 YT:Z:UP
13 HWI-700523F:21:C6KJ9ANXX:4:2301:14610:90784 81 chr1 2287003 36 76M1I4M1I10M1D34M = 2286876 -252 AACCTAATAGAATATCCTCAAAGAAGAGATCTAAACGAAACCCTAGTCCGTGAAAACAGAGAAACAGATCGA TACGAAAAGAGAGGATGAAAAGAAACTCACATCTGCCAAGCGGAGAGGATGAAT 6:000090@0700000;0000=0800808<00C=0/=E:000=0/>E/EDC0F1@DGGGC@CF=1DE00CF@F:<GGF1:BF>G@GF1F>F11F1EGGG>CGGGGCGBGGGGGGGGGGF@CBBCB@ AS:i:201 XN:i:0 XM:i:4 XO:i:3 XG:i:3 NM:i:7 MD:Z:16A60G7G4^C1A32 YS:i:99 YT:Z:DP
14 HWI-D00418:56:C6KLUANXX:8:2302:14361:100328 69 chr1 2287016 0 * = 2287016 0 ACACAATGTGTCCTTAAACTTGAATCAATTTATTAACCAAGAAAGAGATCTGAATCGTAACATGAATGCACAAAGTACTAAAAAAATC AAGCTTTTAGATTCAACAAAAGGAATCAAGTCAAACCC BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF YT:Z:UP
15 HWI-D00418:56:C6KLUANXX:8:2302:14361:100328 153 chr1 2287016 37 63M1I4M1I10M1D46M = 2287016 0 ATCCTCAAAGAAGAGATCTAAACGAAACCCTAGTCCGTGAAAACAGAGAAACAGATCGATACGAAAAGAGAG GATGAAAAGAAACTCACATCTGCCAAGCGGAGAGGATGAATAGAGAAGCGAAG FFFFBFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBBBBB AS:i:187 XS:i:61 XN:i:0 XM:i:5 XO:i:3 XG:i:3 NM:i:8 MD:Z:3A60G7G4^C1A40A3 YT:Z:UP
16 HWI-700523F:21:C6KJ9ANXX:4:2206:19732:85187 133 chr1 2287024 0 * = 2287024 0 AAACAGATATGTCACCAAAAGGGCTATTAAAAGGCTCAAAAGCAGAGATAACAAACACAATGTGTCCTTAAACTTGAATCAATTTATT AACCAAGAAAGAGATCTGAATCGTAACATGAATGCACA ?AA@BBGGGGGG>GGGGGG>1FDGGGGG1FGGGGGGG1FGGGFGGGGGGGGGGGBGGGGBC@FGGGGGGGGGGGGGGGCGGGGGGGGGGGGEDGBGGGGGEGF0FFGCF@FFFGGGCG0CGGGEGF YT:Z:UP
17 HWI-700523F:21:C6KJ9ANXX:4:2206:19732:85187 89 chr1 2287024 25 55M1I4M1I10M1D55M = 2287024 0 AGAAGAGATCTAAACGAAACCCTAGTCCGTGAAAACAGAGAAACAGATCGATACGAAAAGAGAGGATGAAAA GAAACTCACATCTGCCAAGCGGAGAGGATGAATAGAGAAGCGAAGAGAACTCTT GFD@9C<<008CC0.8C>FC;0DE9/F=00CGGGGDF0D@GGGFE@GEGGFGGGGGGGFEEGDFFGGGGGGGGGFEF@GGEF:>>GBGGFBGGGGGGEGGGCGGFFEF;GGDGFGFC1CGDA=CCB AS:i:196 XS:i:80 XN:i:0 XM:i:4 XO:i:3 XG:i:3 NM:i:7 MD:Z:56G7G4^C1A40A12 YT:Z:UP
18 HWI-D00418:56:C6KLUANXX:8:1308:13410:70628 97 chr1 2287031 21 48M1I4M1I10M1D62M = 2287182 0 ATCTAAACGAAACCCTAGTCCGTGAAAACAGAGAAACAGATCGATACGAAAAGAGAGGATGAAAAGAAACTC ACATCTGCCAAGCGGAGAGGATGAATAGAGAAGCGAAGAGAACTCTTCCAAGAA BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF AS:i:189 XS:i:93 XN:i:0 XM:i:5 XO:i:3 XG:i:3 NM:i:8 MD:Z:49G7G4^C1A40A14G4 YT:Z:UP
Where the read HWI-700523F:21:C6KJ9ANXX:4:2301:14610:90784 is:
HWI-700523F:21:C6KJ9ANXX:4:2301:14610:90784 161 chr1 2286876 36 18M1I50M1D7M2D3M1D1M3D1M5D17M2D28M = 2287003 252 ATGAATGCACAAAGTACTAAAAAAATCAAGCTTTTAGATTCAACAAAAGGAATCAAGTCAAACCCTAGATTGATTTACCCTAGATATGCTAAGGTTCTAATTCAAATCAGATCTAACCTAATAGAA BBA=?FGG>GD@F=BDFFEGG1CGGGGFGCGCEC1FGGG>1EGGGGGGGGGGGDFGGGGGGGCBFGGDD0FFC00FGGCFGGGGDGD000=FFG@0:0FB@007CF@>@FC@F?CFG>:F4BA@C= AS:i:99 XN:i:0 XM:i:11 XO:i:7 XG:i:15 NM:i:26 MD:Z:7A1G23C7G14A3G7^T7^AG3^A1^AAA1^CAAAC2T1T12^CA2A0C0A23 YS:i:201 YT:Z:DP
HWI-700523F:21:C6KJ9ANXX:4:2301:14610:90784 81 chr1 2287003 36 76M1I4M1I10M1D34M = 2286876 -252 AACCTAATAGAATATCCTCAAAGAAGAGATCTAAACGAAACCCTAGTCCGTGAAAACAGAGAAACAGATCGATACGAAAAGAGAGGATGAAAAGAAACTCACATCTGCCAAGCGGAGAGGATGAAT 6:000090@0700000;0000=0800808<00C=0/=E:000=0/>E/EDC0F1@DGGGC@CF=1DE00CF@F:<GGF1:BF>G@GF1F>F11F1EGGG>CGGGGCGBGGGGGGGGGGF@CBBCB@ AS:i:201 XN:i:0 XM:i:4 XO:i:3 XG:i:3 NM:i:7 MD:Z:16A60G7G4^C1A32 YS:i:99 YT:Z:DP
Please let me know if you need more lines.
Thanks,
Rimjhim
Rimjhim,
This is caused by having very long reads and the two reads actually overlap.
I've modified the code and it should be taken care of. Let me know if it still doesn't work.
Jiali