SequelTools
SequelTools copied to clipboard
Crash running QC mode
Hello, I'm running into some issues trying to run SequelTools for quality controls
Beginning quality control function
Running in WITH_SCRAPS mode
Traceback (most recent call last):
File "/home/amanda/programas/SequelTools/Scripts/generateReadLenStats_wScraps.py", line 93, in <module>
szData = lineLst[2].strip().strip("sz:")
IndexError: list index out of range
ERROR: Calculation of read length statistics failed!
Running in NO_SCRAPS mode
Traceback (most recent call last):
File "/home/amanda/programas/SequelTools/Scripts/generateReadLenStats_noScraps.py", line 94, in <module>
start = int(coord.split("_")[0]); stop = int(coord.split("_")[1])
ValueError: invalid literal for int() with base 10: 'ccs'
ERROR: Calculation of read length statistics failed!
Not sure of what could be the cause. Thanks
@J-Calvelo:
Thanks for trying SequelTools!
From the error message, it seems like you are not using the actual scrap file (the parsed tags were missing in the file, I think?). If you are certain that you used the scraps bam file, would it be possible for you to share few lines of the bam
file? The output of this command, perhaps?
samtools view your.bam | head -n 1
Thanks,
Hello, I figured it out some hours after posting it. It was indeed a problem with the bam file, I used picard to convert from fastq to bam. Sorry for the inconvinience
Hello there, I have encountered the same error when running QC without scrap file.
Running in NO_SCRAPS mode
Traceback (most recent call last):
File "/home/threadripper/Downloads/SequelTools/Scripts/generateReadLenStats_noScraps.py", line 94, in <module>
start = int(coord.split("_")[0]); stop = int(coord.split("_")[1])
ValueError: invalid literal for int() with base 10: 'ccs'
ERROR: Calculation of read length statistics failed!
This is the first line of my bam file
m64047_220526_230626/0/ccs 4 * 0 255 * * 00 ACACTAGATCGCGTGTTGAATTGGTGTACTCAATTTACATTTAAACACAATCAATAGTGAGGACGGATAACAACGCAATGAATTCAAGAACACCACATATATTTACAAAAGCGCTGCAGCTCGCCCCAAGCATTTGTAAATATTTGTGTTTTTTTTTTTTGCTAACCTTGGCGCCATGACAAATGACCGTCAGCTTTATATCGATCGTAAGACCGAGGAAGCAGATTACATCTACCGGGCTGATCGATCTGGTCACTGTTCATATACACTACTGGATTGTCATATTTTAGCATTTTGGCGATGACCATCCCATTAGTTTCCAGAGCCGTGGACGGTCGAGTCTCTCGGTCGAGTCTCTCGGTCGAGTCTCTCGGTCGAGTCTCTCGGTCGAGTCTCTCGGTCGAGTATCCTAACAGGCTAACACAACACTTGCAATACGTCACATTTTTTTATAAATCGACAGCTCGTTCGGAAAGCAGTGTTCGCTCTAGTCTTCTGCGCGAGTAAGAAACTTTTTAAGTTAAATTATACACAATCATAATCAGACACGAACCATTATATTAAGAGACAGCCAGGTAGATGTTGAGCTGGTTATACCTATCCTCTTAATATTCGGAGATCGTATAAAACCCTTTTTTATAAGATATTTAAATTTACTTACGGAAAAATTCTGTATATTTATAGGGACCTTAATGAACGTGCGCGCACAGAAAGCCCCATAAAATAATTTTTGCCTTTATGTAAAGGACTGACGGGAAAACATAGTTTATCTTTATTTGTAATATTGTAATTGTGTCGATAGTGATTCTATCTAAATATTCAATATTTGTTTTGACATACATTAAATTTTTATAACGACCTGTGATGCCAAGGTCAAGACTCCATCTTTCCAAATTTAAGTCTAAGAGATACTCGAGGTCTTAGCTTGTAAATTTCCCGAACGGTCCGCTTCTGTATCTATGAACATCGGTGACATGCCACCAAATCAAGGTGCCGTATGTAATCAGGCTATGGAAATAGCCAAAGTAAACTAATCTAGTAGTTTGAATATCTTGTAACCTCTAGTTGCTTGTTTTATAATATTTGAGGAAACGAAGTGAACAATTTTTATTGTCACCCCGGTAGAACTCGAGACACGATATACAAATATCAGGGCGTTGTGTAGTACTTATGACCTCGTAGACATTAAAGTTTCTATATAAGTTTAATAAGCGCAATACCTCTTAGTATTTTAAATAATTATTAAGTACACAGTATTCCTGTTAACGTTTATACAAGAATTAAAAAGGGCACCACCCGGTTTCAAAATGTGTCGAGAATAAATTGTTTTACTTAGAGCGTGCAAGCAACATGCAAAACATGCAACATCCAGACCCGCCTCTGACCCTCGCATTCGTTTCACCACTGCGAATCTCTATCCAGATTATTGGATACTGATTCGAGAAGATCCACGAATTCGAACTCCTTGAGCAAGCGTTAAGAAGATTATTTAGCCAAAATCCGCCACCAAGAAGGTGGGTTTGCATGCTGCATGTTGCTTGCACGCTCCAAGTAATGGATTACCGCAAATTAAAATATACAAATTATAATTTATATTATTTGTTTTACTTATGTAAAATTAAAAACAATGTACAATGAAAACAAGTAAAAGATTTTTTCAGACGTTTGTTATCCTTCCTGTAGAATTTGGGCGTGTCGACTTCGATTTAAGCACCGACCTTCTTTCCAAATTACTGAAATTCGGACTGTTAGTTTCGGTCTGCCTTCGAAGATTACAAGAAATCTGCATGTCAGAACAAAAGCGTGCCTCCATTTTAATGTTATATATTATAATTTCAAAACAATTCCAAATTGACAGGTTTTAATTCTTGGTGGCGGATTTTGGCTAAATAATCTTAACGCTTGCTCTAGGAGTTCGAATTTGTGGATCTTCTCAAATCAACATCCAATAATTTGGATAGGGATTTGCAGTGGTGAAACGAATTCGAGGGTTAGAGGCGGGTTTGCATGCTGCATGTTTTGCATGTTGCTTGCTTGCTCTTCAAAACAATTCGCGAACAGATAAACAAAGCGCTTCGCGAAGTATGATTTTTATTTTAACATACGAAACCGAATTAATTTATGATAATGTCGTAAAGTGTATAGCGTCTAGGGGAGAGCGGGGCGCTGCGGGACGTCACAGATACACAAGACAAAGGGATCGTTTCGGTAAGGTCATAAAACTGTAGTGCGCGCTGGGCGGGCGGGCGACGGGGCGGGGGCGCGGAGGCGGGACGCGCTACGGCGCTTGTATTGCTTTATTGAGTGGAATATCTCCTTTCCGAGCTGACTCGCAGATTTATTATTCATTTATTTATAACTTACGCTGGCGTGTGCGCGGTGGCGCCCGGTGCACGTCGGTTCTTGGATTTTATATTTCGTAATTTTATTTTTCAGCGTTTTGTAGTATGTCTGAGCTGATGAAACCAGATGACAAATGACGTCTGCCTTGTGACGGCGCGTCGTACCTCTAAGAACCGAATGATCAGTTCAGTGTATTTTTGAATAGAGAAAATAGAAGGAAAAGTAGAAAAAATTATAAGCGAAAAATGATTATTTATTTTATAACACGTAATGAATATTATCATTACAGAACACGCGATCTAGTGT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Q~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Z~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~m~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ RG:Z:ebc74f4b ac:B:i,59,1,60,0 ec:f:59.8636 ma:i:0 np:i:60 rq:f:1 sn:B:f,12.4602,18.3831,3.90846,7.87343 we:i:7800687 ws:i:77168 zm:i:0
I used the bam file ouput in Q20 folder given by the sequencer. May I know if this tool suitable for PacBio HiFi dataset?
@senzei-21 It looks like you are running SequelTools on CCS/HiFi reads - unfortunately, this tool only works on subreads. Please try this on your subreads and let us know if you still have the issue.
Thanks,
I managed to run this tool on my subreads. But my raw data doesn't have any scrap files so I can't proceed with the filtering option.