Finder icon indicating copy to clipboard operation
Finder copied to clipboard

Finder on PacBio data

Open WietseHR opened this issue 11 months ago • 3 comments

Hello, I am currently trying to run finder on three whole genome samples:

  1. Sequenced with Illumina HiSeq x ten
  2. Sequenced with Illumina Novaseq 6000
  3. Sequenced with PacBio SMRT

Samples 1 and 2 are doing fine at the moment but sample 3 generates the following error with star:

EXITING because of FATAL ERROR in reads input: quality string length is not equal to sequence length
@SRR12124361.1
GCGTCGGATAAGCCTGTCATAAGTCATAAATTACACAATACACATCAGCCATTTTGGAAGACCCGATGATTGGTTTGTTTGACCATACCATCTTCATCGCGGAAGATCTCCATCATCGCATGTCCCAACCAAAATTCCGATCCTCCGGCAACCTCGTGTAGCCCCCTCTTGGAATAAAACCTAGTTACAGGAGAAGCGGCCGGCATGGTCCATTTCCGATCAAAGCTCACCGCTCTCACATGGACGGGAATATCGCAGTGTTCCGGTTTGCCTGTATATAGCTTCTGTTATGTAGCGGTAACTGTGAGGGAAATGTCGCATGACGATATAACGAAAGCTTACCTTGCCTTACGCGAAGGGGTAGTGTGCGAGACTGTGAAGGTAGGCTGACGTGGACTACGCCAAGTAGCCATCGATAGCGACAGCCCATGTATATAGGTATAAACTAAGCCATATTACTATATCCAATCTCGCGTTGAACATCTTGGTGAGCGAAATGAGTCTTCCGCCGTACATAATGGGATGTCAGCGAGAGTCATCTGTGCGAGAGCACAGGGTAAAATCTCCAAGCCAAATAGGAATACATTTTGTTACAGGGATCAGACGTCGTCCTTCACTTCGGGGGGACAAAACCAGTCCTGTGAGGCAAA
SOLUTION: fix your fastq file

Jul 12 09:43:06 ...... FATAL ERROR, exiting
Segmentation fault (core dumped)

If I check this read ID in the FASTQ file I see that the quality string length and the sequence length are both the same length: 1979 I think it has something to do with the long reads from PacBio sequencing (the error sequence is just a small part of the original sequence). My question is if there's a workaround for Finder to work with Long read data? Thanks in advance!

WietseHR avatar Jul 13 '23 09:07 WietseHR