fastQValidator
fastQValidator copied to clipboard
Returning: 6 : FASTQ_NO_SEQUENCE_ERROR is non-obvious
When validating one of my files, I get this error:
[ec2-user@master scratch]$ $FASTQVALIDATOR --disableSeqIDCheck --file sample_2.clipped.fastq Finished processing M2GEN-BMS-0715-11192013-LU-FT_L6.D706_2.clipped.fastq with 230767016 lines containing 57691754 sequences. There were a total of 0 errors. Returning: 6 : FASTQ_NO_SEQUENCE_ERROR
The "Finished processing" line indicates everything is okay, but I get a return value of 6, with no indication as to what the problem is.
If I do a 'wc -l' on the fastq file, the line count is correct and matches its mate.
It appears the last sequence in my file is a partial sequence. This causes lines 354-358 to return FASTQ_NO_SEQUENCE_ERROR of FastQFile.cpp in libStatGen.
Since valid is true and its at the end of the file, I think it is safe to assume its:
- Either a partial sequence, or
- Or there is no EOF at the end of the quality string.
I'm not convinced there should be an error here, or perhaps there should be an option to report this type of error.
Sorry for the delayed response. I was able to recreate the "error messages" by putting a blank line at the end of the fastq file. I have updated that situation so it will not return an ERROR message and will just return success. Multiple blank lines at the end of the file will still result in an error message, but will indicate "sequence too short", etc as well (which is not what you were seeing). The update is found in the libStatGen master branch.