sg-nex-data icon indicating copy to clipboard operation
sg-nex-data copied to clipboard

two fastq files were not correctly formated

Open alexyfyf opened this issue 1 year ago • 3 comments

Hi team,

I have downloaded some cDNA fastq files from you s3 repo. I found 2 files are not correctly formatted when I run QC with NanoPlot.

SGNex_MCF7_cDNAStranded_replicate2_run1/SGNex_MCF7_cDNAStranded_replicate2_run1.fastq.gz
SGNex_K562_cDNAStranded_replicate3_run3/SGNex_K562_cDNAStranded_replicate3_run3.fastq.gz

The first one has additional strings before the @ character of the first read.

fastq_fail/FAK34234_679ea2e77287c6ea3bab84c69ca16d29e5d9c760_228.fastq000666 001750 001750 00010735421 13424777162 023424 0ustar00gridgrid000000 000000 @0185f0c7-c4a5-40fb-9ac2-6907653a86a5 runid=679ea2e77287c6ea3bab84c69ca16d29e5d9c760 read=46243 ch=61 start_time=2019-02-01T08:06:48Z flow_cell_id=FAK34234 protocol_group_id=010219_MCF7_mRNA_PCS109 sample_id=010219_MCF7_mRNA_PCS109
ACGGTAATACTTCGGTCTTGTTTCGACAATCGGTCGCTCAGACCGACCGTGGAAC
+
#"*%&$#%"$&"""""$&&#"""""""++*++)/+%#%##'+*$%&'%"##("&$

The second one has a read with an unmatching length of quality score.

@09f55d50-803e-4048-899d-bb2fbdbf9c33 runid=446e90283984afd70d3f9af90262644290c7fca2 read=1796 ch=64 start_time=2019-01-07T07:56:26Z flow_cell_id=FAK11042 protocol_group_id=070119_K562_mRNA_PCS109 sample_id=070119_K562_mRNA_PCS109
TCGGTGATAAAGTGTTAATCGTCGG
+
%"-$&%""""""""$"""""""""

Can you confirm this? Cheers, Alex

alexyfyf avatar May 12 '23 06:05 alexyfyf