FastQC icon indicating copy to clipboard operation
FastQC copied to clipboard

Wrong Illumina TruSeq adapter sequences in contaminant_list.txt?

Open aushev opened this issue 4 years ago • 0 comments

I have an impression that sequences of the Illumina TruSeq adaptors starting from Index 13 are wrong. I checked in couple sources, including Illumina website, and starting from Index 13 sequences are different, for example

GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTCAAC  TCTCGTATGCCGTCTTCTGCTTG    # Index 13, current contaminant_list.txt
GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTCAACAATCTCGTATGCCGTCTTCTGCTTG    # Index 13, Illumina file

GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTTCCG  TCTCGTATGCCGTCTTCTGCTTG    # Index 14, current contaminant_list.txt
GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTTCCGTATCTCGTATGCCGTCTTCTGCTTG    # Index 14, Illumina file

(I have added spaces to align easier) You can see that after the 6-letter barcode, there is a 2-letter difference. Additionally, for Index 23 (https://github.com/s-andrews/FastQC/blob/45c9977c69deb0341e086cceec49145211c4dcb0/Configuration/contaminant_list.txt#L114), even the barcode is different: CCACTC in the current contaminant list while GAGTGG in Illumina's list

Overall, maybe it would be helpful to indicate source of information as a comment line before each block of sequences?

aushev avatar Jun 30 '20 06:06 aushev