fqtk icon indicating copy to clipboard operation
fqtk copied to clipboard

Demultiplexing "N"-barcode as no-op

Open mschubert opened this issue 1 year ago • 1 comments

With https://github.com/fulcrumgenomics/fqtk/pull/30 (and release v0.3.0) fqtk allows Ns in barcodes.

I tried to run demultiplexing accepting any sequence for a sample (with a barcode containing only Ns), but all reads are written to unmatched.R1.fq.gz instead of the sample fq.

Is this intended?

mschubert avatar Jul 24 '24 12:07 mschubert

@mschubert would you be willing to share some of your data, or one FASTQ record that should have been matched to a sample along with the expected barcode?

nh13 avatar Jul 25 '24 08:07 nh13

My apologies for the late reply: The issue seems to be with fastq records that contain Ns themselves:

# my.fq.gz
@M00872:1070:000000000-GLPWM:1:1101:15776:1330 1:Y:0:1
AAGANNATNGNNGNNANNNTNNNAACGTAGTGCGCCAGCCTATTTCAGTGCTCAATCTTGCAGAGAATACTCTTGAGAGCG
+
AA1A##>>#>##A##A###A###ABBFFFHGGHEGGGGGGHHFHHHHHHHHHGFHHHHHHHHHHHCGHHFHHHGHHHHHHE
@M00872:1070:000000000-GLPWM:1:1101:15866:1331 1:Y:0:1
AAGANNATNGNNGNNANNNTNNNAACGTAGTGCGCATAAGCCGTTCAAGAGGAGCCATTGTGGGGAGGCCCTGGGGACTGG
+
AAAA##>>#>##A##A###A###BABFFHHGGHEEEEGHFFHGEEGHGFHEHHEHGFHHHFGFC>FCGGCEHHHGGAEFG/
# meta.tsv
sample_id  barcode
test       NNNNNNN
fqtk demux --inputs my.fq.gz --max-mismatches 0 --read-structures 7B+T --sample-metadata meta.tsv --output out

mschubert avatar Aug 19 '24 11:08 mschubert

Thank-you @mschubert for the clear report!

nh13 avatar Aug 19 '24 22:08 nh13