kallisto
kallisto copied to clipboard
Mapping Sample Tags to Cellular Barcodes
Hi there,
I am trying to run an amplicon-based (targeted panel) BD Rhapsody scRNA-seq data using kallisto/kallistobustools. Is there a way to map sample tags to cellular barcodes in order to identify which cells belong to which condition? In other words, I am using BD's 'Single-Cell Multiplex Kit - Human' as my Sample Tags, and would be very useful to be able to provide that as an input to kb
in order to identify cells.
I have now asked BD to provide us with the sequences of the sample tags used in their kit.
Thank you very much in advance!
Where are the sample tag sequences in your fastq files? Are they in the R1 file, the R2 file, or in some separate fastq file?
They are in the R2 file
Do you know what position they are in the R2 file and how long the sequences are?
Yes! I received the SampleTagSequences fasta file from BD this morning and was able to find them in R2 using the grep
command
You could make the sample tags as part of the barcodes.
The default scheme for -x BDWTA
is -x 0,0,9,0,21,30,0,43,52:0,52,60:1,0,0
which means taking 9+9+9 base pair barcodes from file #0 (aka R1 file) and an 8-bp UMI from that file, and then using the entire sequence in file #1 (aka R2 file) as your biological reads file.
If you change that to -x 1,0,5,0,0,9,0,21,30,0,43,52:0,52,60:1,0,0
, the first 5 base pairs of your R2 file will be inserted at the beginning of each barcode (assuming the sample tags are the first 5 base pairs of the R2 file). Thus, the final barcode will be 32-bp long (5-bp tag + 27-bp barcode) [note: 32-bp is the maximum length that field can be, don't exceed that; if your sample tags are longer than 5-bp, then maybe just use the first 5-bp of those tags].