CITE-seq-Count
CITE-seq-Count copied to clipboard
CITE-seq-Count 100% unmapped
Hello,
Thank you for developing such an innovative package! I've been trying to run CITE-seq-Count
on my 10X V3 data, but keep on getting 100% unmapped returned.
This is the command I used:
CITE-seq-Count -R1 d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R1_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R1_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L003_R1_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L004_R1_001.fastq.gz -R2 d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R2_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R2_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L003_R2_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L004_R2_001.fastq.gz -t tags.csv -cbf 1 -cbl 16 -umif 17 -umil 28 -cells 5000 -o cite_seq_results/ --start-trim 10
based on grepping our antibody tags returning:
I have also tried --start-trim
: 0,1 and --sliding-window
as well and they all return 100% unmapped. I've attached the run_report.yaml
below.
Date: 2021-10-19
Running time: 7.0 hours, 19.0 minutes, 16.43 seconds
CITE-seq-Count Version: 1.4.3
Reads processed: 259436768
Percentage mapped: 0
Percentage unmapped: 100
Uncorrected cells: 3266
Correction:
Cell barcodes collapsing threshold: 1
Cell barcodes corrected: 217758
UMI collapsing threshold: 2
UMIs corrected: 8244869
Run parameters:
Read1_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R1_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R1_001.fa$
Read2_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R2_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R2_001.fa$
Cell barcode:
First position: 1
Last position: 16
UMI barcode:
First position: 17
Last position: 28
Expected cells: 5000
Tags max errors: 2
Start trim: 1
Do you have any idea what I can do to fix this? Thanks so much for your help!
Hi,
As far as I can see in the screenshot attached, --start-trim
should equal 10 as in --start-trim 10
since your tag sequence starts at the 11th base.
(1) Do you mind posting your tag.csv
file?
(2) Also, did you try to run CITE-seq-Count
on just on a single fastq file (e.g. L001 only) or a merged fastq file instead of comma separation? Just trying to eliminate potential issues.
(3) Ideally, if you have a whitelist from cellranger it really helps to run CITE-seq-Count
with that so it can assign the antibody barcodes/tags to the relevant cells.
(1) Do you mind posting your
tag.csv
file?(2) Also, did you try to run
CITE-seq-Count
on just on a single fastq file (e.g. L001 only) or a merged fastq file instead of comma separation? Just trying to eliminate potential issues.(3) Ideally, if you have a whitelist from cellranger it really helps to run
CITE-seq-Count
with that so it can assign the antibody barcodes/tags to the relevant cells.
Thank you for your reply.
-
I've attached
tags.csv
here. tags.csv -
I tried re-running the command (this time including:
--whitelist 3M-february-2018.txt
and--start-trim 10
) on merged fastq files and still get 100% unmapped. run_report.yaml:
Date: 2021-10-31
Running time: 19.0 hours, 42.0 minutes, 7.002 seconds
CITE-seq-Count Version: 1.4.3
Reads processed: 259436768
Percentage mapped: 0
Percentage unmapped: 100
Uncorrected cells: 3246
Correction:
Cell barcodes collapsing threshold: 1
Cell barcodes corrected: 282864
UMI collapsing threshold: 2
UMIs corrected: 8391407
Run parameters:
Read1_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_R1.fastq.gz
Read2_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_R2.fastq.gz
Cell barcode:
First position: 1
Last position: 16
UMI barcode:
First position: 17
Last position: 28
Expected cells: 5000
Tags max errors: 2
Start trim: 10
I also tried on just one file. Still 100% unmapped. run_report.yaml:
Date: 2021-10-27
Running time: 1.0 hour, 28.0 minutes, 24.34 seconds
CITE-seq-Count Version: 1.4.3
Reads processed: 65028569
Percentage mapped: 0
Percentage unmapped: 100
Uncorrected cells: 584
Correction:
Cell barcodes collapsing threshold: 1
Cell barcodes corrected: 41861
UMI collapsing threshold: 2
UMIs corrected: 3376189
Run parameters:
Read1_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R1_001.fastq.gz
Read2_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R2_001.fastq.gz
Cell barcode:
First position: 1
Last position: 16
UMI barcode:
First position: 17
Last position: 28
Expected cells: 1000
Tags max errors: 2
Start trim: 10
Do you know what could be going on? Thanks so much!
The fact that you still get 100% after the trim change is odd. Seems like it should have fixed it.
Try a run without the whitelist, only run 10000 reads. trim is 10 for sure
I tried running the command without --white-list
on just the first 10000 reads and this what I get:
run_report.yaml
:
Date: 2021-11-03
Running time: 5.645 seconds
CITE-seq-Count Version: 1.4.3
Reads processed: 10000
Percentage mapped: 1
Percentage unmapped: 249
Uncorrected cells: 0
Correction:
Cell barcodes collapsing threshold: 1
Cell barcodes corrected: 341
UMI collapsing threshold: 2
UMIs corrected: 29
Run parameters:
Read1_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R1_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R1_001.fastq$
Read2_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R2_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R2_001.fastq$
Cell barcode:
First position: 1
Last position: 16
UMI barcode:
First position: 17
Last position: 28
Expected cells: 5000
Tags max errors: 2
Start trim: 10
I think the trim is at 9 actually, not 10
@Hoohm Thanks for that suggestion. I tried with a merged R1/R2 files and separate R1/R2 files, both specifying trim 9, but still got 100% unmapped
@Hoohm may I ask why the 9 trim. the inicial nucleotides are 10. @carbycrab could it be that the fastq file in addition to the CMOs has ADTs (CMOs for multiplexing, ADTs for cell surface/identity)? Usually the percentage of the latter are very hight