CITE-seq-Count icon indicating copy to clipboard operation
CITE-seq-Count copied to clipboard

CITE-seq-Count 100% unmapped

Open carbycrab opened this issue 2 years ago • 9 comments

Hello,

Thank you for developing such an innovative package! I've been trying to run CITE-seq-Count on my 10X V3 data, but keep on getting 100% unmapped returned.

This is the command I used:

CITE-seq-Count   -R1   d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R1_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R1_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L003_R1_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L004_R1_001.fastq.gz   -R2 d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R2_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R2_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L003_R2_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L004_R2_001.fastq.gz   -t tags.csv -cbf 1 -cbl 16 -umif 17 -umil 28 -cells 5000 -o cite_seq_results/ --start-trim 10

based on grepping our antibody tags returning:

Screen Shot 2021-10-24 at 1 26 03 PM

I have also tried --start-trim: 0,1 and --sliding-window as well and they all return 100% unmapped. I've attached the run_report.yaml below.

Date: 2021-10-19
Running time: 7.0 hours, 19.0 minutes, 16.43 seconds
CITE-seq-Count Version: 1.4.3
Reads processed: 259436768
Percentage mapped: 0
Percentage unmapped: 100
Uncorrected cells: 3266
Correction:
	Cell barcodes collapsing threshold: 1
        Cell barcodes corrected: 217758
        UMI collapsing threshold: 2      
  UMIs corrected: 8244869
Run parameters:
        Read1_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R1_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R1_001.fa$
        Read2_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R2_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R2_001.fa$
        Cell barcode:
                First position: 1
                Last position: 16
        UMI barcode:
                First position: 17
                Last position: 28
        Expected cells: 5000
        Tags max errors: 2
        Start trim: 1

Do you have any idea what I can do to fix this? Thanks so much for your help!

carbycrab avatar Oct 24 '21 17:10 carbycrab

Hi,

As far as I can see in the screenshot attached, --start-trim should equal 10 as in --start-trim 10 since your tag sequence starts at the 11th base.

fjrossello avatar Oct 26 '21 09:10 fjrossello

(1) Do you mind posting your tag.csv file?

(2) Also, did you try to run CITE-seq-Count on just on a single fastq file (e.g. L001 only) or a merged fastq file instead of comma separation? Just trying to eliminate potential issues.

(3) Ideally, if you have a whitelist from cellranger it really helps to run CITE-seq-Count with that so it can assign the antibody barcodes/tags to the relevant cells.

cpflueger2016 avatar Oct 27 '21 05:10 cpflueger2016

(1) Do you mind posting your tag.csv file?

(2) Also, did you try to run CITE-seq-Count on just on a single fastq file (e.g. L001 only) or a merged fastq file instead of comma separation? Just trying to eliminate potential issues.

(3) Ideally, if you have a whitelist from cellranger it really helps to run CITE-seq-Count with that so it can assign the antibody barcodes/tags to the relevant cells.

carbycrab avatar Nov 01 '21 17:11 carbycrab

Thank you for your reply.

  1. I've attached tags.csv here. tags.csv

  2. I tried re-running the command (this time including: --whitelist 3M-february-2018.txt and --start-trim 10) on merged fastq files and still get 100% unmapped. run_report.yaml:

Date: 2021-10-31
Running time: 19.0 hours, 42.0 minutes, 7.002 seconds
CITE-seq-Count Version: 1.4.3
Reads processed: 259436768
Percentage mapped: 0
Percentage unmapped: 100
Uncorrected cells: 3246
Correction:
	Cell barcodes collapsing threshold: 1
        Cell barcodes corrected: 282864
        UMI collapsing threshold: 2
        UMIs corrected: 8391407
Run parameters:
        Read1_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_R1.fastq.gz
        Read2_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_R2.fastq.gz
        Cell barcode:
                First position: 1
                Last position: 16
        UMI barcode:
 First position: 17
                Last position: 28
        Expected cells: 5000
        Tags max errors: 2
        Start trim: 10

I also tried on just one file. Still 100% unmapped. run_report.yaml:

Date: 2021-10-27
Running time: 1.0 hour, 28.0 minutes, 24.34 seconds
CITE-seq-Count Version: 1.4.3
Reads processed: 65028569
Percentage mapped: 0
Percentage unmapped: 100
Uncorrected cells: 584
Correction:
	Cell barcodes collapsing threshold: 1
        Cell barcodes corrected: 41861
        UMI collapsing threshold: 2
        UMIs corrected: 3376189
Run parameters:
        Read1_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R1_001.fastq.gz
        Read2_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R2_001.fastq.gz
        Cell barcode:
                First position: 1
                Last position: 16
        UMI barcode:
   First position: 17
                Last position: 28
        Expected cells: 1000
        Tags max errors: 2
        Start trim: 10

Do you know what could be going on? Thanks so much!

carbycrab avatar Nov 01 '21 17:11 carbycrab

The fact that you still get 100% after the trim change is odd. Seems like it should have fixed it.

Try a run without the whitelist, only run 10000 reads. trim is 10 for sure

Hoohm avatar Nov 03 '21 08:11 Hoohm

I tried running the command without --white-list on just the first 10000 reads and this what I get: run_report.yaml:

Date: 2021-11-03
Running time: 5.645 seconds
CITE-seq-Count Version: 1.4.3
Reads processed: 10000
Percentage mapped: 1
Percentage unmapped: 249
Uncorrected cells: 0
Correction:
	Cell barcodes collapsing threshold: 1
        Cell barcodes corrected: 341
        UMI collapsing threshold: 2
        UMIs corrected: 29
Run parameters:
        Read1_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R1_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R1_001.fastq$
        Read2_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R2_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R2_001.fastq$
        Cell barcode:
                First position: 1
                Last position: 16
        UMI barcode:
                First position: 17
                Last position: 28
        Expected cells: 5000
        Tags max errors: 2
        Start trim: 10

carbycrab avatar Nov 04 '21 19:11 carbycrab

I think the trim is at 9 actually, not 10

Hoohm avatar Nov 21 '21 10:11 Hoohm

@Hoohm Thanks for that suggestion. I tried with a merged R1/R2 files and separate R1/R2 files, both specifying trim 9, but still got 100% unmapped

carbycrab avatar Nov 23 '21 15:11 carbycrab

@Hoohm may I ask why the 9 trim. the inicial nucleotides are 10. @carbycrab could it be that the fastq file in addition to the CMOs has ADTs (CMOs for multiplexing, ADTs for cell surface/identity)? Usually the percentage of the latter are very hight

sunta3iouxos avatar Nov 24 '21 21:11 sunta3iouxos