CITE-seq-Count icon indicating copy to clipboard operation
CITE-seq-Count copied to clipboard

Percentage unmapped: 100

Open sopenaml opened this issue 2 years ago • 18 comments

Hi, sorry for asking again, but after going through the previous answers I have not found the answer that solves my problem. I have data that contains: HTOs, ADTs, 5' gex and VDJ. I'm trying to use CITE-seq-count to get matrix counts for the HTOs and ADTs, in order to demultiplex my data. To do so I've opted for doing to separate runs for HTOs and ADTs for which I provide the tags.csv

ACCCACCAGTAAGAC,hashtag1
CTTGCCGCATGTCAT,hashtag3
AAAGCATTCTTCACG,hashtag4

or cellsurface.barcodes.csv

GACCCGGTGTCATTT,CD80
CACATCGTTTGTGTA,CD95
CACTCCTTGTAGTCA,PD-L2

All my abs are TotalSeq-C, and upon grep the tags I can see that they start in position 11: so I've added --start-trim 10

tag

When I run the script with the tags.csv file I get 96% mapped but when I do with the cellsurface.barcodes.csv, I get 100% unmapped despite I can grep the tags in R2. Would you know why the ADT tags are not mapped? Since the libraries should contain both cell surface barcodes and HTOs I would expect the mapping to be split between both or am I wrong? Can anyone help please? Thank you very much. Miriam

my running commands

CITE-seq-Count -R1 BEN4535A3_R1_001.fastq.gz -R2 BEN4535A3_R2_001.fastq.gz --tags cellsurface.barcodes.csv --cell_barcode_first_base 1 --cell_barcode_last_base 16 --umi_first_base 17 --umi_last_base 26 -cells 10000 --start-trim 10 --threads 24 -o citeseqcount/BEN4535A3.adt


CITE-seq-Count -R1 BEN4535A3_R1_001.fastq.gz -R2 BEN4535A3_R2_001.fastq.gz --tags tags.csv --cell_barcode_first_base 1 --cell_barcode_last_base 16 --umi_first_base 17 --umi_last_base 26 -cells 10000 --start-trim 10 --threads 24 -o citeseqcount/BEN4535A3.adt

sopenaml avatar Apr 05 '22 11:04 sopenaml

Hi I have same problem here. How did you extract that sequence from fastq and determine the start trim point? do we use R2 fastq for it? Thank you

dianitasusilo avatar Apr 12 '22 04:04 dianitasusilo

grep "barcode_of_interest" your.fastq_R2.file | head -n 20

that should print reads containing your barcodes and to know where to trim, count the bases from the left of the sequence until the first nt of the barcode

Hope it helps,

Miriam


From: Dianita Susilo Saputri @.> Sent: Tuesday, April 12, 2022 5:28 AM To: Hoohm/CITE-seq-Count @.> Cc: Miriam Llorian Sopena @.>; Author @.> Subject: Re: [Hoohm/CITE-seq-Count] Percentage unmapped: 100 (Issue #167)

External Sender: Use caution.

Hi I have same problem here. How did you extract that sequence from fastq and determine the start trim point? do we use R2 fastq for it? Thank you

— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FHoohm%2FCITE-seq-Count%2Fissues%2F167%23issuecomment-1095992575&data=04%7C01%7C%7C43d7565d0cdf435498e608da1c3cda2e%7C4eed7807ebad415aa7a99170947f4eae%7C0%7C0%7C637853344938463050%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=jr7eSCnUof%2FZvUseRCvywNeymaSSsQXnWrUtXp%2BeI0Y%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAGLEBWLL5G7BURPOMPQW7QDVET3VTANCNFSM5SSOCWCQ&data=04%7C01%7C%7C43d7565d0cdf435498e608da1c3cda2e%7C4eed7807ebad415aa7a99170947f4eae%7C0%7C0%7C637853344938463050%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=lXasiHU6xKUfEw4ZorXBM6iDoYYeQIXtF0gRd6vLqVQ%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>

The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT

sopenaml avatar Apr 12 '22 09:04 sopenaml

Thanks for your quick reply but in my case they are not in the same position...

Here I looked for my hashtag nucleotide in the raw R2 fastq files, and it turned out like this. image

Or did I use wrong fastq file as input?

dianitasusilo avatar Apr 13 '22 01:04 dianitasusilo

In that case I'm not sure maybe the developers would answer? I'll try sliding window, as I've seen messages in the channel suggesting that.


From: Dianita Susilo Saputri @.> Sent: Wednesday, April 13, 2022 2:24 AM To: Hoohm/CITE-seq-Count @.> Cc: Miriam Llorian Sopena @.>; Author @.> Subject: Re: [Hoohm/CITE-seq-Count] Percentage unmapped: 100 (Issue #167)

External Sender: Use caution.

Thanks for your quick reply but in my case they are not in the same position...

Here I looked for my hashtag nucleotide in the raw R2 fastq files, and it turned out like this. [image]https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuser-images.githubusercontent.com%2F61182046%2F163080450-d121a66c-2f51-402a-8565-9c68eee923f4.png&data=04%7C01%7C%7Cca4a49a60233488f244408da1cec6027%7C4eed7807ebad415aa7a99170947f4eae%7C0%7C0%7C637854098802545982%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=haYxq3zZXN82e944g%2BvO1vMnY1dQQ9e3%2B7xuAS3vx7s%3D&reserved=0

Or did I use wrong fastq file as input?

— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FHoohm%2FCITE-seq-Count%2Fissues%2F167%23issuecomment-1097455209&data=04%7C01%7C%7Cca4a49a60233488f244408da1cec6027%7C4eed7807ebad415aa7a99170947f4eae%7C0%7C0%7C637854098802545982%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=ZMTYhTjDM4wRASWYHDcPI5vO5V8KymUd32fhl3hbmxo%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAGLEBWK45VQTTAHIISOARXDVEYO5JANCNFSM5SSOCWCQ&data=04%7C01%7C%7Cca4a49a60233488f244408da1cec6027%7C4eed7807ebad415aa7a99170947f4eae%7C0%7C0%7C637854098802545982%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=7Sigj5QGzfCocjs0%2FVB7CThBq988v%2BvdHJN43kRT%2Fac%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>

The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT

sopenaml avatar Apr 13 '22 08:04 sopenaml

Happy to see fellow users help each other.

@dianitasusilo maybe a sliding window approach might help yes: --sliding-window is the option you are looking for.

@sopenaml Could you check out of your barcodes in R1 are overlapping? It might be a mapping between barcodes similar to totalSeqB.

Hoohm avatar Apr 13 '22 09:04 Hoohm

Hi Patrick, I'm not sure which barcodes you are referring to, are you asking me to check if the antibody barcodes appear in R1 somewhere?

Thank you for your help, Miriam


From: Patrick Roelli @.> Sent: Wednesday, April 13, 2022 10:30 AM To: Hoohm/CITE-seq-Count @.> Cc: Miriam Llorian Sopena @.>; Mention @.> Subject: Re: [Hoohm/CITE-seq-Count] Percentage unmapped: 100 (Issue #167)

External Sender: Use caution.

Happy to see fellow users help each other.

@dianitasusilohttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdianitasusilo&data=04%7C01%7C%7C9e69f6aa206b468589b508da1d303186%7C4eed7807ebad415aa7a99170947f4eae%7C0%7C0%7C637854390077438838%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Aj%2Bs4yFQkqpEjsMMtMg%2F1%2Bto1LkFmlN4sGlLiSrGxYI%3D&reserved=0 maybe a sliding window approach might help yes: --sliding-window is the option you are looking for.

@sopenamlhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fsopenaml&data=04%7C01%7C%7C9e69f6aa206b468589b508da1d303186%7C4eed7807ebad415aa7a99170947f4eae%7C0%7C0%7C637854390077438838%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=0V5SalxERpWnRRgMNSfwX6LI3uQK5BpG%2F7WDsGuMUPg%3D&reserved=0 Could you check out of your barcodes in R1 are overlapping? It might be a mapping between barcodes similar to totalSeqB.

— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FHoohm%2FCITE-seq-Count%2Fissues%2F167%23issuecomment-1097798207&data=04%7C01%7C%7C9e69f6aa206b468589b508da1d303186%7C4eed7807ebad415aa7a99170947f4eae%7C0%7C0%7C637854390077438838%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=sHh8wzNkR3mUN8Zu%2B%2F7MwVUTNcLkYqH2vV0g4tREOwk%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAGLEBWOYH6SXDIUJNM7ADCLVE2HZZANCNFSM5SSOCWCQ&data=04%7C01%7C%7C9e69f6aa206b468589b508da1d303186%7C4eed7807ebad415aa7a99170947f4eae%7C0%7C0%7C637854390077438838%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=eIB4Jy7v%2BMGad0uCuuBU%2B9PpPBkRqulW6OF6ZXPfF4c%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>

The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT

sopenaml avatar Apr 13 '22 09:04 sopenaml

The cell barcodes that should be in R1, just before the UMI

On Wed, 13 Apr 2022, 11:46 sopenaml, @.***> wrote:

Hi Patrick, I'm not sure which barcodes you are referring to, are you asking me to check if the antibody barcodes appear in R1 somewhere?

Thank you for your help, Miriam


From: Patrick Roelli @.> Sent: Wednesday, April 13, 2022 10:30 AM To: Hoohm/CITE-seq-Count @.> Cc: Miriam Llorian Sopena @.>; Mention @.> Subject: Re: [Hoohm/CITE-seq-Count] Percentage unmapped: 100 (Issue #167)

External Sender: Use caution.

Happy to see fellow users help each other.

@dianitasusilo< https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdianitasusilo&data=04%7C01%7C%7C9e69f6aa206b468589b508da1d303186%7C4eed7807ebad415aa7a99170947f4eae%7C0%7C0%7C637854390077438838%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Aj%2Bs4yFQkqpEjsMMtMg%2F1%2Bto1LkFmlN4sGlLiSrGxYI%3D&reserved=0> maybe a sliding window approach might help yes: --sliding-window is the option you are looking for.

@sopenaml< https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fsopenaml&data=04%7C01%7C%7C9e69f6aa206b468589b508da1d303186%7C4eed7807ebad415aa7a99170947f4eae%7C0%7C0%7C637854390077438838%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=0V5SalxERpWnRRgMNSfwX6LI3uQK5BpG%2F7WDsGuMUPg%3D&reserved=0> Could you check out of your barcodes in R1 are overlapping? It might be a mapping between barcodes similar to totalSeqB.

— Reply to this email directly, view it on GitHub< https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FHoohm%2FCITE-seq-Count%2Fissues%2F167%23issuecomment-1097798207&data=04%7C01%7C%7C9e69f6aa206b468589b508da1d303186%7C4eed7807ebad415aa7a99170947f4eae%7C0%7C0%7C637854390077438838%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=sHh8wzNkR3mUN8Zu%2B%2F7MwVUTNcLkYqH2vV0g4tREOwk%3D&reserved=0>, or unsubscribe< https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAGLEBWOYH6SXDIUJNM7ADCLVE2HZZANCNFSM5SSOCWCQ&data=04%7C01%7C%7C9e69f6aa206b468589b508da1d303186%7C4eed7807ebad415aa7a99170947f4eae%7C0%7C0%7C637854390077438838%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=eIB4Jy7v%2BMGad0uCuuBU%2B9PpPBkRqulW6OF6ZXPfF4c%3D&reserved=0

. You are receiving this because you were mentioned.Message ID: @.***>

The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT

— Reply to this email directly, view it on GitHub https://github.com/Hoohm/CITE-seq-Count/issues/167#issuecomment-1097827624, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJVO2A3467HNZGEDCGE4KLVE2JXPANCNFSM5SSOCWCQ . You are receiving this because you commented.Message ID: @.***>

Hoohm avatar Apr 13 '22 10:04 Hoohm

Hi Patrick,

I've checked my cite-seq ab barcodes agains R1 and I don't see any matches. If I check my cell hashing barcodes, there's one that finds few (7 ) matches on R1, but the rest none. So it's not that my barcodes are overlapping with cell barcodes. Any other ideas of what the problem may be? Thanks

sopenaml avatar Apr 21 '22 13:04 sopenaml

Hi, I have the same problem, where I can grep my HTO out of read 2 but still get 100% reads unmapped. I am running 1.4.5 using Python 3.9. Do we know what the solution to this issue is?

drlaurenwasson avatar May 10 '22 20:05 drlaurenwasson

Hi, I'm afraid I didn't get an answer/solution. My suspicion is that, at least in my case, the ADTs only label a small proportion of the cells so I was wondering if that could be the problem. I've tried cell-ranger multi instead, and I do get counts for the ADTs so I think I'm goin g to use cell ranger multi for now..


From: Lauren Wasson @.> Sent: Tuesday, May 10, 2022 9:26 PM To: Hoohm/CITE-seq-Count @.> Cc: Miriam Llorian Sopena @.>; Mention @.> Subject: Re: [Hoohm/CITE-seq-Count] Percentage unmapped: 100 (Issue #167)

External Sender: Use caution.

Hi, I have the same problem, where I can grep my HTO out of read 2 but still get 100% reads unmapped. I am running 1.4.5 using Python 3.9. Do we know what the solution to this issue is?

— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FHoohm%2FCITE-seq-Count%2Fissues%2F167%23issuecomment-1122827525&data=05%7C01%7C%7Cf2b4fe5b64b340682c2408da32c35098%7C4eed7807ebad415aa7a99170947f4eae%7C0%7C0%7C637878111710828541%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=w7d6%2F9pQlFZuK6oRBEwPx094TWQZ1%2BZcEn7ouV2N5FI%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAGLEBWLD2X2G7VS2QJM5XOTVJLA55ANCNFSM5SSOCWCQ&data=05%7C01%7C%7Cf2b4fe5b64b340682c2408da32c35098%7C4eed7807ebad415aa7a99170947f4eae%7C0%7C0%7C637878111710828541%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=UVi3IogEEigT1CABkORGQGx%2Fkk%2BvLjVl7Fi33K0VtQ4%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>

The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT

sopenaml avatar May 11 '22 08:05 sopenaml

Hey @sopenaml, I need to rephrase what I mentioned earlier. Depending on what chemistry kit you used, it's possible that your R1 barcodes(cell barcodes) linked to one library (GEX, VDJ, ADTs) are linked to one cell barcode and your HTOs are linked to another cell barcode in the same cell. This means that when you do your overlap, it's going to be very low because the barcodes need to be translated.

Here is the translation matrix. https://github.com/10XGenomics/cellranger/blob/master/lib/python/cellranger/barcodes/translation/3M-february-2018.txt.gz

Is it a bit clearer?

Hoohm avatar May 15 '22 07:05 Hoohm

Hi everyone,

I am running into an issue, where I do have about ~35% unmapped reads. Is there a way to bring that number up? Grepping the R2 file, shows that start trim needs to be --start-trim 0 Grep_R2 I used 10xv3 Attached are the tags tags.csv

Here is the is what I run to get such output:

CITE-seq-Count -T ${numThreads} \ -R1 ${fastq_path}${nova_id}outs/fastq_path/${fcid}/sham_hto/${outname[$index]}_KS220601_batch1_HTO_${snum[$index]}_L001_R1_001.fastq.gz,${fastq_path}${nova_id}outs/fastq_path/${fcid}/sham_hto/${outname[$index]}_KS220601_batch1_HTO_${snum[$index]}_L002_R1_001.fastq.gz,${fastq_path}${nova_id}outs/fastq_path/${fcid}/sham_hto/${outname[$index]}_KS220601_batch1_HTO_${snum[$index]}_L003_R1_001.fastq.gz,${fastq_path}${nova_id}outs/fastq_path/${fcid}/sham_hto/${outname[$index]}_KS220601_batch1_HTO_${snum[$index]}_L004_R1_001.fastq.gz \ -R2 ${fastq_path}${nova_id}outs/fastq_path/${fcid}/sham_hto/${outname[$index]}_KS220601_batch1_HTO_${snum[$index]}_L001_R2_001.fastq.gz,${fastq_path}${nova_id}outs/fastq_path/${fcid}/sham_hto/${outname[$index]}_KS220601_batch1_HTO_${snum[$index]}_L002_R2_001.fastq.gz,${fastq_path}${nova_id}outs/fastq_path/${fcid}/sham_hto/${outname[$index]}_KS220601_batch1_HTO_${snum[$index]}_L003_R2_001.fastq.gz,${fastq_path}${nova_id}outs/fastq_path/${fcid}/sham_hto/${outname[$index]}_KS220601_batch1_HTO_${snum[$index]}_L004_R2_001.fastq.gz \ -t tags.csv -cbf 1 -cbl 16 -umif 17 -umil 28 -cells 5000 --sliding-window --start-trim 0 \ -o /project/CiteSeq7_sham \

Here is the the output I get: Date: 2022-07-13 Running time: 6.0 minutes, 37.25 seconds CITE-seq-Count Version: 1.4.5 Reads processed: 3575668 Percentage mapped: 64 Percentage unmapped: 36 Uncorrected cells: 0 Correction: Cell barcodes collapsing threshold: 1 Cell barcodes corrected: 16075 UMI collapsing threshold: 2 UMIs corrected: 12836 Run parameters: Read1_paths: _S17_L004_R1_001.fastq.gz Read2_paths: _S17_L004_R2_001.fastq.gz Cell barcode: First position: 1 Last position: 16 UMI barcode: First position: 17 Last position: 28 Expected cells: 5000 Tags max errors: 2 Start trim: 0

Thank you in advance!

stepanovacz avatar Jul 13 '22 19:07 stepanovacz

I am able to bring the number of mapped reads above 90, by setting --max-error 6 or higher. However, I do not think that is it a good solution as I get plenty of doublets and negatives Doublet 1868 Negative 35 Singlet 394 . Any idea what else I can do? Thank you!

stepanovacz avatar Jul 20 '22 15:07 stepanovacz

Would you be able to send me a sample of your data so that I can run it and have a look?

Hoohm avatar Aug 10 '22 08:08 Hoohm

Hi Patrick,

Thank you for your reply. Yes, here is the data and the tags (hashtags used). Please let me know if I am missing anything! I appreciate you looking into this! sham_hto.zip https://drive.google.com/file/d/1ynqVw-74I9gw6NErchiv0ix1jFSnP4kC/view?usp=drive_web tags (3).csv https://drive.google.com/file/d/1ahBW4U7tgkDH_qXLr_nVv051hncM0MiM/view?usp=drive_web

On Wed, Aug 10, 2022 at 4:54 AM Patrick Roelli @.***> wrote:

Would you be able to send me a sample of your data so that I can run it and have a look?

— Reply to this email directly, view it on GitHub https://github.com/Hoohm/CITE-seq-Count/issues/167#issuecomment-1210367852, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQZZGLNKICTMW4YN72EZNITVYNU4BANCNFSM5SSOCWCQ . You are receiving this because you commented.Message ID: @.***>

-- Katya Stepanova Graduate Student Deppmann & Campbell Labs

stepanovacz avatar Aug 10 '22 12:08 stepanovacz

I asked for access

Hoohm avatar Aug 13 '22 16:08 Hoohm

results/unmapped.csv 
tag,count
AAGCAGTGGTATCAA,38893
GGGGGGGGGGGGGGG,20759
CCGTACCTCAAAAAA,17644
GCAGTGGTATCAACG,10879
TTCCTGCCAAAAAAA,5855
GTGGTATCAACGCAG,5442
AGCAGTGGTATCAAC,4087
CCGTACCCCAAAAAA,3959
CAGTGGTATCAACGC,3894

It seems pretty reasonable from what I see in the first sample. The unmapped.csv gives you the top sequences that are not mapping. 22% of polyG, means no sequence there or could not be read

Why do you need to get higher?

I want to make sure about the translation issue. Do you have a high overlap between the cells from the RNA side and the HTO?

Hoohm avatar Aug 13 '22 17:08 Hoohm

Hi, is the translation matrix used only with v3 chemistry? I seem to have a similar problem where grep doesnt return barcodes in my fastq R2 for which I know exist in my data after cellranger. I used the 5' v2 chemistry with gex, vdj and feature barcode libs.

Thanks

Leeana

leeanapeters avatar Aug 17 '22 16:08 leeanapeters