gencore icon indicating copy to clipboard operation
gencore copied to clipboard

Single-end read with UMI

Open caleblareau opened this issue 6 years ago • 3 comments

Does gencore work with data like this, which is common in scRNA-seq?

My read names look like this: A00439:244:HFHN5DRXX:1:2139:21052:22357_CGAGAGGTACCTGTGAATTG

where the first part of the string is what the sequencer provides; what follows the underscore is the UMI.

I've tried running the following:

gencore -i mix_for_gencore.bam -o gencore_dedup.bam -r $fasta

and

gencore -i mix_for_gencore.bam -o gencore_dedup.bam -r $fasta -u _

but neither option results in a filtering of any reads (i.e. the output bam is the same size).

caleblareau avatar Nov 05 '19 18:11 caleblareau

https://github.com/OpenGene/gencore#umi-examples

你看这个example,你这个reads的ID在"_"前面有个随机的序列"22357",你得想办法让他变成一个固定的序列,比如"umi",然后在gencore的参数后面加-u umi

zhujiaqi2014 avatar Nov 28 '19 07:11 zhujiaqi2014

@zhujiaqi2014

Hi man, @caleblareau may not understand Chinese. Better to reply in English :)

sfchen avatar Nov 28 '19 07:11 sfchen

https://github.com/OpenGene/gencore#umi-examples

You see this example, your reads ID has a random sequence "22357" in front of "_", you have to find a way to make it a fixed sequence, such as "umi", and add "-u umi" in the command line

zhujiaqi2014 avatar Nov 28 '19 07:11 zhujiaqi2014