zUMIs
zUMIs copied to clipboard
Question about put pattern and internal reads together to analyse
Hi,
I am confused about the result and use of zUMIs pipeline.
Here are my yaml content.
pm1-1.yaml.txt
Firts, my input was the paired-end reads with start of specify pattern sequence. The reads1 contain total 1657623 reads, while the STAR result : filtered.tagged.Log.final.out shows that Number of input reads are 846533, Q1 I think that was because of the filterring condition and cDNA range setted in reads1 in my yaml file. Am I right?
Secondly, I have found that reads id in
"pm1.filtered.tagged.unmapped.bam <flag including: 4>",
"pm1.filtered.tagged.Aligned.out.bam<flag including: 0, 16>" and
"pm1.filtered.Aligned.GeneTagged.sorted.bam<flag including: 0, 16>" are same.
Q2 And why "pm1.filtered.tagged.unmapped.bam reads id are the same as pm1.filtered.tagged.Aligned.out.bam and pm1.filtered.Aligned.GeneTagged.sorted.bam.
What's more, "pm1.filtered.tagged.Aligned.toTranscriptome.out.bam<flag including: 0, 16, 252, 276>" has missed some reads according to above three bam files, below is the miss reads in "pm1.filtered.tagged.Aligned.toTranscriptome.out.bam" in the pm1.filtered.Aligned.GeneTagged.sorted.bam file.
miss-in-toTranscriptome.bam.txt, I also check the first read in miss reads bam result mapping position, below is the ENSG0000014267 position of transcriptome of my reference, and is no problem.
Q3 Why these reads miss in pm1.filtered.tagged.Aligned.toTranscriptome.out.bam?
Finally, I have separate my raw reads<PE150> into paired patterned_reads
below is my yaml file.
pm1-2.yaml.txt
below is my new STAR filtered.tagged.Log.final.out shows that Number of input reads are 846533, it seem the file3 and file4 are fail to put together to analyze. While the Uniquely mapped reads number are less than not put together to analyze.
I am so puzzled about above, looking forward to your reply, thanks a lot! Dka
Hi,
as mentioned in your other issue, the use of the particular 11bp pattern "ATTGCGCAATG" is reserved to the processing of Smart-seq3 data. our pipeline is hardcoded in this case and I am unfortunately unable to provide support to custom protocols that you might be trying to process. Sorry about this,
Christoph
I am still puzzle about your answer.
Below is the smartseq3 yaml.
What are the file3 and file4 function for this pipline? What if I do not separate my data into patterns reads and internal reads, and than just setup the file1 and file2 like this: file1: name: /home/ccy/1-scrna-data-2023-11-14/rawdata/star-test-1/patterns_and_internal_1.fq.gz base_definition: - BC(12-17,33-40,56-63) - UMI(64-69) find_pattern: ATTGCGCAATG file2: name: /home/ccy/1-scrna-data-2023-11-14/rawdata/star-test-1/patterns_and_internal_2.fq.gz base_definition: - cDNA(1-150) What happen to those do not start with pattern's reads in file1, will they use to mapping? or will be drop?
@Dkaaaaa zUMIs will filter some low quality reads according to barcode and UMIs before go to STAR and i think that's why the number of input reads is less than in reads1 file.