cDNA_Cupcake icon indicating copy to clipboard operation
cDNA_Cupcake copied to clipboard

No out put with get_abundance_post_collapse.py

Open B10inform opened this issue 3 years ago • 7 comments

Hi,

I tried get_abundance_post_collapse.py cDNA_Cupcake/cupcake/test_data/test.collapsed cDNA_Cupcake/cupcake/test_data/cluster_report.csv

My output: test.collapsed.read_stat.txt id is_fl stat pbid m64012_190727_053041/105120134/ccs Y unmapped NA m64012_190727_053041/21628950/ccs Y unmapped NA m64012_190727_053041/114950281/ccs Y unmapped NA m64012_190727_053041/87229073/ccs Y unmapped NA m64012_190727_053041/153355478/ccs Y unmapped NA m64012_190727_053041/26019010/ccs Y unmapped NA m64012_190727_053041/78905991/ccs Y unmapped NA m64012_190727_053041/5179482/ccs Y unmapped NA m64012_190727_053041/65865665/ccs Y unmapped NA m64012_190727_053041/62521911/ccs Y unmapped NA m64012_190727_053041/80284111/ccs Y unmapped NA m64012_190727_053041/16974171/ccs Y unmapped NA m64012_190727_053041/64291197/ccs Y unmapped NA

test.collapsed.abundance.txt


Field explanation

count_fl: Number of associated FL reads norm_fl: count_fl / total number of FL reads, mapped or unmapped Total Number of FL reads: 0

pbid count_fl norm_fl

Thank you so much

B10inform avatar Jan 10 '22 18:01 B10inform

That's because they are all unmapped! Please make sure you actually have mapped transcripts. -Liz

Magdoll avatar Jan 12 '22 13:01 Magdoll

Hi Liz,

I used the test data from cupcake. https://github.com/Magdoll/cDNA_Cupcake/tree/master/cupcake/test_data.

get_abundance_post_collapse.py cDNA_Cupcake/cupcake/test_data/test.collapsed cDNA_Cupcake/cupcake/test_data/cluster_report.csv

Thanks

B10inform avatar Jan 12 '22 21:01 B10inform

Hi Liz, Were you able to look into it?

The test data transcripts seems to be mapped. test.collapsed.gff:- https://github.com/Magdoll/cDNA_Cupcake/blob/master/cupcake/test_data/test.collapsed.gff test.collapsed.read_stat.txt:- https://github.com/Magdoll/cDNA_Cupcake/blob/master/cupcake/test_data/test.collapsed.read_stat.txt test.collapsed.group.txt:- https://github.com/Magdoll/cDNA_Cupcake/blob/master/cupcake/test_data/test.collapsed.group.txt cluster_report.csv file: https://github.com/Magdoll/cDNA_Cupcake/blob/master/cupcake/test_data/cluster_report.csv

Thanks

B10inform avatar Jan 19 '22 18:01 B10inform

Hi Liz,

I was wondering if you had had time to look on to this? I tried it as described here using the rest_data samples : https://github.com/Magdoll/cDNA_Cupcake/wiki/Cupcake:-supporting-scripts-for-Iso-Seq-after-clustering-step

minimap2 -ax splice -t 30 -uf --secondary=no -C5 hg38.fa hq_isoforms.fastq > hq_isoforms.fastq.sam

sort -k 3,3 -k 4,4n hq_isoforms.fastq.sam > hq_isoforms.fastq.sorted.sam

collapse_isoforms_by_sam.py --input hq_isoforms.fastq --fq -s hq_isoforms.fastq.sorted.sam --dun-merge-5-shorter -o test

get_abundance_post_collapse.py test.collapsed cluster_report.csv

I used cluster_report.csv (https://github.com/Magdoll/cDNA_Cupcake/blob/master/cupcake/test_data/cluster_report.csv)

I am really stuck here. I would really appreciate any help.

Thanks

B10inform avatar Feb 11 '22 01:02 B10inform

I'm going through these data and have encountered the same issue where the abundance script is reporting unmapped reads, but the .sam shows mapped reads. Did you ever find a solution?

KrRi4 avatar Apr 27 '22 18:04 KrRi4

I was able to get this working by editing the values in the cluster_report file column "cluster_id".
I changed each entry from "transcript/#" to "sample1_HQ_transcript/#" . This matches the transcript name in the .sam header to the cluster_report cluster_id.

KrRi4 avatar Apr 28 '22 19:04 KrRi4

I had the same issue with the test and and my own. Seems like the issue is due to cluster id mismatch between the FASTA/FASTQ file (and all the downstream files generated using this by Cupcake) and the cluster_report.csv . I created a new issue to handle such cluster id mismatches (#229). Other relates issues: #61 #146

ssutharzan avatar Jun 15 '22 19:06 ssutharzan