cDNA_Cupcake
cDNA_Cupcake copied to clipboard
No out put with get_abundance_post_collapse.py
Hi,
I tried get_abundance_post_collapse.py cDNA_Cupcake/cupcake/test_data/test.collapsed cDNA_Cupcake/cupcake/test_data/cluster_report.csv
My output: test.collapsed.read_stat.txt id is_fl stat pbid m64012_190727_053041/105120134/ccs Y unmapped NA m64012_190727_053041/21628950/ccs Y unmapped NA m64012_190727_053041/114950281/ccs Y unmapped NA m64012_190727_053041/87229073/ccs Y unmapped NA m64012_190727_053041/153355478/ccs Y unmapped NA m64012_190727_053041/26019010/ccs Y unmapped NA m64012_190727_053041/78905991/ccs Y unmapped NA m64012_190727_053041/5179482/ccs Y unmapped NA m64012_190727_053041/65865665/ccs Y unmapped NA m64012_190727_053041/62521911/ccs Y unmapped NA m64012_190727_053041/80284111/ccs Y unmapped NA m64012_190727_053041/16974171/ccs Y unmapped NA m64012_190727_053041/64291197/ccs Y unmapped NA
test.collapsed.abundance.txt
Field explanation
count_fl: Number of associated FL reads norm_fl: count_fl / total number of FL reads, mapped or unmapped Total Number of FL reads: 0
pbid count_fl norm_fl
Thank you so much
That's because they are all unmapped! Please make sure you actually have mapped transcripts. -Liz
Hi Liz,
I used the test data from cupcake. https://github.com/Magdoll/cDNA_Cupcake/tree/master/cupcake/test_data.
get_abundance_post_collapse.py cDNA_Cupcake/cupcake/test_data/test.collapsed cDNA_Cupcake/cupcake/test_data/cluster_report.csv
Thanks
Hi Liz, Were you able to look into it?
The test data transcripts seems to be mapped. test.collapsed.gff:- https://github.com/Magdoll/cDNA_Cupcake/blob/master/cupcake/test_data/test.collapsed.gff test.collapsed.read_stat.txt:- https://github.com/Magdoll/cDNA_Cupcake/blob/master/cupcake/test_data/test.collapsed.read_stat.txt test.collapsed.group.txt:- https://github.com/Magdoll/cDNA_Cupcake/blob/master/cupcake/test_data/test.collapsed.group.txt cluster_report.csv file: https://github.com/Magdoll/cDNA_Cupcake/blob/master/cupcake/test_data/cluster_report.csv
Thanks
Hi Liz,
I was wondering if you had had time to look on to this? I tried it as described here using the rest_data samples : https://github.com/Magdoll/cDNA_Cupcake/wiki/Cupcake:-supporting-scripts-for-Iso-Seq-after-clustering-step
minimap2 -ax splice -t 30 -uf --secondary=no -C5 hg38.fa hq_isoforms.fastq > hq_isoforms.fastq.sam
sort -k 3,3 -k 4,4n hq_isoforms.fastq.sam > hq_isoforms.fastq.sorted.sam
collapse_isoforms_by_sam.py --input hq_isoforms.fastq --fq -s hq_isoforms.fastq.sorted.sam --dun-merge-5-shorter -o test
get_abundance_post_collapse.py test.collapsed cluster_report.csv
I used cluster_report.csv (https://github.com/Magdoll/cDNA_Cupcake/blob/master/cupcake/test_data/cluster_report.csv)
I am really stuck here. I would really appreciate any help.
Thanks
I'm going through these data and have encountered the same issue where the abundance script is reporting unmapped reads, but the .sam shows mapped reads. Did you ever find a solution?
I was able to get this working by editing the values in the cluster_report file column "cluster_id".
I changed each entry from "transcript/#" to "sample1_HQ_transcript/#" . This matches the transcript name in the .sam header to the cluster_report cluster_id.
I had the same issue with the test and and my own. Seems like the issue is due to cluster id mismatch between the FASTA/FASTQ file (and all the downstream files generated using this by Cupcake) and the cluster_report.csv . I created a new issue to handle such cluster id mismatches (#229). Other relates issues: #61 #146