cDNA_Cupcake icon indicating copy to clipboard operation
cDNA_Cupcake copied to clipboard

List index out of range error in collapse_isoforms_by_sam.py

Open hvbakel opened this issue 5 years ago • 3 comments

Dear Liz, When running cDNA_cupcake, I'm encountering the following error:

Traceback (most recent call last):
  File "/hpc/users/pintod02/.conda/envs/pbisoseq/bin/collapse_isoforms_by_sam.py", line 4, in <module>
    __import__('pkg_resources').run_script('cupcake==7.0', 'collapse_isoforms_by_sam.py')
  File "/hpc/users/pintod02/.conda/envs/pbisoseq/lib/python2.7/site-packages/pkg_resources/__init__.py", line 666, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/hpc/users/pintod02/.conda/envs/pbisoseq/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1453, in run_script
    exec(code, namespace, namespace)
  File "/hpc/users/pintod02/.conda/envs/pbisoseq/lib/python2.7/site-packages/cupcake-7.0-py2.7-linux-x86_64.egg/EGG-INFO/scripts/collapse_isoforms_by_sam.py", line 248, in <module>
    main(args)
  File "/hpc/users/pintod02/.conda/envs/pbisoseq/lib/python2.7/site-packages/cupcake-7.0-py2.7-linux-x86_64.egg/EGG-INFO/scripts/collapse_isoforms_by_sam.py", line 207, in main
    collapse_fuzzy_junctions(f_good.name, f_txt.name, args.allow_extra_5exon, internal_fuzzy_max_dist=args.max_fuzzy_junction)
  File "/hpc/users/pintod02/.conda/envs/pbisoseq/lib/python2.7/site-packages/cupcake-7.0-py2.7-linux-x86_64.egg/EGG-INFO/scripts/collapse_isoforms_by_sam.py", line 152, in collapse_fuzzy_junctions
    _size = get_fl_from_id(group_info[pbid])
  File "/hpc/users/pintod02/.conda/envs/pbisoseq/lib/python2.7/site-packages/cupcake-7.0-py2.7-linux-x86_64.egg/EGG-INFO/scripts/collapse_isoforms_by_sam.py", line 92, in get_fl_from_id
    return sum(int(_id.split('/')[1].split('p')[0][1:]) for _id in members)
  File "/hpc/users/pintod02/.conda/envs/pbisoseq/lib/python2.7/site-packages/cupcake-7.0-py2.7-linux-x86_64.egg/EGG-INFO/scripts/collapse_isoforms_by_sam.py", line 92, in <genexpr>
    return sum(int(_id.split('/')[1].split('p')[0][1:]) for _id in members)
IndexError: list index out of range

Any idea what the issue could be?

hvbakel avatar Jun 03 '19 13:06 hvbakel

This is usually a sequence ID mis-match issue. That said, you are on older versions of Cupcake - can you please update to latest (v7.5) first and if errors remain, can you share with me your command that you used and the sequence format? (just give me like the first 5 sequence IDs)

Magdoll avatar Jun 03 '19 18:06 Magdoll

I am getting the same thing. Traceback below:

Traceback (most recent call last): File "/opt/conda/bin/collapse_isoforms_by_sam.py", line 235, in <module> main(args) File "/opt/conda/bin/collapse_isoforms_by_sam.py", line 185, in main for recs in iter: # recs is {'+': list of list of records, '-': list of list of records} File "/opt/conda/lib/python3.7/site-packages/cupcake/tofu/branch/branch_simple2.py", line 81, in iter_gmap_sam records = [next(quality_alignments)] File "/opt/conda/lib/python3.7/site-packages/cupcake/tofu/branch/branch_simple2.py", line 108, in get_quality_alignments for r in gmap_sam_reader: File "/opt/conda/lib/python3.7/site-packages/cupcake/io/BioReaders.py", line 377, in __next__ return GMAPSAMRecord(line, self.ref_len_dict, self.query_len_dict) File "/opt/conda/lib/python3.7/site-packages/cupcake/io/BioReaders.py", line 182, in __init__ self.process(record_line, ref_len_dict, query_len_dict) File "/opt/conda/lib/python3.7/site-packages/cupcake/io/BioReaders.py", line 400, in process self.sID = raw[2] IndexError: list index out of range

Here is my command:

collapse_isoforms_by_sam.py --input m64120_200619_171832.flnc.clustered.fasta -s mapped_isoseq_reads/m64120_200619_171832.flnc.clustered.fasta.sorted.sam --dun-merge-5-shorter -o m64120_200619_171832collapse_isoforms_by_sam.py --input m64120_200619_171832.flnc.clustered.fasta -s mapped_isoseq_reads/m64120_200619_171832.flnc.clustered.fasta.sorted.sam --dun-merge-5-shorter -o m64120_200619_171832

Does collapse_isoforms_by_sam.py take fasta as input?

dtlyfoung avatar Jul 24 '20 03:07 dtlyfoung

My issue ended up being an upstream isoseq3 issue. I was on version 3.2.2 and upgraded to 3.3.0. Must have been something with the headers that were coming out of v.3.2.2 output vs. 3.3.0.

dtlyfoung avatar Jul 24 '20 23:07 dtlyfoung