GFFtools-GX icon indicating copy to clipboard operation
GFFtools-GX copied to clipboard

gff_to_bed TypeError: len() of unsized object

Open tzintzuni opened this issue 8 years ago • 0 comments

I encountered this issue in converting GCF_000001405.33_GRCh38.p7_genomic.gff.

Traceback (most recent call last):
  File "/.../gff_to_bed.py", line 119, in <module>
    __main__() 
  File "/.../gff_to_bed.py", line 116, in __main__
    writeBED(Transcriptdb)
  File "/.../gff_to_bed.py", line 55, in writeBED
    exon_cnt = len(ent1['exons'][idx])
TypeError: len() of unsized object

There are some cases where a feature has no records that would go into blocks and so is represented by a 0-dimensional numpy array containing nan. I feel like there could be some deeper problems here but in the end the following fix seems to have worked:

def writeBED(tinfo):
    """
    writing result files in bed format 

    @args tinfo: list of genes 
    @type tinfo: numpy object  
    """

    for ent1 in tinfo:
        child_flag = False  

        for idx, tid in enumerate(ent1['transcripts']):
            child_flag = True 
            exon_cnt = 0
            exon_len = ''
            exon_cod = '' 
            rel_start = None 
            rel_stop = None 
            if ent1['exons'][idx].ndim > 0:
                exon_cnt = len(ent1['exons'][idx])
                for idz, ex_cod in enumerate(ent1['exons'][idx]):#check for exons of corresponding transcript  
                    exon_len += '%d,' % (ex_cod[1]-ex_cod[0]+1)
                    if idz == 0: #calculate the relative start position 
                        exon_cod += '0,'
                        rel_start = int(ex_cod[0])-1 
                        rel_stop = int(ex_cod[1])
                    else:
                        exon_cod += '%d,' % (ex_cod[0]-1-rel_start) ## shifting the coordinates to zero 
                        rel_stop = int(ex_cod[1])
...

tzintzuni avatar Jul 20 '16 17:07 tzintzuni