GFFtools-GX
GFFtools-GX copied to clipboard
gff_to_bed TypeError: len() of unsized object
I encountered this issue in converting GCF_000001405.33_GRCh38.p7_genomic.gff.
Traceback (most recent call last):
File "/.../gff_to_bed.py", line 119, in <module>
__main__()
File "/.../gff_to_bed.py", line 116, in __main__
writeBED(Transcriptdb)
File "/.../gff_to_bed.py", line 55, in writeBED
exon_cnt = len(ent1['exons'][idx])
TypeError: len() of unsized object
There are some cases where a feature has no records that would go into blocks and so is represented by a 0-dimensional numpy array containing nan
. I feel like there could be some deeper problems here but in the end the following fix seems to have worked:
def writeBED(tinfo):
"""
writing result files in bed format
@args tinfo: list of genes
@type tinfo: numpy object
"""
for ent1 in tinfo:
child_flag = False
for idx, tid in enumerate(ent1['transcripts']):
child_flag = True
exon_cnt = 0
exon_len = ''
exon_cod = ''
rel_start = None
rel_stop = None
if ent1['exons'][idx].ndim > 0:
exon_cnt = len(ent1['exons'][idx])
for idz, ex_cod in enumerate(ent1['exons'][idx]):#check for exons of corresponding transcript
exon_len += '%d,' % (ex_cod[1]-ex_cod[0]+1)
if idz == 0: #calculate the relative start position
exon_cod += '0,'
rel_start = int(ex_cod[0])-1
rel_stop = int(ex_cod[1])
else:
exon_cod += '%d,' % (ex_cod[0]-1-rel_start) ## shifting the coordinates to zero
rel_stop = int(ex_cod[1])
...