rnaseqlib icon indicating copy to clipboard operation
rnaseqlib copied to clipboard

Creating gff from genePred returns error

Open adomingues opened this issue 9 years ago • 4 comments

I was trying to create custom miso annotation using these instructions but got the following error:

python /local/data/home/adomingu/env/lib/python2.7/site-packages/rnaseqlib/gff/gff_make_annotation.py ./ ./gff --flanking-rule commonshortest --genome-label Zv9.junker
Making GFF alternative events annotation...
  - UCSC tables read from: /fsimb/groups/imb-kettinggr/genomes/Danio_rerio/Ensembl/Zv9/Annotation/Archives/archive-2014-05-23-16-02-26/Genes/miso
  - Output dir: /fsimb/groups/imb-kettinggr/genomes/Danio_rerio/Ensembl/Zv9/Annotation/Archives/archive-2014-05-23-16-02-26/Genes/miso/gff
Loaded 1 UCSC tables.
Loading tables...
Traceback (most recent call last):
  File "/local/data/home/adomingu/env/lib/python2.7/site-packages/rnaseqlib/gff/gff_make_annotation.py", line 65, in <module>
    main()
  File "/local/data/home/adomingu/env/lib/python2.7/site-packages/rnaseqlib/gff/gff_make_annotation.py", line 61, in main
    make_annotation(args)
  File "/local/data/home/adomingu/env/lib/python2.7/site-packages/rnaseqlib/gff/gff_make_annotation.py", line 35, in make_annotation
    sanitize=args.sanitize)
  File "/local/data/home/adomingu/env/local/lib/python2.7/site-packages/rnaseqlib/events/defineEvents.py", line 933, in defineAllSplicing
    sg = splicegraph.SpliceGraph(table_fnames)
  File "/local/data/home/adomingu/env/local/lib/python2.7/site-packages/rnaseqlib/events/SpliceGraph.py", line 446, in __init__
    self.load_tables()
  File "/local/data/home/adomingu/env/local/lib/python2.7/site-packages/rnaseqlib/events/SpliceGraph.py", line 496, in load_tables
    self.tables[table_label] = parseTables.readTable(table_fname)
  File "/local/data/home/adomingu/env/local/lib/python2.7/site-packages/rnaseqlib/events/parseTables.py", line 56, in readTable
    for col_num in range(len(header))])
IndexError: list index out of range

Since was converting GTF to genePred using gtfToGenePred -genePredExt gtf genepred and the output was not exactly the same as represented in rnaseqlib's intructions:

gtfToGenePred -genePredExt Danio_rerio.Zv9.74.ImprovedUTRs.Junker_et_al_2014.noZv.so.gtf ensGene.txt
head ensGene.txt
ENSDART00000152494      chr1    -       2286    4806    4806    4806    2       2286,4578,      2340,4806,      0       ENSDARG00000096573      none     none    -1,-1,
ENSDART00000152276      chr1    +       3799    15148   15148   15148   8       3799,4645,5177,5532,6246,9049,9239,13072,       3930,4809,5202,5646,6393,9141,9360,15148,        0       ENSDARG00000076045      none    none    -1,-1,-1,-1,-1,-1,-1,-1,
ENSDART00000109479      chr1    +       3799    15138   15138   15138   9       3799,4645,5177,5532,6246,11049,11239,13072,14304,       3930,4809,5202,5646,6393,11141,11360,13958,15138,        0       ENSDARG00000076045      none    none    -1,-1,-1,-1,-1,-1,-1,-1,-1,

I tested it further using (i) the ensGene track from UCSC downloaded for both Zv9 and hg19; (ii) a file with the example genePred in the help page. All failed with the same message.

I have no idea of what the problem is.

adomingues avatar Apr 17 '15 12:04 adomingues