isotools
isotools copied to clipboard
gff file incompatibility?
Hi Matthias, Thank you for making such a nice tool! I would be interested to use it but I cannot seem to format my gff so that it would be compatible. What version of tabix should I be using?
I'm getting the following error;
annotation_fn=f'sorted_fixed_input.gff3.gz' #create the IsoTools transcriptome object from the reference annotation isoseq=Transcriptome.from_reference(annotation_fn)
0%| | 0.00/15.1M [00:00<?, ?B/s]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[8], line 3
1 annotation_fn=f'sorted_fixed_input.gff3.gz'
2 #create the IsoTools transcriptome object from the reference annotation
----> 3 isoseq=Transcriptome.from_reference(annotation_fn)
File ~/anaconda3/envs/metacell/lib/python3.11/site-packages/isotools/transcriptome.py:55, in Transcriptome.from_reference(cls, reference_file, file_format, **kwargs)
53 tr = cls()
54 tr.chimeric = {}
---> 55 tr.data = import_ref_transcripts(reference_file, tr, file_format, **kwargs)
56 tr.infos = {'reference_file': reference_file, 'isotools_version': __version__}
57 tr.filter = {'gene': DEFAULT_GENE_FILTER.copy(),
58 'transcript': DEFAULT_TRANSCRIPT_FILTER.copy(),
59 'reference': DEFAULT_REF_TRANSCRIPT_FILTER.copy()}
File ~/anaconda3/envs/metacell/lib/python3.11/site-packages/isotools/_transcriptome_io.py:1064, in import_ref_transcripts(fn, transcriptome, file_format, chromosomes, gene_categories, short_exon_th, **kwargs)
1062 exons, transcripts, gene_infos, cds_start, cds_stop, skipped = _read_gtf_file(fn, chromosomes, **kwargs)
1063 else: # gff/gff3
-> 1064 exons, transcripts, gene_infos, cds_start, cds_stop, skipped = _read_gff_file(fn, chromosomes, **kwargs)
1066 if skipped:
1067 logger.info('skipped the following categories: %s', skipped)
File ~/anaconda3/envs/metacell/lib/python3.11/site-packages/isotools/_transcriptome_io.py:1012, in _read_gff_file(file_name, chromosomes, progress_bar)
1010 with tqdm(total=path.getsize(file_name), unit_scale=True, unit='B', unit_divisor=1024, disable=not progress_bar) as pbar, TabixFile(file_name) as gff:
1011 chrom_ids = get_gff_chrom_dict(gff, chromosomes)
-> 1012 for line in gff.fetch():
1013 file_pos = gff.tell() >> 16 # the lower 16 bit are the position within the zipped block
1014 if pbar.n < file_pos:
File ~/anaconda3/envs/metacell/lib/python3.11/site-packages/pysam/libctabix.pyx:499, in pysam.libctabix.TabixFile.fetch()
ValueError: could not create iterator, possible tabix version mismatch
Thank you very much for your help, Best, Ruth
Hi, A more recent htslib version (HTSlib/1.17-GCC-12.2.0) solved my issue.
Thanks, Best, Ruth
Hi, thank you for reporting. I will leave this open until I fixed the version for the dependencies.