ribotricer
ribotricer copied to clipboard
ORF Annotations
Hello,
I was wondering if there is a way to further break down ORF annotations presented by the ribotricer algorithm.
I would like to get a sense for proportion of different non-canonical ORF types such as extended, truncated, polycistronic, or internal ORFs.
While it is currently not supported internally, if you pass it a custom index file following the format of the index file with these annotations, it is doable.
Hello,
I would like to give an index file of custom ORFs of which I need to check whether these are actually translated. I created a tsv file trying to imitate the Ribotricer index:
ORF_ID ORF_type transcript_id transcript_type gene_id gene_name gene_type chrom strand start_codon coordinate ENSG00000000971|ENST00000695973:ORF-62539:240:1977:1937:1977 novel ENST00000695973 novel ENSG00000000971 xxx xxx 1 +NNN 196720308-196720347 ENSG00000000971|ENST00000695986:ORF-65386:62:491:412:491 novel ENST00000695986 novel ENSG00000000971 xxx xxx 1 +NNN 196675450-196675528 ENSG00000000971|ENST00000696026:ORF-57440:84:1800:1780:1800 novel ENST00000696026 novel ENSG00000000971 xxx xxx 1 +NNN 196715770-196715773,196725127-196725142
As you can see I left out some information e.g. the start codon, gene name and gene type and also the ORF ID looks differently. Ribotricer cannot use this file I think as I now get "0lines [00:00, ?lines/s]" in the log file which did not happen before when I used Ribotricer to detect the ORFs. And I do not get any translated ORFs, so I think I need to adjust the format, do you come to know which part of the file I need to adjust or whether all columns need to look exactly like Ribotricer defines it?
Thanks a lot and best regards,
Christina
Hi @ChrissiKalk97 - custom index formats are currently not supported. My suggestion would be to create an index file which has exactly the same columns as required by ribotricer index. Most important ones are the chromosome, start, and end locations - you can put the start codon, gene_name and gene_type to be dummys (though they should not be that difficult to annotate), please asssigna a unique ORF_ID to each ORF.