Finder icon indicating copy to clipboard operation
Finder copied to clipboard

Output files not always containing CDS filed

Open vccarneiro opened this issue 2 years ago • 3 comments

Hello,

I was checking the output files because I want to generate a protein fasta in order to run BUSCO. I realized that some files like "combined_with_CDS_high_and_low_confidence_merged.gtf" contain a CDS field for some genes, but for other genes not. Why is that? Exons and CDS coordinates are not always identical. I would appreciate it if you tell me how to obtain the aminoacid sequences corresponding only to CDS coordinates of all genes from the above file. So far, I used the Augustus python script "etAnnoFastaFromJoingenes.py", but it can only extract sequences from .gtf files with annotated CDS coordinates. Any help would be appreciated. Best wishes, Vitor.

vccarneiro avatar Jan 26 '22 15:01 vccarneiro