BRAKER
BRAKER copied to clipboard
The easiest way to convert braker output to EVM
Run the braker with --gff3 parametrs. The output file braker.gff3 can be converted to EVM GFF3 format using augustus_GFF3_to_EVM_GFF3.pl
. However, before coveting, you need delete the ;
at the end of lline of braker.gff3. For example
$ head braker.gff3
HiC_scaffold_3 AUGUSTUS gene 1720719 1722807 . + . ID=jg51092;
HiC_scaffold_3 AUGUSTUS mRNA 1720719 1722807 . + . ID=jg51092.t1;Parent=jg51092;
HiC_scaffold_3 AUGUSTUS start_codon 1720719 1720721 . + 0 ID=jg51092.t1.start1;Parent=jg51092.t1;
HiC_scaffold_3 AUGUSTUS CDS 1720719 1720866 0.78 + 0 ID=jg51092.t1.CDS1;Parent=jg51092.t1;
$ sed -i 's/;$//g' braker.gff3
$ head braker.gff3
HiC_scaffold_3 AUGUSTUS gene 1720719 1722807 . + . ID=jg51092
HiC_scaffold_3 AUGUSTUS mRNA 1720719 1722807 . + . ID=jg51092.t1;Parent=jg51092
HiC_scaffold_3 AUGUSTUS start_codon 1720719 1720721 . + 0 ID=jg51092.t1.start1;Parent=jg51092.t1
HiC_scaffold_3 AUGUSTUS CDS 1720719 1720866 0.78 + 0 ID=jg51092.t1.CDS1;Parent=jg51092.t1
$ EVidenceModeler-1.1.1/EvmUtils/misc/augustus_GFF3_to_EVM_GFF3.pl braker.gff3 > braker.evm.gff3
$head braker.evm.gff3
HiC_scaffold_99 Braker gene 5882 6218 . + . ID=gene.file_1_file_1_jg16.t1;Name=Braker%20prediction
HiC_scaffold_99 Braker mRNA 5882 6218 . + . ID=model.file_1_file_1_jg16.t1;Parent=gene.file_1_file_1_jg16.t1;Name=Braker%20prediction
HiC_scaffold_99 Braker exon 5882 6218 . + . ID=model.file_1_file_1_jg16.t1.exon1;Parent=model.file_1_file_1_jg16.t1
HiC_scaffold_99 Braker CDS 5882 6218 . + . ID=cds.model.file_1_file_1_jg16.t1;Parent=model.file_1_file_1_jg16.t1
If you don't do like this, the ;
at the end of line will cause error in the converted EVM GFF3 file.
$ EVidenceModeler-1.1.1/EvmUtils/misc/augustus_GFF3_to_EVM_GFF3.pl braker.gff3|head
HiC_scaffold_96 Augustus gene 13129 13184 . - . ID=gene.file_1_file_1_jg32.t1;;Name=Augustus%20prediction
HiC_scaffold_96 Augustus mRNA 13129 13184 . - . ID=model.file_1_file_1_jg32.t1;;Parent=gene.file_1_file_1_jg32.t1;;Name=Augustus%20prediction
HiC_scaffold_96 Augustus exon 13129 13184 . - . ID=model.file_1_file_1_jg32.t1;.exon1;Parent=model.file_1_file_1_jg32.t1
HiC_scaffold_96 Augustus CDS 13129 13184 . - . ID=cds.model.file_1_file_1_jg32.t1;;Parent=model.file_1_file_1_jg32.t1
As you can see, in the exon
line, an incorrect ;
appear in the middle of ID=model.file_1_file_1_jg32.t1;.exon1;
. The incorrect ;
will cause ERROR, CDS cds.model.file_1_file_1_jg32.t1.HiC_scaffold_14:43163129-43163185 does not fully map within an exon record.
error when using validator from EVM.
The braker.gtf file also can be convert to gff3 format with the method in #123, then convert gff3 file following above steps.
The braker.gtf can be converted using augustus_GTF_to_EVM_GFF3.pl
, if you have corrected the order of gene_id
and transcript_id
in the 9th column of gtf manually.
Incorrect order will cause Error, cannot parse gene_id and transcript_id from HiC_scaffold_13
error .
However, the EVM GFF3 file produced by augustus_GTF_to_EVM_GFF3.pl
will assigned same id to different genes, and will cause Error, feature: HiC_scaffold_5-jg35973 is described multiple times with different data values:
using validator from EVM.
wonderful