funannotate
funannotate copied to clipboard
What are differences between --augustus-species and --busco_seed_species options of predict subcommand?
I am learning the usage of funannotate
to annotate a genome with any other evidence such as RNA sequencing data. Reading the document, I found two usages of funannotate predicet
to annotate a genome with only a genome sequence file as following:
First, at
https://funannotate.readthedocs.io/en/latest/predict.html#explanation-of-inputs-and-options
funannotate predict -i mygenome.fa -o output_folder -s "Aspergillus nidulans"
--augustus_species anidulans
Second, at
https://funannotate.readthedocs.io/en/latest/tutorials.html#genome-assembly-only
funannotate predict -i MyAssembly.fa -o fun \
--species "Pseudogenus specicus" --strain JMP12345 \
--busco_seed_species botrytis_cinerea --cpus 12
funannotate predicet
prints out the two options like this:
--augustus_species Augustus species config. Default: uses species name
--busco_seed_species Augustus pre-trained species to start BUSCO. Default: anidulans
First, I do not know whether I should use both or either of them. Second, option arguments for both can be chosen from the first colmun of funannotate species
output:
$ funannotate species
Species Augustus GeneMark Snap GlimmerHMM CodingQuarry Date
Conidiobolus_coronatus augustus pre-trained None None None None 2021-09-11
E_coli_K12 augustus pre-trained None None None None 2021-09-11
Xipophorus_maculatus augustus pre-trained None None None None 2021-09-11
But, I do not know what the default of --augustus_species
(Default: uses species name) means. Could you please explain what any differences of the two options are or point me to where I should look at to learn it in the document available at
https://funannotate.readthedocs.io/en/latest/index.html
Thank you!
--busco_seed_species
is the species that is used to run BUSCO (otherwise it defaults to anidulans), the BUSCO results are then used to de novo train Augustus. --augustus_species
is for specifying a specific pre-trained species to run Augutustus directly, if you specify --augustus_species
then it will not train Augustus, but rather just run Augustus with those parameters. If --augustus_species
is not set, then the training set is derived from the --species --strain --isolate
parameters, ie if you passed:
funannotate predict --species "Aspergillus nidulans" --isolate ABC123
Then the script will turn this into aspergillus_nidulans_ABC123
as the parameter to --augustus_species
, since that doesn't exist, it will then run BUSCO and use those results to train Augustus.