GTDBTk
GTDBTk copied to clipboard
de_novo_workflow fails to parse gtdb classify output
Hello,
I am trying to build a tree integrating my genomes with the GTDB refs. After the tree is built, the workflow fails with the error
ERROR: Not a tab-separated line: user_genome classification fastani_reference fastani_reference_radius fastani_taxonomy fastani_ani fastani_af closest_placement_reference closest_placement_radius closest_placement_taxonomy closest_placement_ani closest_placement_af pplacer_taxonomy classification_method note other_related_references(genome_id,species_name,radius,ANI,AF) msa_percent translation_table red_value warnings
I had provided the file produced by gtdb classify to the flag --gtdbtk_classification_file.
Hello, thanks for submitting this issue. This will be fixed in the next release.
As a workaround, you can manually specify the taxonomy using the --custom_taxonomy_file
instead.
Hi @aaronmussig
As a workaround, you can manually specify the taxonomy using the
--custom_taxonomy_file
instead.
Do you mean by this: cut columns 1 and 2 (user_genome
and classification
) from the classify_wf output file, and pass those via --custom_taxonomy_file
?
More generally, do these options have the same purpose, i.e. to decorate the tree?
Hi @zwets I am facing a similar issue. Could you solve by cutting the first two columns and providing as --custom_taxonomy_file?
@adityabandla
I am facing the same issue. Could you let me know if you could solve this problem.