tools-devteam icon indicating copy to clipboard operation
tools-devteam copied to clipboard

gi2taxonomy seems outdated

Open bernt-matthias opened this issue 8 years ago • 3 comments

The tool depends on package_taxonomy_1_0_0 which seems to be replaced by data_manager_fetch_ncbi_taxonomy.

Furthermore the data for the tool data is hard coded in the python file: see GI2TAX, NAME_FILE, NODE_FILE. It would be nice if

  • GI2tax could be taken from tool data of gi2taxonomy or the history and
  • the latter two somehow provided by the data_manager_fetch_ncbi_taxonomy.

This would be nice because the tool could then map arbitrary IDs (e.g. uniprot ids) to the taxonomy, given a mapping from the IDs to NCBI taxids.

One could also thing about to simplify the tool such that it creates the table only from the taxids and the mapping from the gi (or other ids) could be done with other tools (e.g. join).

bernt-matthias avatar Aug 28 '17 12:08 bernt-matthias

The package_* should be replaced by a Conda package. It is compiling some C program. The data_manager is there to retrieve data sued by the compiled tool. That said the mentioned improvements are much needed!

bgruening avatar Aug 28 '17 20:08 bgruening

OK. I would prefer to extend the ete package instead (https://toolshed.g2.bx.psu.edu/view/earlhaminst/ete/a4ba317fc713). The API has the same funtionality -- and can even search for names and taxids. I will try to do this these days. I just have two questions:

  • how to use the data from data_manager_fetch_ncbi_taxonomy in the tool. maybe you can name me a tool that uses it already.
  • how to create the ete3 sqlite db from the ncbi taxonomy dump .. maybe it would be an idea to extend the data manager?

bernt-matthias avatar Aug 29 '17 08:08 bernt-matthias

I just started: see here: https://github.com/TGAC/earlham-galaxytools/pull/90

bernt-matthias avatar Aug 29 '17 13:08 bernt-matthias