usegalaxy-playbook
usegalaxy-playbook copied to clipboard
Please add DADA2-formatted reference databases to test/main
The dada2 tools are currently installed on Galaxy test and will soon be installed on Galaxy main. Please add the dada2 reference datasets https://benjjneb.github.io/dada2/training.html so that the tools that require them are functional. I believe that the General FASTA release will be sufficient, but others may be requested. Here is the download link for the general fast release: https://doi.org/10.15156/BIO/786343.
I guess Silva is quite popular. Sometimes users prepare RDP because it comes with copy number variation data (if I remember correctly) but its older.
There is also quite a bit extra info in the data manager's help
@jennaj I have confirmed with the lab testing this pipeline that the General FASTA release https://doi.org/10.15156/BIO/786343 is what they need for reference datasets for their testing.
Is this already in the data manager (aka dada manager)?
I also just asked @martenson how to get these fixes https://github.com/galaxyproject/tools-iuc/pull/2705 applied to the tools on Galaxy test. I'm working with a lab doing some critical work with this pipeline. ;)
I ran the data manager and it appeared to succeed, but I couldn't find the data on Test. It looks like all the DMs we've installed lately are going to be messed up, e.g.:
<table comment_char="#" name="dada2_species">
<columns>value, name, path</columns>
<file path="/tmp/tool-data/toolshed.g2.bx.psu.edu/repos/iuc/dada2_filterandtrim/cc41546adf56/dada2_species.loc"/>
<tool_shed_repository>
<tool_shed>toolshed.g2.bx.psu.edu</tool_shed>
<repository_name>dada2_filterandtrim</repository_name>
<repository_owner>iuc</repository_owner>
<installed_changeset_revision>cc41546adf56</installed_changeset_revision>
</tool_shed_repository>
</table>
This is discussed in #31. Except unlike before, this is even more of a problem since we don't have the tool-data files in CVMFS to copy as described in step 3 - they were discarded after installation.
@natefoo @davebx thanks for everything you've done on this. Sorry this has created some issues.
I fixed all the paths and whatnot, but the DM fails. The handler logs:
galaxy.tools.data_manager.manager WARNING 2020-01-15 14:30:16,408 No values for data table "dada2_taxonomy" were returned by the data manager "toolshed.g2.bx.psu.edu/repos/iuc/data_manager_dada2/data_manager/dada2_fetcher/0.0.1".
However, the DM's primary output appears to return a data table entry:
{"data_tables": {"dada2_taxonomy": {"name": "UNITE: General Fasta release 8.0 for Fungi", "path": "unite_8.0_fungi.taxonomy", "taxlevels": "Kingdom,Phylum,Class,Order,Family,Genus,Species", "value": "unite_8.0_fungi"}}}
Anyone with a better understanding of DMs know what's going on here?
Interestingly... the log message references an old version of the DM (0.0.1) which I don't believe is even installed (both 0.0.7 and 0.0.8 appear to be installed, and 0.0.8 is the one that ran). It appears to come from the entry in shed_data_manager_conf.xml:
<data_manager guid="toolshed.g2.bx.psu.edu/repos/iuc/data_manager_dada2/data_manager/dada2_fetcher/0.0.1" id="dada2_fetcher" shed_conf_file="/cvmfs/test.galaxyproject.org/config/shed_tool_conf.xml">
<tool file="toolshed.g2.bx.psu.edu/repos/iuc/data_manager_dada2/f57c13f5878b/data_manager_dada2/data_manager/dada2_fetcher.xml" guid="toolshed.g2.bx.psu.edu/repos/iuc/data_manager_dada2/dada2_fetcher/0.0.7"><tool_shed>toolshed.g2.bx.psu.edu</tool_shed><repository_name>data_manager_dada2</repository_name><repository_owner>iuc</repository_owner><installed_changeset_revision>f57c13f5878b</installed_changeset_revision><id>toolshed.g2.bx.psu.edu/repos/iuc/data_manager_dada2/dada2_fetcher/0.0.7</id><version>0.0.7</version></tool><data_table name="dada2_taxonomy">
<output>
<column name="value" />
<column name="name" />
<column name="path" output_ref="out_file">
<move relativize_symlinks="True" type="file">
<source>${path}</source>
<target base="${GALAXY_DATA_MANAGER_DATA_PATH}">dada2/${path}</target>
</move>
<value_translation>${GALAXY_DATA_MANAGER_DATA_PATH}/dada2/${path}</value_translation>
<value_translation type="function">abspath</value_translation>
</column>
<column name="taxlevels" />
</output>
</data_table>
<data_table name="dada2_species">
<output>
<column name="value" />
<column name="name" />
<column name="path" output_ref="out_file">
<move relativize_symlinks="True" type="file">
<source>${path}</source>
<target base="${GALAXY_DATA_MANAGER_DATA_PATH}">dada2/${path}</target>
</move>
<value_translation>${GALAXY_DATA_MANAGER_DATA_PATH}/dada2/${path}</value_translation>
<value_translation type="function">abspath</value_translation>
</column>
</output>
</data_table>
</data_manager>
<data_manager guid="toolshed.g2.bx.psu.edu/repos/iuc/data_manager_dada2/data_manager/dada2_fetcher/0.0.1" id="dada2_fetcher" shed_conf_file="/cvmfs/test.galaxyproject.org/config/shed_tool_conf.xml">
<tool file="toolshed.g2.bx.psu.edu/repos/iuc/data_manager_dada2/bf7b2c14cabc/data_manager_dada2/data_manager/dada2_fetcher.xml" guid="toolshed.g2.bx.psu.edu/repos/iuc/data_manager_dada2/dada2_fetcher/0.0.8"><tool_shed>toolshed.g2.bx.psu.edu</tool_shed><repository_name>data_manager_dada2</repository_name><repository_owner>iuc</repository_owner><installed_changeset_revision>bf7b2c14cabc</installed_changeset_revision><id>toolshed.g2.bx.psu.edu/repos/iuc/data_manager_dada2/dada2_fetcher/0.0.8</id><version>0.0.8</version></tool><data_table name="dada2_taxonomy">
<output>
<column name="value" />
<column name="name" />
<column name="path" output_ref="out_file">
<move relativize_symlinks="True" type="file">
<source>${path}</source>
<target base="${GALAXY_DATA_MANAGER_DATA_PATH}">dada2/${path}</target>
</move>
<value_translation>${GALAXY_DATA_MANAGER_DATA_PATH}/dada2/${path}</value_translation>
<value_translation type="function">abspath</value_translation>
</column>
<column name="taxlevels" />
</output>
</data_table>
<data_table name="dada2_species">
<output>
<column name="value" />
<column name="name" />
<column name="path" output_ref="out_file">
<move relativize_symlinks="True" type="file">
<source>${path}</source>
<target base="${GALAXY_DATA_MANAGER_DATA_PATH}">dada2/${path}</target>
</move>
<value_translation>${GALAXY_DATA_MANAGER_DATA_PATH}/dada2/${path}</value_translation>
<value_translation type="function">abspath</value_translation>
</column>
</output>
</data_table>
</data_manager>
The correct version appears in the tool tag but not the data_manager tag. No idea if this is the problem, though.
I fixed the version and it's the same thing:
galaxy.tools.data_manager.manager WARNING 2020-01-15 15:25:18,524 No values for data table "dada2_taxonomy" were returned by the data manager "toolshed.g2.bx.psu.edu/repos/iuc/data_manager_dada2/data_manager/dada2_fetcher/0.0.8".
Hmm..strange. Thanks @natefoo for your help!
Btw. new data_manager with silva 138 available