tools-iuc
tools-iuc copied to clipboard
Add format feature for downloading multiple files with PyEGA
FOR CONTRIBUTOR:
- [x] - I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
- [x] - License permits unrestricted use (educational + commercial)
- [ ] - This PR adds a new tool or tool collection
- [x] - This PR updates an existing tool or tool collection
- [ ] - This PR does something else (explain below)
I have added a parameter to provide the format of the download files when downloading multiple files with pyega. Before everything would be interpreted as a 'data' format. I tested if auto_format="true" in a collection is possible but it would raise an error. For now I've added the data types I see the most when working on Galaxy, but please add any that you think are important. I think adding all data types would be very overkill.
raise an error
What is the error?
I meant that when I do planemo lint it raises the following error.
When I add auto_format="true" to the collection element:
ERROR: Invalid XML found in file: pyega3.xml. Errors [/mnt/e/CINECA/tools-iuc/tools/pyega3/tmpn8thf0ni:151:0:ERROR:SCHEMASV:SCHEMAV_CVC_COMPLEX_TYPE_3_2_1: Element 'collection', attribute 'auto_format': The attribute 'auto_format' is not allowed.
When I add auto_format="true" to the discover_datasets element:
.. ERROR: Invalid XML found in file: pyega3.xml. Errors [/mnt/e/CINECA/tools-iuc/tools/pyega3/tmpkqql5p5r:153:0:ERROR:SCHEMASV:SCHEMAV_CVC_COMPLEX_TYPE_3_2_1: Element 'discover_datasets', attribute 'auto_format': The attribute 'auto_format' is not allowed.]
So basically there is no way to auto format files in a collection, right?
You are right, this feature is missing https://github.com/galaxyproject/galaxy/pull/11754
But actually pattern="__designation_and_ext__" should do the trick. What are the file names where Galaxy detects data.
I'm only sure for vcf.gz files which are interpreted as gz files instead of vcf_bgzip.
An example of a filename: Case5_F.17.g.vcf.gz maybe it is because of the .g?
I think the tool would work as is, if you would add a command to the command block which renames all vcf.gz to vcf_bgzip.
The same should be done for all files that we may get from ega where the file extension does not match the extension of the cordoned galaxy datatype.
Workflow keeps failing because of time out.
Timed out after 900.25 seconds waiting on tool test run.
I guess EGA is just slow sometimes, since the test finishes correctly locally.
@bernt-matthias is it possible increase the time-out threshold or is this error raised for a different reason?
@bernt-matthias is it possible increase the time-out threshold or is this error raised for a different reason?
This timeout applies to all tool tests. If we change it then the change applies to all tools.