galaxy
galaxy copied to clipboard
composite data wo primary overwrites other test parameters
This loop https://github.com/galaxyproject/galaxy/blob/e1befbd9c570844f1a305d2192c9a8c1a08354f2/lib/galaxy/tool_util/verify/interactor.py#L397
iterates over the inputs of the test and tests if the value is found in the uploads. For composite inputs without primary file this does lead to a bug:
- inputs_tree is
{'input': [''], 'select_in': ['']} - self.uploads is
{'': {'src': 'hda', 'id': '2891970512fa2d5a'}, '_COMPOSITE_RENAMED_t0_d49209a9b0ae11ea82f4b88a60b2bbaa': {'src': 'hda', 'id': '2891970512fa2d5a'}}
So, after the loop inputs_tree is {'input': {'src': 'hda', 'id': '2891970512fa2d5a'}, 'select_in': {'src': 'hda', 'id': '2891970512fa2d5a'}}.
What comes to my mind as first is to allow for special entries to the value attribute that are not treated as files, e.g. everything starting with double underscore.
Seems that setting any value to value already fixes this, but I do not understand why? Isn't it possible to upload the primary file.
If this is the solution I guess its sufficient to test in the tool parser/linter that value must not be empty.
Upload of basic composite files seems not to work. I created composite_blastdbp.xml (to test create simply create 4 empty files).
Error is: 'A required composite file (blastdb.psq) was not specified.'. If one adds one of the optional composite files the psq file appears.
Going back some years, I never got upload of a BLAST DB to work. Functionally it was enough to upload a FASTA file and rebuild the DB within Galaxy.
@bernt-matthias I think https://github.com/galaxyproject/galaxy/pull/9885/commits/1553b07b7b2e530f31d50d4dd921b600b6dd9e59 should fix the parameter collision so that the __MOCKENTRY__ workaround shouldn't be necessary. If you can add test/functional/tools/composite_blastdbp.xml to test/functional/tools/samples_tool_conf.xml and assert that the select_in parameter doesn't get overwritten I think this is ready to go.
Thanks @mvdbeek .. lets see if the test works.
@bernt-matthias I fixed a couple of bugs here (seems like we were never able to test composite datatypes that were not auto_primary_file), but one thing I'm not sure about ... are blastdb databases really basic datatypes ? If so, composite_blastdbp.xml isn't really testing the original issue, right?
Yep its a basic data type: https://github.com/galaxyproject/galaxy/blob/2fd2412fe42c4a396619db4611613126f6e00f08/lib/galaxy/datatypes/blast.py#L271
test/functional/tools/metadata.xml
not ok 107 test/functional/tools/metadata.xml
-:13: element composite_data: Schemas validity error : Element 'composite_data': The attribute 'name' is required but missing.
-:14: element composite_data: Schemas validity error : Element 'composite_data': The attribute 'name' is required but missing.
-:15: element composite_data: Schemas validity error : Element 'composite_data': The attribute 'name' is required but missing.
- fails to validate
Seems that the name attribute is not covered by the xsd file. Also the composite_data tag is not shown in https://docs.galaxyproject.org/en/master/dev/schema.htm .. is this desired?
I added name as required, fixing that now
Ok. Then please drop my last commit and I will wait until further notice.
Well this turned into implementing tests for basic type composite datasets. Unfortunately there is a bunch of auto_primary_file composite tests that wrongly specify a primary file to upload, which breaks the assumption in https://github.com/galaxyproject/galaxy/blob/71f82958ba04f5259858dc414e70ac925586ca2b/lib/galaxy/tool_util/verify/interactor.py#L320-L335
in which I thought that auto_primary_filetests are not going to specify a primary file to upload.
And it doesn't look like there is a good way around it, since the test parsing does not have access to the datatype registry (which is good, I think), and the job execution slices the job execution up before we can special-case the inputs.
So ... should we break these "faulty" auto_primary_file composite tests?
The symptoms here would be that when you upload these you would now have the uploaded primary composite file replace the first composite_data element, and the last composite_data element would be in a different execution slice 🔥 ... so without a Galaxy developer at hand that would be difficult to debug.
So I grepped for composite_data in the IUC, devteam and bgruening repo. Looks like there's just velvet that would break, so I'm inclined to do that.
Since this would also be in agreement with the docs (https://docs.galaxyproject.org/en/master/dev/data_types.html#defining-auto-primary-file-composite-datatypes) I also think that breaking (aka fixing) these tests is the way to proceed.
At some point it might be nice to have also docs for the test definition (maybe as a first step we could link to the test tools)
At some point it might be nice to have also docs for the test definition (maybe as a first step we could link to the test tools)
👍
I'll move the milestone for now, I don't think I'll have time to get this done before the freeze, but we can always get this in since this fixes a bug.
Pushing the milestone, but this is a bug fix and we could merge whenever I think. Want to rebase and were the last of Marius' comments addressed?