toil
toil copied to clipboard
CWL job chaining tests
This PR is a prototype for the test suite to determine if Toil is chaining CWL jobs as we expect / would like.
I'm using a list of Toil tasks, each containing a list of the CWL jobs completed within that task, to test if chaining has occurred or not. The list of tasks is compiled from the stats outputs (requiring the --stats and --logDebug flags) - if a more robust method for getting this information could be suggested I'd be very happy to change this code.
To show the expected outputs I've included an example of two CWL workflows, and a test which one fails and the other passes (because of the subworkflow wrapping the first 2 CWL jobs). While we're working out the best way to write these tests I'll work on more substantial tests for the final setup.
Changelog Entry
To be copied to the draft changelog by merger:
- PR submitter writes their recommendation for a changelog entry here
Reviewer Checklist
- [ ] Make sure it is coming from
issues/XXXX-fix-the-thing
in the Toil repo, or from an external repo.- [ ] If it is coming from an external repo, make sure to pull it in for CI with:
contrib/admin/test-pr otheruser theirbranchname issues/XXXX-fix-the-thing
- [ ] If there is no associated issue, create one.
- [ ] If it is coming from an external repo, make sure to pull it in for CI with:
- [ ] Read through the code changes. Make sure that it doesn't have:
- [ ] Addition of trailing whitespace.
- [ ] New variable or member names in
camelCase
that want to be insnake_case
. - [ ] New functions without type hints.
- [ ] New functions or classes without informative docstrings.
- [ ] Changes to semantics not reflected in the relevant docstrings.
- [ ] New or changed command line options for Toil workflows that are not reflected in
docs/running/cliOptions.rst
- [ ] New features without tests.
- [ ] Comment on the lines of code where problems exist with a review comment. You can shift-click the line numbers in the diff to select multiple lines.
- [ ] Finish the review with an overall description of your opinion.
Merger Checklist
- [ ] Make sure the PR passes tests.
- [ ] Make sure the PR has been reviewed since its last modification. If not, review it.
- [ ] Merge with the Github "Squash and merge" feature.
- [ ] If there are multiple authors' commits, add Co-authored-by to give credit to all contributing authors.
- [ ] Copy its recommended changelog entry to the Draft Changelog.
- [ ] Append the issue number in parentheses to the changelog entry.
@douglowe Are both tests supposed to pass?
@douglowe Are both tests supposed to pass?
The second test should fail. Really I should have made the test not equal, but for this prototype I wanted it to display what was being tested, so made it fail the test, rather than testing for failure.
@douglowe Are both tests supposed to pass?
The second test should fail. Really I should have made the test not equal, but for this prototype I wanted it to display what was being tested, so made it fail the test, rather than testing for failure.
Okay, I added https://github.com/DataBiosphere/toil/pull/3696/commits/72df69dcbbec6b8c92a33c3eb15dc00d63c45aa5 so your new test would get run as part of the CI
Just to see if the recent changes in Toil 5.5.0 fixed this, I ran the tests without the other code changes and test_biobb_fail
still fails:
AssertionError: Lists differ: [['editconf'], ['genion'], ['grompp'], ['pdb2gmx'], ['solvate']] != [['genion', 'grompp', 'pdb2gmx', 'editconf', 'solvate']]
First differing element 0:
['editconf']
['genion', 'grompp', 'pdb2gmx', 'editconf', 'solvate']
First list contains 4 additional elements.
First extra element 1:
['genion']
- [['editconf'], ['genion'], ['grompp'], ['pdb2gmx'], ['solvate']]
+ [['genion', 'grompp', 'pdb2gmx', 'editconf', 'solvate']]
I pushed this to our repo as issues/3696-cwl-job-chaining-tests
so that GitLab CI will run
This is supposed to fix #3697.
@douglowe It seems like it might make sense for us to schedule a meeting with us and @DailyDreaming to talk about getting this feature actually merged, and how it changes the scheduling design. Is that something you would be interested in?