wdl icon indicating copy to clipboard operation
wdl copied to clipboard

Update RFC to Include Test cases

Open patmagee opened this issue 4 years ago • 7 comments

Before making a PR to the RFC process, I wanted to start a conversation here to see whate you guys think.

Whenever an new change is submitted, we currently have now way of testing whether the feature has been implemented or no, OR whether the implemented feature meets the requirements of the WDL specification. This approach makes it a bit difficult for engine implementors to know whether or not they are adding the requested features as intended.

It would be good for each PR to include a set of positive and negative test cases which could help guide engine implementors when developing new features. These test cases should be simple WDL's that are made to specifically test a single feature against the current release(? or maybe development?) specification, assuming that the new feature has been added to the spec.

The question we should focus on are:

  1. Is this necessary?
  2. How will this benefit the Engine implementors?
  3. Where do we put test cases?
  4. How many test cases are sufficient to accept a single PR?
  5. How do we create test cases that will actually work when the development specification is rapidly changing?

patmagee avatar Oct 22 '19 15:10 patmagee

Yeah +1 to this, I've been mostly working out of the development spec and there have definitely been a few times where I've been unsure whether the feature has been implemented or not quite working too. I guess it's not super related to this PR, but I have a few thoughts:

  • there should be an element of "Engine implements Feature X". I do like CWL's approach to this with the Requirements (even though sometimes they feel clunky), it's very clear whether an engine supports the "feature" you're using. Also gives you a basic metric like: "this engine has a 100% conformance to the WDL test cases".

  • The contributor providing simple test cases should be a part of the issue, or one of the first things that occurs. More test cases should mean less accidental regression of functionality, or if it's a legitimate issue then it can become part of the test suite. To address (5), it will make it explicit how the functionality is changing.

I'm not an engine implementor, but as a pipeline author (and tester of different engines), it's hard to know whether an engine properly supports a spec without running your workflows through them, so some things are important to me.

illusional avatar Oct 22 '19 20:10 illusional

1. Is this necessary?

2. How will this benefit the Engine implementors?

3. Where do we put test cases?

4. How many test cases are sufficient to accept a single PR?

5. How do we create test cases that will actually work when the development specification is rapidly changing?
  1. Yes. I think this is a great idea. This will be beneficial in multiple ways:
    • By writing actual WDL it will become more clear if a feature is needed, it will allow the change requester to think deeper about the request.
    • It will make it harder to request changes. This is a good thing, since I think (personal opinion, not wanting to start a flame war here) that the WDL spec has been suffering from feature creep. By having more requirements on requests for change, the pace of change will be slowed somewhat.
    • Since there is actual WDL it is easier to review the request for change.
  2. I am not an engine developer, but I do write code and pipelines that I test. It is much easier to test existing cases. Creating new tests that confirm to the spec requires interpreting the spec and translating it to actual WDL. This job is probably much easier for the person who wrote the request for change than for people who have to interpret what the requester wrote. (With the problem of misinterpretation).
  3. My proposal: In a folder named test_cases it should be a subfolder in the folder for that particular version of WDL. Test cases should be in .wdl format and comments should clearly state what the added value is of each test case.
  4. This depends on the PR, but at least one test case that should succeed. For more complex PRs we can also require some cases that should fail. But I think the openwdl core team can better judge this from a PR to PR basis.
  5. My opinion on this is that we should use semantic versioning. We are now on stable 1.0.0. The next version will be 2.0.0, because objects are removed. We are writing a pipeline language here, making backwards incompatible changes is a very bad thing. It is much nicer if things are stable. If a PR breaks existing test cases we can evaluate if this feature is worth incrementing the major version number again. (I sincerely hope that 2.x.x will be the coming version for years).

A great idea @patmagee, thanks for putting so much thought in it already!

rhpvorderman avatar Oct 23 '19 07:10 rhpvorderman

Adding test cases is a great idea. But there may be some cases where it will be hard to write tests that actually test the spec. For example, the import behavior will differ depending on whether the WDL is called using an HTTP(S) URL or a local path. As such simply writing a WDL file alone would not necessarily prove that an engine is spec-compliant, as this one WDL file should produce different behavior depending on how it gets run.

DavyCats avatar Oct 23 '19 07:10 DavyCats

Bringing in @mlin to this conversation. He had a great suggestion as an additional way that we could help test and propel features into the specificaion. It seems they will be writing documentation shortyl on how to fork miniwdl and test new WDL features that someone wants in the specification. I do not think this would replace the test cases, but provide a lower barrier for feature implementation.

@rhpvorderman I agree about the backwards incompatible changes. They are not my favorite and generally have quite a lag time before everyone migrates over. It is my hope that v2.0.0 will be a stable long term version. To me it seems like we have addressed alot of the major issues structurally that we previously had. So I hope we will not need to make any more breaking changes any tiome soon

In terms of the structure, what do you guys think makes the most sense. New Test cases should be part of an RFC. To that end, we can use the PR to define an RFC # as well. so PR 138 == RFC-138. I wonder if a structure like the following would make sense:

versions/
    |- 2.0.0/
          |- parsers/
          |- SPEC.MD
          |- test_cases/
              |- RFC-<PR #>/
                   |- test_feature_x_should_do_y.wdl

patmagee avatar Oct 23 '19 14:10 patmagee

#2 -- I think the test cases have a two fold benefit -- make it clear to an implementor what they're solving for and make it clear to a user exactly how to leverage a feature. Sometimes the examples don't include a full task/workflow and how story around a WDL feature, and this would make it easier for new features to get adopted since it is more complete documentation.

#4 -- I think maybe this change can be a case by case basis, it's possible some features have a lot of logic flows so benefit from a more complex case. Others less so, so one case is plenty.

@DavyCats interesting point -- I think in that case we would just need multiple test cases (aka multiple WDLs) to have thorough coverage for a feature like imports.

ruchim avatar Nov 01 '19 13:11 ruchim

Delighted to see this proposal taking form & getting support! Beyond testing, this will be very helpful for eg pipeline authors as it will double as a convenient recipe book. On the Broad side we could host the WDL test cases in a Terra workspace with test data & everything preconfigured to work out of the box.

vdauwera avatar Nov 05 '19 21:11 vdauwera

We now have a spec for how to write examples in the spec that can also be test cases. This answers most of your original questions, @patmagee. I think we can just update RFC.md to say that all PRs for non-trivial additions require at least one properly formatted example.

jdidion avatar Feb 07 '24 17:02 jdidion