tasty Add exact dependency matching

trafficstars

Tasty 1.2 introduced a way for tests to specify dependencies. That is, what tests should run before themselves. These dependencies are specified using an AWK-like expression annotated to a TestTree. These expressions can match against any test in the full TestTree. This approach has a few rough edges:

Any pattern has to be tested against any test in the tree. If your test tree mostly consists of tests specifying dependencies for each other, this means calculating your test tree is of quadratic complexity.
It's easy to introduce dependency cycles, or other mistakes introducing needless sequentiality. The latter being especially insidious as it easily goes unnoticed.

This commit introduces the ability to specify dependencies by using a combinator taking a list of TestTrees. This way of specifying dependencies removes the quadratic complexity in favor of a linear one, while eliminating the ability to accidentally introduce cycles or unintended sequentiality.

Jul 19 '22 19:07 martijnbastiaan

@Bodigrim @VictorCMiraldo I'm eager to get this PR merged, my aim is to fix all of the gotchas listed in Tasty's README regarding dependencies - these affect the main project I'm working on:

If Test B depends on Test A, remember that either of them may be filtered out using the --pattern option. Collecting the dependency info happens after filtering. Therefore, if Test A is filtered out, Test B will run unconditionally, and if Test B is filtered out, it simply won't run.

Tasty does not currently check whether the pattern in a dependency matches anything at all, so make sure your patterns are correct and do not contain typos. Fortunately, misspecified dependencies usually lead to test failures and so can be detected that way.

Dependencies shouldn't form a cycle, otherwise Tasty with fail with the message "Test dependencies have cycles." A common cause of this is a test matching its own dependency pattern.

Using dependencies may introduce quadratic complexity. Specifically, resolving dependencies is O(number_of_tests × number_of_dependencies), since each pattern has to be matched against each test name. As a guideline, if you have up to 1000 tests, the overhead will be negligible, but if you have thousands of tests or more, then you probably shouldn't have more than a few dependencies.

Additionally, it is recommended that the dependencies follow the natural order of tests, i.e. that the later tests in the test tree depend on the earlier ones and not vice versa. If the execution order mandated by the dependencies is sufficiently different from the natural order of tests in the test tree, searching for the next test to execute may also have an overhead quadratic in the number of tests.

This PR together with a followup PR should achieve that. I know it's quite a bit to review, so I've tried to add enough tests to hopefully convince you that it's at least somewhat sound. If you'd like a video call to talk this through, I'm open for that too! If there's anything else I can do, ask away.

Jul 26 '22 19:07 martijnbastiaan

@martijnbastiaan sorry, I don't have bandwidth for this right now. Could you possibly summarise all breaking changes here (both API and behaviour)?

CC @VictorCMiraldo @adamgundry @andreasabel as other maintainers.

Jul 26 '22 20:07 Bodigrim

No worries @Bodigrim, thanks for the reply.

The API changes are as follows:

Added exports: afterTree, sequentialTestGroup.
Added TreeFold.foldAfterTree. This allows folding AfterTree (a constructor not part of the public API). Given the existence of trivalFold, I don't expect this to affect existing programs.
TreeFold.foldSingle takes an extra argument ExactPath. I would actually like every function in TreeFold to take an ExactPath (its position in the TestTree), but I decided against is to minimize API changes. Given its usefulness I'd like to ask the maintainers here to make the same consideration - and let me know if they'd opt for adding the ExactPath everywhere.

There are no behavioral changes to existing programs.

Jul 27 '22 08:07 martijnbastiaan

Thanks for the PR @martijnbastiaan! That is a large one! :) Unfortunately, I don't have the bandwidth nor the familiarity with the mechanisms surrounding tasty's test dependencies. Hence, I'm not sure I can give a good review on this anytime soon.

Jul 27 '22 15:07 VictorCMiraldo

Understood!

Hmm. So to give some background: I wrote this patch to resolve some problems we face with our 4K large testsuite over at clash-lang/clash-compiler, a project me and my colleagues at QBayLogic maintain.

What I could do is ask a colleague for a review on this patch (and the follow-up one). Would that be enough -together with the listed API changes I posted earlier- to accept this patch? I'm of course willing to maintain the feature too, and deal with the fallout if any.

Jul 27 '22 17:07 martijnbastiaan

TreeFold.foldSingle takes an extra argument ExactPath.

Can this breaking change be avoided by introducing a new fold and expressing existing foldSingle via it?

Jul 27 '22 18:07 Bodigrim

Yes, we could add foldSingleExact that will (by default) simply call foldSingle.

Jul 27 '22 19:07 martijnbastiaan

@martijnbastiaan could you please share what kind of dependency graph is typical for Clash test suite? Is it chains of depending tests?

Jul 27 '22 21:07 Bodigrim

Sure. To give a bit of context: Clash is a Haskell to silicon compiler - meaning Haskell can be used to build chips that could be synthesized by say a TSMC (or more realistically; it's used to program a reconfigurable chip). Silicon is typically written in a Hardware Description Language (HDL) which is what Clash targets as a compiler output. There are currently three major HDLs: Verilog, SystemVerilog, and VHDL. Each of these languages have various simulators that can be used to predict chip behavior before actually burning it into silicon. Clash itself can also be simulate (simply by leveraging GHC!).

So, one test looks something like:

  graph TD;
      FooTest-->VHDL;
      FooTest-->Verilog;
      FooTest-->SystemVerilog;
      VHDL-->Clash_VHDL;
      VHDL-->Clash_Sim_VHDL;
      Verilog-->Clash_Verilog;
      Verilog-->Clash_Sim_Verilog;
      SystemVerilog-->Clash_SV;
      SystemVerilog-->Clash_Sim_SV;
      Clash_SV-->ModelSim;
      Clash_SV-->Verilator;
      Clash_VHDL-->GHDL;
      Clash_VHDL-->Vivado_VDHL;
      Clash_Verilog-->Vivado_Verilog;
      Clash_Verilog-->IcarusVerilog;

I'm glossing over some things, but each new (regression) test looks a bit like this. This makes the number of dependencies grow linearly with the number of tests, hence us running into this caveat mentioned in the README:

Using dependencies may introduce quadratic complexity. Specifically, resolving dependencies is O(number_of_tests × number_of_dependencies), since each pattern has to be matched against each test name. As a guideline, if you have up to 1000 tests, the overhead will be negligible, but if you have thousands of tests or more, then you probably shouldn't have more than a few dependencies.

With regards to:

Changing TreeFold is likely to break all third-party tasty ingredients.

I'm willing to help out patching them, of course.

Jul 28 '22 07:07 martijnbastiaan

I agree that the infrastructure for dependencies between tests is wanting, but I feel that your approach covers only a fraction of cases, but required changes are quite profound. My use case is dependencies between disjoint, potentially remote trees, see https://hackage.haskell.org/package/tasty-bench-0.3.2/candidate/docs/Test-Tasty-Bench.html#g:4

If the shape of your dependency graphs is relatively stable, might your use case be better served with a custom IsTest provider?

Jul 28 '22 19:07 Bodigrim

The only real limitation is that you can't make a DAG, you can only make a tree. I've checked out the example of bcompare and the examples it links (chimera, fast-digits, unicode-data). Correct me if I'm wrong, but it seems like none of those examples need the TestTree to be a DAG. I think all of these examples just want to run a single test as a baseline (left part of afterTree) and compare it against a bunch of other tests (right part of afterTree). I'll respond to the other points too, but I'd like to understand the need for a DAG better before I do.

Jul 29 '22 10:07 martijnbastiaan

Correct, none of my examples needs a DAG. However, the shape of TestTree is not necessarily in correspondence with a dependency tree.

Jul 31 '22 22:07 Bodigrim

Sure, that's true. I'll be honest, it's kinda hard to respond to this without sounding very negative/argumentative. It seems to me that you're paying a pretty high price for the ability to do so. The examples need to construct patterns such as:

"$NF == \"" ++ show n ++ "\" && $(NF-1) == \"Chimera\""

..which would break if there's ever another test group called Chimera anywhere in the tree, quite seriously contradicting the composability of TestTrees. It has all the downsides mentioned in the README. Last but not least, for a casual reader it's pretty hard to even read the code - given that you need to do the same process Tasty does when finding a dependency: scan the whole tree.

So all in all, it seems somewhat "obvious" to me that you'd want something like afterTree. And as far as the complexity goes.. well that's kinda imposed to us by the existence of after itself. If we only had afterTree, there would have been no need for Trie or DependencyTree (from the follow-up PR). All would naturally follow from the trees construction and traversal - the code would be even simple than what's now in main.

So, in a way I feel that this remark:

but I feel that your approach covers only a fraction of cases

turns everything on its head. It seems to me that after should try and justify its costs. Is it really worth having this pattern matching when it messes with composability, computational complexity, correct-by-constructionness, and code complexity?

I guess what I'm asking is: is there any chance of this getting merged? If not, do you see an alternative way of tackling the issues in the README, or will Tasty be forever stuck with them?

Aug 01 '22 08:08 martijnbastiaan

@martijnbastiaan to be very clear, I agree that the dependency design is inconvenient and I also agree that if we were doing it from the scratch and with our current experience, then AfterTree would be a saner option. My problem is that we already have a mechanism (constructor After) for this pretty niche feature, and adding another one (constructor AfterTree) for even more niche use cases does not strike me as the most desirable outcome. I'm happy to be overruled by other maintainers, my cautiousness is because I'm a caretaker only.

So, answering your question directly, yes, there is a chance, but we must consider the design and long-term consequences. This can take more time than one might like, sorry for that. I do appreciate your work, your efforts and this particular contribution.

What is the prior art for dependencies between tests in other frameworks?

Could you define something like

data ClashTest = ClashTest 
  { ghdlTest :: Assertion
  , vivadoVdhlTest :: Assertion
  , vivadoVerilogTest :: Maybe Assertion
  , icarusVerilogTest :: [Assertion]
  , ...
  }

and provide instance IsTest ClashTest to manage dependency chain? More generally, you can define newtype AssertionChain = AssertionChain [Assertion] with instance IsTest AssertionChain executing tests in order.

Aug 01 '22 18:08 Bodigrim

Understood. If necessary, I'd be willing to maintain it. My ideal would be to deprecate After in the very long term, but I understand that takes a lot of time, effort, and commitment - hence not always worth it.

I'll have a look at your proposal and dig into the prior art, hopefully this weekend.

Aug 01 '22 18:08 martijnbastiaan

What is the prior art for dependencies between tests in other frameworks?

I've looked into this:

sydtest: exposes sequential and parallel declaring that all tests in a test tree need to be executed sequentially or in parallel. (Where parallel is the default.) It does not seem to offer a way to chain two tests together.
hspec: exposes parallel declaring all tests in a test tree to be executed in parallel. It does not seem to offer a way to modify this locally - i.e. further down in the test case.

To me it looks like sydtest is very close to doing TheRightThing(tm).

Could you define something like

This can work, but it has quite a few drawbacks by virtue of the tests being opaque to Tasty:

It's not rendering them in the progress report, also making it harder to read test failures
It's hard(er) to run specific simulators individually / replay tests
It defeats parallelism between different simulators

So all in all this would be a pretty big step back IMO.

I could do two things to make this series of patches more acceptable:

I could reverse the order of patches, making the first patch (fixing test filtering, a PR I intended to publish after merging this one) a pure bugfix. Having that infrastructure would make adding AfterTree almost trivial.
I could make the patches fully backwards compatible by introducing a bunch of "exact" functions, like:

-- | An algebra for folding a `TestTree`.
--
-- Instead of constructing fresh records, build upon `trivialFold`
-- instead. This way your code won't break when new nodes/fields are
-- indroduced.
data TreeFold b = TreeFold
  { foldSingle :: forall t . IsTest t => OptionSet -> TestName -> t -> b
  , foldGroup :: OptionSet -> TestName -> b -> b
  , foldResource :: forall a . OptionSet -> ResourceSpec a -> (IO a -> b) -> b
  , foldAfter :: OptionSet -> DependencyType -> Expr -> b -> b

    -- "Exact" versions of functions defined above. In 'trivialFold', these simply
    -- call the non-exact versions. These functions are here for backwards compatiblity
    -- reasons.
    --
    -- Note: instead of taking a `TreeFold`, these functions could take their non-exact
    -- function directly, but this makes the type signatures quite long.
  , exactFoldSingle :: forall t . IsTest t => TreeFold b -> ExactPath -> OptionSet -> TestName -> t -> b
  , exactFoldGroup :: TreeFold b -> ExactPath -> OptionSet -> TestName -> b -> b
  , exactFoldResource :: forall a . TreeFold b -> ExactPath -> OptionSet -> ResourceSpec a -> (IO a -> b) -> b
  , exactFoldAfter :: TreeFold b -> ExactPath -> OptionSet -> DependencyType -> Expr -> b -> b
  }

I personally don't think it's worth it, but I'm happy to do it if that makes things more acceptable/reviewable.

I do appreciate your work, your efforts and this particular contribution.

Thanks btw, happy to work with you guys.

Aug 07 '22 07:08 martijnbastiaan

To me it looks like sydtest is very close to doing TheRightThing(tm).

Would sequential combinator on its own be sufficient to match your requirements?

Aug 07 '22 15:08 Bodigrim

For my specific use case it would be both a step forward and backward at the same time.

The positives:

We could remove the code dealing with patterns, which is a fairly ugly/confusing part of the test code now.
Starting the testsuite would (if properly implemented) again be a matter of milliseconds, instead of half a minute.

The neutrals:

Tasty would continue to show tests separately.

The negatives:

We'd lose the ability to run simulators in parallel.

I'm somewhat biased as I already put in the work of course, but it strikes me as a band-aid instead of a proper solution. Do you think it would be easier to implement?

Aug 08 '22 20:08 martijnbastiaan

Well, adding TestTreeSequential constructor or maybe extending existing TextTree with a switch Parallel | Sequential seems to me a better option. You do not lose ability to run certain parts in parallel, because TestTreeSequential can affect only one level of tree.

Aug 09 '22 23:08 Bodigrim

Sure, I can amend the patch to do this.

Aug 10 '22 06:08 martijnbastiaan

@martijnbastiaan before you do more work, I'd like to hear from other maintainers.

@VictorCMiraldo @andreasabel Existing TestGroup constructor groups subtrees for (potentially) parallel execution. The proposal is either to add TestGroupSequential constructor, or to extend TestGroup with a field data Parallelism = Parallel | Sequential, which mandates parallel execution of subtrees. How do you feel about this? Currently a similar (but not quite the same) effect can be achieved with a sequence of After constructors, which is very tedious to write and unbearably slow to execute.

Edited: ~~TestTree~~ TestGroup.

Aug 10 '22 18:08 Bodigrim

@Bodigrim and @martijnbastiaan, first of all, thanks for the great work both making the PR and the thoughtful comments! :)

Currently a similar (but not quite the same) effect can be achieved with a sequence of After constructors, which is very tedious to write and unbearably slow to execute.

Yeah, that doesn't sound like a very good solution and I can see how running tests sequentially is an interesting feature. I do tend to agree with @Bodigrim, however. Adding a AfterTree constructor feels weird given that we already have an After. Considering the semantic interplay between them seems like a potential nightmare for me. Can we have weird deadlocks with things like (in pseudo-tasty) t = AfterTree (After "/a/" $ singleTest "b") (singleTest "a")? IIUC, plain tasty already has this type of problem with regular TestTrees.

It does feel more natural to to add a TestTreeSequential constructor instead of using the dependency mechanism to enforce sequential execution. Also, it doesn't introduce new quirks w.r.t. dependencies such as t above: a TestTree u has a deadlock iff TestTreeSequential u has a deadlock.

Would a TestTreeSequential need this change? If not, then it would be fully backwards compatible, which is a huge plus IMO

Aug 11 '22 08:08 VictorCMiraldo

Potentially we can leave TreeFold as is, with foldGroup processing both TestGroup and TestGroupSequential. Adding a new constructor to TestTree is breaking any way, but potentially with less impact on clients.

Aug 11 '22 21:08 Bodigrim

Fundamentally, TestTreeSequential will differ very little from AfterTree. Instead of having two TestTree fields, it will simply take a list of them. I do think it's the better option: I only ended up using sequentialTestGroup (a function this PR introduces) in practice. However, it will still need the changes to TreeFold, unless I find a better way to implement it.

The core of the problem is that this variable:

https://github.com/UnkindPartition/tasty/blob/df7ccab50d361c7d1a3c00960faac883d98c2530/core/Test/Tasty/Run.hs#L327-L332

needs to take TestTreeSequential (or AfterTree) into account. While patterns need to scan the whole tree there, AfterTree/TestTreeSequential can define much more efficient lookup strategies.

Aug 13 '22 13:08 martijnbastiaan

Hmm.. An alternative implementation could implement a custom folder that propagates dependencies up while it is folding. I.e., it would not use foldTestTree to run the tests. This would avoid the complexities of introducing the Trie and would also make for a natural solution to the very first caveat mentioned in the README. I'm not sure how this would fit into the rest of the ecosystem though.

Aug 13 '22 13:08 martijnbastiaan

I would do this (please find a better name for TestGroup'):

data ExecutionOrder = Concurrent | Sequential

data TestTree
  = forall t . IsTest t => SingleTest TestName t
  | TestGroup' ExecutionOrder TestName [TestTree]
  | PlusTestOptions (OptionSet -> OptionSet) TestTree
  | forall a . WithResource (ResourceSpec a) (IO a -> TestTree)
  | AskOptions (OptionSet -> TestTree)
  | After DependencyType Expr TestTree

pattern TestGroup = TestGroup' Concurrent

This is a breaking change, but a very minor one. And you can leave TreeFold as is, ignoring ExecutionOrder, so there is no breakage here.

Indeed, createTestActions can fold TestTree by its own means, not necessarily via foldTestTree.

Aug 18 '22 00:08 Bodigrim

FWIW, I like to compromise suggested by @Bodigrim above. Seems like a good option to minimize breakage while enabling @martijnbastiaan to have the necessary feature that motivated the PR in the first place.

Aug 24 '22 16:08 VictorCMiraldo

Great, thank you both. I've got a patch now that implements the suggestions (adding TestGroup field + custom traversal). While it doesn't try to fix any of the issues raised in the README for pattern dependencies, it does solve them for the new way of specifying deps in a fairly straightforward way.

I'm pretty happy with the patch, but I'm still working on writing a proper testsuite. I'm spending only a few hours on it every weekend so it'll probably take a while..

Aug 24 '22 16:08 martijnbastiaan

I've just pushed an alternative implementation based on the comments above. This one doesn't try to fix any issues with dependencies specified using patterns, but does solve them all for dependencies specified using sequentialTestGroup. TestGroup is now the primitive setting parallelization/sequentiality. Two backwards incompatible changes have been applied:

1. TreeFold no longer has a field foldResource. Its type signature had to change in order for the fold to account for dependencies while filtering. However, it doesn't seem likely downstream will be affected at all:

Searching for foldResource on all of GitHub only yields tasty using it.
Searching through all libraries with tasty in their name on Hackage does not yield any relevant usages of TreeFold either.

If this change proves controversial, I suggest we move the second commit in this PR to a separate PR and discuss there.

2. TestGroup has an additional field ExecutionMode. @Bodigrim, you suggested making a pattern synonym like this:

pattern TestGroup = TestGroup' Concurrent

But, libraries using this pattern will silently fail:

case tree of
  TestGroup _ -> ...
  _           -> ...

Libraries using the following pattern will fail with just a GHC warning:

case tree of
  TestGroup _ -> ..
  [.. exhaustive list of constructors ..]

To get a feel for the breakage, I've cloned all the libraries published on Hackage related to tasty and searched for TestGroup. Of all these libraries, 7 use TestGroup directly: tasty-bench, tasty-expected-failure, tasty-silver, tasty-inspection-testing, tasty-focus, tasty-focus, and tasty-bdd. Of these, 6 would fail silently as their usage falls in one of the two patterns mentioned. Only tasty-inspection-testing would continue to be correct.

I'm proposing to let static typing do its thing to prevent silent breakage.

Aug 28 '22 10:08 martijnbastiaan

@Bodigrim I've applied your suggestions in separate commits. I'll squash them into the first two if you're happy with them.

Aug 29 '22 12:08 martijnbastiaan

tasty tasty copied to clipboard

Add exact dependency matching

tasty
tasty copied to clipboard