chiseltest
chiseltest copied to clipboard
TimeoutException should dump better debug information
Is there a way to catch TimeoutException
in user code? It seems like the exception gets thrown and there is no way to catch it (user code is not even part of the stacktrace) and process it in user code (e.g. to reformat the message, provide extra context, etc).
- should connect ports and work *** FAILED ***
chisel3.tester.TimeoutException: timeout on Clock(IO clock in MyModule) at 1000 idle cycles
at chisel3.tester.TreadleBackend.$anonfun$run$5(TreadleBackend.scala:188)
at chisel3.tester.TreadleBackend.$anonfun$run$5$adapted(TreadleBackend.scala:185)
at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:145)
at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:235)
at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:228)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:145)
at chisel3.tester.TreadleBackend.run(TreadleBackend.scala:185)
at chisel3.tester.internal.Context$.$anonfun$run$1(Testers2.scala:25)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
If I recall correctly, the timeout is being thrown in the infrastructure code, which is a layer above the main test body (which I think is also being run in another thread). I think you can wrap the entire test (new MuDut()) { ... }
construct in a try-catch, and it might work?
Otherwise, can you give me an example of the use case? I'm not sure how this is best handled especially given the multithreaded nature of tests. Currently, uncaught exceptions in any thread are propagated up to the main thread, then all other threads interrupted and terminated. If you wanted to catch the TimeoutException, in which thread should it fire? Just the main thread? Or is it fine to it have it as an infrastructure-level exception that can't be caught within the test body?
Also, the intent behind TimeoutException is to allow recovery from infinite loops without killing and restarting sbt - it's essentially a last-ditch save. We could also think of other mechanisms to support your use case.
The short-term problem which I ran into is the case of writing a test library. I wanted a way to present a friendlier message to the user about some more details of the test that failed, with the current stacktrace it's not immediately obvious where the fault came from. (i.e. yes, the test failed, but what was the test library doing that led to it?)
Code wise I wanted to do something like: (not super fixed on this API, just need some way to express this intent)
try {
fork {
...
} .fork {
...
} .join
} catch {
case e: TimeoutException: reportError("Your foo bar test failed")
}
One idea for debugging was to allow named timescopes, which would be useful in localizing assertions (so you could get something like "assertion x != 1 failure in MyTransaction: SPI: bit 0: high", and not just a bunch of line numbers / stack trace). That might also provide more context for timeout failures.
That being said, this strategy certainly can't provide as much information as catch-rethrow, but it's also more lightweight. If you're already using timescopes, you just need to tack a name onto them, and you get assertion and timeout context essentially free. I think wrapping everything in try-catch is a bit cumbersome.
Anyways, I still don't think catching a TimeoutException is a good pattern (feel free to convince me otherwise?) - instead, if something is expected to finish within some time, that should be expressed with more intent-exposing constructs, eg expectWithin.
I'm fine with using named timescopes; then, is there a way to get the extra information out of it? It would make test libraries more pleasant to use if test libraries can help provide extra information. The main thing is to make the user aware of what the test library was doing that caused this.
val t = timescope(name="my_foo_bar", timeout=1000) {
fork { ...} .fork { ... } .join
}
if (t.hasExceptions) {
// get exception and report it
}
The idea is for the named timescope to provide information on what the library was doing (eg, SPI transaction, on bit 0, on the clock-high half-cycle). It wouldn't allow you to put in logic after-the-fact (eg, post-processing error messages) as you might when catching an exception, but you can have generated timescope names beforehand since the timescope name would take a string argument.
Is it possible to expand the named timescope idea to take a function or some user logic? e.g.
timescope(reportingFunc=myReportingFunc, timeout=1000) { ... }
It could also be a trait, object, or something else. Which ever API is fine, as long as there's a relatively easy way for test library writers to add information beyond just a string.
Also, if there's an experimental
workaround where it's possible to just catch the TimeoutException
directly, that's fine too. I understand developing a clean API takes time and thought.
So in my view there's two orthogonal issues here:
- Should TimeoutException be catchable in user test code? And which thread should get the exception if there are multiple threads? Note if all threads get the exception, and nothing in the thread catches it, you will get console spam indicating uncaught exception(s), which could just add to the confusion.
- How should the timescope name stack (or other context-dependent testers2 data) be obtained for the purposes of reporting exceptions? Is this sufficiently beneficial to outweigh the cost of adding special-case constructs? One possibility that I think is relatively clean would be to have a special Testers2Exception (which could be extended by more specific exceptions) that captures the current context-specific testers2 data that is known, like timescope name stack. I'm not convinced adding a reporting function is the best solution, since it doesn't seem super composable - it seems like it's replacing a try/catch block without looking like a try/catch block.
Can we have TimeoutException
as a subclass of Testers2Exception
?
Sure, but you still need to determine which thread gets the TimeoutException
, or, if it stays an infrastructure-level exception not visible from user test code, but rather the code invoking the driver, which thread's context does it capture from.
Typing up resolutions from two weeks ago... When a TimeoutException fires, dump debugging info from all the threads. It should provide good debugging trace information (especially if we have waitFor as in #39), though its won't be catchable from within a test.