GPU compilation error detection or troubleshooting
When searching for GPU-compatibility bugs in #300, I found that the bugs are actually quite trivial:
- the use of default function argument (Numba currently does not support this), and
- the use of Numpy functions.
However, the error messages that show up do not clearly indicate what happened, so we had to manually hunt for possible issues from the changed code line-by-line.
It would be very helpful in debugging in the future if we could detect these identified issues during the compilation and accordingly report them in the error message.
Or, creating a list of frequently showing-up bugs for troubleshooting may be good enough.
What do you think? @braxtoncuneo @jpmorgan98 @clemekay
If we are adding in the trace decorator, we could add in checks for default arguments through function inspection, and raise an error if a default argument is found.
If we want something that doesn't use decorators, it could be accomplished by inspecting functions for the functions that they call and recursively searching for default-argument-using functions.
I think having a list of known/frequently-appearing bugs is a good idea, perhaps on the website? It's not exactly a robust solution, but at the very least, it gives someone debugging a place to start.
@braxtoncuneo is there any downside to adding in the trace decorator? If not, I think that sounds like a nice solution.
Also, hopefully modularizing the code will help with tracking down changes
In fact, a few weeks ago I asked Braxton about anything I should keep in mind or look out for while developing to make sure I'm not causing GPU issues; maybe we should have a general page for "Seemingly innocuous things that nevertheless break gpu compatibility".
It could be a list of things that developers wouldn't necessarily think to avoid because they won't cause CPU issues but that we know will cause GPU issues, such as the use of some Numpy functions or passing objects outside of arrays.
It can then double as a place to start when trying to debug GPU issues!
If these bugs have been fixed does that mean we are ready for v0.12.0?
@jpmorgan98 As it stands, the bug fix PR was merged in yesterday #310, but the GPU regression tests for it failed.
Yes, the GitHub workflow run for the GPU regression test failed. Is that expected, @jpmorgan98? I have confirmed that the GPU regression test passes (running from our OSU CI machine), though.