alpaka
alpaka copied to clipboard
Add CI jobs with `alpaka_DEBUG=2`
Currently, all (but one analysis) Debug CI jobs run with alpaka_DEBUG=0
. This means, that extra debugging code is never tested by the CI. We should add at least a few CI runs testing different debug levels.
I will fix it. When I created the job generator, I was not aware about the cmake flag.
Thx a lot! It could be that this causes some tests to become very noisy, so we may need to have a look at the logs, whether this is still managable.
Your are right: https://gitlab.com/hzdr/crp/alpaka/-/jobs/5021535754
Do you have an idea to handle it?
Well, we could have a discussion whether anything but the 4th line here is meaningful to anyone:
29: [-] BufCpuImpl
29: [-] allocBuf
29: [+] operator()
29: printDebug e: (1) ewb: 1 de: (1) dptr: 0x55ba59952570 dpitchb: (1) se: (1) sptr: 0x55ba59962ba0 spitchb: (1)
29: [-] operator()
29: [+] ~BufCpuImpl
29: [-] ~BufCpuImpl
29: [+] ~BufCpuImpl
29: [-] ~BufCpuImpl
29: [+] getDevByIdx
29: [+] getDevCount
29: [-] getDevCount
29: [-] getDevByIdx
29: [+] getDevByIdx
29: [+] getDevCount
29: [-] getDevCount
29: [-] getDevByIdx
29: [+] QueueGenericThreadsBlocking
29: [-] QueueGenericThreadsBlocking
29: [+] allocBuf
29: [+] BufCpuImpl
and just reduce the amount of output. Especially all simple queries like getDevCount
etc. are just noise in IMO. I can see a use for the logs when buffer implementations are destroyed, because the shared pointers make it harder to understand lifetimes.
But in general, we should just have very little CI runs with alpaka_DEBUG=2
. I don't know whether you can steer that in the job generator.
If I add the BUILD_TYPES
CMAKE_DEBUG1
and CMAKE_DEBUG2
and map it to alpaka_DEBUG=1
and alpaka_DEBUG=2
, it will be nearly distributed to the same amount. But I can also implement some custom rules, like set for one job for each device compiler version alpaka_DEBUG=2
and let the rest be alpaka_DEBUG=1
.
MAybe we should think about adding debug lvl 3 where all entries and exits of alpaka functions will be visible and disable this for debug lvl 2.
But then we will need CI jobs for alpaka_DEBUG=3
...
FYI in debug mode with the CUDA back-end I see:
93% tests passed, 2 tests failed out of 30
Total Test time (real) = 316.42 sec
The following tests FAILED:
3 - mandelbrotTest (ILLEGAL)
4 - matMulTest (ILLEGAL)
Both the tests failed with : 'cudaErrorLaunchOutOfResources': 'too many resources requested for launch'!`
But then we will need CI jobs for
alpaka_DEBUG=3
...
Actual yes and no. It only tests, if a std::cout
is working. So, this is not critical. On the other side, we know that printing to the terminal can change the execution order in a parallel program. But a std::cout
should not fix our application. Therefore, I'm for a alpaka_DEBUG=3
. This level is only for human developer ;-)
But then we will need CI jobs for
alpaka_DEBUG=3
...Actual yes and no. It only tests, if a
std::cout
is working. So, this is not critical.
The std::cout << getWidth(extent) ...;
was exactly what was broken, so we must include it in the tests.