testing for portblas portfft
Description
Add Unit tests for portfft, portblas to CI
We'll look into ways to reduce the time it takes to run the tests. We may need to cache the SYCL kernels across the tests.
We'll look into ways to reduce the time it takes to run the tests. We may need to cache the SYCL kernels across the tests.
The build times are comparable to MKL backend. MKL tests run in < 5 minutes. CI failed with a timeout because both portblas and portfft tests take > 4 hours to run. Are you saying that driver compile time is the cause? I can add a regex to select a subset of the tests to run, with a target of running for 5 minutes. What do you recommend?
The build times are comparable to MKL backend. MKL tests run in < 5 minutes. CI failed with a timeout because both portblas and portfft tests take > 4 hours to run. Are you saying that driver compile time is the cause? I can add a regex to select a subset of the tests to run, with a target of running for 5 minutes. What do you recommend?
Yes the JIT compilation may be the issue, we'll need to confirm. I think in the current situation we would need to disable too many tests so I would suggest to wait until we know more. Running all the tests under 5 minutes seems very optimistic for portFFT or portBLAS though.
I changed it so we build all the unit tests for portfft/portblas but only run a handful of tests. Everything passes and port* finishes before MKL. It is not what we want for the long term, but we can enable more as we get faster machines and hopefully find a way to reduce the JIT time. Are you OK with committing what is there now?
An indirectly related question: I noticed that previous workflows were run on my forked repo. Do you think a condition like this can avoid the runs on forked repos?
Do you think a condition like this can avoid the runs on forked repos?
That will work, but you have to put the condition in every job for every workflow. I don't think there is a file-level way to do it. An alternative is to disable github actions for your fork. It is in settings/actions/general, then look near the top.
I see. I hope GitHub can provide a more maintainable way for this. At the same time, we may need to communicate this with contributors, but this is out of scope of this PR.
@rscohn2 from our local testing we have confirmed that using AOT compilation of the kernels speeds up the testing a lot. This is best to use if we know we will run on a specific device.
For the portBLAS backend it turns out this can be done by setting the cmake flag -DPORTBLAS_TUNING_TARGET=INTEL_CPU. This sets the SYCL target to spir64_x86_64. With this we were able to compile in 32 minutes and run all the tests in 8 minutes on a i9-12900K. I think it would take a bit longer with the current GitHub runner though.
For the portFFT backend you might as well directly set the target to spir64_x86_64 and enable more tests if that's ok.
We'll try to improve the documentation for this. We need to find a meaningful default behavior while still letting the user tune for more specific use-cases.
The github runner has 2 cores/4 threads, but I think we will have more cores available soon. For the 8 minute test, were they parallel or serial? @dnhsieh-intel prefers serial tests because the logs are easier to read.
We just ran ctest so the tests should have been run serially when we measured 8 minutes.
portfft tests are not passing with aot
https://github.com/oneapi-src/oneMKL/actions/runs/9208889271/job/25332103601#:~:text=48-,unknown%20file%3A%20Failure,%5B%20%20FAILED%20%20%5D,-ComputeTestSuite/ComputeTests_in_place_COMPLEX.COMPLEX_SINGLE_in_place_buffer
Right, that makes sense actually portFFT is using spec constants. Let's revert the target to spir64 for portFFT then.
With AOT, portblas takes 1.5 hours to compile and 5 minutes to run. Runners only have 2 cores, but we are getting runners with more cores so parallel build will be faster. 2nd longest is mkl blas, which takes 1 hour with most time spent in compile. portfft is back to not using aot and running a couple tests. Everything passes and runs in an acceptable amount of time.