oneMKL icon indicating copy to clipboard operation
oneMKL copied to clipboard

testing for portblas portfft

Open rscohn2 opened this issue 1 year ago • 13 comments

Description

Add Unit tests for portfft, portblas to CI

rscohn2 avatar May 21 '24 20:05 rscohn2

We'll look into ways to reduce the time it takes to run the tests. We may need to cache the SYCL kernels across the tests.

Rbiessy avatar May 22 '24 09:05 Rbiessy

We'll look into ways to reduce the time it takes to run the tests. We may need to cache the SYCL kernels across the tests.

The build times are comparable to MKL backend. MKL tests run in < 5 minutes. CI failed with a timeout because both portblas and portfft tests take > 4 hours to run. Are you saying that driver compile time is the cause? I can add a regex to select a subset of the tests to run, with a target of running for 5 minutes. What do you recommend?

rscohn2 avatar May 22 '24 10:05 rscohn2

The build times are comparable to MKL backend. MKL tests run in < 5 minutes. CI failed with a timeout because both portblas and portfft tests take > 4 hours to run. Are you saying that driver compile time is the cause? I can add a regex to select a subset of the tests to run, with a target of running for 5 minutes. What do you recommend?

Yes the JIT compilation may be the issue, we'll need to confirm. I think in the current situation we would need to disable too many tests so I would suggest to wait until we know more. Running all the tests under 5 minutes seems very optimistic for portFFT or portBLAS though.

Rbiessy avatar May 22 '24 13:05 Rbiessy

I changed it so we build all the unit tests for portfft/portblas but only run a handful of tests. Everything passes and port* finishes before MKL. It is not what we want for the long term, but we can enable more as we get faster machines and hopefully find a way to reduce the JIT time. Are you OK with committing what is there now?

rscohn2 avatar May 22 '24 16:05 rscohn2

An indirectly related question: I noticed that previous workflows were run on my forked repo. Do you think a condition like this can avoid the runs on forked repos?

dnhsieh-intel avatar May 22 '24 17:05 dnhsieh-intel

Do you think a condition like this can avoid the runs on forked repos?

That will work, but you have to put the condition in every job for every workflow. I don't think there is a file-level way to do it. An alternative is to disable github actions for your fork. It is in settings/actions/general, then look near the top.

rscohn2 avatar May 22 '24 18:05 rscohn2

I see. I hope GitHub can provide a more maintainable way for this. At the same time, we may need to communicate this with contributors, but this is out of scope of this PR.

dnhsieh-intel avatar May 22 '24 19:05 dnhsieh-intel

@rscohn2 from our local testing we have confirmed that using AOT compilation of the kernels speeds up the testing a lot. This is best to use if we know we will run on a specific device. For the portBLAS backend it turns out this can be done by setting the cmake flag -DPORTBLAS_TUNING_TARGET=INTEL_CPU. This sets the SYCL target to spir64_x86_64. With this we were able to compile in 32 minutes and run all the tests in 8 minutes on a i9-12900K. I think it would take a bit longer with the current GitHub runner though.

For the portFFT backend you might as well directly set the target to spir64_x86_64 and enable more tests if that's ok.

We'll try to improve the documentation for this. We need to find a meaningful default behavior while still letting the user tune for more specific use-cases.

Rbiessy avatar May 23 '24 11:05 Rbiessy

The github runner has 2 cores/4 threads, but I think we will have more cores available soon. For the 8 minute test, were they parallel or serial? @dnhsieh-intel prefers serial tests because the logs are easier to read.

rscohn2 avatar May 23 '24 11:05 rscohn2

We just ran ctest so the tests should have been run serially when we measured 8 minutes.

Rbiessy avatar May 23 '24 13:05 Rbiessy

portfft tests are not passing with aot

https://github.com/oneapi-src/oneMKL/actions/runs/9208889271/job/25332103601#:~:text=48-,unknown%20file%3A%20Failure,%5B%20%20FAILED%20%20%5D,-ComputeTestSuite/ComputeTests_in_place_COMPLEX.COMPLEX_SINGLE_in_place_buffer

rscohn2 avatar May 23 '24 16:05 rscohn2

Right, that makes sense actually portFFT is using spec constants. Let's revert the target to spir64 for portFFT then.

Rbiessy avatar May 24 '24 08:05 Rbiessy

With AOT, portblas takes 1.5 hours to compile and 5 minutes to run. Runners only have 2 cores, but we are getting runners with more cores so parallel build will be faster. 2nd longest is mkl blas, which takes 1 hour with most time spent in compile. portfft is back to not using aot and running a couple tests. Everything passes and runs in an acceptable amount of time.

rscohn2 avatar May 26 '24 11:05 rscohn2