Run GitHub action with all allowed configurations and cache the build result
Some bugs failing build proccesses are detected lately, e.g., #581, #582. The reason is that we only build the project in CI with only one configuration. We should build all accepted configurations with GitHub flow to catch build errors as soon as possible. Below are the combinations I suggest.
CPU:
| HPTT | BLAS |
|---|---|
| enabled | OpenBLAS |
| enabled | MKL |
| disabled | OpenBLAS |
| disabled | MKL |
CUDA:
| CUTT | cuTENSOR | cuQuantum |
|---|---|---|
| enabled | enabled | enabled |
| enabled | enabled | disabled |
| enabled | disabled | enabled |
| enabled | disabled | disabled |
| disabled | enabled | enabled |
| disabled | enabled | disabled |
| disabled | disabled | enabled |
| disabled | disabled | disabled |
Some CUTT features are hidden behind the cuTENSOR's or cuQuantum's feature flag, so we can't separate CUTT from the combinations above. In addition, we have different feature flags for cuTENSOR and cuQuantum in the codebase, so we can't expect the feature flag of cuTENSOR is always on when enabling cuQuanum's feature flag, even cuTENSOR is a dependency of cuQuantum.
If cuTENSOR is a dependency of cuQuantum, they don't we enforce to enable cuTENSOR when we enable cuTENSOR?
Or we only have the option of cuQuantum. Use cannot enable/disalbe cuTENSOR.
With CUTT removed, I think we are converging to have both cuTENSOR and cuQuantum as dependencies. I am trying to set up a self-hosted GPU runner to test out the GPU CI/CD.