DLA-Future
DLA-Future copied to clipboard
Reduction To Tridiagonal "1-stage" (local)
This effort aims at evaluating a possible improvement for the GPU backend by using a 1-stage reduction instead of the 2-stage currently implemented which seems to fit better on a MC backend.
At the moment of writing:
- a basic local MC implementation (not refined, just "make it work")
- a basic test
- a trivial swap of the algorithm in eigensolver from 2-stage to 1-stage, since there no miniapp has been implemented yet
Next steps:
- making it work on GPU backend
- profiling and analysis for identification of main problems for better defining optimization strategies