DLA-Future Tridiagonal Solver (dist): Migrate permutation of local eigenvectors to GPU

Tridiagonal Solver (dist): Migrate permutation of local eigenvectors to GPU

Open albestro opened this issue 1 year ago • 0 comments

In #967, a new "special" permutation has been added. In the end it is just a local permutation, but it starts from reasoning globally. Currently, it runs on MC for both MC and GPU variants of the tridiagonal solver. In order to get it run on GPU, we have two main ways:

In order to re-use the local permutation:

we can "preprocess" the permutation array on Backend::MC extracting just local parts and convert global indices to local indices
- Problem: currently the permutation (local) can just deal with local matrices
- Option 1: use local indices to access the local part
- Option 2: create a new object (e.g. MatrixRef) that just refers to the local part (i.e. the new object does not feel anymore the distribution)

Permutation on GPU: Currently it is implemented passing a "simplified" distribution (pointer + horizontal and vertical distance between tiles)

Since we are going to support "random" placed allocations
- (preferred) Option 1: send a vector of pointers, each element is the beginning of a tile
- (Option 2: force the layout on the matrix used)
- @rasolca does not like how the position is currently computed
It is going to be implemented differently (currently a CUDA thread works on a single element)
cudaMemcpy is not an alternative since it would spawn too many small kernels

Nov 30 '23 17:11 albestro

DLA-Future DLA-Future copied to clipboard

Tridiagonal Solver (dist): Migrate permutation of local eigenvectors to GPU

DLA-Future
DLA-Future copied to clipboard