Pratik Nayak
Pratik Nayak
@klausbu , If understand it correctly, you have one fp64 problem which you want to solve by splitting the work into CPU and GPU. What I meant previously was that...
If you set the residual tolerance to 1e-20, then the solver will try to get an error of atleast 1e-20, but it might be lower. For CG, methods 1. and...
Another approach we could always take is add some macros to reduce the duplication and generate some of the boiler-plate code that most kernel tests have. It would probably not...
This is definitely a good idea. I think we already had some effort from @tcojean using PAPI with software defined events. Additionally, there are a few libraries that can abstract...
I agree that there is a semantic difference between Dense and Multivector. Currently while we use these interchangeably, we definitely should not do that and there are implications for example,...
I would prefer 2, but instead of the deprecating the `Preconditioner::Ilu`, I think maybe it is better if `factorization::Ilu` is deprecated instead. Everything could be the same. You could have...
@upsj What do you think of two separate classes, one for distributed and one for single GPU , differing in namespaces `gko::preconditioner::Schwarz` and `gko::distributed::preconditioner:Schwarz`.
For the different algorithms, I think ARPACK is a pretty good source. The original library is in FORTRAN, but they have a beta version, [ARPACK++](https://www.ime.unicamp.br/~chico/arpack++/) in templated C++.