Thomas Baumann
Thomas Baumann
@pancetta, what the hell where you going from when you implemented this MPI version? The stuff in question was removed by you in the non MPI version in 2018, see...
> Relax. "might" means that these things are also called during MLSDC sweeps, but there the `u0` is not changed. PFASST however changes the `u0` during its iterations and that's...
I don't really know what's going on. I just took the current version and MPI-ied it. The tests check that MPI and non-MPI versions produce the same results. Let me...
I was experimenting with CUDA graphs and it seems to work really well as the following plots show:   I record separate graphs for `forward` and `backward` operations for...
I want to comment a bit more on Alltoallw, since I anticipate a comment on this. To quote the [paper](https://www.sciencedirect.com/science/article/abs/pii/S074373151830306X) you wrote to go along mpi4py-fft: > MPI_ALLTOALL(V) works on...
By the way, I talked to a member of the MPI Forum and developer of OpenMPI at a conference and showed him the plots with poor performance of Alltoallw. He...
It seems there have been some developments in testing open source code on GPUs, see [here](https://quansight.com/post/building-a-gpu-ci-service-for-conda-forge/). If I understand correctly, you could apply for this program to have the code...
What's the status of this? Support for subclassing has since been added to cupy (see the issue mentioned above) and cupy has FFTs as well. I am just starting to...
I started out with CuPyfying the DistArray class. There were a few issues, but I think so far nothing catastrophic. It is not as simple as replacing `np` with `cp`...
I made some progress, but there is still plenty of room for improvement and I was hoping you have some suggestions. First of all, I made some scaling tests with...