Liam Adams
Liam Adams
Also it looks like work on a couple of these has already been start in PR #925
Following @tgymnich suggestion, I've had a look at FMS and it also uses MPI_MIN, and MPI_MAX reductions, though these are only used with MPI_Allreduce.
> Perhaps this PR is adding too many ingredients at once. Maybe we could just focus on the gpu-offloading first and add the 'multiply-add' later? > > In that respect...
That sounds like a great plan, and thanks for taking a look so far! I'll work on submitting those PR's.
I've submitted PR's for the first 2 items (https://github.com/ecmwf/atlas/pull/240 and https://github.com/ecmwf/atlas/pull/241). And, I'll submitted the remaining 2 PRs as their dependencies pass review and get merged in.
Hi @wdeconinck, yes that's right. I have a branch (on a fork) that does that contains an implementation of that adaptation. I just haven't opened a PR because priorities changed...
Hi @wdeconinck, I'm wondering what the current status of this PR is? The reason that I ask is that I've tried to build and run it locally with a GPU...
> Hi @l90lpa I have just tested this with NVHPC 22.11 and saw no issues like that. > > My loaded modules: > > 1. cmake/3.28.3 2) prgenv/nvidia 3) gcc/11.2.0...
Hi @wdeconinck, thanks again for sharing your build environment. I was able to get Atlas+ecTrans working using NVHPC 22.11. However, I've been having trouble building some of our code (and...