Arturo Vargas
Arturo Vargas
PAConvectionApply3D/T uses quite a bit of CUDA global memory. This is on account of the high amount of local thread arrays that do not fit in register space. PR https://github.com/mfem/mfem/pull/3054...
Folks, it looks like we are rebuilding `R_transpose = new TransposeOperator(R);` below for every GetRestrictionOperator() call: https://github.com/mfem/mfem/blob/23cd6a78177e0bc17eef26da3f74e2c08ec2d9f1/fem/pfespace.cpp#L1211-L1213 Is this right? and could this be a memory leak?
Hi all, It would be nice to have general support for AddMult for the different assembly types in bilinearform: https://github.com/mfem/mfem/blob/baf9090c887e08058f0e93c01274b4cb4e069a19/fem/bilinearform.hpp#L273-L275
Building on user feedback a first pass at introducing run-time policy selection to forall. See example run-time-forall.cpp Developers can provide any number of execution policies and then select at run...
Introduces the sycl backend for launch and dynamic shared memory (for all backends). This PR will introduce API changes. - [x] Keep prototype of perfectly nested loop interface in expt.
Various packages have been creating a layer over RAJA which instantiates both host and device policies. Based on a run-time compute-policy the correct forall method is then launched (see example...
Currently RAJA::ListSegments are not supported within RAJA Teams. This is on account that a ListSegment is captured within the kernel execution space by value triggering cuda/mallocs and free (on the...
RAJA Teams functional test can be based of the Kernel functional test.
As observed in RAJA plugins, we would like to avoid passing in a typed erased Resource into RAJA when using single policies as it has been observed to add overhead.