libCEED icon indicating copy to clipboard operation
libCEED copied to clipboard

Opt Backend Assembly

Open jeremylt opened this issue 2 years ago • 0 comments

The /cpu/self/opt/* backends should implement their own version of diagonal/full assembly that assembles by element. A lot of the pieces are all there in the code, but spread out.

Current:

Assemble QFunction
for (elem in l-vec) Assemble Operator element

New:

for (elem in l-vec) {
  Assemble QFunction element
  Assemble Operator element
}

This is very similar to our approach with the operator application, except we would probably want to keep the block size set a 1 for simplicity. Then we can set /cpu/self/opt/serial as the operator fallback for /cpu/self/opt/blocked.

This would hopefully significantly decrease the assembly memory footprint (and speed things up) for the Opt, AVX, and XSMM backends.

jeremylt avatar Oct 10 '23 15:10 jeremylt