devito icon indicating copy to clipboard operation
devito copied to clipboard

cross-loop blocking in staggered tti and *elastic propagators

Open FabioLuporini opened this issue 5 years ago • 1 comments

This should improve runtime performance of TTI staggered.

The basic infrastructure is already in the codebase (used for example by the CIRE algorithm), but it currently doesn't support the case in which all of the involved loop nests write to user-provided data (in the typical CIRE algorithm use case, all but one loop nests write to DSE-generated temporary Arrays)

FabioLuporini avatar Jul 24 '18 09:07 FabioLuporini

the description in the original message is now obsolete, but the (performance) issue is still there:

for x
  for y
    ...
for x
  for y
    ...

there could be reuse across these loops, and we're currently dropping it on the floor

some tiling technique could/should be used instead

FabioLuporini avatar Apr 27 '20 08:04 FabioLuporini

closing as nonsensical in retrospect

elastic requires skewing tti-staggered works just like any other codes with cross-derivatives assigned to temps by CIRE

FabioLuporini avatar Oct 31 '23 08:10 FabioLuporini