CarpetX icon indicating copy to clipboard operation
CarpetX copied to clipboard

CarpetX: Added interpolation caches: Stencils and Particles

Open lucass-carneiro opened this issue 1 month ago • 7 comments

Overview

This PR provides two optimizations for CarpetX's interpolator, based on caching:

  1. Compute interpolation stencil positions and offsets for all particles. This information can be reused across multiple variables.
  2. Caches AMReX particle containers used in interpolation. When interpolation targets don't change (i.e., when the grid does not change), these containers can be reused. This skips the expensive operations of creating and distributing AMReX particles across domain.

I expect this optimization to be important for multipatch runs that don't change their grids very often (or at all)

Benchmarks

These are Cactus timer based benchmarks that measure Multipatch1_Interpolate, which indirectly calls CarpetX_Interpolate.

The benchmark used runs for 4 BSSN iterations with 7 levels of mesh refinement and no regrinding. It uses 8 MPI processes and 5 OpenMP threads per process.

Cactus timer results for benchmark before optimization.

======================================================================
    %    Time/s   Min/s   Max/s   Timer (gettimeofday)
======================================================================
100.0   170.931 170.930 170.931   CCTK total time
. . .
48.6    83.122  82.376  84.132   Sync
. . .
6.3    10.854  10.233  11.643   CapyrX::MultiPatch1_Interpolate

Cactus timer results for benchmark after optimization

======================================================================
    %    Time/s   Min/s   Max/s   Timer (gettimeofday)
======================================================================
100.0   164.690 164.690 164.690   CCTK total time
. . .
46.6    76.758  76.050  77.963   Sync
. . .
4.5     5.685   5.068   6.478   CapyrX::MultiPatch1_Interpolate

Speedups using avg. time (old / new):

Total Runtime: 3.8%
Sync         : 8.3%
Interpolation: 90.9%

lucass-carneiro avatar Nov 11 '25 08:11 lucass-carneiro

Ticket is here

lucass-carneiro avatar Nov 11 '25 08:11 lucass-carneiro

I like this idea. We did the same in Carpet.

Instead of using an automatic cache I recommend splitting the interpolation function into two. The first only sets up things, without interpolating anything, and returns an "interpolation setup". The second function would take an interpolation setup as input and perform the interpolation. In this way there is no automatic mechanism and no checking needed.

In Carpet we also have a "world age" which increases whenever the grid setup changes. The world age is stored in the interpolation setup. When the world age does not match then the interpolation setup is recreated automatically.

eschnett avatar Nov 12 '25 16:11 eschnett

I'm failing to see how an InterpolationSetup data structure that holds things that the interpolator needs and has an age parameter that automatically recreates the setup when needed is any different than an automatic cache that holds the AMReX particles and recreates/redistributes them when necessary? It seems to me that these are achieving the same thing in a conceptual level, only with different names?

Once I understand the difference I can begin working on that. It will probably be significantly different than the current status of this PR though (I think I will have to basically start from scratch), so maybe change it to a draft?

lucass-carneiro avatar Nov 12 '25 19:11 lucass-carneiro

Maybe see: https://bitbucket.org/einsteintoolkit/tickets/issues/2898/carpetx-interpolator-caches?iframe=true&spa=0#comment-69249901

rhaas80 avatar Nov 12 '25 19:11 rhaas80

The main difference is that an interpolation setup can be used right away by the interpolator, it does not need to search the cache, and does not need to ensure that the cache doesn't overflow.

eschnett avatar Nov 12 '25 20:11 eschnett

Ok, so I hadn't seen Roland's bitbucket comment, sorry about that. I need to look at CarpetInterp2 more thoroughly to try and understand what it is doing exactly.

But I think it would be useful (for my understanding) if I could get a higher level understanding for what you would like to have so I can plan how the architecture of this would work.

  1. CarpetX has a global variable called epoch. It starts at 0. It automatically tracks which epoch it is in by increasing the epoch value whenever it has to (grid changes or something else).
  2. There are global data structures (say particle_cache and interp_stencil_cache) that hold interpolation data. These structures hold the properly re-distributed AMReX particle data for interpolation targets and interpolation stencil. They also hold which epoch this interpolation data is valid for. One could, prob. make them a single structure (say interp_data)
  3. There's a new function (say, interp_setup) that accesses interp_data and reads its stored epoch. It then compares it against CarpetX's current epoch. If they differ this function recreates the necessary data and updates the global data store.
  4. CarpetX_Interpolate would now call interp_setup, access the stored interpolation data and perform the interpolation using that. No changes to the CarpetX_Interpolate API would then be necessary

If the above is correct, then I think I would know how to get started. But I would also say that this is still a cache and is still automatic, in some sense, even though I understand why it is different than what I did and how simpler the epoch way would potentially be than what I came up with originally

lucass-carneiro avatar Nov 12 '25 20:11 lucass-carneiro

Converted to draft while I rework the implementation

lucass-carneiro avatar Nov 13 '25 17:11 lucass-carneiro

Based on feedback, I've significantly changed this PR. I think the caching change is complicated and we ought to do it in phases.

In this PR, I'm implementing the simplest change: I've split CarpetX_Interpolate into two:

  1. Relevant interpolation data is gathered and saved in InterpolationSetup. Currently, I'm saving only the AMReX particles, but saving more stuff in the future (such as the interpolation stencils themselves) should be easy to do
  2. InterpolateFromSetup is called with an interpolation setup and result pointers and the interpolation is performed.

Currently, I've changed CarpetX_Interpolate to use this mechanism internally. This makes sure that no external user of the CarpetX interpolation routines will note any changes unless they opt in to use the new API.

lucass-carneiro avatar Nov 29 '25 09:11 lucass-carneiro