Provide fast splint in OneDimCubicSpline for PBC Coulomb Potentials
Since commit 1bc275373377871b921de37a661ac98ad5197340, splint calls in evaluateSR in PBC Coulomb potentials use cubic spline interpolation. This leads a substantial slowdown, 6% of NiO-a512 calculations, on SPR on Aurora.
diff --git a/src/Numerics/OneDimCubicSpline.h b/src/Numerics/OneDimCubicSpline.h index f58245e12..004d7b0cc 100644 --- a/src/Numerics/OneDimCubicSpline.h +++ b/src/Numerics/OneDimCubicSpline.h
- return
- m_grid->cubicInterpolateFirst(m_Y[Loc],m_Y[Loc+1],m_Y1[Loc],m_Y1[Loc+1]); Additional context Add any other context or screenshots about the feature request here.
Since the potential grid spacing is small, the cubic spline is not necessary for the accuracy. I understand the change was made to handle a wide range of grids and forces. A possible solution is to add splint_fast.
Hi Jeongnim, thanks for the report. Can you please provide a bit more info so that we can gauge the importance?
- 6% slowdown on Intel SPR relative to what? Which versions of the software did you compare?
- Do other Intel platforms show similar issues, or is there some SPR quirk we should be aware of?
- How did you measure the 6%? Inbuilt timers, profiler?
Comments:
- 6% is a surprisingly large figure. It suggests that the spline code itself it actually inefficient, not being well optimized, has redundant initialization or some other performance bug etc. The FLOP count between linear interpolation (which I think you are asking for) and cubic splines should not be that much. So perhaps we should improve the spline implementation?