WarpX
WarpX copied to clipboard
Apparent particle density spikes with SP particles, DP fields
Configured with -DWarpX_PRECISION=DOUBLE -DAMReX_PARTICLES_PRECISION=SINGLE
, we're seeing some very anomalous reported particle densities at particular coordinates compared to default double precision:
Those are at z=325, 650, 1300
The domain is 0.0025^2, with nx=nz=1664. The domain starts with zero offsets, no moving window.
With those parameters, those coordinates are (some of the) points at which the single and double precision representation of iz*nz
are exactly equal. We observe similar spikes across the x axis at positions ±650 cells from the origin.
Here's the bit of code I wrote to recognize the representation coincidence:
#include <iostream>
#include <cstdio>
using namespace std;
int main(int argc, char** argv)
{
double dz = 0.0025/1664.0;
float sdz = dz;
double diff_dz = dz - sdz;
printf("Double %a, Mixed %a, Diff %a %e\n\n", dz, sdz, diff_dz, diff_dz);
printf("Index\tDouble\tMixed\tDiff\n");
for (int i = 0; i < 1664; ++i) {
printf("%d\t%e\t%e\t%e\n", i, i*dz, i*sdz, i*dz - i*sdz);
}
return 0;
}
When I wrote that, I was expecting those to be points at which the representation difference was greatest, rather than zero.
To be more precise, 325 is the first such point with identical SP/DP value, and then 2*325
and 4*325
, but we didn't observe errors at 3*325=975
or 5*325=1625
which also have identical SP/DP values
Ok, here's some zoomed-in scatter plots of the points around z=1300
data:image/s3,"s3://crabby-images/5da5f/5da5f99cd2d98d2d0daeb7283887bda20268f3cf" alt="image"
data:image/s3,"s3://crabby-images/31f23/31f23d924d6753473db200853eee3c15b5a2f258" alt="image"
Given that the anomaly extends several points away from z=1300, I doubt that it's purely a diagnostic/analysis artifact. There's a similar but smaller spread around z=325,650
as well. My suspicion is that charge deposition is misbehaving at these points, and there's a physical concentration that's resulting. I've got a rough sense that the adjacent 'excess' and 'deficit' points should balance out to about the average density difference between the two runs
These were both run on CUDA GPU with 1 grid covering the entire domain, so this isn't an issue related to Redistribute()
A note of discussion from @KZhu-ME and @peterscherpelz - the same anomaly appears for electron density as well, but to a lesser extent.
The effect takes some time from the beginning of the run to appear, and so per @peterscherpelz
If it's not as prominent for electrons that would be consistent with a gradual effect over time, because the slower movement of the ions implies they wouldn't move away to smooth out the spike as quickly.
@ax3l Any idea how to move forward on this?
May be a consequence of -ffast-math
on GPU
Does the same issue occur with AMReX_CUDA_FASTMATH=OFF
?
Thanks; we'll check with fast math off next - it'll probably be next week on our end that we can check it.
Sorry for the late update. Turning fastmath off did not fix the issue unfortunately.
It looks like that changed the position of the spikes, though. What indices are those happening at? Can you see anything interesting about the output of my little analysis program at those positions?
Kevin told me that he did that run in a slightly lower-resolution domain, 1472^2, to get results a bit faster. The spikes consistently occur where the zero differences appear at that resolution, based on my utility above.
Ok, now that #3397 has been addressed, I'm back to looking at this. Here's a more recent plot of the effect
That shows its absence in double precision, and its growth over time