WarpX
WarpX copied to clipboard
Need help on doing a benchmark run using WarpX
Hi guys, as suggested by @RemiLehe Remi, I am opening an issue here and asking for help on doing a benchmark test using WarpX, regarding ExB instabilities occurred in Hall thrusters. The benchmark is described in this paper: 10.1088/1361-6595/ab46c5, I attached the PDF as well. 2D axial-azimuthal particle-in-cell benchmark for low-temperature partially magnetized plasmas.pdf
One special thing I need to do is that after solving the Poisson's equation, the potential needs to be modified according to
phi = phi - (x/xe)*U_average,
where U_average is the y average at x=xe, this modification corresponds to Eq. (9) in the paper.
I therefore added a function in PICMI script,
def phi_fix():
phi = fields.PhiFPWrapper(0, True)[...]
phi_ave = np.sum(phi[i_xin][1:nz+1]) / nz
for i in range(len(phi)-2):
x = i*dx
phi[i+1][:] = phi[i+1][:] - x/xin*phi_ave
# or phi[i+1][1:nz+1] = phi[i+1][1:nz+1] - x/xin*phi_ave
fields.PhiFPWrapper(0, True)[...] = phi
callbacks.installafterEsolve(phi_fix)
Remi mentioned it should be installed afterEsolve
, but I also need to update the E field accordingly in the Python function.
Axel @ax3l mentioned that maybe we can add a new callback position between poisson solve and stencil for E.
Therefore, based on what I have in the PICMI scripts, can we add a new callback position and solve this problem?
This is the full PICMI script. picmi.txt
Currently, I'm trying to use this piece of code to do the job, but I'm a little bit confused about the indexing of phi and Ex. The simulation has nx=500, phi has length along x 503, I would say it contains guard grid points, nx=500 is 500 cells, which has 0,1,2,...,500, thus 501 grid points. Plus two guard grid points, equals 503. But Ex seems to have more than 503 length, so I'm confused.
def phi_fix():
phi = fields.PhiFPWrapper(0, True)[...]
Ex = fields.ExFPWrapper(0, True)[...]
phi_ave = np.mean(phi[i_xin][:])
for i in range(len(phi)-2):
x = i*dx
phi[i+1][:] = phi[i+1][:] - x/xin*phi_ave
for i in range(len(phi)-2):
Ex[i+1][2:nz+2] = -(phi[i+2][1:nz+1]-phi[i][1:nz+1])/(dx*2)
fields.PhiFPWrapper(0, True)[...] = phi
fields.ExFPWrapper(0, True)[...] = Ex
callbacks.installafterEsolve(phi_fix)
Well, because I don't quite know if the above is doing correctly, (noticed I was using staggered grid, but switched to collocated grid now), and the simulation speed drops a lot, so I hard coded the following for now in ElectrostaticSolver.cpp
.
#elif defined(WARPX_DIM_XZ) || defined(WARPX_DIM_RZ)
int rank,nprocs,n,nc,ns;
MPI_Comm_size(MPI_COMM_WORLD,&nprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
n = 256/nprocs;
nc = 0;
Real phi0 = 0._rt, phis = 0._rt;
for (int i=rank*n; i<=(rank+1)*n; i++) {
phi0 = phi0 + phi_arr(480,i,0);
nc = nc + 1;
};
MPI_Allreduce(&phi0, &phis, 1, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD);
MPI_Allreduce(&nc, &ns, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD);
phis = phis / double(ns);
amrex::ParallelFor( tbx, tby, tbz,
[=] AMREX_GPU_DEVICE (int i, int j, int k) {
Real f = i*dx[0]/2.4e-2*phis*2.0;
Ex_arr(i,j,k) +=
+(beta_x*beta_x-1._rt)*0.5_rt*inv_dx*(phi_arr(i+1,j ,k)-phi_arr(i-1,j ,k)-f)
+ beta_x*beta_z *0.5_rt*inv_dz*(phi_arr(i ,j+1,k)-phi_arr(i ,j-1,k)-f);
},
[=] AMREX_GPU_DEVICE (int i, int j, int k) {
Real f = i*dx[0]/2.4e-2*phis*2.0;
Ey_arr(i,j,k) +=
+beta_x*beta_y*0.5_rt*inv_dx*(phi_arr(i+1,j ,k)-phi_arr(i-1,j ,k)-f)
+beta_y*beta_z*0.5_rt*inv_dz*(phi_arr(i ,j+1,k)-phi_arr(i ,j-1,k)-f);
},
[=] AMREX_GPU_DEVICE (int i, int j, int k) {
Real f = i*dx[0]/2.4e-2*phis*2.0;
Ez_arr(i,j,k) +=
+ beta_z*beta_x *0.5_rt*inv_dx*(phi_arr(i+1,j ,k)-phi_arr(i-1,j ,k)-f)
+(beta_z*beta_z-1._rt)*0.5_rt*inv_dz*(phi_arr(i ,j+1,k)-phi_arr(i ,j-1,k)-f);
}
);
Sorry for the late reply! Here are a few answers to your questions:
-
Regarding the update of
E
: as you indicated in your second message, you can simply updateE
directly inafterEsolve
. -
Regarding the number of points of
phi
andE
: the difference is becauseE
typically has more guard cells thanphi
. This is becausephi
requires only one guard cell (for the parallel Poisson solver), whileE
needs to have enough guard cells for particle with a wide shape factor to gather values from the guard cells (if they are close to a boundary between sub-domain). You can try to passinclude_ghosts=False
inExWrapper
andPhiWrapper
to avoid this problem. -
As a side-note, using explicit
for
loops in Python code like this:
for i in range(len(phi)-2):
x = i*dx
phi[i+1][:] = phi[i+1][:] - x/xin*phi_ave
for i in range(len(phi)-2):
Ex[i+1][2:nz+2] = -(phi[i+2][1:nz+1]-phi[i][1:nz+1])/(dx*2)
is very inefficient. It is much more efficient to use numpy
array-wise operations, e.g.
x = dx*np.arange(len(phi)-2)
phi[1:-1, :] += phi[1:-1,:] - x/xin*phi_ave
Ex[1:-1,2:-2] += -( phi[2:,1:-1]-phi[:-2,1:-1] )/(2*dx)
@RemiLehe Hi Remi, thanks for your detailed reply!
I have a few more questions.
-
Within the above defined function
phi_fix()
, when I use domain decomposition, each MPI rank will only handle part of the whole phi array, right? But when Iprint(len(phi))
andprint(len(phi[0][:]))
, I got all the same full length from all the ranks, why is that? In other words, does your suggested code still work correctly if I turn on domain decomposition by setting saynumprocs=[1,2]
? -
My comment above yours, which hard coded that piece of code in
ElectrostaticSolver.cpp
, will that does the job I want correctly? And will it work correctly for whatever parallelization? -
If I need to give correct values to the guard cells?
Well, I made a mistake I guess, this may be better.
#elif defined(WARPX_DIM_XZ) || defined(WARPX_DIM_RZ)
int rank,nprocs,n,nc,ns;
MPI_Comm_size(MPI_COMM_WORLD,&nprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
n = 256/nprocs;
nc = 0;
Real phi0 = 0._rt, phis = 0._rt;
for (int i=rank*n; i<=(rank+1)*n; i++) {
phi0 = phi0 + phi_arr(480,i,0);
nc = nc + 1;
};
MPI_Allreduce(&phi0, &phis, 1, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD);
MPI_Allreduce(&nc, &ns, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD);
phis = phis / double(ns);
const Box& tb = mfi.tilebox( phi[lev]->ixType().toIntVect() );
amrex::ParallelFor( tb,
[=] AMREX_GPU_DEVICE (int i, int j, int k) {
phi_arr(i,j,k) = phi_arr(i,j,k) - i*dx[0]/2.4e-2*phis;
} // loop ijk
);
amrex::ParallelFor( tbx, tby, tbz,
[=] AMREX_GPU_DEVICE (int i, int j, int k) {
Ex_arr(i,j,k) +=
+(beta_x*beta_x-1._rt)*0.5_rt*inv_dx*(phi_arr(i+1,j ,k)-phi_arr(i-1,j ,k))
+ beta_x*beta_z *0.5_rt*inv_dz*(phi_arr(i ,j+1,k)-phi_arr(i ,j-1,k));
},
[=] AMREX_GPU_DEVICE (int i, int j, int k) {
Ey_arr(i,j,k) +=
+beta_x*beta_y*0.5_rt*inv_dx*(phi_arr(i+1,j ,k)-phi_arr(i-1,j ,k))
+beta_y*beta_z*0.5_rt*inv_dz*(phi_arr(i ,j+1,k)-phi_arr(i ,j-1,k));
},
[=] AMREX_GPU_DEVICE (int i, int j, int k) {
Ez_arr(i,j,k) +=
+ beta_z*beta_x *0.5_rt*inv_dx*(phi_arr(i+1,j ,k)-phi_arr(i-1,j ,k))
+(beta_z*beta_z-1._rt)*0.5_rt*inv_dz*(phi_arr(i ,j+1,k)-phi_arr(i ,j-1,k));
}
);
Hi Yin - For you question 1, when you do phi = fields.PhiFPWrapper(0, True)[...]
, the phi that is returned is a full sized array covering the full domain and is broadcast to all processors. This will work fine, but may not be efficient since every processor will be doing the same operation on the whole array.
If you want to do the operation locally on each processor, you would need to use the Python equivalent of a loop over MFIters. For example:
phi = fields.PhiFPWrapper()
Ex = fields.ExFPWrapper()
for mfi in phi:
phiarr = phi.mf.array(mfi)
Exarr = Ex.mf.array(mfi)
phiarr += ...
Hi Yin - For you question 1, when you do
phi = fields.PhiFPWrapper(0, True)[...]
, the phi that is returned is a full sized array covering the full domain and is broadcast to all processors. This will work fine, but may not be efficient since every processor will be doing the same operation on the whole array.If you want to do the operation locally on each processor, you would need to use the Python equivalent of a loop over MFIters. For example:
phi = fields.PhiFPWrapper() Ex = fields.ExFPWrapper() for mfi in phi: phiarr = phi.mf.array(mfi) Exarr = Ex.mf.array(mfi) phiarr += ...
I see, thank you so much Dave, I will have a try!
I will close this issue, because I have had a working scipt already. Thanks for all the help!