stella Bug with electromagnetic stella at low ky

I believe I have found a bug with electromagnetic stella at low kys. I have been doing a linear comparison of stella, GX, GS2, CGYRO for an electromagnetic version of CBC, and I have been seeing STELLA disagreeing at low kys. Specifically, it appears that the eigenfunctions disagree entirely with the other codes (despite agreeing at higher ky values), as can be seen from the plots below. In the final plot, I am plotting the quasilinear weights (~qflux/phi**2), and you can see that STELLA consistently gets the parititioning between electrons and ions incorrect. I also attach the input files for all of the codes used in the comparison at the lowest ky.

input_files.zip

Apr 04 '25 15:04 physycola

On the suggestion of @mrhardman , I tested the various implict vs explicit options available in STELLA. Below, I plot the growthrates from this comparison (the points are as labelled, with "scan_ky_nvgrid_24" corresponding to mirror_implicit = .true. and streaming_implicit = .true.. Not using stream_implicit = .true. gives behaviour more consistent with the other codes (not shown here), except at the lowest ky, where the growthrate is completely off. This is despite the .omega file outputting numerical growthrates that agree with the other codes. The source of the difference seems to be due to the fields, which are also plotted below for the smallest ky. This suggests to be that somehow STELLA is not saving the fields correctly at this ky.

Apr 05 '25 20:04 physycola

I would pick two ky, one which you regard as high (and agreeing with other codes), and one which you regard as "low". For these two ky, I would scan in dt, for runs with and without the various fields. Running with multiple ky should rule out general normalisation issues. I believe (although @GeorgiaActon @HanneThienpondt may need to correct due to the refactoring), that you can switch off apar and bpar in turn with, e.g.,

&physics_flags
  include_apar = .false.
  include_bpar = .false.

I would suspect that bpar, being treated like bpar in GS2, very similar to phi, does not cause any disagreement. I expect you will find that running your preferred code for comparison without apar will show that stella has a problem with the implementation of apar. I suspect this because apar appears through a time derivative in the GKE that stella solves, which complicates the algorithm used to solve for g. I would expect that the fully explicit algorithm works well if dt is small enough. See https://github.com/stellaGK/stella/pull/129#issuecomment-1874010992 for examples that I used to convince myself that some features of the EM stella were working. Note that I could not get the MTM simulation to converge to GS2 with drifts_implicit = .true..

I do not expect there is any problem with stella diagnostics -- this is very likely an algorithmic issue that can only be solved by 1) resolution or 2) changes to the algorithm or bugfixes.

Apr 07 '25 08:04 mrhardman

I generally agree with these comments. It would be interesting to see if stella agrees with the other codes at small dt in both the explicit and implicit versions of the code.

Another thing to try is to set time_upwind to 1.0 and run implicit, decreasing dt to see if that gives better agreement. Georgia found this helped with numerical stability at low ky.

Apr 07 '25 09:04 mabarnes

Thank you both for your comments. I may have to delay drilling down these into these issues for a few days as I am at a conference, but they are the top of my priority list.

From the testing that I have done, I would say that I expect the fully explicit and mirror_implicit = .true. algorithms to agree wth the other codes given a sufficiently small dt (though I must emphasise that said dt seems to be very small). Turning on stream_implicit = .true. seems to render things more stable, but then one obtains the "wrong" answer. I will substantiate these claims with plots when I have had the chance to investigate further.

Apr 07 '25 11:04 physycola

After a lot of playing around, I think I may have some clarity on the minimal ingredients required to observe the discrepancy from the other codes.

To do this, I considered two values of ky from the scan below; ky = 0.141 (second lowest value) and ky = 9.899 (highest value). I would have used the lowest value (ky = 0.071), but the timstep restriction required to ensure that streaming_implicit = .false. remained stable was so large that runs were taking 12+ hours, preventing progress. For each of these ky values, I performed runs corresponding to the "outer product" of varying the number of evolved fields and the various implicit options in the code. In all of the following drifts_implicit = .false. due to the known issues there.

For ky = 9.899, STELLA shows good agreement with GS2 (and CGYRO, where present) irrespective of fields being evolved/implicit options; I show the results for all fields being evolved (but the other cases show similarly good agreement).

However, for ky = 0.141, the agreement is not so good. In particular, we find that whenever the algorithm is run with streaming_implicit = .true. and include_apar = .true. simultaneously, it gives incorrect behaviour when compared to the other codes. With include_apar = .false., it seems to recover the correct behaviour. I've included various figures that hopefully illustrate this below. The colours/markers are shared between the plots.

`include_apar = .true.` and `include_bpar = .true.`

`include_apar = .true.` and `include_bpar = .false.`

`include_apar = .false.` and `include_bpar = .true.`

`include_apar = .false.` and `include_bpar = .false.`

Thus, it appears that the option streaming_implicit = .true. gives incorrect behaviour when include_apar = .true., as @mrhardman suggested might be the case. This is corroborated by the fact that the numerical instability observed for streaming_implicit = .false. at ky = 0.071 (shown in my comment above) was only present for include_apar = .true., suggesting that there is something wrong with the implementation of the apar-dependant part of the parallel-streaming algorithm. The same behaviour, shown above for ky = 0.141, is observed for ky = 0.071 when comparing streaming_implicit = .true. and include_apar = .true. to the results obtained using GS2 (in fact, the difference is larger).

Apr 14 '25 00:04 physycola

Thank you for making this detailed comparison. It also looks like the frequency $\omega$ is off when include_bpar = .true. or include_bpar = .false., when include_apar = .false., but as the eigenfunctions look close, perhaps this might be fixed by increased resolution?

Apr 14 '25 07:04 mrhardman

I can confirm that this disagreement in frequency was due to runtime. The growthrates with include_apar = .false. are significantly smaller than in the previous case, and the STELLA runs had not been run for long enough (as it doesn't have automatic growthrate convergence checking). Running for longer resolves this issue in all cases; the one for include_apar = include_bpar = .false. is shown below.

Apr 14 '25 15:04 physycola

Hi, Toby. This is related to the beginning of your post where you tried to do benchmark with ky = 0.0707 in stella. I tried the following inputs and I got frequency = 0.0547 and growth rate = 1.311 in stella normalisation which I think agree pretty well with other codes you used. The main changes I made are resolutions (lower than yours) and upwinding parameters. You had the default upwindings which is 0.02, but I set them to be zero as you can see below. I think this may be why the agreement is better now.

&zgrid_parameters boundary_option = 'default' nperiod = 4 ntubes = 1 nzed = 64 /

&geo_knobs geo_option = 'miller' /

&millergeo_parameters betadbprim = 0.0 betaprim = 0.0 d2psidr2 = 0.0 d2qdr2 = 0.0 kappa = 1.0 kapprim = 0.0 nzed_local = 128 qinp = 2.0 rgeo = 3.0 rhoc = 0.5 rmaj = 3.0 shat = 1.0 shift = 0.0 tri = 0.0 triprim = 0.0 /

&parameters_physics adiabatic_option = 'field-line-average-term' beta = 0.05 full_flux_surface = .false. g_exb = 0.0 include_apar = .true. include_bpar = .true. nonlinear = .false. rhostar = 0.0 vnew_ref = 0.0002827725233882337 zeff = 1.0 /

&dissipation collision_model = 'fokker-planck' collisions_implicit = .false. hyper_dissipation = .false. include_collisions = .true. /

&hyper d_hyper = 0.0 /

&vpamu_grids_parameters nmu = 12 nvgrid = 48 vpa_max = 3.0 /

&parameters_numerical zed_upwind = 0.0 time_upwind = 0.0 vpa_upwind = 0.0 delt = 0.007071067811865476 delt_max = 1.0 delt_min = 1e-10 drifts_implicit = .false. explicit_option = 'rk3' fphi = 1.0 mirror_implicit = .true. mirror_semi_lagrange = .true. stream_implicit = .true. tend = 20 /

&kt_grids_knobs grid_option = 'range' /

&kt_grids_range_parameters aky_max = 0.07071067811865477 aky_min = 0.07071067811865477 nakx = 1 naky = 1 theta0_max = 0.0 theta0_min = 0.0 /

&init_g_knobs chop_side = .false. ginit_option = 'default' phiinit = 0.001 restart_file = 'nc/example.nc' width0 = 1.0 /

&species_knobs nspec = 2 species_option = 'stella' /

&species_parameters_1 dens = 1.0 fprim = 1.0 mass = 0.0002723085105 temp = 1.0 tprim = 3.0 type = 'electron' z = -1.0 /

&species_parameters_2 dens = 1.0 fprim = 1.0 mass = 1.0 temp = 1.0 tprim = 3.0 type = 'ion' z = 1.0 /

&stella_diagnostics_knobs nwrite = 500 nsave = 5000 save_for_restart = .true. write_omega = .true. write_phi_vs_time = .true. write_apar_vs_time = .true. write_bpar_vs_time = .true. write_gvmus = .false. write_gzvs = .false. write_fluxes_kxkyz = .true. / /

&layouts_knobs vms_layout = 'vms' xyzs_layout = 'yxzs' /

&neoclassical_input include_neoclassical_terms = .false. neo_option = 'sfincs' /

&sfincs_input nproc_sfincs = 2 nx = 5 nxi = 16 /

Aug 07 '25 14:08 Scottyujia

Thanks for coming back on this. The disagreement in the growthrate is pretty small even in the cases I showed, but the eigenfuntions are very different. Perhaps it would be useful for you to share the data so I could compare?

Aug 07 '25 14:08 physycola

Entire directory as requested

entire_dir.zip

Aug 07 '25 22:08 Scottyujia

As anticipated by @Scottyujia , the run with the different upwinding agrees much better with CGYRO and GS2, as can be seen from the eigenfunctions below.

There is still disagreement between STELLA and the other codes when it comes to the "quasilinear" particle and heat fluxes, but this is perhaps a separate/minor issue.

Before closing this issue, might it be worth adding a warning message when a user attempts to run at low ky with non-zero upwinding? It may avoid users encountering issues like this in the future. @mabarnes

Aug 08 '25 00:08 physycola

I think if you just turn off zed_upwinding, it will agree very well. I will up load some data and plots soon.

Aug 08 '25 00:08 Scottyujia

only_zed_upwinding=0.zip

Here is the case with only zed_upwinding set to zero. As you can see the stella eigenfunctions agree well with those from gs2. I have also attached the whole directory where the sim is run which includes all the inputs and outputs.

Aug 08 '25 01:08 Scottyujia

Great, it appears there is agreement. The only one that remains is between measures of the heat and particle fluxes, but this is not what I opened the issue for.

Aug 08 '25 01:08 physycola

Bug with electromagnetic stella at low ky

include_apar = .true. and include_bpar = .true.

include_apar = .true. and include_bpar = .false.

include_apar = .false. and include_bpar = .true.

include_apar = .false. and include_bpar = .false.

`include_apar = .true.` and `include_bpar = .true.`

`include_apar = .true.` and `include_bpar = .false.`

`include_apar = .false.` and `include_bpar = .true.`

`include_apar = .false.` and `include_bpar = .false.`