simsopt icon indicating copy to clipboard operation
simsopt copied to clipboard

Overflow error in Mgrid write method

Open mishapadidar opened this issue 3 months ago • 5 comments

Summary Using the .write method of the Mgrid class can lead to an Overflow error (see screenshot).

Context The Mgrid class is used to write netcdf files which can be used in free boundary VMEC. When setting the resolution parameters of the Mgrid class, nr, nphi, nz, to high values, the write method can overflow (see screenshot)

Image

As seen in the screenshot, the write method attempts to write values to an array that is allocated for single precision. The error occurs because the values being written cannot be represented in 32 bits.

This overflow error is problematic because it limits users to only using coarse grids which poorly represent magnetic fields.

Fix The Mgrid class uses Scipy's netcdf_file class to write the file. The netcdf_file class relies on the version argument, which defaults to version=1. The version argument sets the array format for writing, where 1 means Classic format (32 bit in our case) and 2 means 64-bit offset format. Switching the Mgrid.write method to rely on netcdf_file(... version=2) forces the values to be written to a 64-bit array, which solves the problem.

Question/Uncertainty While setting version=2 seems to prevent the overflow error from appearing, I do not understand the fortran side of the code well enough to know if this change will lead to misrepresentation of the magnetic fields in the Mgrid files. In other words, is using 64-bit offset format appropriate? Perhaps you have a better sense, @mbkumar @landreman.

mishapadidar avatar Sep 24 '25 21:09 mishapadidar

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 92.56%. Comparing base (0da1f07) to head (de27e0d). :warning: Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #557   +/-   ##
=======================================
  Coverage   92.56%   92.56%           
=======================================
  Files          83       83           
  Lines       16229    16229           
=======================================
  Hits        15023    15023           
  Misses       1206     1206           
Flag Coverage Δ
unittests 92.56% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Sep 24 '25 22:09 codecov[bot]

I see that simsopt only uses the MGrid class in one place, in magneticfield.py in the to_mgrid function, i.e. for generating the files which will be inputs for VMEC to run on.

The question is thus whether VMEC can handle (read in) mgrid files which have been generated with this option (and next, whether our bound version of VMEC can). This I am pretty sure of, I know people who use insanely large MGRID resolutions.

The input file handling of our bound VMEC should be similar to raw VMEC, as it just wraps and interfaces the Fortran routines.

for safety, you should test if the mgrid file generated like this works with raw VMEC and on our bound version, and if it passes we can be confident there is no issue. If it fails on raw VMEC, it is an issue for them I would say.

smiet avatar Sep 30 '25 10:09 smiet

On quick look at the VMEC fortran source code, most of the integers used in mgrid code are 32 bit wide. Going for higher resolution may lead to failures. I may not be understanding the problem here. Are you talking about 32 bit integers being the problem or the grid is represented with 32 bit floats?

mbkumar avatar Oct 16 '25 02:10 mbkumar

@mbkumar I believe that the problem is that the mgrid file is being written with the incorrect version argument. I am not 100% positive if the error is arising from the float type.

One reason to suspect this is that DESC also writes mgrid files using the 64-bit offset format.

Another is that the simsopt .write method cannot write files as large as the ones often used in the community, such as the ones here, unless we use the 64 bit offset format.

To get a better understanding of the problem, I am consulting with the DESC developers.

mishapadidar avatar Oct 16 '25 13:10 mishapadidar

Then I think the issue is using 32 bit floats in place of 64 bit doubles.

Bharat Medasani

On Thu, Oct 16, 2025 at 9:31 AM Misha Padidar @.***> wrote:

mishapadidar left a comment (hiddenSymmetries/simsopt#557) https://github.com/hiddenSymmetries/simsopt/pull/557#issuecomment-3410902969

@mbkumar https://github.com/mbkumar I believe that the problem is that the mgrid file is being written with the incorrect version argument. I am not 100% positive if the error is arising from the float type.

One reason to suspect this is that DESC also writes mgrid files https://github.com/PlasmaControl/DESC/blob/4e527932c898b6058a4134fe7e57157e37e80b78/desc/magnetic_fields/_core.py#L642 using the 64-bit offset format.

Another is that the simsopt .write method cannot write files as large as the ones often used in the community, such as the ones here https://www.osti.gov/servlets/purl/2458165, unless we use the 64 bit offset format.

To get a better understanding of the problem, I am consulting with the DESC developers.

— Reply to this email directly, view it on GitHub https://github.com/hiddenSymmetries/simsopt/pull/557#issuecomment-3410902969, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA62VEH2M475OXWYBHCAWKT3X6M3BAVCNFSM6AAAAACHNLWG62VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTIMJQHEYDEOJWHE . You are receiving this because you were mentioned.Message ID: @.***>

mbkumar avatar Oct 16 '25 14:10 mbkumar

Establishing file format correctness I believe I have been able to establish that community has been using NETCDF files with the 64-bit offset format by using the ncdump command. According to the ncdump documentation, "ncdump may also be used to determine what kind of netCDF file is used (which variant of the netCDF file format) with the -k option."

I ran the command

ncdump -k some_file.nc

using wout files for W7-X, HSX, and ARIES-CS, as well as an mgrid file from STELLOPT. They all return the same answer:64-bit offset.

This is a strong signal to suggest that we should be using the version=2 argument in the NETCDF writer class, i.e. netcdf_file(... version=2). Using the version=1 argument, will not write a file in 64-bit offset format.

Test runs Following @smiet's recommendation, I tested that VMEC in free boundary mode can handle large mgrid files: nphi=720, nR=nZ=544. Here are the takeaways:

  • The Mgrid.write operation is successful when we use the netcdf_file(... version=2), as suggested in this PR. This is not the case if we use netcdf_file(... version=1).
  • VMEC runs and converges to ftol=1.00E-15
  • The attached image compares the rotational transform profile produced by free boundary VMEC compared to the profile produced using BoozerSurface. They agree extremely well, suggesting that the results are accurate.
Screenshot 2025-11-25 at 1 46 09 PM

Discussion with DESC group Additionally, I discussed this code change with the DESC group (Daniel Dudt and Dario Panici), since they also use netcdf_file(... version=2). They couldn't recall why they also use this setting, but believe that it is the right choice.

mishapadidar avatar Nov 25 '25 19:11 mishapadidar