`neutral_mixed` MMS tests and expanded differential operator MMS tests
The changes included here aim to significantly expand the tests available in tests/mms. The following features are added:
- symbolic operators in the
tests/mms/perpendicular_laplacian.pymodule covering all of the differential operators that are present inneutral_mixed. - extension of
nonorthogonal_test.pyandorthogonal_test.pyto test the newly added differential operators. This requires some changes tomain.cxx. - an automatic test of the right hand side of the equation solved in
neutral_mixed. We check that the definition of the right hand side matches the expected convergence order. Note that the expected convergence order is second order (2) whenevolve_momentum = false, but the observed convergence order is first order or worse whenevolve_momentum = true. - ability to add an external source of momentum to neutrals in
neutral_mixed. - ability to read normalised sources in
neutral_mixed.
The major addition is the test script neutral_mixed_ddt_test.py, which runs the following four tests with prescribed functions Nd, Pd, NVd that are constructed to be functions of (x,y) and satisfy the boundary conditions imposed in neutral_mixed.
-
neutral_mixed/evolve_momentum_False_conservation_test_False. Check the definition of the RHS whenevolve_momentum = false, by calculating symbolicallyddt(Nd)andddt(Pd), and using these symbolic expressions as sources to cancel the numericalddt(Nd)andddt(Pd), such that the outputddt(Nd)andddt(Pd)are of order the discretisation error. The spatial resolution is scanned to check convergence, with the following results. -
neutral_mixed/evolve_momentum_False_conservation_test_True. Check that the RHS whenevolve_momentum = falseconservesNdandPd. The spatial resolution is scanned to check convergence, with the following results. Note that density is exactly conserved but pressure is only conserved within numerical error. A volume integration is done overddt(Nd)andddt(Pd)(without a symbolic source) to obtain these errors. -
neutral_mixed/evolve_momentum_True_conservation_test_False. Check the definition of the RHS whenevolve_momentum =true, by calculating symbolicallyddt(Nd),ddt(NVd)andddt(Pd), and using these symbolic expressions as sources to cancel the numericalddt(Nd),ddt(NVd)andddt(Pd), such that the outputddt(Nd),ddt(NVd)andddt(Pd)are of order the discretisation error. The spatial resolution is scanned to check convergence, with the following results. -
neutral_mixed/evolve_momentum_True_conservation_test_True. Check that the RHS whenevolve_momentum = trueconservesNdand energy. The spatial resolution is scanned to check convergence, with the following results. Note that conservation is first order accurate for density and less than first order for energy.
These tests can be readily extended to test the ddt() of other physics modules, or to 3D (x,y,z), allowing for rapid testing of the definition of the equations solved.
Potential issues requiring further work:
- [x] Flag documentation issues, with respect to dimensionally incorrect documentation for momentum equation for neutrals, factors of mass in viscous coefficients. (See #328 for notes).
- [ ] Determine why energy conservation was not observed.
- [ ] Extend nonorthogonal operators to charged species.
- [ ] Demonstrate
neutral_mixedworking smoothly with nonorthogonal mesh in realistic physics case. - [ ] Improve readme and documentation for tests -- reviewers please make suggestions.
- [ ] Anything else raised by reviewers.
The reason the CIs keep failing on this is that you're using the CVODE solver but SUNDIALS isn't available in your CI image. I've switched it over to PVODE, which seems to work. I've also updated the MMS testing script to print the output when there is an error, which should make such issues easier to diagnose in future.
I'm struggling to work out why the CI is continuing to fail for neutral_mixed_ddt_test.py. It always seems to occur in the post-processing stages, but not consistently in the same place. When error messages are produced they indicate that it is related to the xarray objects reading netCDF data. I can not reproduce any of these locally (even when running with act).
Run 1947
Error seems to occur after reading in and printing the second BOUT++ dataset for the final test case. Presumably it actually happens while reading the third (and final) dataset. Reported as a segfault. No backtrace is provided.
...
[Pd]
function = (0.2*x^3 - 0.3*x^2 + 0.1)*sin(2*y) + 1.0
bndry_core = neumann
bndry_all = neumann
[NVd]
function = (x^4 - 2*x^3 + x^2)*sin(y)
bndry_core = neumann
bndry_all = neumann
93% tests passed, 1 tests failed out of 14
Total Test time (real) = 220.70 sec
The following tests FAILED:
14 - rhs_mms_neutral_mixed (SEGFAULT)
Run 1948
Error seemed to occurred when converting l2norm to a NumPy array for the final test case. This is after all of the datasets have been read in.
Traceback (most recent call last):
File "/home/runner/.local/lib/python3.10/site-packages/xarray/core/common.py", line 158, in __float__
return float(self.values)
File "/home/runner/.local/lib/python3.10/site-packages/xarray/core/dataarray.py", line 797, in values
return self.variable.values
File "/home/runner/.local/lib/python3.10/site-packages/xarray/core/variable.py", line 530, in values
return _as_array_or_item(self._data)
File "/home/runner/.local/lib/python3.10/site-packages/xarray/core/variable.py", line 315, in _as_array_or_item
data = np.asarray(data)
File "/home/runner/.local/lib/python3.10/site-packages/dask/array/core.py", line 1729, in __array__
x = self.compute()
File "/home/runner/.local/lib/python3.10/site-packages/dask/base.py", line 373, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/home/runner/.local/lib/python3.10/site-packages/dask/base.py", line 681, in compute
results = schedule(expr, keys, **kwargs)
File "/home/runner/.local/lib/python3.10/site-packages/xarray/core/indexing.py", line 577, in __array__
return np.asarray(self.get_duck_array(), dtype=dtype)
File "/home/runner/.local/lib/python3.10/site-packages/xarray/core/indexing.py", line 580, in get_duck_array
return self.array.get_duck_array()
File "/home/runner/.local/lib/python3.10/site-packages/xarray/core/indexing.py", line 799, in get_duck_array
return self.array.get_duck_array()
File "/home/runner/.local/lib/python3.10/site-packages/xarray/core/indexing.py", line 654, in get_duck_array
array = self.array[self.key]
File "/home/runner/.local/lib/python3.10/site-packages/xarray/backends/netCDF4_.py", line 103, in __getitem__
return indexing.explicit_indexing_adapter(
File "/home/runner/.local/lib/python3.10/site-packages/xarray/core/indexing.py", line 1023, in explicit_indexing_adapter
result = raw_indexing_method(raw_key.tuple)
File "/home/runner/.local/lib/python3.10/site-packages/xarray/backends/netCDF4_.py", line 116, in _getitem
array = getitem(original_array, key)
File "src/netCDF4/_netCDF4.pyx", line 5055, in netCDF4._netCDF4.Variable.__getitem__
File "src/netCDF4/_netCDF4.pyx", line 4631, in netCDF4._netCDF4.Variable.shape.__get__
File "src/netCDF4/_netCDF4.pyx", line 3786, in netCDF4._netCDF4.Dimension.__len__
File "src/netCDF4/_netCDF4.pyx", line 2164, in netCDF4._netCDF4._ensure_nc_success
RuntimeError: NetCDF: HDF error
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/runner/work/hermes-3/hermes-3/build/tests/mms/neutral_mixed_ddt_test.py", line 176, in <module>
success, output_message_3 = run_neutral_mixed_manufactured_solutions_test(test_input)
File "/home/runner/work/hermes-3/hermes-3/build/tests/mms/job_functions.py", line 442, in run_neutral_mixed_manufactured_solutions_test
l2norm = np.array(l2norm)
ValueError: setting an array element with a sequence.
93% tests passed, 1 tests failed out of 14
Total Test time (real) = 220.37 sec
The following tests FAILED:
14 - rhs_mms_neutral_mixed (Failed)
Run 1949
Same as Run 1947.
Run 1950
Error in final test case. I think this happens when turning the l2error into a NumPy array, again.
Exception ignored in: <function CachingFileManager.__del__ at 0x7fcd20379ab0>
Traceback (most recent call last):
File "/home/runner/.local/lib/python3.10/site-packages/xarray/backends/file_manager.py", line 250, in __del__
self.close(needs_lock=False)
File "/home/runner/.local/lib/python3.10/site-packages/xarray/backends/file_manager.py", line 234, in close
file.close()
File "src/netCDF4/_netCDF4.pyx", line 2669, in netCDF4._netCDF4.Dataset.close
File "src/netCDF4/_netCDF4.pyx", line 2636, in netCDF4._netCDF4.Dataset._close
File "src/netCDF4/_netCDF4.pyx", line 2164, in netCDF4._netCDF4._ensure_nc_success
RuntimeError: NetCDF: HDF error
93% tests passed, 1 tests failed out of 14
Total Test time (real) = 220.34 sec
The following tests FAILED:
14 - rhs_mms_neutral_mixed (SEGFAULT)
Run 1955
Same as 1947 and 1950, except this time there is a backtrace:
Exception ignored in: <function CachingFileManager.__del__ at 0x7facddacdab0>
Traceback (most recent call last):
File "/home/runner/.local/lib/python3.10/site-packages/xarray/backends/file_manager.py", line 250, in __del__
self.close(needs_lock=False)
File "/home/runner/.local/lib/python3.10/site-packages/xarray/backends/file_manager.py", line 234, in close
file.close()
File "src/netCDF4/_netCDF4.pyx", line 2669, in netCDF4._netCDF4.Dataset.close
File "src/netCDF4/_netCDF4.pyx", line 2636, in netCDF4._netCDF4.Dataset._close
File "src/netCDF4/_netCDF4.pyx", line 2164, in netCDF4._netCDF4._ensure_nc_success
RuntimeError: NetCDF: HDF error
93% tests passed, 1 tests failed out of 14
Total Test time (real) = 222.76 sec
The following tests FAILED:
14 - rhs_mms_neutral_mixed (SEGFAULT)
Run 1956
Again, seems to happen when converting the l2norm to a NumPy array for the final test case. This time there is no backtrace, just a low-level memory error.
malloc_consolidate(): unaligned fastbin chunk detected
93% tests passed, 1 tests failed out of 14
Total Test time (real) = 219.69 sec
The following tests FAILED:
14 - rhs_mms_neutral_mixed (Subprocess aborted)
Error: Process completed with exit code 8.
Some reflections
The evenly numbered runs are those labelled as being for the "pull request", while the odd number of runs are labelled as for the "push". In reality, both run exactly the same workflow. However, the "pull request" jobs fail when converting the l2norm to a NumPy array while the "push" jobs fail slightly earlier, while reading in the third dataset. The error messages that get printed vary from run to run, but not are particularly helpful in figuring out what is going wrong.
Thanks @cmacmackin for your investigation so far. Maybe @ZedThree could chime in once he's back from leave - this seems particularly subtle..