Mesh Tally Crashes the Simulation
Bug Description
There is a serious bug in the calculations involving a mesh tally, specifically with the track-length estimator. The simulation freezes without producing any additional statepoints. This issue is evident because when it occurs, the number of CPUs in use drops from utilizing all available OpenMP threads to just one. Similar behavior has been reported multiple times in the OpenMC discussion forum: openmc-freezes-when-simulating-batches simulation-freezing particle-lost-without-warning-error-or-being-killed-stalling-the-run
Steps to Reproduce
Attached is a straightforward demonstration of this problem. The input file can be generated using the following Python code. This code will create a wall made of the H1 isotope and a mesh filter surrounding it. Additionally, there is an attached figure illustrating the simple geometry, with an arrow indicating the beam direction. In this example (with seed=17) the simulation will crash around the 6th batch.
H = openmc.Material(name='Hydrogen')
H.add_nuclide('H1', 1)
materials = openmc.Materials([H])
materials.export_to_xml()
z1 = openmc.ZPlane( z0 = 0)
z2 = openmc.ZPlane( z0 = 200)
cyl = openmc.ZCylinder(r = 200)
wall_reg = -cyl & +z1 & -z2
wall_cell = openmc.Cell(name = "wall" , region= wall_reg , fill = H)
world_sphere = openmc.Sphere(r=1000,name="world_sphere",boundary_type='vacuum')
world = openmc.Cell(region=-world_sphere &(~wall_reg) , name = 'world')
univ = openmc.Universe(cells=[wall_cell , world])
geometry = openmc.Geometry(univ)
geometry.export_to_xml()
settings = openmc.Settings()
settings.run_mode = 'fixed source'
settings.particles = 1000000
bat = 200
settings.batches = bat
settings.statepoint['batches'] = list(range(bat+1))[::1]
angle = openmc.stats.Monodirectional((0,0,1))
energy = openmc.stats.Discrete(10e6,1)
point = openmc.stats.Point((0, 0, -0.5))
source = openmc.Source(space=point,angle=angle,energy=energy,particle="neutron")
settings.source = source
settings.seed = 17
settings.export_to_xml()
mesh = openmc.RegularMesh()
mesh.dimension = [50, 50,50]
mesh.lower_left = [-200,-200,-10]
mesh.upper_right = [200,200,300]
mesh_filter = openmc.MeshFilter(mesh)
tallies = openmc.Tallies()
mesh_tally = openmc.Tally(name="mesh_tally")
mesh_tally.filters = [mesh_filter]
mesh_tally.scores = ["flux"]
mesh_tally.estimator = 'tracklength'
tallies.append(mesh_tally)
tallies.export_to_xml()
Environment
The tests were done using the latest docker image:
"Id": "sha256:1b0e57ac2bdddda83097c9e3370c173f5dd9e92b4f4c818fc31d441a4dc25bbb",
"Digest": null,
"RepoDigests": [
"openmc/openmc@sha256:56a36944da2cf5f2c2cb4054c8389344ec3068f36b8134af77c63e5f9abc6f57"
],
"Labels": null
}
And some more details:
OpenMC version 0.14.0
Git SHA1: fa2330103de61a864c958d1a7250f11e5dd91468
Copyright (c) 2011-2023 MIT, UChicago Argonne LLC, and contributors
MIT/X license at <https://docs.openmc.org/en/latest/license.html>
Build type: RelWithDebInfo
Compiler ID: GNU 10.2.1
MPI enabled: yes
Parallel HDF5 enabled: yes
PNG support: yes
DAGMC support: no
libMesh support: no
MCPL support: no
NCrystal support: no
Coverage testing: no
Profiling flags: no
Data library: fendl-3.2-hdf5
Hi @itay-space! Thank you very much for reporting this and for the simple working example. I'll dig into it as soon as I can.
Hi @pshriwise
I wanted to follow up on the issue I reported last week. Were you able to reproduce the problem?
Your assistance is greatly appreciated. Thank you for your help!
Hi @itay-space!
Thanks for your patience. I found the cause of the problem for the example you've provided above. The underlying issue was actually resolved in https://github.com/openmc-dev/openmc/pull/2811, which should be included in the latest OpenMC release. In this case, it was causing nan values in a particle position that led to an infinite loop in the mesh filter routine.
This doesn't seem to be the cause of all the hanging simulations you linked above, however, so I'm going to be looking into those as well separately, but I wanted to update you on this one sooner rather than later.
Hello. I left a comment in the forum here. I am pasting it here too in case it is helpful. All the following comments are related to the example provided in that forum thread.
I was trying to track this problem down with openmc-0.15.0
In file mesh.cpp:669 there is the follwing loop:
// For all directions outside the mesh, find the distance that we need to
// travel to reach the next surface. Use the largest distance, as only
// this will cross all outer surfaces.
int k_max {0};
for (int k = 0; k < n; ++k) {
if ((ijk[k] < 1 || ijk[k] > shape_[k]) &&
(distances[k].distance > traveled_distance)) {
traveled_distance = distances[k].distance;
k_max = k;
}
}
For particle 23228, traveled_distance is always equal to 0, so the particle will never advance in the subsequent code, causing the problem to hang because of an infinite loop.
For particle 23228 the involved variables in this loop have the following values:
- ijk = {-7, 1, 1}
- shape_ = {1, 1, 1}
- distances = {next_index = -6, max_surface = true, distance = 0}, {next_index = 2, max_surface = true, distance = 1614.1689884883433}, {next_index = 0, max_surface = false, distance = 4916.5287060087403}
So, in the case of particle 23228 the traveled_distance will be only updated when k==1, but the distance in this case is 0, so it won’t go into the if clause.
I am not sure if the problem is in ijk or in the distances variables.
distance is calculated in mesh.cpp:1050 as:
d.distance = (positive_grid_boundary(ijk, i) - r0[i]) / u[i];
where:
- positive_grid_boundary(ijk, i) == 100
- r0[i] == 100
giving as a result a distance of 0.
@itay-space sorry I didn't have time to get to this! Glad to see it's been resolved