Current status:

All LB features except pressure tensor are in
Pressure tensor can be merged as soon as corresponding PR in PbmPy and Walberla are merged
Electrokinetics is WIP and tracked in a separate ticket
Documentation is not yet adapted
GPu support and support for switching accuracy to single precision on the CPu is not yet done.

The current plan is to merge this after the release of 4.2

Apr 10 '19 13:04 RudolfWeeber

Codecov Report

Merging #2701 into python will decrease coverage by 7%. The diff coverage is 38%.

@@           Coverage Diff            @@
##           python   #2701     +/-   ##
========================================
- Coverage      89%     81%     -8%     
========================================
  Files         557     559      +2     
  Lines       24326   25963   +1637     
========================================
- Hits        21698   21100    -598     
- Misses       2628    4863   +2235

Impacted Files	Coverage Δ
src/core/MpiCallbacks.hpp	`97% <ø> (ø)`
src/core/communication.hpp	`100% <ø> (ø)`
src/core/electrostatics_magnetostatics/coulomb.cpp	`79% <ø> (-1%)`	:arrow_down:
...d_based_algorithms/FluctuatingMRT_LatticeModel.cpp	`0% <0%> (ø)`
...rid_based_algorithms/FluctuatingMRT_LatticeModel.h	`0% <0%> (ø)`
...based_algorithms/LbWalberlaD3Q19FluctuatingMRT.hpp	`0% <0%> (ø)`
...ore/grid_based_algorithms/lb_particle_coupling.hpp	`100% <ø> (ø)`
.../grid_based_algorithms/lbboundaries/LBBoundary.hpp	`73% <ø> (-27%)`	:arrow_down:
src/core/grid_based_algorithms/philox_rand.h	`0% <0%> (ø)`
src/core/grid_based_algorithms/lb_interface.cpp	`36% <45%> (-35%)`	:arrow_down:
... and 54 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 95a9464...379b1d3. Read the comment docs.

May 06 '19 14:05 codecov[bot]

Implementation status of Espresso's lb interface is here: https://github.com/espressomd/espresso/wiki/Walberla_Integartion

May 13 '19 18:05 RudolfWeeber

@fweik, this could be a starting point for integrating boundary support. Particle coupling only works on a single core for now, if skin>0

The LbWalberla class has

get_node_velocity_at_boundary(const Utils::Vector3i &node) const; bool set_node_velocity_at_boundary(const Utils::Vector3i node, const Utils::Vector3d v); bool remove_node_from_boundary(const Utils::Vector3i &node); all of which are parallel calls. Velocities are in LB units, indices are global.

Jun 20 '19 14:06 RudolfWeeber

@mkuron, there still seems to be an issue with the ghost communication. Could you please take a look at the constructor of LbWalberla? The intention was to have a ghost layer of size int(skin/agrid +1) and to ghost-communicate the full set of populations. Then, velocity interpolation of particles outside the box domain and on ghosts should work. Unfortunately, that is not what happens. The lb_momentum_conservation test cannot be run on more than one node.

Jun 20 '19 14:06 RudolfWeeber

The communication looks fine. Are you sure it's not the domain decomposition where the error is coming from? If PBCs work, then the communication is also working.

Jun 24 '19 10:06 mkuron

@mkuron, I investigated the Nan-proliferation from Walberla's UBB() further. It only appears, when nodes are marked as boundary at one of the domain boundaires. As far as I can tell, this case is not covered by Walberla's testing of the UBB. There, alsways SimpleUBB is used for channels.

I extracted the channel test to src/core/unit_tests/LbWalberla.cpp (boundary_test_shear) to get rid of potential Espresso interference. Could you please take a look at the constructor of the LbWalberla class in src/core/grid_based_algorithms/LbWalberla.cpp and at the LbWalberla::LB_Boundary_handling class in src/core/grid_based_algorithms/LbWalberla.hpp Maybe a boundary related communication or the like is still missing? The UBB uses ghost-layer fields. Do we have to add that to the communication in the LbWalberla constructor?

Aug 15 '19 11:08 RudolfWeeber

https://github.com/RudolfWeeber/espresso/tree/walberla/src/core/grid_based_algorithms

Aug 15 '19 11:08 RudolfWeeber

when nodes are marked as boundary at one of the domain boundaires

You need to consistently mark cells on both sides of the PBC yourself. Communication cannot take care of that for you. So when you flag a boundary at x=0, you need to do the same at x=L.

Aug 15 '19 11:08 mkuron

On Thu, Aug 15, 2019 at 04:21:07AM -0700, Michael Kuron wrote:

when nodes are marked as boundary at one of the domain boundaires

You need to consistently mark cells on both sides of the PBC yourself. Communication cannot take care of that for you. So when you flag a boundary at x=0, you need to do the same at x=L. Where L is the grid dimension and the index runs from 0 to L-1?

Aug 15 '19 11:08 RudolfWeeber

Exactly. So your ghost cells would be at x=-1 and X=L.

Aug 15 '19 11:08 mkuron

OK, so it's not pretty, but works. Since the opposing ghost cells can be on a different MPI rank, all ranks now go through the full LB grid and mark boundary cells shifted by a full lattice size in all combinations of coordiantes as well now.

In the steady state case, the shear profiles now match (testsuite/python/lb_shear.py with 500 integration steps). The time evolution is not yet correct. Maybe, the viscosity used in LbWalberla::LbWalberla has to be converted to lattice units.

Aug 16 '19 16:08 RudolfWeeber

With the viscosity converted to lattice units, the time-dependent shear profile is now reproduced

Aug 19 '19 09:08 RudolfWeeber

So, the boundary force from Walberla for the couette flow along the shear direction matches the expected result. (lb_shear.py)

@mkuron, the UBB also reports a force perpendicular to the wall. The Walberla test for BoundaryForcecouette does not check this. What is the expected value? It is of similar magnitude to the equilibrium pressure times surface area, but off by a factor of not quite 2-3 depending on parameters.

Equilibrium pressure is: p_eq = DENS * AGRID2 / TIME_STEP2 / 3

agrid, density, and viscosity are all !=1, but setting them to 1 doesn't help, so not unit conversions, I think.

Aug 19 '19 17:08 RudolfWeeber

@RudolfWeeber you seem to have forgotten to commit grid_based_algorithms/lbboundaries/LBBoundary.cpp

Aug 20 '19 07:08 KaiSzuttor

Added it.

Aug 20 '19 08:08 RudolfWeeber

Current state (to my understanding)

Working:

Basic LB setup and most property getters/setters
parallelization up to 2 cores (Walberla enumerates mpi nodes differently)
Velocity boundary conditions validated via onset of Couette flow
partial support for forces on boundaries (tangential force on boundaires for Couette flow)
standard particle coupling (there are slight differences between what Walberla is doing and what Espresso is doing with regards to forces applied to the LB)
inertialess tracer coupling
lb electrodydrodynamics Not working:
ENGINE (probably affected by the same bug as ENGINE for Espresso's cpu lb)
setting bulk viscosity independently
non-equal time steps for MD and Walberla LB
Thermalization

Aug 29 '19 16:08 RudolfWeeber

This is the current state as per the test suite. Thermalization is waiting for the type erasure work. That will come tomorrow, if all goes well.

Walberla is currently limited to 2 mpi ranks.

Working:

lb_buoyancy_force.py Passes on 2 cores
lb_poiseuille.py passes on 2 cores
lb_thermo_virtual.py Working on 2 cores
lb_walberla.py passes on 2 cores, but probably obsolete by now
linear_momentum_lb.py Passing on 2 cores
lb_electrohydrodynamics.py

Woprking on 1 core:

lb_boundary.py On 2 nodes, the head node doesn't see correct boundary flags on boundaries stored on the 2nd node. Probably Mpi ranks should only answer, if the site is in their local domain
lb_boundary_volume_force.py passes on 1 core. Wrong values on 2 core. Issue with boundary force reduction?

Partial:

lb.py Passes on 2 cores with thermalization tests disabled
lb_momentum_conservation.py Worse results on 2 nodes than on 1. Liekly bug. Generally unclear, what tolerance is acceptable. Unit-test for viscous coupling needed
lb_shear.py the velocity profile is correct on 2 cores, but the stress tensor is wrong on 1 and 2 cores (#3464)

Broken:

engine_lb.py broken. Slight mismatch. This should be looked at after the unit-testing for particle coupling is done. Note that the total momentum used in the test now contains dt * F/2 for all point forces applied to lb in the previous time step. This is a behaviour change.
lb_boundary_velocity.py Broken. Not entirely clear, why it eer worked, though
lb_interpolation.py Broken (not examined)
lb_poiseuille_cylinder.py Broken, but values appear to be proportional
lb_stokes_sphere.py Results off by some 10 percent. Either issue with different boundary handling or with different LB model
virtual_sites_tracers_walberla.py Broken. Liekly, the case of intertialess tracers with no lb active is not handled correctly

Waiting for thermalziation:

lb_thermostat.py Waiting for thermalization
lb_density.py Needs thermalization. Waiting for type reasure work
lb.py thermalization part

Other:

lb_streaming.py Not applicable until the same LB model as in Espresso is used.

Feb 10 '20 18:02 RudolfWeeber

@itischler, I merged your changes moving the interpolation to Espresso. Thanks.

I applied 2 corrections:

Removed extra factor 8 in get_velocity_at_pos()
The offset of the Lb Lattice is Agrid/2 or 1/2 in lattice units. I changed that (rom 0) in the calls to the bspline inpterpolation

I also added exceptions, if interpolation source nodes are inaccessible.

With these changes, the code works in the middle of the box. (lb_momentum_conservation.py). But as soon, as the particel gets close to agrid/2 to a box boundary, the total momentum derails. the lb_momentum_conservation.py test prints the momentum over time and a message, once the boundary gets close.

Would you be willing to investigate this further? Preferrably by mimicing the components of the particle coupling in the unit test of by making the stuff in lb_particle_coupling.cpp usable in the unit test (if it ins't already).

There are two possible causes:

velocity interpolation (e.g., outdated info int the pdf field ghost layer)
Loss or double-counting of applied forces. If F_c is the coupling force in a time step, then sum_i f_i should equal F_c, where i goes over all NON-GHOSt lattice sites on all mpi ranks. The terms are, to my understanding, only equal for the entire system, not per node. So, an MPI reduction would be needed in the unit test.

Feb 18 '20 17:02 RudolfWeeber

@RudolfWeeber the error in test lb_shear is random, with all the characteristics of a dangling pointer. I could reproduce it three times in the docker container on coyote10, but not systematically, and only with 2 MPI threads, and only the first time that make check_python was executed. Reproducing the error with a debug build of espresso yielded the following backtrace:

[1][ERROR   ]-----(18.519 sec) Assertion failed!
[1]  File:       /home/espresso/espresso/build/walberla-prefix/src/walberla/src/field/allocation/FieldAllocator.h:149
[1]  Expression: referenceCounts_.find(mem) != referenceCounts_.end()
[1]  
[1]  
[1]  Fatal error came from /home/espresso/espresso/build/walberla-prefix/src/walberla/src/core/debug/CheckFunctions.cpp:41
[1]  Aborting now ...
[1]  
[1]  Stack backtrace:
[1]  Backtrace: 
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 walberla::debug::printStacktrace(std::ostream&)
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 walberla::Abort::defaultAbort(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, bool)
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 walberla::Abort::abort(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int)
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 walberla::debug::check_functions_detail::ExitHandler::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 void walberla::debug::check_functions_detail::check<walberla::debug::check_functions_detail::ExitHandler>(char const*, char const*, int, walberla::debug::check_functions_detail::ExitHandler)
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 walberla::field::FieldAllocator<double>::decrementReferenceCount(double*)
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 walberla::field::Field<double, 19ul>::~Field()
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 walberla::field::GhostLayerField<double, 19ul>::~GhostLayerField()
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 walberla::lbm::PdfField<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> >::~PdfField()
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 walberla::lbm::PdfField<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> >::~PdfField()
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 walberla::domain_decomposition::internal::BlockData::Data<walberla::lbm::PdfField<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> > >::~Data()
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 walberla::domain_decomposition::internal::BlockData::Data<walberla::lbm::PdfField<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> > >::~Data()
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 walberla::domain_decomposition::IBlock::~IBlock()
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release()
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 walberla::blockforest::BlockForest::~BlockForest()
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 walberla::blockforest::StructuredBlockForest::~StructuredBlockForest()
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release()
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count()
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 std::__shared_ptr<walberla::blockforest::StructuredBlockForest, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr()
[1]  	/home/espresso/espresso/build/src/core/EspressoCore.so 	 std::shared_ptr<walberla::blockforest::StructuredBlockForest>::~shared_ptr()
[1]  
[1]  
[1]  
[1]  (from: /home/espresso/espresso/build/walberla-prefix/src/walberla/src/core/debug/CheckFunctions.cpp:41)

I couldn't reproduce it again after that, so I haven't got a GDB backtrace.

Feb 24 '20 12:02 jngrad

Finally managed to catch the exception in GDB using

/usr/bin/mpiexec -n 1 ./pypresso --gdb testsuite/python/lb_shear.py : -n 1 ./pypresso testsuite/python/lb_shear.py

However, the backtrace is completely different:

#0  0x00007f2b29d42ced in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x00007f2b2ae676c8 in walberla::domain_decomposition::internal::BlockData::thrower<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> > (ptr=0x2179df0) at /home/espresso/es3/espresso/build/walberla-prefix/src/walberla/src/domain_decomposition/IBlock.h:158
#2  0x00007f2b2ae27ba9 in walberla::domain_decomposition::internal::BlockData::get<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> > (this=0x2179270) at /home/espresso/es3/espresso/build/walberla-prefix/src/walberla/src/domain_decomposition/IBlock.h:101
#3  0x00007f2b2ae1dea4 in walberla::domain_decomposition::internal::BlockData::get<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> > (this=0x2179270) at /home/espresso/es3/espresso/build/walberla-prefix/src/walberla/src/domain_decomposition/IBlock.h:117
#4  0x00007f2b2ae11ed7 in walberla::domain_decomposition::IBlock::getData<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> > (this=0x2178df0, index=...) at /home/espresso/es3/espresso/build/walberla-prefix/src/walberla/src/domain_decomposition/IBlock.h:342
#5  0x00007f2b2ae04f7d in walberla::domain_decomposition::IBlock::getData<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> > (this=0x2178df0, index=...) at /home/espresso/es3/espresso/build/walberla-prefix/src/walberla/src/domain_decomposition/IBlock.h:355
#6  0x00007f2b2ae6d87b in walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >::configure (this=0x20d8498, block=...) at /home/espresso/es3/espresso/build/walberla-prefix/src/walberla/src/lbm/lattice_model/ForceModel.h:573
warning: RTTI symbol not found for class 'walberla::blockforest::StructuredBlockForest'
#7  0x00007f2b2ae67e5c in walberla::lbm::LatticeModelBase<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2>::configure (this=0x20d8468, block=..., sbs=...) at /home/espresso/es3/espresso/build/walberla-prefix/src/walberla/src/lbm/lattice_model/LatticeModelBase.h:102
#8  0x00007f2b2ae61cc8 in walberla::lbm::internal::PdfFieldHandling<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> >::allocateDispatch (this=0x20d8430, block=0x2178df0, _initialize=true, initialDensity=1) at /home/espresso/es3/espresso/build/walberla-prefix/src/walberla/src/lbm/field/AddToStorage.h:136
#9  0x00007f2b2ae508e8 in walberla::lbm::internal::PdfFieldHandling<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> >::allocate (this=0x20d8430, block=0x2178df0) at /home/espresso/es3/espresso/build/walberla-prefix/src/walberla/src/lbm/field/AddToStorage.h:93
#10 0x00007f2b2ae50600 in walberla::field::BlockDataHandling<walberla::lbm::PdfField<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> >, false>::initialize (this=0x20d8430, block=0x2178df0) at /home/espresso/es3/espresso/build/walberla-prefix/src/walberla/src/field/blockforest/BlockDataHandling.h:54
#11 0x00007f2b2ae4fce8 in walberla::blockforest::internal::BlockDataHandlingHelper<walberla::lbm::PdfField<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> > >::initialize (this=0x2179b70, block=0x2178df0) at /home/espresso/es3/espresso/build/walberla-prefix/src/walberla/src/blockforest/BlockDataHandling.h:126
#12 0x00007f2b2aecee8d in walberla::domain_decomposition::BlockStorage::addBlockData(walberla::selectable::SetSelectableObject<std::shared_ptr<walberla::domain_decomposition::internal::BlockDataHandlingWrapper>, walberla::uid::UID<walberla::uid::suidgenerator::S> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /home/espresso/es3/espresso/build/src/core/EspressoCore.so
#13 0x00007f2b2adc730f in walberla::blockforest::BlockForest::addBlockData (this=0x2178a70, dataHandling=..., identifier="pdf field") at /home/espresso/es3/espresso/build/walberla-prefix/src/walberla/src/blockforest/BlockForest.h:364
warning: RTTI symbol not found for class 'std::_Sp_counted_ptr_inplace<walberla::lbm::internal::PdfFieldHandling<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> >, std::allocator<walberla::lbm::internal::PdfFieldHandling<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> > >, (__gnu_cxx::_Lock_policy)2>'
warning: RTTI symbol not found for class 'std::_Sp_counted_ptr_inplace<walberla::lbm::internal::PdfFieldHandling<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> >, std::allocator<walberla::lbm::internal::PdfFieldHandling<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> > >, (__gnu_cxx::_Lock_policy)2>'
#14 0x00007f2b2adea479 in walberla::blockforest::BlockForest::addBlockData<walberla::lbm::internal::PdfFieldHandling<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> > > (this=0x2178a70, dataHandling=std::shared_ptr<walberla::lbm::internal::PdfFieldHandling<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1> >, 2> >> (use count 3, weak count 0) = {...}, identifier="pdf field", requiredSelectors=..., incompatibleSelectors=...) at /home/espresso/es3/espresso/build/walberla-prefix/src/walberla/src/blockforest/BlockForest.h:867
warning: RTTI symbol not found for class 'std::_Sp_counted_ptr_inplace<walberla::lbm::internal::PdfFieldHandling<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> >, std::allocator<walberla::lbm::internal::PdfFieldHandling<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> > >, (__gnu_cxx::_Lock_policy)2>'
warning: RTTI symbol not found for class 'std::_Sp_counted_ptr_inplace<walberla::lbm::internal::PdfFieldHandling<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> >, std::allocator<walberla::lbm::internal::PdfFieldHandling<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> > >, (__gnu_cxx::_Lock_policy)2>'
#15 0x00007f2b2ade06de in walberla::blockforest::StructuredBlockForest::addBlockData<walberla::lbm::internal::PdfFieldHandling<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> > > (this=0x20f32d0, dataHandling=std::shared_ptr<walberla::lbm::internal::PdfFieldHandling<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1> >, 2> >> (use count 3, weak count 0) = {...}, identifier="pdf field", requiredSelectors=..., incompatibleSelectors=...) at /home/espresso/es3/espresso/build/walberla-prefix/src/walberla/src/blockforest/StructuredBlockForest.h:139
warning: RTTI symbol not found for class 'std::_Sp_counted_ptr_inplace<walberla::blockforest::StructuredBlockForest, std::allocator<walberla::blockforest::StructuredBlockForest>, (__gnu_cxx::_Lock_policy)2>'
warning: RTTI symbol not found for class 'std::_Sp_counted_ptr_inplace<walberla::blockforest::StructuredBlockForest, std::allocator<walberla::blockforest::StructuredBlockForest>, (__gnu_cxx::_Lock_policy)2>'
#16 0x00007f2b2add8380 in walberla::lbm::addPdfFieldToStorage<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2>, walberla::blockforest::StructuredBlockForest> (blocks=std::shared_ptr<walberla::blockforest::StructuredBlockForest> (use count 2, weak count 3) = {...}, identifier="pdf field", latticeModel=warning: RTTI symbol not found for class 'walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2>'
..., initialVelocity=..., initialDensity=1, ghostLayers=2, layout=@0x7ffc109d0380: walberla::field::zyxf, requiredSelectors=..., incompatibleSelectors=...) at /home/espresso/es3/espresso/build/walberla-prefix/src/walberla/src/lbm/field/AddToStorage.h:212
#17 0x00007f2b2add0faa in walberla::LbWalberla<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> >::setup_with_valid_lattice_model (this=0x7ffc109d0d50) at /home/espresso/es3/espresso/src/core/grid_based_algorithms/LbWalberla_impl.hpp:344
#18 0x00007f2b2adcad58 in walberla::LbWalberlaD3Q19TRT::LbWalberlaD3Q19TRT (this=0x7ffc109d0d50, viscosity=0.28888888888888892, density=0.49679999999999991, agrid=0.59999999999999998, tau=0.02, box_dimensions=..., node_grid=..., n_ghost_layers=2) at /home/espresso/es3/espresso/src/core/grid_based_algorithms/LbWalberlaD3Q19TRT.hpp:17
#19 0x00007f2b2adc2905 in init_lb_walberla (viscosity=0.28888888888888892, density=0.49679999999999991, agrid=0.59999999999999998, tau=0.02, box_dimensions=..., node_grid=..., skin=0.23999999999999999) at /home/espresso/es3/espresso/src/core/grid_based_algorithms/lb_walberla_instance.cpp:46
#20 0x00007f2b2add281c in Communication::MpiCallbacks::call_all<double, double, double, double, Utils::Vector<double, 3ul> const&, Utils::Vector<int, 3ul> const&, double, double&, double, double&, double&, Utils::Vector<double, 3ul> const&, Utils::Vector<int, 3ul>&, double&> (this=0x20d7300, fp=0x7f2b2adc2856 <init_lb_walberla(double, double, double, double, Utils::Vector<double, 3ul> const&, Utils::Vector<int, 3ul> const&, double)>, args#0=@0x7ffc109d0fd8: 0.28888888888888892, args#1=@0x7ffc109d0fe0: 0.49679999999999991, args#2=@0x7ffc109d0fc8: 0.59999999999999998, args#3=@0x7ffc109d0fc0: 0.02, args#4=..., args#5=..., args#6=@0x7f2b2b435a70: 0.23999999999999999) at /home/espresso/es3/espresso/src/core/MpiCallbacks.hpp:573
#21 0x00007f2b2adc2c16 in mpi_init_lb_walberla (viscosity=0.28888888888888892, density=2.2999999999999998, agrid=0.59999999999999998, tau=0.02) at /home/espresso/es3/espresso/src/core/grid_based_algorithms/lb_walberla_instance.cpp:60
#22 0x00007f2b09888d8f in __pyx_pf_10espressomd_2lb_15LBFluidWalberla_10_activate_method (__pyx_v_self=0x7f2b18037b08) at /home/espresso/es3/espresso/build/src/python/espressomd/lb.cpp:7529
#23 0x00007f2b098882be in __pyx_pw_10espressomd_2lb_15LBFluidWalberla_11_activate_method (__pyx_v_self=0x7f2b18037b08, unused=0x0) at /home/espresso/es3/espresso/build/src/python/espressomd/lb.cpp:7434
#24 0x00007f2b2823c3f6 in __Pyx_CyFunction_CallMethod (func=0x7f2b231b3df0, self=0x7f2b18037b08, arg=0x7f2b2e7bd048, kw=0x0) at /home/espresso/es3/espresso/build/src/python/espressomd/script_interface.cpp:15940
#25 0x00007f2b2823c704 in __Pyx_CyFunction_CallAsMethod (func=0x7f2b231b3df0, args=0x7f2b18038550, kw=0x0) at /home/espresso/es3/espresso/build/src/python/espressomd/script_interface.cpp:15992
#26 0x00007f2b09ae0dc2 in __Pyx_PyObject_Call (func=0x7f2b231b3df0, arg=0x7f2b18038550, kw=0x0) at /home/espresso/es3/espresso/build/src/python/espressomd/actors.cpp:9403
#27 0x00007f2b09ae1043 in __Pyx__PyObject_CallOneArg (func=0x7f2b231b3df0, arg=0x7f2b18037b08) at /home/espresso/es3/espresso/build/src/python/espressomd/actors.cpp:9442
#28 0x00007f2b09ae116a in __Pyx_PyObject_CallOneArg (func=0x7f2b231b3df0, arg=0x7f2b18037b08) at /home/espresso/es3/espresso/build/src/python/espressomd/actors.cpp:9461
#29 0x00007f2b09ac76f3 in __pyx_pf_10espressomd_6actors_5Actor_6_activate (__pyx_v_self=0x7f2b18037b08) at /home/espresso/es3/espresso/build/src/python/espressomd/actors.cpp:2491
#30 0x00007f2b09ac661e in __pyx_pw_10espressomd_6actors_5Actor_7_activate (__pyx_v_self=0x7f2b18037b08, unused=0x0) at /home/espresso/es3/espresso/build/src/python/espressomd/actors.cpp:2265
#31 0x00007f2b2823c3f6 in __Pyx_CyFunction_CallMethod (func=0x7f2b231ab1b8, self=0x7f2b18037b08, arg=0x7f2b2e7bd048, kw=0x0) at /home/espresso/es3/espresso/build/src/python/espressomd/script_interface.cpp:15940
#32 0x00007f2b2823c704 in __Pyx_CyFunction_CallAsMethod (func=0x7f2b231ab1b8, args=0x7f2b18038400, kw=0x0) at /home/espresso/es3/espresso/build/src/python/espressomd/script_interface.cpp:15992
#33 0x00007f2b09ae0dc2 in __Pyx_PyObject_Call (func=0x7f2b231ab1b8, arg=0x7f2b18038400, kw=0x0) at /home/espresso/es3/espresso/build/src/python/espressomd/actors.cpp:9403
#34 0x00007f2b09ae1043 in __Pyx__PyObject_CallOneArg (func=0x7f2b231ab1b8, arg=0x7f2b18037b08) at /home/espresso/es3/espresso/build/src/python/espressomd/actors.cpp:9442
#35 0x00007f2b09ae116a in __Pyx_PyObject_CallOneArg (func=0x7f2b231ab1b8, arg=0x7f2b18037b08) at /home/espresso/es3/espresso/build/src/python/espressomd/actors.cpp:9461
#36 0x00007f2b09ad3ea9 in __pyx_pf_10espressomd_6actors_6Actors_4add (__pyx_self=0x7f2b231ac048, __pyx_v_self=0x7f2b2d345e10, __pyx_v_actor=0x7f2b18037b08) at /home/espresso/es3/espresso/build/src/python/espressomd/actors.cpp:6033
#37 0x00007f2b09ad3a2d in __pyx_pw_10espressomd_6actors_6Actors_5add (__pyx_self=0x7f2b231ac048, __pyx_args=0x7f2b180906c8, __pyx_kwds=0x0) at /home/espresso/es3/espresso/build/src/python/espressomd/actors.cpp:5965
#38 0x00007f2b2823c385 in __Pyx_CyFunction_CallMethod (func=0x7f2b231ac048, self=0x7f2b231ac048, arg=0x7f2b180906c8, kw=0x0) at /home/espresso/es3/espresso/build/src/python/espressomd/script_interface.cpp:15935
#39 0x00007f2b2823c5f6 in __Pyx_CyFunction_Call (func=0x7f2b231ac048, arg=0x7f2b180906c8, kw=0x0) at /home/espresso/es3/espresso/build/src/python/espressomd/script_interface.cpp:15974
#40 0x00007f2b2823c759 in __Pyx_CyFunction_CallAsMethod (func=0x7f2b231ac048, args=0x7f2b180906c8, kw=0x0) at /home/espresso/es3/espresso/build/src/python/espressomd/script_interface.cpp:15995

I cut at frame 40 because frames 41 to 106 are in the Python executable and don't have debug info. If you need the full GDB log, it's in gdb-bt.log

Feb 24 '20 14:02 jngrad

@RudolfWeeber I can break on the assertion line, but can't do much from there: with continue GDB will exit on the assertion, with catch throw then continue GDB will catch the irrelevant __cxa_throw in the next loop. If I set a breakpoint on the body of the assertion macro before the MPI abort (shown below), GDB won't actually catch it.

(gdb) set breakpoint pending on
break /home/espresso/es4/espresso/build/walberla-prefix/src/walberla/src/core/debug/CheckFunctions.cpp:41
run
(gdb) No symbol table is loaded.  Use the "file" command.
Breakpoint 1 (/home/espresso/es4/espresso/build/walberla-prefix/src/walberla/src/core/debug/CheckFunctions.cpp:41) pending.
(gdb) warning: Error disabling address space randomization: Operation not permitted
Starting program: /usr/bin/python3 testsuite/python/lb_shear.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fda97198700 (LWP 47878)]
[New Thread 0x7fda96997700 (LWP 47879)]
[1][ERROR   ]-----(18.393 sec) Assertion failed!
[1]                            File:       /home/espresso/es4/espresso/build/walberla-prefix/src/walberla/src/field/allocation/FieldAllocator.h:149
[1]                            Expression: referenceCounts_.find(mem) != referenceCounts_.end()
...
[1]                            (from: /home/espresso/es4/espresso/build/walberla-prefix/src/walberla/src/core/debug/CheckFunctions.cpp:41)

Feb 24 '20 17:02 jngrad

The assertion is raised on rank 1 almost every time, no wonder I couldn't catch it from a GDB session on rank 0. When starting 2 GDB sessions, mpiexec automatically passes 'quit' to the extra GDB process. Cannot start two GDB sessions in windows with xterm -e because docker has no GUI. Tried starting two GDB sessions in a terminal multiplexer with mpirun -np 2 screen -AdmS mpi ./pypresso --gdb="-ex run" testsuite/python/lb_shear.py but I got an ompi_mpi_init: ompi_rte_init failed in docker (and outside of docker too). Running out of ideas.

Feb 24 '20 19:02 jngrad

It is apparently possible to forwad X out of a container

http://fabiorehm.com/blog/2014/09/11/running-gui-apps-with-docker/

Feb 24 '20 20:02 RudolfWeeber

Unable to forward an X-window in a docker container through an shh connection despite considerable efforts. The issue is reproducible on the institute machines, although it's a lot less frequent.

To simplify things, I added std::raise(SIGABRT); (from #include<csignal>) at line 40 of walberla/src/core/debug/CheckFunctions.cpp, right before WALBERLA_ABORT(). On Linux, GDB will stop at the abort signal without the need to set up a catch throw or a breakpoint. Next, simply run GDB in X-windows, many many times, until the error triggers:

mpirun -np 2 xterm -fa 'Monospace' -fs 13 -e ./pypresso --gdb="-ex run" testsuite/python/lb_shear.py

GDB backtrace:

#0  __GI_raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00001555515e7017 in walberla::debug::check_functions_detail::ExitHandler::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /work/jgrad/es-walberla/espresso/build/src/core/EspressoCore.so
#2  0x00001555514f9128 in walberla::debug::check_functions_detail::check<walberla::debug::check_functions_detail::ExitHandler> (
    expression=0x15555174d048 "referenceCounts_.find(mem) != referenceCounts_.end()", 
    filename=0x15555174cfe0 "/work/jgrad/es-walberla/espresso/build/walberla-prefix/src/walberla/src/field/allocation/FieldAllocator.h", line=149, failFunc=...)
    at /work/jgrad/es-walberla/espresso/build/walberla-prefix/src/walberla/src/core/debug/CheckFunctions.impl.h:288
#3  0x000015555156f2ca in walberla::field::FieldAllocator<walberla::math::Vector3<double> >::decrementReferenceCount (this=0xe87ea0, mem=0x1375730)
    at /work/jgrad/es-walberla/espresso/build/walberla-prefix/src/walberla/src/field/allocation/FieldAllocator.h:149
#4  0x000015555156e7b2 in walberla::field::Field<walberla::math::Vector3<double>, 1ul>::~Field (this=0x13568f0, __in_chrg=<optimized out>)
    at /work/jgrad/es-walberla/espresso/build/walberla-prefix/src/walberla/src/field/Field.impl.h:324
#5  0x0000155551589114 in walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul>::~GhostLayerField (this=0x13568f0, __in_chrg=<optimized out>)
    at /work/jgrad/es-walberla/espresso/build/walberla-prefix/src/walberla/src/field/GhostLayerField.h:88
#6  0x0000155551589130 in walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul>::~GhostLayerField (this=0x13568f0, __in_chrg=<optimized out>)
    at /work/jgrad/es-walberla/espresso/build/walberla-prefix/src/walberla/src/field/GhostLayerField.h:88
#7  0x00001555515aaa66 in walberla::domain_decomposition::internal::BlockData::Data<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >::~Data
    (this=0x13a60b0, __in_chrg=<optimized out>) at /work/jgrad/es-walberla/espresso/build/walberla-prefix/src/walberla/src/domain_decomposition/IBlock.h:61
#8  0x00001555515aaa8e in walberla::domain_decomposition::internal::BlockData::Data<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >::~Data
    (this=0x13a60b0, __in_chrg=<optimized out>) at /work/jgrad/es-walberla/espresso/build/walberla-prefix/src/walberla/src/domain_decomposition/IBlock.h:61
#9  0x00001555516000ac in walberla::domain_decomposition::IBlock::~IBlock() () from /work/jgrad/es-walberla/espresso/build/src/core/EspressoCore.so
#10 0x0000155551310cc6 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x129e930) at /usr/include/c++/7/bits/shared_ptr_base.h:154
#11 0x00001555516547e8 in walberla::blockforest::BlockForest::~BlockForest() () from /work/jgrad/es-walberla/espresso/build/src/core/EspressoCore.so
#12 0x0000155551630961 in walberla::blockforest::StructuredBlockForest::~StructuredBlockForest() () from /work/jgrad/es-walberla/espresso/build/src/core/EspressoCore.so
#13 0x0000155551310cc6 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x127f8b0) at /usr/include/c++/7/bits/shared_ptr_base.h:154
#14 0x0000155551310c81 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x13574a8, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/shared_ptr_base.h:684
#15 0x00001555514f7b94 in std::__shared_ptr<walberla::blockforest::StructuredBlockForest, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x13574a0, 
    __in_chrg=<optimized out>) at /usr/include/c++/7/bits/shared_ptr_base.h:1123
#16 0x00001555514f7bb0 in std::shared_ptr<walberla::blockforest::StructuredBlockForest>::~shared_ptr (this=0x13574a0, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/shared_ptr.h:93
#17 0x00001555514f7d1a in walberla::LbWalberla<walberla::lbm::D3Q19<walberla::lbm::collision_model::TRT, false, walberla::lbm::force_model::GuoField<walberla::field::GhostLayerField<walberla::math::Vector3<double>, 1ul> >, 2> >::~LbWalberla (this=0x1357420, __in_chrg=<optimized out>)
    at /work/jgrad/es-walberla/espresso/src/core/grid_based_algorithms/LbWalberla_impl.hpp:770
#18 0x0000155551507da4 in walberla::LbWalberlaD3Q19TRT::~LbWalberlaD3Q19TRT (this=0x1357420, __in_chrg=<optimized out>)
    at /work/jgrad/es-walberla/espresso/src/core/grid_based_algorithms/LbWalberlaD3Q19TRT.hpp:9
#19 0x0000155551507dc0 in walberla::LbWalberlaD3Q19TRT::~LbWalberlaD3Q19TRT (this=0x1357420, __in_chrg=<optimized out>)
    at /work/jgrad/es-walberla/espresso/src/core/grid_based_algorithms/LbWalberlaD3Q19TRT.hpp:9
#20 0x0000155551507cfe in std::default_delete<LbWalberlaBase>::operator() (this=0x155551b682f0 <(anonymous namespace)::lb_walberla_instance>, __ptr=0x1357420)
    at /usr/include/c++/7/bits/unique_ptr.h:78
#21 0x00001555514ff25b in std::unique_ptr<LbWalberlaBase, std::default_delete<LbWalberlaBase> >::~unique_ptr (
    this=0x155551b682f0 <(anonymous namespace)::lb_walberla_instance>, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:268
#22 0x0000155554f7e041 in __run_exit_handlers (status=0, listp=0x155555326718 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, 
    run_dtors=run_dtors@entry=true) at exit.c:108
#23 0x0000155554f7e13a in __GI_exit (status=<optimized out>) at exit.c:139
#24 0x00000000006384f7 in Py_Exit (sts=sts@entry=0) at ../Python/pylifecycle.c:1565
#25 0x00000000006385c0 in handle_system_exit () at ../Python/pythonrun.c:626
#26 0x00000000006385ec in PyErr_PrintEx () at ../Python/pythonrun.c:636
#27 0x0000000000638ab3 in PyErr_Print () at ../Python/pythonrun.c:532
#28 PyRun_SimpleFileExFlags () at ../Python/pythonrun.c:425
#29 0x0000000000638c65 in PyRun_AnyFileExFlags () at ../Python/pythonrun.c:81
#30 0x0000000000639631 in run_file (p_cf=0x7fffffffdb7c, filename=<optimized out>, fp=<optimized out>) at ../Modules/main.c:340
#31 Py_Main () at ../Modules/main.c:810
#32 0x00000000004b0f40 in main (argc=2, argv=0x7fffffffdd78) at ../Programs/python.c:69

Feb 25 '20 18:02 jngrad

Following @mkuron's advice, I added bookkeeping printf statements in every memory allocation and deallocation in walberlasrc/field/allocation/FieldAllocator.h to track down which object got its reference counter decremented too far. The output for both threads is attached in decref-thread0.log (where the error occurred) and decref-thread1.log (which hanged). Each function call prints the full function signature, the pointer, the call to deallocate if it happened, and the reference counter value after the increment/decrement (in decrementReferenceCount(T*) I actually print that value before and after the decrement with a - resp. + sign, because this is where the assertion error is triggered). If it's unclear, or if you need to reproduce the logs locally, the code diff is available in decref-printf-patch.txt.

The relevant part of the first log is this:

T* walberla::field::FieldAllocator<T>::allocate(walberla::uint_t, walberla::uint_t, walberla::uint_t, walberla::uint_t, walberla::uint_t&, walberla::uint_t&, walberla::uint_t&) [with T = walberla::math::Vector3<double>; walberla::uint_t = long unsigned int]
  mem=0x13400d0
  referenceCounts_[0x13400d0] = 1
... more instructions ...
bool walberla::field::FieldAllocator<T>::decrementReferenceCount(T*) [with T = walberla::math::Vector3<double>]
  mem=0x13400d0
 -referenceCounts_[0x13400d0] = 0
rank 0: ExitHandler::operator()

The assertion catches the incorrect reference count. Although, it seems impossible to free memory twice because the deallocate() call has a guard on refCount == 0, which is skipped if the reference counter is negative, so there shouldn't be a segfault.

Anyway, the address 0x13400d0 doesn't appear anywhere else in that log, nor in the other log, so it's unclear to me how its reference counter got decremented to 0. The referenceCounts_ member isn't accessed in other C++ files. Could it be that during garbage collection, parts of the std::map<T*, uint_t> referenceCounts_ get accidentally overwritten by zeros?

Feb 25 '20 20:02 jngrad

@jngrad, thanks for testing. I would have liked to also see backtraces of the two points that print mem=0x13400d0, but I already have a suspicion what might be happening.

In https://i10git.cs.fau.de/walberla/walberla/-/blob/master/src/field/allocation/FieldAllocator.h#L201, the reference counts map is declared as a static member variable of the allocator. In https://i10git.cs.fau.de/walberla/walberla/-/blob/master/src/field/allocation/FieldAllocator.h#L206, it is defined. Perhaps the referenceCounts_ symbol ends up in multiple shared object files, and Espresso's linker flags however somehow suppress the duplicate symbol warning? Could you please print out the memory address of referenceCounts_ every time you print the reference count to check whether it changes?

An alternative cause could be that the destruction order is wrong, though I see no reason why I should be. The reference count map is constructed when the shared object that creates the field is loaded. The field is constructed after the blockforest has been constructed and is destroyed as part of the blockforest's destruction. Only when the shared object is unloaded should the reference count map be destroyed.

Feb 26 '20 14:02 mkuron

@mkuron I'll try that. @fweik had the same suspicion.

Feb 26 '20 14:02 jngrad

@RudolfWeeber we probably should also resolve any merge conflicts ASAP. Otherwise we will have a (citing @jngrad) "merge-party".

Feb 27 '20 09:02 KaiSzuttor

@mkuron @fweik I couldn't reproduce the error when walberla::debug::printStacktrace() prints the stack trace. Printing the std::map address reveals no collision, i.e. the map address is the same in both threads, and memory is freed only in the thread where the allocation occurred. The output is in thread0.log, thread1.log. The relevant part is:

T* walberla::field::FieldAllocator<T>::allocate(walberla::uint_t) [with T = double; walberla::uint_t = long unsigned int]
  mem=0x13d1670
  referenceCounts_[0x13d1670] = 1 (std::map at 0x155551b689c0)
bool walberla::field::FieldAllocator<T>::decrementReferenceCount(T*) [with T = double]
  mem=0x13d1670
 -referenceCounts_[0x13d1670] = 1 (std::map at 0x155551b689c0)
 +referenceCounts_[0x13d1670] = 0 (std::map at 0x155551b689c0)
  deallocate()
...
T* walberla::field::FieldAllocator<T>::allocate(walberla::uint_t) [with T = double; walberla::uint_t = long unsigned int]
  mem=0x13d1670
  referenceCounts_[0x13d1670] = 1 (std::map at 0x155551b689c0)
bool walberla::field::FieldAllocator<T>::decrementReferenceCount(T*) [with T = double]
  mem=0x13d1670
 -referenceCounts_[0x13d1670] = 0 (std::map at 0x155551b689c0)
Thread 1 "python3" received signal SIGABRT, Aborted

Feb 27 '20 18:02 jngrad

Ok, but this would mean that there is a halo reduction (of the force density) anyway?

No, I don’t think so. Each particle within the halo region of a node is coupled exactly once. The distribution functions in the Walberla connection need to make sure, the total amound arrives on the correct cells, across all nodes.

Let’s discuss in person, once I’m back.

Mar 10 '20 14:03 RudolfWeeber

espresso
espresso copied to clipboard

Interface Walberla LB

Codecov Report

espresso espresso copied to clipboard

Interface Walberla LB

Codecov Report

espresso
espresso copied to clipboard