Python gas-water simulator
Added a Python gas-water simulator to be able to run f.ex. CO2-/H2STORE data files. Also added the option to input command-line arguments in constructor so both ways of instantiating the simulator classes have that ability.
I open this PR in draft mode because most of the code is duplicate of the PyBlackOilSimulator implementation, which is not optimal. I've made futile attempts at making a base simulator that the other Python simulators can inherit. If anybody have suggestions on how to make a base simulator (or want to take up the mantle?), I'm happy to hear them!
most of the code is duplicate of the PyBlackOilSimulator implementation, which is not optimal. I've made futile attempts at making a base simulator that the other Python simulators can inherit.
@svenn-t Interesting. It is not immediately clear to me what is the difficulty with making a base simulator class. Can you elaborate on that?
I made an attempt here: 45be9b86ed7dd094278c384239993f3816df41c7 which compiled but gave "undefined symbol" errors when I imported any of the "BlackOilSimulator" or "GasWaterSimulator" in Python. It might have something to do with how I made the base simulator but I also have next to no experience with pybind and it could be some issues with how I exported the classes.
which compiled but gave "undefined symbol" errors
@svenn-t I think we need to instantiate the templated base class explicitly for the two template values we are using, see this commit: https://github.com/hakonhagland/opm-simulators-1/commit/b551864f330c91972dacb6004e2b0355e5d8418b. This fixed the undefined symbol error for me.
which compiled but gave "undefined symbol" errors
@svenn-t I think we need to instantiate the templated base class explicitly for the two template values we are using, see this commit: hakonhagland@b551864. This fixed the undefined symbol error for me.
You are correct, this fixed the "undefined symbol" errors for me as well. I did get some other issues when running examples using the python classes that I need to look into before updating this PR. But thanks for fixing the immediate errors!
I did get some other issues when running examples using the python classes that I need to look into before updating this PR
@svenn-t Yes, I am getting this error when I run the unit tests: The MPI_Comm_free() function was called after MPI_FINALIZE was invoked.. I will have a look at why this happens.
The MPI_Comm_free() function was called after MPI_FINALIZE was invoked.
Seems like this happens because the derived class PyBlackOilSimulator destructor is called before the base class PyBaseSimulator destructor, and the derived class destructor implicitly frees its Opm::Main pointer which will cause MPI_Finalize() to be called, then the base class destructor is called which will implicitly free its Opm::FlowMain pointer, which will implicitly destroy the communicator, see:
https://github.com/OPM/opm-simulators/blob/abfe4223d2539bfacd6e7174136db3991752042d/opm/simulators/wells/ParallelWellInfo.hpp#L339
Seems like this happens because the derived class PyBlackOilSimulator destructor is called before the base class PyBaseSimulator destructor, and the derived class destructor implicitly frees its Opm::Main pointer which will cause MPI_Finalize() to be called, then the base class destructor is called which will implicitly free its Opm::FlowMain pointer, which will implicitly destroy the communicator, see
Ok, so the issue is having the main_ pointer located in the derived classes, e.g. PyBlackOilSimulator, and flow_main_ located in the base class PyBaseSimulator? I couldn't figure out how to have main_ in the base class since it seemed to be very simulator specific...
I couldn't figure out how to have main_ in the base class
@svenn-t The following seems to work for me: https://github.com/hakonhagland/opm-simulators-1/commit/36ccf4ac8298225f32422b2c3144193eacb0777c
@svenn-t The following seems to work for me: https://github.com/hakonhagland/opm-simulators-1/commit/36ccf4ac8298225f32422b2c3144193eacb0777c
Indeed, this worked for me as well. Thanks a lot for the fixes! Unfortunately, I get a lot of "violates the C++ One Definition Rule" errors when I compile, and running cases with the GasWaterSimulator crashes. Since there are so many of the ODR errors I suspect there is a fundamental issue like, e.g. the BlackOilTwoPhaseIndices for the gas-water simulator has not been properly included or some header file is missing. I'll look into it.
running cases with the GasWaterSimulator crashes
@svenn-t Do you have any test cases for the gas water simulator? Then I can also have a look at it
@svenn-t Do you have any test cases for the gas water simulator? Then I can also have a look at it
Yes, I have a modified SPE1CASE1 file here: SPE1CASE1_GW.zip
@svenn-t Thanks for the test case!
I get a lot of "violates the C++ One Definition Rule" errors when I compile
I do not get any error when I compile, can you give more details? When I run a python script with these statements:
from opm.simulators import GasWaterSimulator
sim = GasWaterSimulator(filename="SPE1CASE1_GW.DATA")
sim.step_init()
sim.step()
sim.step_cleanup()
The step_init() above runs fine, but then sim.step() crashes with:
===============Saturation Functions Diagnostics===============
System: Water-Gas system.
Relative permeability input format: Saturation Family III.
Only valid for CO2STORE or H2STORE cases with GAS and WATER.
Relative permeability input format: Saturation Family III (GSF/WSF).
Number of saturation regions: 1
================ Starting main simulation loop ===============
Report step 0/10 at day 0/3650, date = 01-Jan-2015
Using Newton nonlinear solver.
Starting time step 0, stepsize 1 days, at day 0/365, date = 01-Jan-2015
python3.12: /home/hakon/test/opm8/opm/opm-simulators/opm/simulators/wells/StandardWell_impl.hpp:66: Opm::StandardWell<TypeTag>::StandardWell(const Opm::Well&, const Opm::ParallelWellInfo<typename Opm::WellInterface<TypeTag>::Scalar>&, int, const typename Base::ModelParameters&, const typename Base::RateConverterType&, int, int, int, int, const std::vector<Opm::PerforationData<typename Opm::WellInterface<TypeTag>::Scalar> >&) [with TypeTag = Opm::Properties::TTag::FlowGasWaterProblem; typename Opm::WellInterface<TypeTag>::Scalar = double; typename Base::ModelParameters = Opm::BlackoilModelParameters<double>; Base = Opm::WellInterface<Opm::Properties::TTag::FlowGasWaterProblem>; typename Base::RateConverterType = Opm::RateConverter::SurfaceToReservoirVoidage<Opm::BlackOilFluidSystem<double, Opm::BlackOilDefaultIndexTraits, Opm::VectorWithDefaultAllocator, std::shared_ptr>, std::vector<int> >]: Assertion `this->num_components_ == numWellConservationEq' failed.
[hakon-Precision-3570:556886] *** Process received signal ***
[hakon-Precision-3570:556886] Signal: Aborted (6)
[hakon-Precision-3570:556886] Signal code: (-6)
I do not get any error when I compile, can you give more details?
I see now that I get warnings and not errors. I have the build output here: Build_log.zip if you are interested, but essentially there are a lot of statements like these:
[build] /home/AD.NORCERESEARCH.NO/svtv/simulators/mupsi/opm-simulators/opm/simulators/flow/BlackoilModel.hpp:61:7: warning: type ‘struct BlackoilModel’ violates the C++ One Definition Rule [-Wodr]
[build] 61 | class BlackoilModel
[build] | ^
[build] /home/AD.NORCERESEARCH.NO/svtv/simulators/mupsi/opm-simulators/opm/simulators/flow/BlackoilModel.hpp:61: note: a different type is defined in another translation unit
[build] 61 | class BlackoilModel
[build] |
and
[build] /home/AD.NORCERESEARCH.NO/svtv/simulators/mupsi/opm-simulators/opm/simulators/wells/StandardWell_impl.hpp:2493: warning: type of ‘computeCurrentWellRates’ does not match original declaration [-Wlto-type-mismatch]
[build] 2493 | StandardWell<TypeTag>::
[build] |
[build] /home/AD.NORCERESEARCH.NO/svtv/simulators/mupsi/opm-simulators/opm/simulators/wells/StandardWell_impl.hpp:2493:5: note: ‘computeCurrentWellRates’ was previously declared here
[build] 2493 | StandardWell<TypeTag>::
[build] | ^
[build] /home/AD.NORCERESEARCH.NO/svtv/simulators/mupsi/opm-simulators/opm/simulators/wells/StandardWell_impl.hpp:2493:5: note: code may be misoptimized unless ‘-fno-strict-aliasing’ is used
I just do a simple make all (with VS Code cmake extension, but nevertheless) and make install afterwards.
The step_init() above runs fine, but then sim.step() crashes with:
I get the same crash report as you when I run the case as well.
I see now that I get warnings
@svenn-t Interesting. I do not get any warnings. Here is my build log build.zip. I compiled like this from the terminal:
$ cmake -DCMAKE_BUILD_TYPE=Debug -DOPM_ENABLE_PYTHON=ON -DOPM_ENABLE_EMBEDDED_PYTHON=ON ..
$ make -j4
Assertion `this->num_components_ == numWellConservationEq' failed.
This is from https://github.com/OPM/opm-simulators/blob/78a2265717a79460ccce3e9f5dfe2bc31a20061e/opm/simulators/wells/StandardWell_impl.hpp#L66 and in gdb I can see that this->num_components_ is 2, whereas numWellConservationEq is 3.
@svenn-t Interesting. I do not get any warnings. Here is my build log build.zip. I compiled like this from the terminal
When i build with Debug variant, I also get no compilation warnings. The build log I attached was with Release variant.
and in gdb I can see that this->num_components_ is 2, whereas numWellConservationEq is 3.
I suspect that the part in PyGasWaterSimulator.cpp where we set BlackOilTwoPhaseIndices is not done properly or even set at all. I tried to move that part to PyBaseSimulator.cpp, but I get the same crash when running the test case.
The build log I attached was with Release variant.
@svenn-t I see, then I am able to reproduce the warnings also. There are many, the first one I get is:
/home/hakon/test/opm8/opm/opm-simulators/opm/simulators/wells/WellInterface.hpp:75:7: warning: type ‘struct WellInterface’ violates the C++ One Definition Rule [-Wodr]
75 | class WellInterface : public WellInterfaceIndices<GetPropType<TypeTag, Properties::FluidSystem>,
| ^
/home/hakon/test/opm8/opm/opm-simulators/opm/simulators/wells/WellInterface.hpp:75:7: note: a type with the same name but different base type is defined in another translation unit
75 | class WellInterface : public WellInterfaceIndices<GetPropType<TypeTag, Properties::FluidSystem>,
| ^
/home/hakon/test/opm8/opm/opm-simulators/opm/simulators/wells/WellInterfaceIndices.hpp:33:7: note: type name ‘Opm::WellInterfaceIndices<Opm::BlackOilFluidSystem<double, Opm::BlackOilDefaultIndexTraits, Opm::VectorWithDefaultAllocator, std::shared_ptr>, Opm::BlackOilTwoPhaseIndices<0u, 0u, 0u, 0u, false, false, 0u, 0u, 0u> >’ should match type name ‘Opm::WellInterfaceIndices<Opm::BlackOilFluidSystem<double, Opm::BlackOilDefaultIndexTraits, Opm::VectorWithDefaultAllocator, std::shared_ptr>, Opm::BlackOilIndices<0u, 0u, 0u, 0u, false, false, 0u, 0u> >’
33 | class WellInterfaceIndices : public WellInterfaceFluidSystem<FluidSystem>
| ^
I believe this happens when building the Python shared library simulators.cpython-312-x86_64-linux-gnu.so since we are linking it with both PyBlackOilSimulator.cpp.o and PyGasWaterSimulator.cpp.o and the first uses Opm::BlackOilIndices and the latter uses Opm::BlackOilTwoPhaseIndices. So both variants of WellInterface gets linked into the shared object. An idea could be to create two separate shared objects, but there might be other solutions that lets us keep everything in a single shared object.
An idea could be to create two separate shared objects
@svenn-t Here is an example of a splitting into two separate modules: https://github.com/hakonhagland/opm-simulators-1/commit/1ff143a8cf78a2a4412b4e623ee1f768848280cd. This compiles without ODR violations for me.
If we are going to split the python module into two, one issue is that each of the two python modules will duplicate much of opm-simulators and opm-common static libraries. To avoid this, we can compile opm-common, opm-grid and opm-simulators as shared libraries (option -DBUILD_SHARED_LIBS=ON) then the code will not be duplicated in the python libraries.
Here is an example of a splitting into two separate modules: hakonhagland@1ff143a.
To also make the unit tests pass I needed to make some further changes to the pybind11 module names and python import statements, see: https://github.com/hakonhagland/opm-simulators-1/commit/6d24fe1091988824c05471e5d35017c5f2dd3d10
@svenn-t Here is an example of a splitting into two separate modules: https://github.com/hakonhagland/opm-simulators-1/commit/1ff143a8cf78a2a4412b4e623ee1f768848280cd. This compiles without ODR violations for me.
To also make the unit tests pass I needed to make some further changes to the pybind11 module names and python import statements, see: https://github.com/hakonhagland/opm-simulators-1/commit/6d24fe1091988824c05471e5d35017c5f2dd3d10
With the new commits, I also do not get the ODR warnings, and running both the gas-water and blackoil versions of the SPE1CASE1 case, I do not get any crashes. Thanks a lot for making these fixes and getting the simulators to work!
If we are going to split the python module into two, one issue is that each of the two python modules will duplicate much of opm-simulators and opm-common static libraries. To avoid this, we can compile opm-common, opm-grid and opm-simulators as shared libraries (option -DBUILD_SHARED_LIBS=ON) then the code will not be duplicated in the python libraries.
I admit this is beyond my knowledge with pybind, cmake, etc. Are there any issues with doing this?
I will update this PR with the new commits since it seems to be compiling and working correctly.
relying on dynamic linking to opm libraries is problematic for pip deployment. i much prefer if you track down the odr violation instead. something is instanced in the python module that should not be instanced in the python module.
relying on dynamic linking to opm libraries is problematic for pip deployment.
@akva2 Is it the auditwheel that is the problem? In principle we should be able to install all dependencies in a manylinux container, right? However, I assume it can be difficult find the correct versions of those that fit together?
i much prefer if you track down the odr violation instead
@akva2 It is quite a lot of the ODR violations, I think it can become a difficult task. As a workaround we could build the two python modules with the static opm libraries (i.e. include them into the shared objects). The drawback is then only the code duplication in the two modules (that is the size of the modules will be larger than strictly necessary)
while we can in principle install shared dependencies with the python module, it is opening a can of worms i'm very reluctant of taking aboard. the problem is that in the wild west of python package management, there are ways users can screw themself over and end up loading wrong versions of the shared libraries from system paths and not the shipped libs, in particular if the don't use venvs and such.
while i'm usually a fan of shared linking, in this case i'll take larger modules any day to avoid the problems so if there is no other way that is what i'd do.
there are ways users can screw themself over and end up loading wrong versions of the shared libraries from system paths and not the shipped libs, in particular if the don't use venvs and such.
@akva2 Interesting. By looking at the source it seems auditwheel is adjusting the rpath so the wheel's internally shipped shared objects should then take precedence over any equivalent libraries from the user's system path. Also, it looks like it renames the libraries, see repair.py, so a conflict in names should then be even less likely, right?
It would be interesting to test this, can I just run docker build . from the python directory?
right, so there are new hacks in the tools to try to fix the situation then. my trust in python packaging tools is still exactly zero though, so i'm sure there are some tool that screws it up somehow, somewhere in the world.. anyways,
basically yes, invocation on jenkins is
DOCKER_BUILDKIT=1 docker build -t manylinux_2_28_opm:built --build-arg build_jobs=16 --build-arg version_tag="$2" --build-arg version="$3" . -f Dockerfile --output $1/wheelhouse
where $1 is the output directory, $2 is the tag to append to the package version, and version is tag to build from (could be "master"). the build args default to tag="master" and version="", where i usually give .dev$DATE for the nightly builds, if not it will use the version (if given it's version from dune.module-$version)
DOCKER_BUILDKIT=1 docker build -t manylinux_2_28_opm:built --build-arg build_jobs=16 --build-arg version_tag="$2" --build-arg version="$3" . -f Dockerfile --output $1/wheelhouse
@akva2 Yes this worked for static opm libraries after a small change to the generate-pypi-package.sh script, see https://github.com/hakonhagland/opm-simulators-1/commit/1b690f91cb24ee08a1a2887f70f4f97b02622d73. However, adding -DBUILD_SHARED_LIBS=ON to the cmake flags and trying to build shared opm libraries failed. It seems like the first problem occurs when trying to build a shared libopmgrid.so due to only static versions of the lapack.a is available in the manylinux container, see https://github.com/OPM/opm-simulators/blob/44eb826766f39b1928e34abcb82c91db467a61c4/python/setup-docker-image.sh#L11
I will try to replace these packages with ones including the shared versions and see how far I get.
I will try to replace these packages with ones including the shared versions and see how far I get.
It seems to work after these changes: https://github.com/hakonhagland/opm-simulators-1/commit/84a9041b024914c1b31f1ee4e8ac95a107bfac3b. I will try to install the generated wheels in some containers to check if they also installs successfully.
I am no longer able to build in the container since it now uses cmake 4.0, see: https://github.com/OPM/opm-simulators/issues/6122
I will try to install the generated wheels in some containers to check if they also installs successfully.
Have now tested the wheels with both shared and static opm libraries in an Ubuntu 22.04 docker container, see #6136. All unit tests passed there.
@akva2 Thanks a lot for your comments! The update hopefully fixes all your comments regarding formatting and build targets with the new python simulators.
I also marked the PR as ready for review as I hope we can get this in before the new release.