OpenModelica
OpenModelica copied to clipboard
Assert in CVODE FMU causing segmentation fault
Description
Co-simulation crash for both CVODE and Explicit Euler on Ubuntu 22.04, and I'm not sure how to come it closer/debug it further
Issue was first reported at FMPy, but is openmodellca related (https://github.com/CATIA-Systems/FMPy/issues/447)
Steps to Reproduce
make_fmu.mos
loadModel(Modelica, {"4.0.0"}); getErrorString();
setCommandLineOptions("--fmiFlags=s:cvode"); getErrorString();
buildModelFMU(Modelica.Fluid.Examples.PumpingSystem, version="2.0", fmuType="me", platforms={"static"}, fileNamePrefix="PumpingSystem_cvode_me" ); getErrorString();
buildModelFMU(Modelica.Fluid.Examples.PumpingSystem, version="2.0", fmuType="cs", platforms={"static"}, fileNamePrefix="PumpingSystem_cvode_cs" ); getErrorString();
running me simulation:
> OMSimulator PumpingSystem_cvode_me.fmu (jupy)
warning: PumpingSystem_cvode_me (logStatusWarning): /home/*****/.openmodelica/libraries/Modelica 4.0.0+maint.om/Fluid/Interfaces.mo:1027: The following assertion has been violated at time 0.000000
pipe.flowModel.m_flows[1] >= 0.0 and pipe.flowModel.m_flows[1] <= 100000.0
assert | warning | Variable violating min/max constraint: 0.0 <= pipe.flowModel.m_flows[1] <= 100000.0, has value: -141.424
info: maximum step size for 'model.root': 0.001000
info: Result file: model_res.mat (bufferSize=10)
info: Final Statistics for 'model.root':
NumSteps = 1019 NumRhsEvals = 1048 NumLinSolvSetups = 65
NumNonlinSolvIters = 1047 NumNonlinSolvConvFails = 1 NumErrTestFails = 2
info: 1 warnings
info: 0 errors
running cs simulation crash:
> OMSimulator PumpingSystem_cvode_cs.fmu (jupy)
LOG_SOLVER | info | CVODE linear multistep method CV_BDF
LOG_SOLVER | info | CVODE maximum integration order CV_ITER_NEWTON
LOG_SOLVER | info | CVODE use equidistant time grid YES
LOG_SOLVER | info | CVODE Using relative error tolerance 1.000000e-06
LOG_SOLVER | info | CVODE Using dense internal linear solver SUNLinSol_Dense.
LOG_SOLVER | info | CVODE Use internal dense numeric jacobian method.
LOG_SOLVER | info | CVODE uses internal root finding method NO
LOG_SOLVER | info | CVODE maximum absolut step size 0
LOG_SOLVER | info | CVODE initial step size is set automatically
LOG_SOLVER | info | CVODE maximum integration order 5
LOG_SOLVER | info | CVODE maximum number of nonlinear convergence failures permitted during one step 10
LOG_SOLVER | info | CVODE BDF stability limit detection algorithm OFF
warning: PumpingSystem_cvode_cs (logStatusWarning): /home/*****/.openmodelica/libraries/Modelica 4.0.0+maint.om/Fluid/Interfaces.mo:1027: The following assertion has been violated at time 0.000000
pipe.flowModel.m_flows[1] >= 0.0 and pipe.flowModel.m_flows[1] <= 100000.0
assert | warning | Variable violating min/max constraint: 0.0 <= pipe.flowModel.m_flows[1] <= 100000.0, has value: -141.424
info: Result file: model_res.mat (bufferSize=10)
error: [fmi2logger] PumpingSystem_cvode_cs (logStatusError): /home/*****/.openmodelica/libraries/Modelica 4.0.0+maint.om/Media/Water/IF97_Utilities.mo:2900: IF97 medium function tsat called with too low pressure
p = -1.70619e+08 Pa <= 611.657 Pa (triple point pressure)
error: [fmi2logger] PumpingSystem_cvode_cs (logFmi2Call): fmi2CompletedIntegratorStep: terminated by an assertion.
error: [fmi2logger] PumpingSystem_cvode_cs (logStatusError): /home/*****/.openmodelica/libraries/Modelica 4.0.0+maint.om/Media/Water/IF97_Utilities.mo:2900: IF97 medium function tsat called with too low pressure
p = -1.70619e+08 Pa <= 611.657 Pa (triple point pressure)
error: [fmi2logger] PumpingSystem_cvode_cs (logFmi2Call): fmi2GetBoolean: terminated by an assertion.
error: [updateSignals] failed to fetch variable PumpRPMGenerator.u
error: [fmi2logger] PumpingSystem_cvode_cs (logStatusError): /home/*****/.openmodelica/libraries/Modelica 4.0.0+maint.om/Media/Water/IF97_Utilities.mo:2900: IF97 medium function tsat called with too low pressure
p = -1.70619e+08 Pa <= 611.657 Pa (triple point pressure)
error: [fmi2logger] PumpingSystem_cvode_cs (logFmi2Call): fmi2GetBoolean: terminated by an assertion.
error: [updateSignals] failed to fetch variable PumpRPMGenerator.u
error: [fmi2logger] PumpingSystem_cvode_cs (logStatusError): /home/*****/.openmodelica/libraries/Modelica 4.0.0+maint.om/Media/Water/IF97_Utilities.mo:2900: IF97 medium function tsat called with too low pressure
p = -1.70619e+08 Pa <= 611.657 Pa (triple point pressure)
fish: Job 1, 'OMSimulator PumpingSystem_cvode…' terminated by signal SIGSEGV (Address boundary error)
Version and OS
Ubuntu 22.04
> omc --version
OpenModelica 1.20.0~dev-250-gb17e1a0
Backtrace from GDB:
Thread 1 "OMSimulator" received signal SIGSEGV, Segmentation fault.
__longjmp_chk (env=0x0, val=1) at ../setjmp/longjmp.c:32
32 ../setjmp/longjmp.c: No such file or directory.
(gdb) bt
#0 __longjmp_chk (env=0x0, val=1) at ../setjmp/longjmp.c:32
#1 0x00007ffff4c983a8 in omc_assert_fmi () from /path/to/Testitesttest/issue-9362/temp/model-afp9qhol/temp/0001_PumpingSystem_cvode_cs/binaries/linux64/PumpingSystem_cvode_cs.so
#2 0x00007ffff4bf98a6 in omc_Modelica_Media_Water_IF97__Utilities_BaseIF97_Basic_tsat ()
from /path/to/Testitesttest/issue-9362/temp/model-afp9qhol/temp/0001_PumpingSystem_cvode_cs/binaries/linux64/PumpingSystem_cvode_cs.so
#3 0x00007ffff4bcdac7 in PumpingSystem_cvode_cs_eqFunction_457 ()
from /path/to/Testitesttest/issue-9362/temp/model-afp9qhol/temp/0001_PumpingSystem_cvode_cs/binaries/linux64/PumpingSystem_cvode_cs.so
#4 0x00007ffff4bca954 in PumpingSystem_cvode_cs_functionDAE ()
from /path/to/Testitesttest/issue-9362/temp/model-afp9qhol/temp/0001_PumpingSystem_cvode_cs/binaries/linux64/PumpingSystem_cvode_cs.so
#5 0x00007ffff4c986e2 in internalEventUpdate () from /path/to/Testitesttest/issue-9362/temp/model-afp9qhol/temp/0001_PumpingSystem_cvode_cs/binaries/linux64/PumpingSystem_cvode_cs.so
#6 0x00007ffff4c98aad in internalEventIteration ()
from /path/to/Testitesttest/issue-9362/temp/model-afp9qhol/temp/0001_PumpingSystem_cvode_cs/binaries/linux64/PumpingSystem_cvode_cs.so
#7 0x00007ffff4c9bb8d in fmi2DoStep () from /path/to/Testitesttest/issue-9362/temp/model-afp9qhol/temp/0001_PumpingSystem_cvode_cs/binaries/linux64/PumpingSystem_cvode_cs.so
#8 0x00005555556bd037 in oms::ComponentFMUCS::stepUntil(double) ()
#9 0x000055555566e6e8 in oms::SystemWC::doStep() ()
#10 0x000055555566872f in oms::SystemWC::stepUntil(double) ()
#11 0x000055555570dd32 in oms::Model::simulate() ()
#12 0x00005555555d4403 in oms_simulate ()
#13 0x00005555555d48dc in do_simulation(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::chrono::duration<double, std::ratio<1l, 1l> >) ()
#14 0x00005555555f2774 in oms_RunFile ()
#15 0x0000555555702cf5 in oms::Flags::SetCommandLineOption(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#16 0x00005555555d326e in oms_setCommandLineOption ()
#17 0x00005555555ca438 in main ()
Using OMSimulator: v2.1.1.post184-g60ecea0-linux
I'm not sure what could be the cause in this case. But according to our library testing CVODE is able to solve this model when not using FMI, so it should be possible to fix the issue. My first guess would be that the default simulation can handle the long jump (when we encounter an error) and the CVODE integrator in the CS FMU can't handle it correctly.
The failing equation is
pumps.medium.sat.Tsat = Modelica.Media.Water.IF97_Utilities.BaseIF97.Basic.tsat(pumps.medium.p)
We are also seeing an access violation seemingly happening in omc__assert_fmi when simulating an FMU with pyfmi:
Thread 1 received signal SIGSEGV, Segmentation fault.
0x00007ffb5d6aae32 in ucrtbase!.intrinsic_setjmpex () from C:\Windows\System32\ucrtbase.dll
(gdb) bt
#0 0x00007ffb5d6aae32 in ucrtbase!.intrinsic_setjmpex () from C:\Windows\System32\ucrtbase.dll
#1 0x00007ffa8fb618db in omc_assert_fmi (threadData=0x41a50f0, info=..., msg=<optimized out>)
at <snip>.fmutmp/sources/fmi-export/fmu2_model_interface.c.inc:228
...
seemingly happening when an assert is triggered
The line 228 points
static void omc_assert_fmi(threadData_t *threadData, FILE_INFO info, const char *msg, ...) __attribute__ ((noreturn));
static void omc_assert_fmi(threadData_t *threadData, FILE_INFO info, const char *msg, ...)
{
va_list args;
va_start(args, msg);
omc_assert_fmi_common(threadData, fmi2Error, LOG_STATUSERROR, info, msg, args);
va_end(args);
MMC_THROW_INTERNAL(); // <==HERE
}
It seems a longjmp is going wrong somehow. Unfortunately I don't have a small reproducer, but we are stumped how to troubleshoot this further? any pointers? pyfmi 2.13.1, Windows 10, OMEdit 1.24.0-dev.beta.0
@arun3688 could you please have a look?
@arun3688 can you perchance help out with how to troubleshoot this crash?
I have since found out that the segfault happens when this assert is thrown from one of our functions:
omc_assert(threadData, info, "Model error: Argument of log(T) was %g should be > 0", tmp1);, with an arg of -7144.6514982773951, but I don't think that will help a lot?
The problem is really sensitive to the stop time -- a tiny variation already avoids the crash (this suggests it will be hard to reproduce; I can only speculate about the reasons), but it's the segfault happening in the first place which is quite concerning.
Right now we just have an FMU which crashes (its solver) with a segfault instead of asserting/exiting out, but I don't know how to proceed from there?
The same model also crashes when running it in OMEdit (so without even exporting an FMU). Further debugging indicates that #9681 could be behind this, and #12381, #12440 related.
@arun3688 can you perchance help out with how to troubleshoot this crash?
I will try look into this and see if i can find something, but i don't have any idea on how to troubleshoot this crash
@arun3688 see comments in #9681. Maybe you can have a quick chat with @phannebohm about it.
@arun3688 @phannebohm I think I might have located the source of the crash, if it's the same underlying reason as in #9681: https://github.com/OpenModelica/OpenModelica/issues/9681#issuecomment-2393186548
Now on #13056