OMSimulator icon indicating copy to clipboard operation
OMSimulator copied to clipboard

Random test failures in PRs and on the master

Open AnHeuermann opened this issue 3 years ago • 12 comments

Description

We seem to have some tests in our testsuite that are failing at random in our PRs. I looked through some of my last PRs and found the following tests that are failing and passing on different runs of Jenkins without any changes to the code bases:

PR-916, commit e6f58b8

  • build / linux64-asan / test / import_parameter_mapping_inline.lua – OMSimulator
  • build / mingw64-gcc / test / import_parameter_mapping_from_ssm.lua – OMSimulator
  • build / linux64-asan / test / tlm1d.lua – tlm

Also we fail some tests on the master branch, e.g. master, commit 2f00f94

  • build / linux64-asan / test / import_parameter_mapping_inline.lua – OMSimulator
  • build / mingw64-gcc / test / import_parameter_mapping_inline.lua – OMSimulator

master, commit e814c71

  • build / mingw32-gcc / test / exportSSMTemplate.py – OMSimulator

master, commit 6616083

  • build / mingw64-gcc / test / Enumeration.lua – OMSimulator

Expected behavior

When a PR passes the test it should also have a successful build on the master.

AnHeuermann avatar Jan 27 '21 14:01 AnHeuermann

And of course I can't reproduce the exact error with asan:

$docker run --rm -it -v /home/andreas/workspace/OMSimulatorStandalone:/OMSimulator docker.openmodelica.org/build-deps:v1.13 bash

#export ASAN=ON
#echo $ASAN
#cd /OMSimulator/
#make config-3rdParty -j
#make config-OMSimulator -j ASAN=ON
#make OMSimulator -j DISABLE_RUN_OMSIMULATOR_VERSION=1
#./install/linux/bin/OMSimulator --version
#exit

$docker run --rm -it -v /home/andreas/workspace/OMSimulatorStandalone:/OMSimulator --cap-add SYS_PTRACE --privileged --oom-kill-disable -m 1024m --memory-swap 1024m docker.openmodelica.org/build-deps:v1.13 bash

#export ASAN=ON
#echo $ASAN
#cd /OMSimulator/
#./install/linux/bin/OMSimulator --version
#make -C testsuite difftool resources
#cd testsuite/partest
#./runtests.pl -asan -nocolour -with-xml -j16

Only tlm3d.lua is failing (or to slow).

But if I add ulimit -v 6291456 I start to get problems. Of course I don't have 6GB memory per processor available, but could it be that the limit on memory per processor is a problem?

We are only doing it for ASAN:

  ${env.ASAN ? "" : "ulimit -v 6291456" /* Max 6GB per process */}

Googling reveals there are a lot of issues when using AddressSanitizer with ulimit -v, so I will create a PR without it and see if this solves anything.

AnHeuermann avatar Jan 27 '21 15:01 AnHeuermann

My next best theory: We are running out of memory at https://github.com/OpenModelica/OMSimulator/blob/master/src/OMSimulatorLib/MatVer4.cpp#L93 and the Jenkins test reports:

OMSimulator: /var/lib/jenkins2/ws/OMSimulator_PR-916/src/OMSimulatorLib/MatVer4.cpp:96: void oms::appendMatVer4Matrix(FILE*, long int, const char*, size_t, size_t, const void*, oms::MatVer4Type_t): Assertion `header.mrows == rows' failed.
Aborted (core dumped)

AnHeuermann avatar Jan 27 '21 15:01 AnHeuermann

If you think it's due to running out of memory, increase it to 10GB or so. If it stops crashing at that location, perhaps you are right? But that might also mean that we aren't checking if malloc returns 0 everywhere...

sjoelund avatar Jan 27 '21 15:01 sjoelund

On what machines does that linux-asan pipeline run? Are they all the same? Maybe the test fails only on a specific subset of that machines. I should to check that as well.

AnHeuermann avatar Jan 27 '21 15:01 AnHeuermann

On what machines does that linux-asan pipeline run? Are they all the same? Maybe the test fails only on a specific subset of that machines. I should to check that as well.

Any machine with the linux label.

sjoelund avatar Jan 28 '21 07:01 sjoelund

Maybe #918 did something, maybe not. Next we should look at build / mingw64-gcc / test / Enumeration.lua – OMSimulator. It is failing on the master every now and then.

AnHeuermann avatar Jan 28 '21 20:01 AnHeuermann

And on build / mingw64-gcc / test / import_parameter_mapping_inline.lua – OMSimulator

info:    model doesn't contain any continuous state
info:    model doesn't contain any continuous state
info:      Instantiation
info:      import_parameter_mapping.co_sim.Input_1              : 20.0
info:      import_parameter_mapping.co_sim.Input_2              : 20.0
info:      import_parameter_mapping.co_sim.Input_3              : 50.0
info:      import_parameter_mapping.co_sim.parameter_1          : -30.0
info:      import_parameter_mapping.co_sim.parameter_2          : -40.0
info:      import_parameter_mapping.co_sim.System1.Input_1      : -100.0
info:      import_parameter_mapping.co_sim.System1.Input_2      : -100.0
info:      import_parameter_mapping.co_sim.System1.parameter_1  : -50.0
info:      import_parameter_mapping.co_sim.System1.parameter_2  : -50.0
info:      import_parameter_mapping.co_sim.System2.Input_1      : 70.0
info:      import_parameter_mapping.co_sim.System2.Input_2      : 70.0
info:      import_parameter_mapping.co_sim.System2.parameter_1  : 70.0
info:      import_parameter_mapping.co_sim.System2.parameter_2  : 70.0
info:    Result file: import_parameter_mapping_res.mat (bufferSize=10)
error:   [createFile] MATWriter::createFile: Permission denied
error:   [initialize] Creating result file failed
error:   [initialize] Initialization of system "import_parameter_mapping.co_sim" failed
info:    Initialization
info:      import_parameter_mapping.co_sim.Input_1              : 20.0
info:      import_parameter_mapping.co_sim.Input_2              : 20.0
info:      import_parameter_mapping.co_sim.Input_3              : 50.0
info:      import_parameter_mapping.co_sim.parameter_1          : -30.0
info:      import_parameter_mapping.co_sim.parameter_2          : -40.0
info:      import_parameter_mapping.co_sim.System1.Input_1      : -100.0
info:      import_parameter_mapping.co_sim.System1.Input_2      : -100.0
info:      import_parameter_mapping.co_sim.System1.parameter_1  : -50.0
info:      import_parameter_mapping.co_sim.System1.parameter_2  : -50.0
info:      import_parameter_mapping.co_sim.System2.Input_1      : 70.0
info:      import_parameter_mapping.co_sim.System2.Input_2      : 70.0
info:      import_parameter_mapping.co_sim.System2.parameter_1  : 70.0
info:      import_parameter_mapping.co_sim.System2.parameter_2  : 70.0
error:   [simulate] Model "import_parameter_mapping" is in wrong model state
info:    Simulation
info:      import_parameter_mapping.co_sim.Input_1              : 20.0
info:      import_parameter_mapping.co_sim.Input_2              : 20.0
info:      import_parameter_mapping.co_sim.Input_3              : 50.0
info:      import_parameter_mapping.co_sim.parameter_1          : -30.0
info:      import_parameter_mapping.co_sim.parameter_2          : -40.0
info:      import_parameter_mapping.co_sim.System1.Input_1      : -100.0
info:      import_parameter_mapping.co_sim.System1.Input_2      : -100.0
info:      import_parameter_mapping.co_sim.System1.parameter_1  : -50.0
info:      import_parameter_mapping.co_sim.System1.parameter_2  : -50.0
info:      import_parameter_mapping.co_sim.System2.Input_1      : 70.0
info:      import_parameter_mapping.co_sim.System2.Input_2      : 70.0
info:      import_parameter_mapping.co_sim.System2.parameter_1  : 70.0
info:      import_parameter_mapping.co_sim.System2.parameter_2  : 70.0
info:    0 warnings
info:    4 errors

AnHeuermann avatar Jan 29 '21 10:01 AnHeuermann

build / linux64-asan / test / tlm1d.lua – tlm

==304==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 80 byte(s) in 1 object(s) allocated from:
    #0 0x7f3aef65b458 in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xe0458)
    #1 0x559354bcd8fe in __gnu_cxx::new_allocator<std::_Rb_tree_node<std::pair<oms::System* const, std::mutex> > >::allocate(unsigned long, void const*) /usr/include/c++/7/ext/new_allocator.h:111
    #2 0x559354bcc4e5 in std::allocator_traits<std::allocator<std::_Rb_tree_node<std::pair<oms::System* const, std::mutex> > > >::allocate(std::allocator<std::_Rb_tree_node<std::pair<oms::System* const, std::mutex> > >&, unsigned long) /usr/include/c++/7/bits/alloc_traits.h:436
    #3 0x559354bc81fa in std::_Rb_tree<oms::System*, std::pair<oms::System* const, std::mutex>, std::_Select1st<std::pair<oms::System* const, std::mutex> >, std::less<oms::System*>, std::allocator<std::pair<oms::System* const, std::mutex> > >::_M_get_node() (/var/lib/jenkins1/ws/OMSimulator_PR-931/install/linux/bin/OMSimulator+0x4001fa)
    #4 0x559354bc1b8c in std::_Rb_tree_node<std::pair<oms::System* const, std::mutex> >* std::_Rb_tree<oms::System*, std::pair<oms::System* const, std::mutex>, std::_Select1st<std::pair<oms::System* const, std::mutex> >, std::less<oms::System*>, std::allocator<std::pair<oms::System* const, std::mutex> > >::_M_create_node<std::piecewise_construct_t const&, std::tuple<oms::System*&&>, std::tuple<> >(std::piecewise_construct_t const&, std::tuple<oms::System*&&>&&, std::tuple<>&&) (/var/lib/jenkins1/ws/OMSimulator_PR-931/install/linux/bin/OMSimulator+0x3f9b8c)
    #5 0x559354bbc45b in std::_Rb_tree_iterator<std::pair<oms::System* const, std::mutex> > std::_Rb_tree<oms::System*, std::pair<oms::System* const, std::mutex>, std::_Select1st<std::pair<oms::System* const, std::mutex> >, std::less<oms::System*>, std::allocator<std::pair<oms::System* const, std::mutex> > >::_M_emplace_hint_unique<std::piecewise_construct_t const&, std::tuple<oms::System*&&>, std::tuple<> >(std::_Rb_tree_const_iterator<std::pair<oms::System* const, std::mutex> >, std::piecewise_construct_t const&, std::tuple<oms::System*&&>&&, std::tuple<>&&) /usr/include/c++/7/bits/stl_tree.h:2398
    #6 0x559354bb792f in std::map<oms::System*, std::mutex, std::less<oms::System*>, std::allocator<std::pair<oms::System* const, std::mutex> > >::operator[](oms::System*&&) /usr/include/c++/7/bits/stl_map.h:512
    #7 0x559354ba7005 in oms::SystemTLM::readFromSockets(oms::SystemWC*, double, oms::Component*) /var/lib/jenkins2/ws/OMSimulator_PR-931/src/OMSimulatorLib/TLM/SystemTLM.cpp:688
    #8 0x559354c26fa5 in oms::ComponentFMUCS::stepUntil(double) /var/lib/jenkins2/ws/OMSimulator_PR-931/src/OMSimulatorLib/ComponentFMUCS.cpp:633
    #9 0x559354b3f8a4 in oms::SystemWC::stepUntil(double, void (*)(char const*, double, oms_status_enu_t)) /var/lib/jenkins2/ws/OMSimulator_PR-931/src/OMSimulatorLib/SystemWC.cpp:606
    #10 0x559354ba363e in oms::SystemTLM::simulateSubSystem(oms::ComRef, double) /var/lib/jenkins2/ws/OMSimulator_PR-931/src/OMSimulatorLib/TLM/SystemTLM.cpp:587
    #11 0x559354bba71d in oms_status_enu_t std::__invoke_impl<oms_status_enu_t, oms_status_enu_t (oms::SystemTLM::*)(oms::ComRef, double), oms::SystemTLM*, oms::ComRef, double>(std::__invoke_memfun_deref, oms_status_enu_t (oms::SystemTLM::*&&)(oms::ComRef, double), oms::SystemTLM*&&, oms::ComRef&&, double&&) /usr/include/c++/7/bits/invoke.h:73
    #12 0x559354bb615c in std::__invoke_result<oms_status_enu_t (oms::SystemTLM::*)(oms::ComRef, double), oms::SystemTLM*, oms::ComRef, double>::type std::__invoke<oms_status_enu_t (oms::SystemTLM::*)(oms::ComRef, double), oms::SystemTLM*, oms::ComRef, double>(oms_status_enu_t (oms::SystemTLM::*&&)(oms::ComRef, double), oms::SystemTLM*&&, oms::ComRef&&, double&&) /usr/include/c++/7/bits/invoke.h:96
    #13 0x559354bcfb88 in decltype (__invoke((_S_declval<0ul>)(), (_S_declval<1ul>)(), (_S_declval<2ul>)(), (_S_declval<3ul>)())) std::thread::_Invoker<std::tuple<oms_status_enu_t (oms::SystemTLM::*)(oms::ComRef, double), oms::SystemTLM*, oms::ComRef, double> >::_M_invoke<0ul, 1ul, 2ul, 3ul>(std::_Index_tuple<0ul, 1ul, 2ul, 3ul>) /usr/include/c++/7/thread:234
    #14 0x559354bcfa54 in std::thread::_Invoker<std::tuple<oms_status_enu_t (oms::SystemTLM::*)(oms::ComRef, double), oms::SystemTLM*, oms::ComRef, double> >::operator()() /usr/include/c++/7/thread:243
    #15 0x559354bcf9cf in std::thread::_State_impl<std::thread::_Invoker<std::tuple<oms_status_enu_t (oms::SystemTLM::*)(oms::ComRef, double), oms::SystemTLM*, oms::ComRef, double> > >::_M_run() /usr/include/c++/7/thread:186
    #16 0x7f3aeec11732  (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0xbe732)
SUMMARY: AddressSanitizer: 80 byte(s) leaked in 1 allocation(s).

lochel avatar Feb 04 '21 15:02 lochel

https://test.openmodelica.org/jenkins/blue/organizations/jenkins/OMSimulator/detail/PR-953/1/tests

+error: [createFile] MATWriter::createFile: Permission denied
+error: [initialize] Creating result file failed
+error: [initialize] Initialization of system "import_parameter_mapping.co_sim" failed

lochel avatar Feb 17 '21 09:02 lochel

https://test.openmodelica.org/jenkins/blue/organizations/jenkins/OMSimulator/detail/PR-957/3/tests

==== Log /tmp/oms-rtest-unknown/tlmtmp_8940/log-tlm1d.lua
Starting TLM simulation.
 Fatal error: Timeout - failed to start all components, give up! (12112 > 12111)
info:    Logging information has been saved to "tlm1d.log"

lochel avatar Feb 19 '21 08:02 lochel

https://test.openmodelica.org/jenkins/blue/organizations/jenkins/OMSimulator/detail/PR-963/1/tests

==== Log /tmp/oms-rtest-unknown/tlmtmp_3598/log-tlm1dfg.lua
Starting TLM simulation.
 Fatal error: Failed to send message header. Aborting.
info:    Logging information has been saved to "tlm1dfg.log"

lochel avatar Feb 23 '21 14:02 lochel

https://test.openmodelica.org/jenkins/blue/organizations/jenkins/OMSimulator/detail/PR-966/8/tests

==== Log /tmp/oms-rtest-unknown/tlmtmp_5922/log-tlm1d.lua
Starting TLM simulation.
Monitoring thread finished.
Manager thread finished.
wc1.P.v [m/s] is equal
wc2.P.v [m/s] is equal
wc1.P.F [N] is equal
wc2.P.F [N] is equal
info:    Logging information has been saved to "tlm1d.log"
=================================================================
==263==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 80 byte(s) in 1 object(s) allocated from:
    #0 0x7f0daea17458 in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xe0458)
    #1 0x560865db5e8e in __gnu_cxx::new_allocator<std::_Rb_tree_node<std::pair<oms::System* const, std::mutex> > >::allocate(unsigned long, void const*) /usr/include/c++/7/ext/new_allocator.h:111
    #2 0x560865db4a75 in std::allocator_traits<std::allocator<std::_Rb_tree_node<std::pair<oms::System* const, std::mutex> > > >::allocate(std::allocator<std::_Rb_tree_node<std::pair<oms::System* const, std::mutex> > >&, unsigned long) /usr/include/c++/7/bits/alloc_traits.h:436
    #3 0x560865db078a in std::_Rb_tree<oms::System*, std::pair<oms::System* const, std::mutex>, std::_Select1st<std::pair<oms::System* const, std::mutex> >, std::less<oms::System*>, std::allocator<std::pair<oms::System* const, std::mutex> > >::_M_get_node() /usr/include/c++/7/bits/stl_tree.h:588
    #4 0x560865daa1da in std::_Rb_tree_node<std::pair<oms::System* const, std::mutex> >* std::_Rb_tree<oms::System*, std::pair<oms::System* const, std::mutex>, std::_Select1st<std::pair<oms::System* const, std::mutex> >, std::less<oms::System*>, std::allocator<std::pair<oms::System* const, std::mutex> > >::_M_create_node<std::piecewise_construct_t const&, std::tuple<oms::System*&&>, std::tuple<> >(std::piecewise_construct_t const&, std::tuple<oms::System*&&>&&, std::tuple<>&&) /usr/include/c++/7/bits/stl_tree.h:642
    #5 0x560865da5275 in std::_Rb_tree_iterator<std::pair<oms::System* const, std::mutex> > std::_Rb_tree<oms::System*, std::pair<oms::System* const, std::mutex>, std::_Select1st<std::pair<oms::System* const, std::mutex> >, std::less<oms::System*>, std::allocator<std::pair<oms::System* const, std::mutex> > >::_M_emplace_hint_unique<std::piecewise_construct_t const&, std::tuple<oms::System*&&>, std::tuple<> >(std::_Rb_tree_const_iterator<std::pair<oms::System* const, std::mutex> >, std::piecewise_construct_t const&, std::tuple<oms::System*&&>&&, std::tuple<>&&) /usr/include/c++/7/bits/stl_tree.h:2398
    #6 0x560865da0b05 in std::map<oms::System*, std::mutex, std::less<oms::System*>, std::allocator<std::pair<oms::System* const, std::mutex> > >::operator[](oms::System*&&) (/var/lib/jenkins1/ws/OMSimulator_PR-966/install/linux/bin/OMSimulator+0x3ffb05)
    #7 0x560865d902d9 in oms::SystemTLM::readFromSockets(oms::SystemWC*, double, oms::Component*) /var/lib/jenkins2/ws/OMSimulator_PR-966/src/OMSimulatorLib/TLM/SystemTLM.cpp:693
    #8 0x560865e0f56b in oms::ComponentFMUCS::stepUntil(double) /var/lib/jenkins2/ws/OMSimulator_PR-966/src/OMSimulatorLib/ComponentFMUCS.cpp:633
    #9 0x560865d29437 in oms::SystemWC::doStep() /var/lib/jenkins2/ws/OMSimulator_PR-966/src/OMSimulatorLib/SystemWC.cpp:578
    #10 0x560865d2ea7f in oms::SystemWC::stepUntil(double) /var/lib/jenkins2/ws/OMSimulator_PR-966/src/OMSimulatorLib/SystemWC.cpp:821
    #11 0x560865d8c911 in oms::SystemTLM::simulateSubSystem(oms::ComRef, double) /var/lib/jenkins2/ws/OMSimulator_PR-966/src/OMSimulatorLib/TLM/SystemTLM.cpp:592
    #12 0x560865da38f3 in oms_status_enu_t std::__invoke_impl<oms_status_enu_t, oms_status_enu_t (oms::SystemTLM::*)(oms::ComRef, double), oms::SystemTLM*, oms::ComRef, double>(std::__invoke_memfun_deref, oms_status_enu_t (oms::SystemTLM::*&&)(oms::ComRef, double), oms::SystemTLM*&&, oms::ComRef&&, double&&) /usr/include/c++/7/bits/invoke.h:73
    #13 0x560865d9f3f2 in std::__invoke_result<oms_status_enu_t (oms::SystemTLM::*)(oms::ComRef, double), oms::SystemTLM*, oms::ComRef, double>::type std::__invoke<oms_status_enu_t (oms::SystemTLM::*)(oms::ComRef, double), oms::SystemTLM*, oms::ComRef, double>(oms_status_enu_t (oms::SystemTLM::*&&)(oms::ComRef, double), oms::SystemTLM*&&, oms::ComRef&&, double&&) /usr/include/c++/7/bits/invoke.h:96
    #14 0x560865db8118 in decltype (__invoke((_S_declval<0ul>)(), (_S_declval<1ul>)(), (_S_declval<2ul>)(), (_S_declval<3ul>)())) std::thread::_Invoker<std::tuple<oms_status_enu_t (oms::SystemTLM::*)(oms::ComRef, double), oms::SystemTLM*, oms::ComRef, double> >::_M_invoke<0ul, 1ul, 2ul, 3ul>(std::_Index_tuple<0ul, 1ul, 2ul, 3ul>) /usr/include/c++/7/thread:234
    #15 0x560865db7fe4 in std::thread::_Invoker<std::tuple<oms_status_enu_t (oms::SystemTLM::*)(oms::ComRef, double), oms::SystemTLM*, oms::ComRef, double> >::operator()() /usr/include/c++/7/thread:243
    #16 0x560865db7f5f in std::thread::_State_impl<std::thread::_Invoker<std::tuple<oms_status_enu_t (oms::SystemTLM::*)(oms::ComRef, double), oms::SystemTLM*, oms::ComRef, double> > >::_M_run() /usr/include/c++/7/thread:186
    #17 0x7f0dadfcd732  (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0xbe732)
SUMMARY: AddressSanitizer: 80 byte(s) leaked in 1 allocation(s).
== 1 out of 1 tests failed [tlm, time: 3]

lochel avatar Feb 25 '21 22:02 lochel