flexbe_behavior_engine icon indicating copy to clipboard operation
flexbe_behavior_engine copied to clipboard

Debug/concurrent container test

Open LoyVanBeek opened this issue 4 years ago • 4 comments

We've noticed that sometimes the test for the ConcurrencyContainer failed, because the calculated sleep_duration was not accurate enough to satisfy the test but certainly looks correct. I think the issue is some rounding or floating point issue, but not sure.

This PR specifies a maximum deviation delta rather than the amount of decimal places.

LoyVanBeek avatar Jan 21 '21 11:01 LoyVanBeek

@pschillinger have you observed something similar?

some more information on this - for better understanding this issue has mainly been observed under Noetic and it results in the following error log:

   Start 2: _ctest_flexbe_core_rostest_test_flexbe_core.test
2: Test command: /root/target_ws/build/flexbe_core/catkin_generated/env_cached.sh "/usr/bin/python3" "/opt/ros/noetic/share/catkin/cmake/test/run_tests.py" "/root/target_ws/build/flexbe_core/test_results/flexbe_core/rostest-test_flexbe_core.xml" "--return-code" "/usr/bin/python3 /opt/ros/noetic/share/rostest/cmake/../../../bin/rostest --pkgdir=/root/target_ws/src/flexbe_behavior_engine/flexbe_core --package=flexbe_core --results-filename test_flexbe_core.xml --results-basedir "/root/target_ws/build/flexbe_core/test_results" /root/target_ws/src/flexbe_behavior_engine/flexbe_core/test/flexbe_core.test "
2: Test timeout computed to be: 10000000
2: [WARN] [1610834257.624376]: Wanted to lock /invalid, but could not find it in current path /subject.
2: [WARN] [1610834257.834378]: Requested outcome for state invalid but active state is subject
2: [WARN] [1610834264.277216]: /before_state
2: UserData object with 1 data entries:
2:   Input Keys: ['data_in']
2:   Output Keys: ['data_out']
2:   Data: {}
2:   Remapping: {'data_in': 'outside'}
2:   Reference: UserData object with 1 data entries:
2:     Input Keys: None
2:     Output Keys: None
2:     Data: {'outside': 'outside_data'}
2:     Remapping: {}
2:     Reference: {}
2: [WARN] [1610834264.278847]: /inner_sm/own_state
2: UserData object with 3 data entries:
2:   Input Keys: ['data_in']
2:   Output Keys: ['data_out']
2:   Data: {}
2:   Remapping: {'data_in': 'own', 'data_out': 'sm_out'}
2:   Reference: UserData object with 3 data entries:
2:     Input Keys: ['sm_in']
2:     Output Keys: ['sm_out']
2:     Data: {'own': 'own_data'}
2:     Remapping: {'sm_in': 'outside'}
2:     Reference: UserData object with 2 data entries:
2:       Input Keys: None
2:       Output Keys: None
2:       Data: {'outside': 'outside_data', 'data_out': 'test_data'}
2:       Remapping: {}
2:       Reference: {}
2: [WARN] [1610834264.280023]: /inner_sm/outside_state
2: UserData object with 5 data entries:
2:   Input Keys: ['data_in']
2:   Output Keys: ['data_out']
2:   Data: {}
2:   Remapping: {'data_in': 'sm_in', 'data_out': 'data_in'}
2:   Reference: UserData object with 5 data entries:
2:     Input Keys: ['sm_in']
2:     Output Keys: ['sm_out']
2:     Data: {'own': 'own_data', 'sm_out': 'inner_data'}
2:     Remapping: {'sm_in': 'outside'}
2:     Reference: UserData object with 3 data entries:
2:       Input Keys: None
2:       Output Keys: None
2:       Data: {'outside': 'outside_data', 'data_out': 'test_data', 'sm_out': 'inner_data'}
2:       Remapping: {}
2:       Reference: {}
2: [WARN] [1610834264.281100]: /inner_sm/internal_state
2: UserData object with 7 data entries:
2:   Input Keys: ['data_in']
2:   Output Keys: ['data_out']
2:   Data: {}
2:   Remapping: {}
2:   Reference: UserData object with 7 data entries:
2:     Input Keys: ['sm_in']
2:     Output Keys: ['sm_out']
2:     Data: {'own': 'own_data', 'sm_out': 'inner_data', 'sm_in': 'outside_data', 'data_in': 'test_data'}
2:     Remapping: {'sm_in': 'outside'}
2:     Reference: UserData object with 3 data entries:
2:       Input Keys: None
2:       Output Keys: None
2:       Data: {'outside': 'outside_data', 'data_out': 'test_data', 'sm_out': 'inner_data'}
2:       Remapping: {}
2:       Reference: {}
2: [WARN] [1610834264.282126]: /after_state
2: UserData object with 3 data entries:
2:   Input Keys: ['data_in']
2:   Output Keys: ['data_out']
2:   Data: {}
2:   Remapping: {'data_in': 'sm_out'}
2:   Reference: UserData object with 3 data entries:
2:     Input Keys: None
2:     Output Keys: None
2:     Data: {'outside': 'outside_data', 'data_out': 'test_data', 'sm_out': 'inner_data'}
2:     Remapping: {}
2:     Reference: {}
2: [WARN] [1610834264.283079]: /modify_state
2: UserData object with 3 data entries:
2:   Input Keys: ['data_in']
2:   Output Keys: ['data_out']
2:   Data: {}
2:   Remapping: {'data_out': 'outside', 'data_in': 'outside'}
2:   Reference: UserData object with 3 data entries:
2:     Input Keys: None
2:     Output Keys: None
2:     Data: {'outside': 'outside_data', 'data_out': 'last_data', 'sm_out': 'inner_data'}
2:     Remapping: {}
2:     Reference: {}
2: [WARN] [1610834264.284106]: /final_state
2: UserData object with 3 data entries:
2:   Input Keys: ['data_in']
2:   Output Keys: ['data_out']
2:   Data: {}
2:   Remapping: {'data_in': 'data_out'}
2:   Reference: UserData object with 3 data entries:
2:     Input Keys: None
2:     Output Keys: None
2:     Data: {'outside': 'test_data', 'data_out': 'last_data', 'sm_out': 'inner_data'}
2:     Remapping: {}
2:     Reference: {}
2: ... logging to /root/.ros/log/rostest-4c57cbbc5be0-18315.log
2: [ROSUNIT] Outputting test results to /root/target_ws/build/flexbe_core/test_results/flexbe_core/rostest-test_flexbe_core.xml
2: [Testcase: testtest_flexbe_core] ... ok
2: 
2: [ROSTEST]-----------------------------------------------------------------------
2: 
2: [flexbe_core.rosunit-test_flexbe_core/test_concurrency_container][FAILURE]------
2: 0.000545359 != 0.1 within 2 places (0.09945464100000001 difference)
2:   File "/usr/lib/python3.8/unittest/case.py", line 60, in testPartExecutor
2:     yield
2:   File "/usr/lib/python3.8/unittest/case.py", line 676, in run
2:     self._callTestMethod(testMethod)
2:   File "/usr/lib/python3.8/unittest/case.py", line 633, in _callTestMethod
2:     method()
2:   File "/root/target_ws/src/flexbe_behavior_engine/flexbe_core/test/test_core.py", line 361, in test_concurrency_container
2:     self.assertAlmostEqual(cc.sleep_duration, .1, places=2)
2:   File "/usr/lib/python3.8/unittest/case.py", line 966, in assertAlmostEqual
2:     raise self.failureException(msg)
2: --------------------------------------------------------------------------------
2: 
2: [flexbe_core.rosunit-test_flexbe_core/test_cross_combinations][passed]
2: [flexbe_core.rosunit-test_flexbe_core/test_event_state][passed]
2: [flexbe_core.rosunit-test_flexbe_core/test_lockable_state][passed]
2: [flexbe_core.rosunit-test_flexbe_core/test_manually_transitionable_state][passed]
2: [flexbe_core.rosunit-test_flexbe_core/test_operatable_state][passed]
2: [flexbe_core.rosunit-test_flexbe_core/test_preemptable_state][passed]
2: [flexbe_core.rosunit-test_flexbe_core/test_ros_state][passed]
2: [flexbe_core.rosunit-test_flexbe_core/test_user_data][passed]
2: 
2: SUMMARY
2:  * RESULT: FAIL
2:  * TESTS: 9
2:  * ERRORS: 0
2:  * FAILURES: 1
2: 
2: rostest log file is in /root/.ros/log/rostest-4c57cbbc5be0-18315.log
2: -- run_tests.py: execute commands
2:   /usr/bin/python3 /opt/ros/noetic/share/rostest/cmake/../../../bin/rostest --pkgdir=/root/target_ws/src/flexbe_behavior_engine/flexbe_core --package=flexbe_core --results-filename test_flexbe_core.xml --results-base-dir "/root/target_ws/build/flexbe_core/test_results" /root/target_ws/src/flexbe_behavior_engine/flexbe_core/test/flexbe_core.test 
2: -- run_tests.py: verify result "/root/target_ws/build/flexbe_core/test_results/flexbe_core/rostest-test_flexbe_core.xml"
2/3 Test mojin-robotics/flexbe_behavior_engine#2: _ctest_flexbe_core_rostest_test_flexbe_core.test .........***Failed    9.17 sec
test 3

ref https://github.com/mojin-robotics/flexbe_behavior_engine/pull/12

fmessmer avatar Jan 21 '21 11:01 fmessmer

Also, notice that the test failed for this specific reason:

2: [flexbe_core.rosunit-test_flexbe_core/test_concurrency_container][FAILURE]------
2: 0.000545359 != 0.1 within 2 places (0.09945464100000001 difference)

This PR only makes that assertion less strict, but the above would still be out of spec. But the test does succeed, so I'm still confused as to why this PR does succeed tests. Maybe some random-ness in ordering or timing going on?

LoyVanBeek avatar Jan 21 '21 12:01 LoyVanBeek

My first assessment would be the following: The test runs a concurrency container with two states both with 10Hz rate. In the first container execute, both states are executed and initialize their rate to execute again after 0.1s. The container then sleeps for the lowest sleep time of the states and in the second container execute, both states should execute again because they have the same rate. Now where this goes wrong could be that during the first container execute, there is a brief delay between the two state executes, e.g., due to an external process interrupt exactly at that moment, so when the second container execute starts, the second state still has like here 0.000545359 seconds on its sleep timer. This would not be notable in practice because the container schedules its next execute for the second state after this very short delay, but in this test case where the executes are called explicitly, it makes a difference.

pschillinger avatar Jan 23 '21 11:01 pschillinger

Aha!

Is there anything I can help with further to find the cause and maybe propose a fix?

LoyVanBeek avatar Jan 24 '21 09:01 LoyVanBeek