vsomeip icon indicating copy to clipboard operation
vsomeip copied to clipboard

[BUG]: Crash of RoutingManagerd version 3.4.10

Open akhzarj opened this issue 1 year ago • 7 comments

vSomeip Version

v.3.4.10

Boost Version

1.78.0

Environment

Target: Test bench with automated test running OS: Embedded Linux

Describe the bug

During the testing activities we observed several (around 6 times) crashes of routingmanagerd.

  • routingmanagerd core dumped with SIGSEGV, Segmentation fault
  • routingmanagerd core dumped with SIGABRT, Aborted

Details in provided back-traces.

Reproduction Steps

Several hundred (300-400) Loop test on the target by running various applications.

Expected behaviour

routingmanagerd should not crash.

Logs and Screenshots

No response

akhzarj avatar Jun 26 '24 10:06 akhzarj

I am very interested in reproducing this. Could you provide some more details about the "Reproduction Steps", especially which applications were used and how exactly the test loops look like?

lutzbichler avatar Aug 09 '24 05:08 lutzbichler

@akhzarj can you give some indications on how we could reproduce it?

duartenfonseca avatar Sep 24 '24 16:09 duartenfonseca

Hi @duartenfonseca , @lutzbichler We already find out the root cause and it is related to dangling pointers in
implementation/endpoints/src/tcp_client_endpoint_impl.cpp with the strand::dispatch() behavior dualism: https://www.boost.org/doc/libs/1_80_0/doc/html/boost_asio/reference/strand/dispatch.html

When strand is busy then passed function will be scheduled and execute after return from dispatch() and it's caller function and then the passed references to local variables will become dangled. To be able to do reproduction the appropriate strands needs to be stressed to become busy.

The fix is removing references in:

  • https://github.com/COVESA/vsomeip/blob/0b83e24d16e1611958194e9b727136522f46556b/implementation/endpoints/src/tcp_client_endpoint_impl.cpp#L272
  • https://github.com/COVESA/vsomeip/blob/0b83e24d16e1611958194e9b727136522f46556b/implementation/endpoints/src/tcp_client_endpoint_impl.cpp#L773
  • https://github.com/COVESA/vsomeip/blob/0b83e24d16e1611958194e9b727136522f46556b/implementation/endpoints/src/tcp_client_endpoint_impl.cpp#L801
  • https://github.com/COVESA/vsomeip/blob/0b83e24d16e1611958194e9b727136522f46556b/implementation/endpoints/src/tcp_client_endpoint_impl.cpp#L951

Notes:

  • In the last one the fix in addition replaces lambda with std::bind() due to lambda immutability, alternative make lambda mutable.
  • It is not checked against the latest version of vsomeip, but you can easily if any new/updated strand::dispatch() contain the same issue.

akhzarj avatar Oct 10 '24 07:10 akhzarj

Hi @akhzarj, So, If I remember correctly this is the same issue we discussed some time ago in the monthly meeting. The fix is not yet in the master, but I asked @kheaactua to create a PR with the fix. Can you have a look at #774, and update it. I seem it does not contain all changes. Thanks! :)

fcmonteiro avatar Oct 10 '24 07:10 fcmonteiro

hmm, it looks like I am missing: https://github.com/COVESA/vsomeip/blob/0b83e24d16e1611958194e9b727136522f46556b/implementation/endpoints/src/tcp_client_endpoint_impl.cpp#L801

I'll add that now.

kheaactua avatar Oct 10 '24 12:10 kheaactua

Hi @fcmonteiro Yes you are remembering it correctly and PR #774 contains the fix that we have with last update from @kheaactua .

akhzarj avatar Oct 11 '24 10:10 akhzarj