vsomeip [BUG]: vsomeip slow to establish communication with lots of EventGroup

vSomeip Version

v3.4.10

Boost Version

1.82

Environment

Android and QNX

Describe the bug

My automotive system has *.fidl with ~3500 attributes, one per CAN signal. My *.fdepl maps each attribute into a unique EventGroup.

Any time the network connection is established, or broken and re-established, I get an avalanche of ~3500 subscribes, followed by ~3500 acknowledgements, transmitted one-per-frame. The entire sequence does not fit inside a 2 seconds Service Discovery interval. When the work does not complete within the timeout interval then routingmanager will issue StopSubscribe and SubscribeNAK. The system will retry but it will take a long time, at least a couple of Service Discovery intervals.

The train logic is supposed to aggregate these together, sending a train only when it’s full or 5 ms elapse, but there are several places in the code that prevent this.

Reproduction Steps

This behavior is easily reproduced when the system has a *.fidl with 1000s of attributes and *.fdepl puts each into a unique EventGroup.

Subscribe to all ~3500 attributes, use an ifconfig down; sleep 10; ifconfig up to break and re-establish the network connection, look at the tcpdump and observe the network behavior.

Expected behaviour

The train logic should do a "pretty good job" to aggregate many SUBSCRIBE and many SUBSCRIBEACK into each Service Discovery packet.

Logs and Screenshots

With the existing code you should see 1000s of back-to-back SUBSCRIBE like:

5039	9.333908	10.6.0.3	10.6.0.10	SOME/IP-SD	86	SOME/IP Service Discovery Protocol [SubscribeNack]
5040	9.334271	10.6.0.10	10.6.0.3	SOME/IP-SD	104	SOME/IP Service Discovery Protocol [Subscribe]
5041	9.335307	10.6.0.10	10.6.0.3	SOME/IP-SD	98	SOME/IP Service Discovery Protocol [Subscribe]
5042	9.335710	10.6.0.10	10.6.0.3	SOME/IP-SD	114	SOME/IP Service Discovery Protocol [Subscribe]
5043	9.336492	10.6.0.10	10.6.0.3	SOME/IP-SD	98	SOME/IP Service Discovery Protocol [Subscribe]
5044	9.336762	10.6.0.10	10.6.0.3	TCP	66	36651 → 30510 [FIN, ACK] Seq=142 Ack=1 Win=64256 Len=0 TSval=269564273 TSecr=2

each of ~98 bytes, separate packets, nothing or almost-nothing aggregated. In this region we see a SUBSCRIBENACK and socket close because the entire sequence exceeded the 2s Service Discovery timeout interval

Apr 10 '24 18:04 joeyoravec

I've opened draft pull requests:

#671
#670

with the code-changes that I've applied locally to address this issue. I would appreciate any feedback on the approach.

Apr 10 '24 20:04 joeyoravec

I've updated the pull request for 3.4.x (but not 3.1.x) with an additional commit for a problem discovered in testing. I was getting this warning:

Received an unreliable vSomeIP SD message with too short length field local: 10.6.0.10:30490 remote: 10.6.0.3:30490

and the root-cause was here: https://github.com/COVESA/vsomeip/blob/6c0e9db200fbcfd37879c4b2ff0c8523a29d8eb5/implementation/endpoints/src/udp_server_endpoint_impl.cpp#L682-L690

on_message_received supports multiple messages in a single UDP frame but only processes the message:

if the message is not SOMEIP-SD
else if the message is SOMEIP-SD and there’s no subsequent message in the frame

After changing the train logic to aggregate multiple SOMEIP-SD messages into a single UDP frame we want it to process all messages found in the frame, no matter if the messages are SOMEIP or SOMEIP-SD

May 22 '24 16:05 joeyoravec

hi @joeyoravec i have been trying to reproduce your problem on my environment, so that we could validate the fix, however I am having some problems. I used one of the CommonAPI examples (link) to achieve this, with the following configurations: example_configs.zip

Can you check if these make sense? our provide the ones you used so that i could check it.

Thanks!

Aug 28 '24 17:08 duartenfonseca

vsomeip vsomeip copied to clipboard

[BUG]: vsomeip slow to establish communication with lots of EventGroup

vSomeip Version

Boost Version

Environment

Describe the bug

Reproduction Steps

Expected behaviour

Logs and Screenshots

vsomeip
vsomeip copied to clipboard