erpc
erpc copied to clipboard
[BUG] Race condition two threads getting expected reply error
Describe the bug
In short: in case of two client threads requesting in same time from server, they may get each one the answer of the else and hence return kErpcStatus_ExpectedReply due to wrong sequence number
Deep dive to the scenario: In case of simple client is used (not arbitrated), the function performClientRequest sends and receives without locking the mutex (but the receive itself and the send itself runs locked in the framed transport class).
That may lead to scenario of the following steps:
- thread A sends request
- context switch happens and thread B running and send request (may likely happen if B has higher priority and it tried to send request when A was in middle, and was blocked on the send lock till A will finish his send)
- thread B enters receive and blocks/busy waiting to response
- server get request A (as it was sent first) and responds to it
- thread B get the respond to A and return kErpcStatus_ExpectedReply due to wrong sequence number and releasing the lock
- Thread A get into recieve
- server get request B (as it was sent first) and responds to it
- thread A get the respond to B and return kErpcStatus_ExpectedReply due to wrong sequence number
It should be noted this is probably won't happen in arbitrated client where all client are assigning there sequence number and the arbitrator waking up the relevant thread according the sequence number.
To Reproduce
run two client threads with different priorities on long send/receive loops
Expected behavior
Each thread getting its response
Screenshots
Not applicable
Desktop (please complete the following information)
- OS: linux
- eRPC Version: 1.10.0
Steps you didn't forgot to do
- [x] I checked if there is no related issue opened/closed.
- [x] I checked that there doesn't exist opened PR which is solving this issue.
Additional context
I think it related #374
I have a related issue I haven't filed yet. It may be related to this issue though and I haven't fully qualified what is going on. But I had the question, is it supported behavior to call client methods from different threads or should they be synchronized? I know the ArtbitratedClientManager solves called between the server thread and the client thread.. but what about multiple client threads? Is that a supported use case?
If you are using arbitrated client manager I think you are OK with multiple threads calling in parallel (there is theoretical race condition in case of getting timeout in same time of getting response)
If you are using arbitrated client manager I think you are OK with multiple threads calling in parallel (there is theoretical race condition in case of getting timeout in same time of getting response)
Thanks I'll continue to try to debug my problem and see if I can create a reproducing test case.