Fast-DDS
Fast-DDS copied to clipboard
Deadlock in v2.6.2
Is there an already existing issue for this?
- [X] I have searched the existing issues
Expected behavior
No deadlock occurs at startup
Current behavior
A high deadlock rate occurs at startup.
Steps to reproduce
The scenario while deadlock occurs.

##1 Get a lock on mp_mutex in Thread3
##2 Get shared lock of endpoints_list_mutex in Thread2
##3 Trying to get mp_mutex in Thread2, but it is blocked because it is already locked in Thread3
##4 Trying to get write lock of endpoints_list_mutex in Thread1, but it is blocked because there is a reader in ##2.
https://github.com/eProsima/Fast-DDS/blob/5076ebc0c5d030cac6225b94e18ef5b17c996ef3/include/fastrtps/utils/shared_mutex.hpp#L69-L72
write_entered flag is set, and following endpoints_list_mutex reads are blocked.
https://github.com/eProsima/Fast-DDS/blob/5076ebc0c5d030cac6225b94e18ef5b17c996ef3/include/fastrtps/utils/shared_mutex.hpp#L98-L101
##5 Trying to get shared lock of endpoints_list_mutex in Thread3, but it is blocked because of the write_entered flag
Fast DDS version/commit
v2.6.2
Platform/Architecture
Ubuntu Focal 20.04 amd64
Transport layer
Default configuration, UDPv4 & SHM
Additional context
For same codes, there is no deadlock with v2.6.0
XML configuration file
No response
Relevant log output
No response
Network traffic capture
No response
@MiguelCompany @eProsima/team this is deadlock issue, just friendly ping.
@Barry-Xu-2018 @fujitatomoya There's a proposed fix in #2976, could you check with it?
@MiguelCompany thanks! we will try that out and get back to you.
@Barry-Xu-2018 @fujitatomoya Did you have time to check whether #2976 fixes this?
@MiguelCompany i will check the evaluation status, will get back to you soon.
@MiguelCompany According to changed code, I think it can fix this problem. Fujita-san will provide final evaluation result in the real environment.
Fujita-san
that is me 😄 family name!
@fujitatomoya hello,how is the final evaluation about #2976 going?
sorry we confirmed that no deadlock observed after this PR.
@fujitatomoya thx~
Closing based on https://github.com/eProsima/Fast-DDS/issues/2961#issuecomment-1292271447