Fast-DDS
Fast-DDS copied to clipboard
Shared mem partition fills up after many runs: how to garbage collect?
Is there an already existing issue for this?
- [X] I have searched the existing issues
Expected behavior
I should be able to turn my system on and off many times with it operating the same every time.
Current behavior
After bringing up the system a number of times, my /dev/shm tmpfs partition fills up, no matter how big it is. Currently, mine is 32GB and it still fills up.
Steps to reproduce
Restart a participant many times in a loop until it can no longer create new shared mem segments.
Fast DDS version/commit
2.6.0-3jammy.20220520.002055
Platform/Architecture
Ubuntu Focal 20.04 amd64
Transport layer
Default configuration, UDPv4 & SHM
Additional context
When I run FastDDS participants many times, my /dev/shm partition fills up with segments until no more can be created. This is with /dev/shm being 32 GB in size. Does FastDDS not have any garbage collection capabilities, or does it just rely on machine restart?
This is running in containers with ipc: host and network: host
XML configuration file
No response
Relevant log output
[RTPS_TRANSPORT_SHM Error] Failed to create segment fastrtps_796f9c10effb4cf1: No such file or directory -> Function Segment
[RTPS_MSG_OUT Error] No such file or directory -> Function init
[RTPS_PARTICIPANT Error] Unable to Register SHM Transport. SHM Transport is not supported in the current platform.
Network traffic capture
No response
Hi @Aposhian,
The shared memory files are cleaned up if the application using Fast DDS exits cleanly using the corresponding API to delete the DDS DomainParticipant. If your application is crashing or you are closing it without calling DomainParticipantFactory::delete_participant then the files are going to be kept. Nevertheless, Fast DDS tries to reuse the shared memory files from previous runs. However, as the application has not exited cleanly, some of these files can be still marked as blocked (you may find this comment helpful).
I am using this from ROS2. I am typically stopping applications with SIGINT. Does rmw_fastrtps make a call to DomainParticipantFactory::delete_participant in that event?
Is it safe to run fastdds shm clean while nodes are running?
Is it safe to run
fastdds shm cleanwhile nodes are running?
It is safe, the result looks like this:
root@csc:/work/ros2_ws# fastdds shm clean
shm.clean:
4 ports in use
2 segments in use
0 zombie ports cleaned
0 zombie segments cleaned
root@csc:/work/ros2_ws# ls /dev/shm/
fast_datasharing_01.0f.3d.85.a9.09.eb.fa.01.00.00.00_0.0.12.3 fastrtps_port7412 fastrtps_port7415
fast_datasharing_01.0f.3d.85.b6.09.19.4d.01.00.00.00_0.0.12.4 fastrtps_port7412_el fastrtps_port7415_el
fastrtps_1820ff8c0c530a42 fastrtps_port7413 sem.fastrtps_port7412_mutex
fastrtps_1820ff8c0c530a42_el fastrtps_port7413_el sem.fastrtps_port7413_mutex
fastrtps_b7e2475640426c8c fastrtps_port7414 sem.fastrtps_port7414_mutex
fastrtps_b7e2475640426c8c_el fastrtps_port7414_el sem.fastrtps_port7415_mutex
@Aposhian
After bringing up the system a number of times...
fastdds shm clean is a python script, you can call it when bring the system down to insure the template files are cleaned up.
@JLBuenoLopez-eProsima
Is it possible to add a feature for auto cleanup broken shared files for fastdds? I think this maybe helpful for user.
Hi @llapx,
It is not in Fast DDS roadmap to include such feature. If you are interested you can open a ticket in the corresponding forum and see if there is enough community support. Also, you may be interested in contacting Fast DDS support team for commercial support.
I am using this from ROS2. I am typically stopping applications with SIGINT. Does
rmw_fastrtpsmake a call toDomainParticipantFactory::delete_participantin that event?
@Aposhian, rmw_fastrtps_shared_cpp::destroy_participant handles the DomainParticipant destruction cleanly. I do not know how the ROS 2 stack signal handling works and if this method is called in case of SIGINT. I suppose that question should be asked in some other place.
I think this issue have been answered and can be closed. Would you mind doing it, @Aposhian? Otherwise, let me know why you consider the issue should be kept open. Maybe we should consider moving to the Q&A forum where according to Fast DDS contributing guidelines, questions should be kept.
I can see there is a way forward by using fastdds shm clean, but I still think this presents a bad user experience. For someone who is using FastDDS with the default config, they may not even exactly know that shared memory is being used, or how shared memory is being used. They use the system for a while, and it works, and then someday it may break with the cryptic error messages that I posted above. The error messages could be better if it said something like "Unable to create new shared memory segments: is your shared memory partition full? Try running fastdds shm clean." Or, if a shared memory segment is unable to be created, then fastdds could automatically try fastdds shm clean and retry (with a corresponding warning message as to what is going on).
@JLBuenoLopez-eProsima How about updating the message for user to be more user friendly as @Aposhian suggested above? i think that message is not clear for user what to do to solve the problem...
I encountered the same issue on Android, and I never found a solution. I had to disable shared memory. This problem occurred after compiling the security module, and the problem still exists after undoing the modification again.
I have created a ticket to track the enhancement of improving the log messages when initializing the SHM Transport (#3578). I am going to close this issue because the question was answered and there is a ticket tracking the improvement.