moveit
moveit copied to clipboard
Draft: Support timed locking in PlanningSceneMonitor
This replaces the shared_mutex
of the PlanningSceneMonitor with a shared_timed_mutex
that allows specifying a timeout for lock attempts. The scoped PS wrappers LockedPlanningSceneRO and LockedPlanningSceneRW now support passing a timeout duration and may fail with a TryLockTimeoutException in case of a failed attempt. This is very useful for making sure the calling thread is not blocked for too long in case of concurrency issues.
Additionally, the new scoped lock wrapper TimedLock allows tracking lock sources (=caller names), start time, waiting time and the duration that the lock persists. An optional DiagnosticStatus publisher can be enabled for debugging these lock events at runtime.
The scope-less locking functions (lockSceneRead(), etc...) are deprecated since they are error-prone and don't work well with the diagnostics feature.
My motivation for this was a lengthy debugging session for getting rid of spurious deadlocks and performance issues. Especially, in scenarios where the planning scene monitor is being used in a time-constrained loop (servo collision checking, online sampling for controller commands) it's critical to understand where time is being spent and what threads are being blocked. I'm not convinced that the DiagnosticStatus publisher should be merged as is (moveit_profiler would do a similar job), but the timed locking feature seems really useful and safer to me.
Left TODOs:
- [ ] Handle locking of octomap_monitor
- [ ] Keep track of lock owners and print names with warning/error/exception
- [ ] Move boilerplate code into separate header
Just a high-level comment. I wonder what the overhead of a shared_timed_mutex
is compared to a regular shared_mutex
. Answers here suggest it can be significant. Should benchmark.
Just a high-level comment. I wonder what the overhead of a
shared_timed_mutex
is compared to a regularshared_mutex
. Answers here suggest it can be significant. Should benchmark.
Good point! I'd be surprised if there was a measurable difference between shared_timed_mutex
(without timeouts) and shared_mutex
since they both implement the same reader/writer behavior. The wrapper class TimedLock
however could actually have performance impacts, we should make sure that this is not the case.
Just a high-level comment. I wonder what the overhead of a
shared_timed_mutex
is compared to a regularshared_mutex
. Answers here suggest it can be significant. Should benchmark.
Do you have an idea how to benchmark this? If yes, we could integrate it in https://github.com/ros-planning/moveit/issues/2717
@captain-yoshi and @felixvd here's a script that is good for benchmarking repeated, high-frequency reads of the planning scene. (https://github.com/ros-planning/moveit/pull/2698#issuecomment-853091796)
To properly test this PR, some writes to the planning scene should also be included (to mimic normal operation). For example, updating joint angles at 100-200 Hz or so.
I'm also curious how std::mutex
would perform, compared to shared_mutex
.
From your descriptions it looks like there are some things left to do to cleanup this patch.
I only skimmed through the changes for now. My main concern is backward compatibility. Is this patch supposed to be API compatible except for deprecating the non-RAII locking API (to which I definitely agree)?