SMACC
SMACC copied to clipboard
Segfault when transitioning from state with a high frequency callback client behaviour
I've come across what might be a bug.
In my system, I have a CB subscribed to a high-frequency teleop topic with the standard SMACC signal system, which then publishes messages onto another topic via another SMACC client. However, I've experienced segfaults occurring during state transitions when publishing a message from within the subscriber CB callback. GDB confirmed my suspicion that I'm referencing a nullptr
when trying to publish a message via the client pointer. I suspect the CB's onExit()
has been called and all the class members have been deallocated before the client pointer is dereferenced and the message is published, triggering a segfault.
The event timeline looks something like this:
- topic data published -> onMessageReceived() triggered -> dereference class pointer to client -> SEGFAULT
- transition triggered -> call onExit() -> destroy CB
Does this make sense? I've added an example node here that hopefully make things clearer. The issue seems to be that there's no way to tell when a pointer to a client has been deallocated inside a CB callback, except for maybe doing a check for nullptr
before attempting to deallocate e.g.
if(cl_pub_ == nullptr) return;
cl_pub_->publish(msg)
but this is effectively a race condition (the pointer can still be deallocated between these instructions). This issue only seems to become apparent with high frequency callback triggers - lower frequency CB's manage to destroy themselves more predictably
The obvious solution to me would be to replace the naked pointers with std::shared_ptr<TClient>
instead and return std::weak_ptr<TClient>
to the CBs via the requiresClient()
calls. On other words, a typical CB would go from
class CbExample : public smacc::ISmaccClientBehavior {
public:
void onEntry() override {
if(!this->requiresClient(cl_pub_)) {
ROS_ERROR("Client doesn't exist");
return;
}
std_msgs::Empty msg;
cl_pub_->publish(msg);
}
private:
ClPub* cl_pub_;
};
to
class CbExample : public smacc::ISmaccClientBehavior {
public:
void onEntry() override {
cl_pub_ = this->requiresClient();
if(auto lock = cl_pub_.lock()) {
std_msgs::Empty msg;
lock->publish(msg);
}
else {
ROS_WARN("ClPub went out of scope");
}
}
private:
std::weak_ptr<ClPub> cl_pub_;
};
In my opinion, this has significant safety benefits and will prevent silly errors like the one I described above. I'm happy to look at this and open a PR. However, this refactor will need significant changes to SMACC's API. Thoughts?