create publisher deadlock
Is there an already existing issue for this?
- [x] I have searched the existing issues
Expected behavior
create publisher not in listener thread ,but it stucked
Current behavior
Thread 80 (Thread 0xfffeb37fa900 (LWP 18039)):
#0 __lll_lock_wait (futex=futex@entry=0xaaab0537e8e0, private=0) at lowlevellock.c:52
#1 0x0000ffff83f71cd8 in __GI___pthread_mutex_lock (mutex=0xaaab0537e8e0) at pthread_mutex_lock.c:80
#2 0x0000ffff84f53f64 in eprosima::fastdds::rtps::RTPSParticipantImpl::createSendResources(eprosima::fastdds::rtps::Endpoint*) () at ../lib/libjaiotcppsdk.so
#3 0x0000ffff84f55e10 in eprosima::fastdds::rtps::RTPSParticipantImpl::create_writer(eprosima::fastdds::rtps::RTPSWriter**, eprosima::fastdds::rtps::WriterAttributes&, eprosima::fastdds::rtps::WriterHistory*, eprosima::fastdds::rtps::WriterListener*, eprosima::fastdds::rtps::EntityId_t const&, bool) () at ../lib/libjaiotcppsdk.so
#4 0x0000ffff84f7d758 in eprosima::fastdds::rtps::RTPSDomainImpl::create_rtps_writer(eprosima::fastdds::rtps::RTPSParticipant*, eprosima::fastdds::rtps::EntityId_t const&, eprosima::fastdds::rtps::WriterAttributes&, eprosima::fastdds::rtps::WriterHistory*, eprosima::fastdds::rtps::WriterListener*) () at ../lib/libjaiotcppsdk.so
#5 0x0000ffff84df5898 in eprosima::fastdds::dds::DataWriterImpl::enable() () at ../lib/libjaiotcppsdk.so
#6 0x0000ffff84ffd55c in eprosima::fastdds::statistics::dds::DataWriterImpl::enable() () at ../lib/libjaiotcppsdk.so
#7 0x0000ffff84debe28 in eprosima::fastdds::dds::DataWriter::enable() () at ../lib/libjaiotcppsdk.so
#8 0x0000ffff84e00938 in eprosima::fastdds::dds::PublisherImpl::create_datawriter(eprosima::fastdds::dds::Topic*, eprosima::fastdds::dds::DataWriterImpl*, eprosima::fastdds::dds::StatusMask const&) () at ../lib/libjaiotcppsdk.so
#9 0x0000ffff84e00c44 in eprosima::fastdds::dds::PublisherImpl::create_datawriter(eprosima::fastdds::dds::Topic*, eprosima::fastdds::dds::DataWriterQos const&, eprosima::fastdds::dds::DataWriterListener*, eprosima::fastdds::dds::StatusMask const&, std::shared_ptreprosima::fastdds::rtps::IPayloadPool) () at ../lib/libjaiotcppsdk.so
#10 0x0000ffff84dfd630 in eprosima::fastdds::dds::Publisher::create_datawriter(eprosima::fastdds::dds::Topic*, eprosima::fastdds::dds::DataWriterQos const&, eprosima::fastdds::dds::DataWriterListener*, eprosima::fastdds::dds::StatusMask const&, std::shared_ptreprosima::fastdds::rtps::IPayloadPool) () at ../lib/libjaiotcppsdk.so
#11 0x0000ffff84ca0c70 in ja::FastDDSPublisher::FastDDSPublisher(eprosima::fastdds::dds::DomainParticipant*, eprosima::fastdds::dds::TypeSupport const&, std::__cxx11::basic_string<char, std::char_traits
Steps to reproduce
- in bussness thread
- publish message if not publisher then create publisher
- it stucked, but not a must-occur.
Fast DDS version/commit
Ubuntu 20.04.6 LTS \n \l
Platform/Architecture
Ubuntu Focal 20.04 arm64
Transport layer
Shared Memory Transport (SHM)
Additional context
create participant:
InfoL << "fast-dds ipc channel start initializing";
auto factory = DomainParticipantFactory::get_instance();
DomainParticipantQos pqos = PARTICIPANT_QOS_DEFAULT;
pqos.setup_transports(eprosima::fastdds::rtps::BuiltinTransports::LARGE_DATA);
pqos.transport().use_builtin_transports = false;
std::shared_ptr<SharedMemTransportDescriptor> shm_transport_ =
std::make_shared<SharedMemTransportDescriptor>();
pqos.transport().user_transports.push_back(shm_transport_);
pqos.wire_protocol().builtin.discovery_config.leaseDuration = Duration_t(3, 0);
pqos.wire_protocol().builtin.discovery_config.leaseDuration_announcementperiod = Duration_t(1, 0);
participant = factory->create_participant(0, pqos, this, StatusMask::none());
if (participant == nullptr) {
throw std::runtime_error("Participant initialization failed");
}
type.register_type(participant);
create publisher
FastDDSPublisher::FastDDSPublisher(DomainParticipant *participant, const TypeSupport &type,
const std::string &topic_name) {
InfoL << "creating fastdds publisher,topic=" << topic_name;
this->topic_name = topic_name;
// Create the publisher
PublisherQos pub_qos = PUBLISHER_QOS_DEFAULT;
participant->get_default_publisher_qos(pub_qos);
publisher = participant->create_publisher(pub_qos, nullptr, StatusMask::none());
if (publisher == nullptr) {
ErrorL << "publisher initialization failed,topic=" << topic_name;
}
// Create the topic
TopicQos topic_qos = TOPIC_QOS_DEFAULT;
participant->get_default_topic_qos(topic_qos);
topic = participant->create_topic(topic_name, type.get_type_name(), topic_qos);
if(topic == nullptr) {
InfoL << "create topic failed,try find,topic=" << topic_name;
topic = participant->find_topic(topic_name, {1, 0});
}
if (topic == nullptr) {
ErrorL << "topic initialization failed,topic=" << topic_name;
}
// Create the data writer
DataWriterQos writer_qos = DATAWRITER_QOS_DEFAULT;
publisher->get_default_datawriter_qos(writer_qos);
writer = publisher->create_datawriter(topic, writer_qos, this, StatusMask::all());
if (writer == nullptr) {
ErrorL << "dataWriter initialization failed,topic=" << topic_name;
}
matched = 0;
InfoL << "create fastdds publisher success,topic=" << topic_name;
}
XML configuration file
no xml
Relevant log output
Network traffic capture
No response
Hi @kubbo,
Thanks for your contribution. Could you please provide a reproducer on this issue or at least share the stack of the second thread causing the deadlock?
That mutex is mainly used during the creation of sending resources and later during sending, so it should not block in the way you are experiencing.
It is generally not easy to find out the cause from the lock information at the top. Judging from your stack, the surface phenomenon is that PipelineDockerManager is waiting for a lock due to createSendResources. It may be an ABBA deadlock. However, there is no information showing which thread occupies the m_send_resources_mutex_ of createSendResources. Maybe you can enter through gdb, go to the frame of mutex=0xaaab0537e8e0, and use p *mutex to check which thread occupies it;