Fast-DDS icon indicating copy to clipboard operation
Fast-DDS copied to clipboard

create publisher deadlock

Open kubbo opened this issue 8 months ago • 3 comments

Is there an already existing issue for this?

  • [x] I have searched the existing issues

Expected behavior

create publisher not in listener thread ,but it stucked

Current behavior

Thread 80 (Thread 0xfffeb37fa900 (LWP 18039)): #0 __lll_lock_wait (futex=futex@entry=0xaaab0537e8e0, private=0) at lowlevellock.c:52 #1 0x0000ffff83f71cd8 in __GI___pthread_mutex_lock (mutex=0xaaab0537e8e0) at pthread_mutex_lock.c:80 #2 0x0000ffff84f53f64 in eprosima::fastdds::rtps::RTPSParticipantImpl::createSendResources(eprosima::fastdds::rtps::Endpoint*) () at ../lib/libjaiotcppsdk.so #3 0x0000ffff84f55e10 in eprosima::fastdds::rtps::RTPSParticipantImpl::create_writer(eprosima::fastdds::rtps::RTPSWriter**, eprosima::fastdds::rtps::WriterAttributes&, eprosima::fastdds::rtps::WriterHistory*, eprosima::fastdds::rtps::WriterListener*, eprosima::fastdds::rtps::EntityId_t const&, bool) () at ../lib/libjaiotcppsdk.so #4 0x0000ffff84f7d758 in eprosima::fastdds::rtps::RTPSDomainImpl::create_rtps_writer(eprosima::fastdds::rtps::RTPSParticipant*, eprosima::fastdds::rtps::EntityId_t const&, eprosima::fastdds::rtps::WriterAttributes&, eprosima::fastdds::rtps::WriterHistory*, eprosima::fastdds::rtps::WriterListener*) () at ../lib/libjaiotcppsdk.so #5 0x0000ffff84df5898 in eprosima::fastdds::dds::DataWriterImpl::enable() () at ../lib/libjaiotcppsdk.so #6 0x0000ffff84ffd55c in eprosima::fastdds::statistics::dds::DataWriterImpl::enable() () at ../lib/libjaiotcppsdk.so #7 0x0000ffff84debe28 in eprosima::fastdds::dds::DataWriter::enable() () at ../lib/libjaiotcppsdk.so #8 0x0000ffff84e00938 in eprosima::fastdds::dds::PublisherImpl::create_datawriter(eprosima::fastdds::dds::Topic*, eprosima::fastdds::dds::DataWriterImpl*, eprosima::fastdds::dds::StatusMask const&) () at ../lib/libjaiotcppsdk.so #9 0x0000ffff84e00c44 in eprosima::fastdds::dds::PublisherImpl::create_datawriter(eprosima::fastdds::dds::Topic*, eprosima::fastdds::dds::DataWriterQos const&, eprosima::fastdds::dds::DataWriterListener*, eprosima::fastdds::dds::StatusMask const&, std::shared_ptreprosima::fastdds::rtps::IPayloadPool) () at ../lib/libjaiotcppsdk.so #10 0x0000ffff84dfd630 in eprosima::fastdds::dds::Publisher::create_datawriter(eprosima::fastdds::dds::Topic*, eprosima::fastdds::dds::DataWriterQos const&, eprosima::fastdds::dds::DataWriterListener*, eprosima::fastdds::dds::StatusMask const&, std::shared_ptreprosima::fastdds::rtps::IPayloadPool) () at ../lib/libjaiotcppsdk.so #11 0x0000ffff84ca0c70 in ja::FastDDSPublisher::FastDDSPublisher(eprosima::fastdds::dds::DomainParticipant*, eprosima::fastdds::dds::TypeSupport const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) (this=0xfffebc058270, participant=0xaaab053ffb40, type=..., topic_name="sys/00000000002/34010000006011156400_acs800/thing/event/pipelineTaskEvent/info") at /tmp/tmp.86aVfdYZFw/3rdparty/ja-IPCChannel/src/channel/fastdds/FastDDSPublisher.cpp:57 #12 0x0000ffff84c958c8 in __gnu_cxx::new_allocatorja::FastDDSPublisher::construct<ja::FastDDSPublisher, eprosima::fastdds::dds::DomainParticipant*&, eprosima::fastdds::dds::TypeSupport&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&>(ja::FastDDSPublisher*, eprosima::fastdds::dds::DomainParticipant*&, eprosima::fastdds::dds::TypeSupport&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) (this=0xfffeb37f8d10, __p=0xfffebc058270) at /usr/include/c++/9/ext/new_allocator.h:146

Steps to reproduce

  1. in bussness thread
  2. publish message if not publisher then create publisher
  3. it stucked, but not a must-occur.

Fast DDS version/commit

Ubuntu 20.04.6 LTS \n \l

Platform/Architecture

Ubuntu Focal 20.04 arm64

Transport layer

Shared Memory Transport (SHM)

Additional context

create participant:


  InfoL << "fast-dds ipc channel start initializing";
        auto factory = DomainParticipantFactory::get_instance();

        DomainParticipantQos pqos = PARTICIPANT_QOS_DEFAULT;
        pqos.setup_transports(eprosima::fastdds::rtps::BuiltinTransports::LARGE_DATA);

        pqos.transport().use_builtin_transports = false;
        std::shared_ptr<SharedMemTransportDescriptor> shm_transport_ =
                std::make_shared<SharedMemTransportDescriptor>();
        pqos.transport().user_transports.push_back(shm_transport_);


        pqos.wire_protocol().builtin.discovery_config.leaseDuration = Duration_t(3, 0);
        pqos.wire_protocol().builtin.discovery_config.leaseDuration_announcementperiod = Duration_t(1, 0);


        participant = factory->create_participant(0, pqos, this, StatusMask::none());

        if (participant == nullptr) {
            throw std::runtime_error("Participant initialization failed");
        }

        type.register_type(participant);

create publisher


FastDDSPublisher::FastDDSPublisher(DomainParticipant *participant, const TypeSupport &type,
                                       const std::string &topic_name) {

        InfoL << "creating fastdds publisher,topic=" << topic_name;
        this->topic_name = topic_name;
        // Create the publisher
        PublisherQos pub_qos = PUBLISHER_QOS_DEFAULT;
        participant->get_default_publisher_qos(pub_qos);
        publisher = participant->create_publisher(pub_qos, nullptr, StatusMask::none());
        if (publisher == nullptr) {

            ErrorL << "publisher initialization failed,topic=" << topic_name;
        }

        // Create the topic
        TopicQos topic_qos = TOPIC_QOS_DEFAULT;
        participant->get_default_topic_qos(topic_qos);



        topic = participant->create_topic(topic_name, type.get_type_name(), topic_qos);
        if(topic == nullptr) {
            InfoL << "create topic failed,try find,topic=" << topic_name;
            topic = participant->find_topic(topic_name, {1, 0});
        }

        if (topic == nullptr) {
            ErrorL << "topic initialization failed,topic=" << topic_name;
        }

        // Create the data writer
        DataWriterQos writer_qos = DATAWRITER_QOS_DEFAULT;

        publisher->get_default_datawriter_qos(writer_qos);
        writer = publisher->create_datawriter(topic, writer_qos, this, StatusMask::all());
        if (writer == nullptr) {
            ErrorL << "dataWriter initialization failed,topic=" << topic_name;
        }
        matched = 0;
        InfoL << "create fastdds publisher success,topic=" << topic_name;
    }

XML configuration file

no xml

Relevant log output


Network traffic capture

No response

kubbo avatar May 14 '25 05:05 kubbo

Hi @kubbo,

Thanks for your contribution. Could you please provide a reproducer on this issue or at least share the stack of the second thread causing the deadlock?

That mutex is mainly used during the creation of sending resources and later during sending, so it should not block in the way you are experiencing.

cferreiragonz avatar May 14 '25 06:05 cferreiragonz

the process all thread stack attached

stack (6).txt

kubbo avatar May 16 '25 02:05 kubbo

It is generally not easy to find out the cause from the lock information at the top. Judging from your stack, the surface phenomenon is that PipelineDockerManager is waiting for a lock due to createSendResources. It may be an ABBA deadlock. However, there is no information showing which thread occupies the m_send_resources_mutex_ of createSendResources. Maybe you can enter through gdb, go to the frame of mutex=0xaaab0537e8e0, and use p *mutex to check which thread occupies it;

fenggaobj avatar May 19 '25 03:05 fenggaobj