Fast-DDS icon indicating copy to clipboard operation
Fast-DDS copied to clipboard

Sleep duration at publisher side influences latency [13360]

Open Silvan97 opened this issue 2 years ago • 6 comments

Hi, I am trying to do some latency tests by cyclic publishing data every 100ms. At the subscriber I substract the source_timestamp from reception_timestamp provided by the SampleInfo. I noticed that the sleep duration at the publisher influences the meassured latencies enormously. As far as I understand in SYNCHRONOUS mode the write() function is blocking and in ASYNCHRONOUS a separate thread handles the write operation. So in both cases a sleep of the main thread should not pause the write operation and influence the latency.

This behaviour can be easily observed at the HelloWorldExample:

With a sleep duration of 1ms I get about 150us latency while a sleep of 1000ms causes a latency about 900us. I tried different combinations of reliability, durability, historydepth and publish_mode settings in the DataWriterQos and DataReaderQos without success. How can I achieve a cyclic publishing every 100ms with the lowest possible latency?

I am aware of the latency test in this repository but it's not suitable for me since the latency ist estimated by meassuring the round-trip time.

Thanks in advance, Silvan

Silvan97 avatar Dec 22 '21 10:12 Silvan97

I think I have a similar issue. Latency of the latency test tool is not representing a regular application.

deeplearningrobotics avatar Aug 03 '22 00:08 deeplearningrobotics

Hi @Silvan97 @deeplearningrobotics ,

I'm afraid I cannot reproduce this issue. I've applied the following patch to the v2.7.1 HelloWorldSubscriber.cpp:

diff --git a/examples/cpp/dds/HelloWorldExample/HelloWorldSubscriber.cpp b/examples/cpp/dds/HelloWorldExample/HelloWorldSubscriber.cpp
index 74977ccf0..24d4fb715 100644
--- a/examples/cpp/dds/HelloWorldExample/HelloWorldSubscriber.cpp
+++ b/examples/cpp/dds/HelloWorldExample/HelloWorldSubscriber.cpp
@@ -160,7 +160,7 @@ void HelloWorldSubscriber::SubListener::on_data_available(
         {
             samples_++;
             // Print your structure data here.
-            std::cout << "Message " << hello_.message() << " " << hello_.index() << " RECEIVED" << std::endl;
+            std::cout << "Latency = " << (info.reception_timestamp.to_ns() - info.source_timestamp.to_ns()) / 1000.0 << " us" << std::endl;
         }
     }
 }

Then, I have run a subscriber in one terminal and a publisher in another.

Sleep of 1 ms

Publisher terminal:

eduponz bin $ ./DDSHelloWorldExample publisher -i 1

Subscriber terminal

eduponz bin $ ./DDSHelloWorldExample subscriber
Starting 
Subscriber running. Please press enter to stop the Subscriber
Subscriber matched.
Latency = 238.209 us
Latency = 125.706 us
Latency = 129.859 us
Latency = 128.425 us
Latency = 136.916 us
Latency = 130.864 us
Latency = 127.312 us
Latency = 145.297 us
Latency = 129.633 us
Latency = 140.161 us
Subscriber unmatched.

Sleep of 10 ms

Publisher terminal:

eduponz bin $ ./DDSHelloWorldExample publisher -i 10

Subscriber terminal

eduponz bin $ ./DDSHelloWorldExample subscriber
Starting 
Subscriber running. Please press enter to stop the Subscriber
Subscriber matched.
Latency = 202.916 us
Latency = 193.928 us
Latency = 198.733 us
Latency = 153.034 us
Latency = 146.185 us
Latency = 155.412 us
Latency = 166.979 us
Latency = 173.183 us
Latency = 155.319 us
Latency = 209.886 us
Subscriber unmatched.

Sleep of 100 ms

Publisher terminal:

eduponz bin $ ./DDSHelloWorldExample publisher -i 100

Subscriber terminal

eduponz bin $ ./DDSHelloWorldExample subscriber
Starting 
Subscriber running. Please press enter to stop the Subscriber
Subscriber matched.
Latency = 303.392 us
Latency = 185.027 us
Latency = 160.752 us
Latency = 176.373 us
Latency = 167.81 us
Latency = 147.133 us
Latency = 177.989 us
Latency = 155.944 us
Latency = 157.806 us
Latency = 161.218 us
Subscriber unmatched.

Sleep of 1000 ms

Publisher terminal:

eduponz bin $ ./DDSHelloWorldExample publisher -i 1000

Subscriber terminal

eduponz bin $ ./DDSHelloWorldExample subscriber
Starting 
Subscriber running. Please press enter to stop the Subscriber
Subscriber matched.
Latency = 300.212 us
Latency = 177.066 us
Latency = 181.685 us
Latency = 187.643 us
Latency = 173.222 us
Latency = 187.84 us
Latency = 190.117 us
Latency = 186.378 us
Latency = 174.57 us
Latency = 207.296 us
Subscriber unmatched.

EduPonz avatar Aug 03 '22 05:08 EduPonz

@EduPonz: I looked more into this problem. Looking at the DDS timestamps looks good but looking at the real latency (set on the application level) show a different picture. In detail I just fill a timestamp into the message right before publishing and messure the difference right after receiving. And this shows a 5x(!) difference.

Starting 
Subscriber matched.
Subscriber running. Please press enter to stop the Subscriber
Latency Real = 1992.65 us
Latency DDS = 353.918 us
Latency Real = 1482.17 us
Latency DDS = 354.623 us
Latency Real = 1471.03 us
Latency DDS = 229.4 us
Latency Real = 1466.78 us
Latency DDS = 335.677 us
Latency Real = 1339.76 us
Latency DDS = 230.275 us
Latency Real = 1405.24 us
Latency DDS = 276.227 us
Latency Real = 1602.6 us
Latency DDS = 402.761 us
Latency Real = 1364.1 us
Latency DDS = 276.522 us
Latency Real = 1559.62 us
Latency DDS = 348.787 us
Latency Real = 1408.38 us
Latency DDS = 278.457 us
Subscriber unmatched.

Diff:

diff --git a~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.cxx b~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.cxx
index 543484890ccb..cdd4588b745e 100644
--- a~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.cxx
+++ b~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.cxx
@@ -36,11 +36,11 @@ using namespace eprosima::fastcdr::exception;
 
 HelloWorld::HelloWorld()
 {
-    // m_index com.eprosima.idl.parser.typecode.PrimitiveTypeCode@289d1c02
+    // m_index com.eprosima.idl.parser.typecode.PrimitiveTypeCode@636be97c
     m_index = 0;
-    // m_message com.eprosima.idl.parser.typecode.StringTypeCode@22eeefeb
+    // m_message com.eprosima.idl.parser.typecode.StringTypeCode@50a638b5
     m_message ="";
-    // m_data com.eprosima.idl.parser.typecode.ArrayTypeCode@17d0685f
+    // m_data com.eprosima.idl.parser.typecode.ArrayTypeCode@1817d444
     memset(&m_data, 0, (1024*1024) * 1);
 
:...skipping...
diff --git a~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.cxx b~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.cxx
index 543484890ccb..cdd4588b745e 100644
--- a~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.cxx
+++ b~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.cxx
@@ -36,11 +36,11 @@ using namespace eprosima::fastcdr::exception;
 
 HelloWorld::HelloWorld()
 {
-    // m_index com.eprosima.idl.parser.typecode.PrimitiveTypeCode@289d1c02
+    // m_index com.eprosima.idl.parser.typecode.PrimitiveTypeCode@636be97c
     m_index = 0;
-    // m_message com.eprosima.idl.parser.typecode.StringTypeCode@22eeefeb
+    // m_message com.eprosima.idl.parser.typecode.StringTypeCode@50a638b5
     m_message ="";
-    // m_data com.eprosima.idl.parser.typecode.ArrayTypeCode@17d0685f
+    // m_data com.eprosima.idl.parser.typecode.ArrayTypeCode@1817d444
     memset(&m_data, 0, (1024*1024) * 1);
 
 }
@@ -109,7 +109,7 @@ size_t HelloWorld::getMaxCdrSerializedSize(
     size_t initial_alignment = current_alignment;
 
 
-    current_alignment += 4 + eprosima::fastcdr::Cdr::alignment(current_alignment, 4);
+    current_alignment += 8 + eprosima::fastcdr::Cdr::alignment(current_alignment, 8);
 
 
     current_alignment += 4 + eprosima::fastcdr::Cdr::alignment(current_alignment, 4) + 255 + 1;
@@ -129,12 +129,15 @@ size_t HelloWorld::getCdrSerializedSize(
     size_t initial_alignment = current_alignment;
 
 
-    current_alignment += 4 + eprosima::fastcdr::Cdr::alignment(current_alignment, 4);
+    current_alignment += 8 + eprosima::fastcdr::Cdr::alignment(current_alignment, 8);
 
 
     current_alignment += 4 + eprosima::fastcdr::Cdr::alignment(current_alignment, 4) + data.message().size() + 1;
 
-    current_alignment += ((1024*1024) * 1) + eprosima::fastcdr::Cdr::alignment(current_alignment, 1);
+    if ((1024*1024) > 0)
+    {
+        current_alignment += ((1024*1024) * 1) + eprosima::fastcdr::Cdr::alignment(current_alignment, 1);
+    }
 
 
     return current_alignment - initial_alignment;
@@ -145,7 +148,7 @@ void HelloWorld::serialize(
 {
 
     scdr << m_index;
-    scdr << m_message;
+    scdr << m_message.c_str();
     scdr << m_data;
 
 
@@ -166,7 +169,7 @@ void HelloWorld::deserialize(
:...skipping...
diff --git a~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.cxx b~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.cxx
index 543484890ccb..cdd4588b745e 100644
--- a~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.cxx
+++ b~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.cxx
@@ -36,11 +36,11 @@ using namespace eprosima::fastcdr::exception;
 
 HelloWorld::HelloWorld()
 {
-    // m_index com.eprosima.idl.parser.typecode.PrimitiveTypeCode@289d1c02
+    // m_index com.eprosima.idl.parser.typecode.PrimitiveTypeCode@636be97c
     m_index = 0;
-    // m_message com.eprosima.idl.parser.typecode.StringTypeCode@22eeefeb
+    // m_message com.eprosima.idl.parser.typecode.StringTypeCode@50a638b5
     m_message ="";
-    // m_data com.eprosima.idl.parser.typecode.ArrayTypeCode@17d0685f
+    // m_data com.eprosima.idl.parser.typecode.ArrayTypeCode@1817d444
     memset(&m_data, 0, (1024*1024) * 1);
 
 }
@@ -109,7 +109,7 @@ size_t HelloWorld::getMaxCdrSerializedSize(
     size_t initial_alignment = current_alignment;
 
 
-    current_alignment += 4 + eprosima::fastcdr::Cdr::alignment(current_alignment, 4);
+    current_alignment += 8 + eprosima::fastcdr::Cdr::alignment(current_alignment, 8);
 
 
     current_alignment += 4 + eprosima::fastcdr::Cdr::alignment(current_alignment, 4) + 255 + 1;
@@ -129,12 +129,15 @@ size_t HelloWorld::getCdrSerializedSize(
     size_t initial_alignment = current_alignment;
 
 
-    current_alignment += 4 + eprosima::fastcdr::Cdr::alignment(current_alignment, 4);
+    current_alignment += 8 + eprosima::fastcdr::Cdr::alignment(current_alignment, 8);
 
 
     current_alignment += 4 + eprosima::fastcdr::Cdr::alignment(current_alignment, 4) + data.message().size() + 1;
 
-    current_alignment += ((1024*1024) * 1) + eprosima::fastcdr::Cdr::alignment(current_alignment, 1);
+    if ((1024*1024) > 0)
+    {
+        current_alignment += ((1024*1024) * 1) + eprosima::fastcdr::Cdr::alignment(current_alignment, 1);
+    }
 
 
     return current_alignment - initial_alignment;
@@ -145,7 +148,7 @@ void HelloWorld::serialize(
 {
 
     scdr << m_index;
-    scdr << m_message;
+    scdr << m_message.c_str();
     scdr << m_data;
 
 
@@ -166,7 +169,7 @@ void HelloWorld::deserialize(
  * @param _index New value for member index
  */
 void HelloWorld::index(
-        uint32_t _index)
+        uint64_t _index)
 {
     m_index = _index;
 }
@@ -175,7 +178,7 @@ void HelloWorld::index(
  * @brief This function returns the value of member index
  * @return Value of member index
  */
-uint32_t HelloWorld::index() const
+uint64_t HelloWorld::index() const
 {
     return m_index;
 }
@@ -184,7 +187,7 @@ uint32_t HelloWorld::index() const
  * @brief This function returns a reference to member index
  * @return Reference to member index
  */
-uint32_t& HelloWorld::index()
+uint64_t& HelloWorld::index()
 {
     return m_index;
 }
diff --git a~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.h b~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.h
index 28d5aaf13429..388d1ce37871 100644
--- a~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.h
+++ b~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.h
@@ -23,6 +23,8 @@
 #define _FAST_DDS_GENERATED_HELLOWORLD_H_
 
 
+#include <fastrtps/utils/fixed_size_string.hpp>
+
 #include <stdint.h>
 #include <array>
 #include <string>
@@ -77,7 +79,7 @@ public:
     /*!
      * @brief Default destructor.
      */
-    eProsima_user_DllExport ~HelloWorld();
+    eProsima_user_DllExport virtual ~HelloWorld();
 
     /*!
      * @brief Copy constructor.
@@ -126,19 +128,19 @@ public:
      * @param _index New value for member index
      */
     eProsima_user_DllExport void index(
-            uint32_t _index);
+            uint64_t _index);
 
     /*!
      * @brief This function returns the value of member index
      * @return Value of member index
      */
-    eProsima_user_DllExport uint32_t index() const;
+    eProsima_user_DllExport uint64_t index() const;
 
     /*!
      * @brief This function returns a reference to member index
      * @return Reference to member index
      */
-    eProsima_user_DllExport uint32_t& index();
+    eProsima_user_DllExport uint64_t& index();
 
     /*!
      * @brief This function copies the value in member message
@@ -170,26 +172,26 @@ public:
      * @param _data New value to be copied in member data
      */
     eProsima_user_DllExport void data(
-            const std::array<char, 1024 * 1024>& _data);
+            const std::array<char, 1024*1024>& _data);
 
     /*!
      * @brief This function moves the value in member data
      * @param _data New value to be moved in member data
      */
     eProsima_user_DllExport void data(
-            std::array<char, 1024 * 1024>&& _data);
+            std::array<char, 1024*1024>&& _data);
 
     /*!
      * @brief This function returns a constant reference to member data
      * @return Constant reference to member data
      */
-    eProsima_user_DllExport const std::array<char, 1024 * 1024>& data() const;
+    eProsima_user_DllExport const std::array<char, 1024*1024>& data() const;
 
     /*!
      * @brief This function returns a reference to member data
      * @return Reference to member data
      */
-    eProsima_user_DllExport std::array<char, 1024 * 1024>& data();
+    eProsima_user_DllExport std::array<char, 1024*1024>& data();
 
     /*!
      * @brief This function returns the maximum serialized size of an object
@@ -250,9 +252,9 @@ public:
 
 private:
 
-    uint32_t m_index;
+    uint64_t m_index;
     std::string m_message;
-    std::array<char, 1024 * 1024> m_data;
+    std::array<char, 1024*1024> m_data;
 };
 
 #endif // _FAST_DDS_GENERATED_HELLOWORLD_H_
\ No newline at end of file
diff --git a~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.idl b~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.idl
index 9799dca59843..88a8e9166e45 100644
--- a~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.idl
+++ b~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorld.idl
@@ -1,6 +1,6 @@
 struct HelloWorld
 {
-       unsigned long index;
+       unsigned long long index;
        string message;
        char data[1024*1024];
 };
diff --git a~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorldPubSubTypes.h b~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorldPubSubTypes.h
index 8b2348be67c7..fb31d872a625 100644
--- a~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorldPubSubTypes.h
+++ b~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorldPubSubTypes.h
@@ -45,7 +45,7 @@ public:
 
     eProsima_user_DllExport HelloWorldPubSubType();
 
-    eProsima_user_DllExport virtual ~HelloWorldPubSubType();
+    eProsima_user_DllExport virtual ~HelloWorldPubSubType() override;
 
     eProsima_user_DllExport virtual bool serialize(
             void* data,
diff --git a~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorldPublisher.cpp b~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorldPublisher.cpp
index 1b8f9d964e49..47eaad46df34 100644
--- a~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorldPublisher.cpp
+++ b~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorldPublisher.cpp
@@ -19,6 +19,7 @@
 
 #include "HelloWorldPublisher.h"
 
+#include <chrono>
 #include <fastdds/dds/domain/DomainParticipantFactory.hpp>
 #include <fastdds/rtps/transport/shared_mem/SharedMemTransportDescriptor.h>
 #include <fastdds/rtps/transport/UDPv4TransportDescriptor.h>
@@ -204,11 +205,13 @@ bool HelloWorldPublisher::publish(
 {
     if (listener_.first_connected_ || !waitForListener || listener_.matched_ > 0)
     {
-        hello_->index(hello_->index() + 1);
         size_t data_size = hello_->data().size();
         std::string s = "BigData" + std::to_string(hello_->index() % 10);
         strcpy(&hello_->data()[data_size - s.length() - 1], s.c_str());
-
+        uint64_t now_ns = std::chrono::duration_cast<std::chrono::nanoseconds>(
+            std::chrono::high_resolution_clock::now().time_since_epoch())
+            .count();
+        hello_->index(now_ns);
         writer_->write(hello_.get());
 
         return true;
diff --git a~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorldSubscriber.cpp b~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorldSubscriber.cpp
index a22d83624ab6..9b0c9fe1158e 100644
--- a~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorldSubscriber.cpp
+++ b~/Fast-DDS/examples/cpp/dds/HelloWorldExampleSharedMem/HelloWorldSubscriber.cpp
@@ -149,9 +149,13 @@ void HelloWorldSubscriber::SubListener::on_data_available(
             samples_++;
             const size_t data_size = hello_->data().size();
             // Print your structure data here.
-            std::cout << "Message " << hello_->message() << " " << hello_->index()
-                      << " RECEIVED With " << data_size << "(bytes) of Data. DataEnd = "
-                      << (char*)&hello_->data()[data_size - 9] << std::endl;
+            uint64_t now_ns = std::chrono::duration_cast<std::chrono::nanoseconds>(
+                std::chrono::high_resolution_clock::now().time_since_epoch())
+                .count();
+            uint64_t latency_real_ns = now_ns -  hello_->index();
+            std::cout << "Latency Real = " << latency_real_ns / 1000.0 << " us" << std::endl;
+            std::cout << "Latency DDS = " << (info.reception_timestamp.to_ns() - info.source_timestamp.to_ns()) / 1000.0 << " us" << std::endl;
+
         }
     }
 }

To give the complete picture here the output of the latency measurements form the same system:

Reading demands file:demands.csv
Performing test with 1 subscribers and 10000 samples
Overhead 0.0267702 ns
Printing round-trip times in us, statistics for 10000 samples
   Bytes, Samples,   stdev,    mean,     min,     50%,     90%,     99%,  99.99%,     max
--------,--------,--------,--------,--------,--------,--------,--------,--------,--------,
 1048576,   10000,  13.885, 202.693, 170.597, 201.432, 216.416, 243.941, 324.923, 838.136 
 1048576,   10000,  15.179, 206.578, 179.226, 204.063, 221.135, 252.549, 477.765, 970.094 

deeplearningrobotics avatar Aug 03 '22 19:08 deeplearningrobotics

The issue is that serialization (and most likely deserailzation) takes A LOT of time. Around 700 us:

Message: HelloWorld with index: 1659558122499278763 SENT
Serialize start us:11.448
Serialize done us:756.621

Your latency test does not use the normal serialization/deserialization functionality and does therefore not catch the issue.

deeplearningrobotics avatar Aug 03 '22 20:08 deeplearningrobotics

And this numbers make perfect sense: DDS Latency + Serialization Latency + Deserialization latency: 200 us + 700 us +700 us = 1600 us

deeplearningrobotics avatar Aug 03 '22 20:08 deeplearningrobotics

Hi @deeplearningrobotics ,

I think we are talking about different things here:

  1. The OP reported an issue that I have not been able to reproduce and that's why I've asked @Silvan97 to provide more information if possible; if that does not happen in two weeks time, we'll most likely close the issue (unless someone else provides the means for reproducing it).
  2. From what I understand, you are describing a different thing entirely. On that note, I have a couple of comments:
    1. The source and reception timestamps are set after and before serialization-deserialization respectively; both are populated whenever the sample is added to the corresponding history.
    2. I'd say that the latency test does use normal serialization/deserialization, it's just that the data type is different from the one you are using. The particularity of its type support is that is prepared for both zero-copy (loans + data-sharing) and running the tests with several sizes. If your data type were to be bounded and plain you'd see very different results.
    3. As not all cases can be covered, the performance test aims to find the best case for a given size, and it's actually a tool for us to look for performance degradation over time.
    4. I'd argue that serialization times are actually not a Fast DDS issue, since the type support implementation is external to the library.

All in all, I understand your concerns, but I'm unsure on whether they belong into this issue (or repository).

EduPonz avatar Aug 04 '22 05:08 EduPonz

Closing due to inactivity

EduPonz avatar Oct 28 '22 05:10 EduPonz