cyclonedds icon indicating copy to clipboard operation
cyclonedds copied to clipboard

Question about when sample lost callback is invoked

Open skysky97 opened this issue 1 year ago • 1 comments

Sorry I don't really understand the SAMPLE_LOST communication status. The DDS spec just says "A sample has been lost (never received)" .

At first I thought it may be invoked when samples are dropped because of resource limits. Like history depth is small and write fast in writer side or take slow in reader side. But it is not.

Would you please explain when the sample lost callback is invoked, and if there is a way to nofity applications about samples drop caused by resource limits.

skysky97 avatar Aug 09 '23 08:08 skysky97

Here is just some general DDS knowledge on this topic.

In the underlying protocol, there is a sequential id on every sample. The publisher increments this id ever time a sample is sent. On the subscriber side, every time a sample is received, there is a check to see if the seq number is 1 greater than the previous.
Failing this test gives you a "SAMPLE LOST".

You only get a SAMPLE LOST when using BEST_EFFORT. And what can cause this? Basically, anything. UDP (and multicast) have a lot of excuses to throw away data, plus there are physical ways of missing data.

Most of the time, excluding networking interruptions, it is kernel buffers being too small.

If you don't like to lose data, use RELIABLE RELIABILITY.

DDS requires you to 'spec out the system'. If you need BEST EFFORT, then you have to design your system to ensure that there is enough resources in place to support the work being done. This includes history depth, processing availability to handle incoming data, switch configuration, and OS setup.

To answer your question, every SAMPLE LOST is either caused by a resource limit or by a transient networking error.

Hope this gives you a little background. Note this is all 'DDS' experience, not specific to cyclonedds.

BudDavis avatar Aug 25 '23 09:08 BudDavis