californium icon indicating copy to clipboard operation
californium copied to clipboard

How to clear ongoing observe notifications from transit on CoapEndpoint stop?

Open Ozame opened this issue 4 months ago • 15 comments

Background

We are using Leshan 2.0.0-M14 (Californium 3.9.1) on our LwM2M client implementation, communicating with a server using Leshan 2.0.0-M9 (Californium 3.7.0). We are actually running multiple clients on same machine, connecting to the same server instance.

Posting this here as this is regarding the way the observations are handled in Californium.

The issue

In our use case, the server observes some resources on the client. We have noticed that the notifications from the client are not sent to the server in a situation where the client (i.e. the client's CoapEndpoint) has been stopped while the previous notification was not yet ACKed.

For example:

  1. Client is started, and server observes a resource of the client.

  2. Client sends a notification on resource change, but the server, for some reason, does not answer immediately.

  3. Client is stopped, whilst the server has still not acked the notification.

  4. Later, client is started again. It tries to send a new observation to the server.

  5. The notification send fails, as the previous notification is still "in transit".

From our clients' logs:

2024-04-22 09:25:37,560 [executor] TRACE- o.e.c.c.o.ObserveRelation - in transit CON-2.05   MID=48271, Token=FFB1D80C889E8CA9, OptionSet={"Observe":1414, "Content-Format":"application/senml+cbor"}, canceled 86 A3 00 ....
2024-04-22 09:25:37,560 [executor] DEBUG- o.e.c.c.n.UdpMatcher - tracking open request [KeyMID[10.222.22.22:5683-48304], KeyToken[10.222.22.22:5683-B0A47C35FE9F3213]]
2024-04-22 09:25:37,560 [executor] TRACE- o.e.c.s.DTLSConnector - connection available for /10.222.22.22:5683,null
2024-04-22 09:25:37,560 [executor] TRACE- o.e.c.c.c.Message - Message transfer completed CON-2.05   MID=   -1, Token=null, OptionSet={"Observe":1624, "Content-Format":"application/senml+cbor"}, 86 A3 00 ....
2024-04-22 09:25:37,560 [executor] DEBUG- o.e.c.c.n.s.ObserveLayer - a former notification is still in transit. Postponing CON-2.05   MID=   -1, Token=null, OptionSet={"Observe":1631, "Content-Format":"application/senml+cbor"}, 86 A3 00 ....
2024-04-22 09:25:37,560 [executor] TRACE- o.e.c.s.DTLSConnector - Sending application layer message to [MAP(10.222.22.22:5683)]

When we stop the LWM2M client, it calls the stop method on the CoapServer. Relevant Leshan snippet in here.

Looks to me like the ObserveRelation is keeping the previous notification, despite it being in "canceled" state, and then this leads to postponing the future notifications in here.

Questions

  • Is there a way to know, when an notification related exchange has completed, either successfully or by timing out? If we could know when it e.g. timeouts, we could wait until that to stop the client.
    • It looks like when an notification times out, the observe relation is cancelled. Is there a way to change this behavior, so that the observations would never be cancelled?
  • Alternatively, can you suggest a way to clean up this "in transit" notification on Endpoint stop or start, so that it won't cause this on later notifications?

Ozame avatar Apr 22 '24 08:04 Ozame