go-coap
go-coap copied to clipboard
Connection from client not reestablished when server restarts using DTLS
Using server from this example: https://github.com/go-ocf/go-coap/blob/master/examples/dtls/server/main.go i.e. "go run main.go"
And slightly modified client to continuously call coap.Get() every 3 seconds to send/receive data https://github.com/go-ocf/go-coap/blob/master/examples/dtls/client/main.go
co, err := coap.DialDTLS("udp", "localhost:5688", &dtls.Config{
PSK: func(hint []byte) ([]byte, error) {
fmt.Printf("Server's hint: %s \n", hint)
return []byte{0xAB, 0xC1, 0x23}, nil
},
PSKIdentityHint: []byte("Pion DTLS Server"),
CipherSuites: []dtls.CipherSuiteID{dtls.TLS_PSK_WITH_AES_128_CCM_8},
})
if err != nil {
log.Fatalf("Error dialing: %v", err)
}
for {
path := "/a"
if len(os.Args) > 1 {
path = os.Args[1]
}
resp, err := co.Get(path)
if err != nil {
log.Fatalf("Error sending request: %v", err)
}
log.Printf("Response payload: %v", resp.Payload())
time.Sleep(3 * time.Second)
}
If I restart the server process, the client never recovers and detects that the Get() fails. Get() blocks indefinitely and you're never able to determine that the connection fails and attempt to reestablish it via Dial(). I'm unsure if I'm missing something, but what's the best practice to follow in this scenario?
Another issue (related, but problem with the server instead of the client). When a server goes down and comes back, with a client being unaware and continues to send messages, the coap.ListenAndServeDTLS() on the server returns with failure "cannot serve tcp: cannot accept connections: dtls: The connection timed out during the handshake"
If you attempt to call coap.ListenAndServeDTLS() on the same port, after it returns, it fails binding the port: "cannot listen and serve: cannot create new dtls listener: listen udp :5688: bind: address already in use"
Looks like two issues.
- Doesn't seem like ListenAndServerDTLS() should return failure if one client happens to be in a different state and is sending messages (perhaps the handshaking process shouldn't be triggered at all here?)
- If/When ListenAndServerDTLS() does return, it doesn't seem as if it's properly cleaning up the port it's listening on.
Hi
Instead call resp, err := co.Get(path) you can use resp, err := co.GetWithContext(ctx, path) and in context you can define timeout.
Doesn't seem like ListenAndServerDTLS() should return failure if one client happens to be in a different state and is sending messages (perhaps the handshaking process shouldn't be triggered at all If/When ListenAndServerDTLS() does return, it doesn't seem as if it's properly cleaning up the port it's listening on.
I will look at it on next week.
@jkralik I looked more into this and it appears that the DTLS code doesn't read any of the packet headers before trying to establish a new connection/session when it receives a message it doesn't know anything about. So it starts doing the DTLS handshake logic even though it's receiving application data packets.
Listener starting readLoop on new session: https://github.com/pion/dtls/blob/0649d6e970ceb0913223f5696be74474788d0a9b/internal/udp/conn.go#L100
Listener fetching conn/session object: https://github.com/pion/dtls/blob/0649d6e970ceb0913223f5696be74474788d0a9b/internal/udp/conn.go#L118
Listener creating new conn/session without verifying incoming packet is in the correct state: https://github.com/pion/dtls/blob/0649d6e970ceb0913223f5696be74474788d0a9b/internal/udp/conn.go#L137
@jkralik So that explains the handshake timeout. I think the DTLS implementers would argue that this behavior is fine as UDP forces the application to deal with these scenarios as it's connectionless. But the ListenAndServerDTLS function should still not return based on one failed/faulty session. Still trying to track that down.
@jkralik This looks like the incorrect behavior in go-coap. https://github.com/go-ocf/go-coap/blob/fd84fc2e6f05d98a193b1fb598523c7521b779ce/server.go#L536
I don't think we would want to exit this loop on one session failing. I've also noticed something similiar with the TCP listener.
https://github.com/go-ocf/go-coap/blob/fd84fc2e6f05d98a193b1fb598523c7521b779ce/server.go#L605
I think the more correct behavior would to either log a message and continue the loop or change the API to provide an error callback when these situations happen? What do you think? I can provide a PR with any of these types of changes.
Thanks!
@hjames9 It's expected that this error occurs only when listen socket was closed. When you shutdown server you get error - this I want to to change too -> it must returns without error message.
Maybe for some errors we can just log them via callback function.
Pls create PR and I will review it. I'm so glad that you are contributing. Thx!
@jkralik Most of the issues I had were resolved with these last 2 PRs. The issue of the listening port not being closed once the ListenAndServe function returns still remains, but I believe that's how it works similarly to how net/http, so it might be ok. What do you think?