webrtc icon indicating copy to clipboard operation
webrtc copied to clipboard

As a client, the handling of DTLS exceptions leads to a failure in connecting to the server

Open minlp opened this issue 1 year ago • 4 comments

Your environment.

  • Version: pion/webrtc/v3 v3.2.21、pion/dtls/v2 v2.2.7

What happened?

When connecting to the WebRTC media server, there are occasional issues with streaming failure. Upon analyzing the captured packets, it appears that the failure is due to DTLS handshake failure.

cap Based on the highlighted section in the image, the packets for "server hello," "certificate," "server key exchange," "certificate request," and "server hello done" are split into two packets (packet sequence numbers 10766 and 10830). In a normal network scenario, packet 10830 should be received before packet 10766. However, due to network latency, packet 10766 arrived first, causing the client to handle it abnormally and return an internal error. This abnormal handling resulted in an overall disruption in the process.

go_code1 Based on the analysis of the captured packets, when packet sequence number 10766 is received and there is no "server hello" and "hello verify request" in the current cache, it results in an internal error being returned. In this case, can the fullPullMap method be modified as follows?
func (h *handshakeCache) fullPullMap(startSeq int, cipherSuite CipherSuite, rules ...handshakeCachePullRule) (int, map[handshake.Type]handshake.Message, bool) {
   ...
out := make(map[handshake.Type]handshake.Message)
	seq := startSeq
        // change the meaning of the second return parameter to whether the result is obtained from the cache
	ok := false
	for _, r := range rules {
		t := r.typ
		i := ci[t]
		if i == nil {
			continue
		}
		var keyExchangeAlgorithm CipherSuiteKeyExchangeAlgorithm
		if cipherSuite != nil {
			keyExchangeAlgorithm = cipherSuite.KeyExchangeAlgorithm()
		}
		rawHandshake := &handshake.Handshake{
			KeyExchangeAlgorithm: keyExchangeAlgorithm,
		}
		if err := rawHandshake.Unmarshal(i.data); err != nil {
			return startSeq, nil, false
		}
		if uint16(seq) != rawHandshake.Header.MessageSequence {
			// There is a gap. Some messages are not arrived.
			return startSeq, nil, false
		}
		seq++
		ok = true
		out[t] = rawHandshake.Message
	}
	return seq, out, ok
}

What did you expect?

In this situation, can successfully establish a connection with the server.

minlp avatar Dec 01 '23 08:12 minlp

Hi @minlp - I haven't tested the revised code but it looks plausible. Can you confirm it fixes the issue in your case? If so, can you kindly open a PR on pion/dtls so we can review in more detail and run the CI tests?

adriancable avatar Dec 02 '23 04:12 adriancable

Hi @minlp - I haven't tested the revised code but it looks plausible. Can you confirm it fixes the issue in your case? If so, can you kindly open a PR on pion/dtls so we can review in more detail and run the CI tests?

It can solve my problem, as I haven't encountered any DTLS handshake failures after making local modifications. I will open a pull request on pion/dtls.

minlp avatar Dec 05 '23 12:12 minlp

Hi @minlp - I haven't tested the revised code but it looks plausible. Can you confirm it fixes the issue in your case? If so, can you kindly open a PR on pion/dtls so we can review in more detail and run the CI tests?

but I can't push my branch to pion/dtls, can you give me the relevant permissions? @adriancable push_error

minlp avatar Dec 05 '23 12:12 minlp

Hi @minlp - you don’t/can’t push to this repo. You fork pion/dtls, push changes to your fork, then create a PR on pion/dtls repo based on the fork. Then me (or one of the other pion maintainers) will take it from there.

adriancable avatar Dec 05 '23 16:12 adriancable

This was fixed with the merging of https://github.com/pion/dtls/pull/600

Sean-Der avatar Mar 30 '24 03:03 Sean-Der