Add metric for server side TLS handshake failures.
The problem
There is no clear metric for failed TLS handshake attempts. At the moment SSL handshake failures go under styx.exception.io_netty_handler_codec_DecoderException metric which is a catch-all for all sorts of decoder problems, and also not specific enough for the TLS handshake issues.
As a devops engineer, I want to know how many failed TLS handshake attempts there has been, so that I can get a clear indication on the configuration problems, and to take a remedial action sooner.
Acceptance criteria
-
Add a metric, perhaps in connections scope, such as
connections.tls.handshake.failures. Increment this metric on every failed TLS handshake. -
Add the new metric in the styx user manual: https://github.com/HotelsDotCom/styx/blob/master/docs/user-guide/metrics.md
-
Add the new metric in the styx user manual under "Troubleshooting TLS connectivity issues". https://github.com/HotelsDotCom/styx/blob/master/docs/user-guide/configure-tls.md
There is a it("Logs an SSL handshake exception") test in TlsErrorSpec.scala to give you some ideas @VivianLopes .
Closing all issues over 3 years old. New issues can be created if problems are still occurring.