helm-charts
helm-charts copied to clipboard
[BUG][opensearch] tls connection (handshake) randomly stops working
Describe the bug After starting up a 3 node cluster, opensearch randomly fails during tls handshake (after a few days) until the pods are restartet. See log below:
[2022-07-18T07:11:09,550][ERROR][o.o.s.s.h.n.SecuritySSLNettyHttpServerTransport] [opensearch-cluster-0] Exception during establishing a SSL connection: javax.net.ssl.SSLHandshakeException: Invalid Alert message: no sufficient data javax.net.ssl.SSLHandshakeException: Invalid Alert message: no sufficient data at sun.security.ssl.Alert.createSSLException(Alert.java:131) ~[?:?] at sun.security.ssl.Alert.createSSLException(Alert.java:117) ~[?:?] at sun.security.ssl.TransportContext.fatal(TransportContext.java:358) ~[?:?] at sun.security.ssl.TransportContext.fatal(TransportContext.java:314) ~[?:?] at sun.security.ssl.TransportContext.fatal(TransportContext.java:305) ~[?:?] at sun.security.ssl.Alert$AlertMessage.
(Alert.java:196) ~[?:?] at sun.security.ssl.Alert$AlertConsumer.consume(Alert.java:236) ~[?:?] at sun.security.ssl.TransportContext.dispatch(TransportContext.java:204) ~[?:?] at sun.security.ssl.SSLTransport.decode(SSLTransport.java:172) ~[?:?] at sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:736) ~[?:?] at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:691) ~[?:?] at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:506) ~[?:?] at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:482) ~[?:?] at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:679) ~[?:?] at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:295) ~[netty-handler-4.1.73.Final.jar:4.1.73.Final] at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1341) ~[netty-handler-4.1.73.Final.jar:4.1.73.Final] at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1234) ~[netty-handler-4.1.73.Final.jar:4.1.73.Final] at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1283) ~[netty-handler-4.1.73.Final.jar:4.1.73.Final] at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:510) ~[netty-codec-4.1.73.Final.jar:4.1.73.Final] at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:449) ~[netty-codec-4.1.73.Final.jar:4.1.73.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:279) ~[netty-codec-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:623) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:586) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) [netty-common-4.1.73.Final.jar:4.1.73.Final] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.73.Final.jar:4.1.73.Final] at java.lang.Thread.run(Thread.java:833) [?:?] [2022-07-18T07:11:09,552][WARN ][o.o.h.AbstractHttpServerTransport] [opensearch-cluster-0] caught exception while handling client http traffic, closing connection Netty4HttpChannel{localAddress=/10.42.5.24:9200, remoteAddress=/10.42.5.20:36392} io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: Invalid Alert message: no sufficient data at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:480) ~[netty-codec-4.1.73.Final.jar:4.1.73.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:279) ~[netty-codec-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:623) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:586) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) [netty-transport-4.1.73.Final.jar:4.1.73.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) [netty-common-4.1.73.Final.jar:4.1.73.Final] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.73.Final.jar:4.1.73.Final] at java.lang.Thread.run(Thread.java:833) [?:?] Caused by: javax.net.ssl.SSLHandshakeException: Invalid Alert message: no sufficient data at sun.security.ssl.Alert.createSSLException(Alert.java:131) ~[?:?] at sun.security.ssl.Alert.createSSLException(Alert.java:117) ~[?:?] at sun.security.ssl.TransportContext.fatal(TransportContext.java:358) ~[?:?] at sun.security.ssl.TransportContext.fatal(TransportContext.java:314) ~[?:?] at sun.security.ssl.TransportContext.fatal(TransportContext.java:305) ~[?:?] at sun.security.ssl.Alert$AlertMessage. (Alert.java:196) ~[?:?] at sun.security.ssl.Alert$AlertConsumer.consume(Alert.java:236) ~[?:?] at sun.security.ssl.TransportContext.dispatch(TransportContext.java:204) ~[?:?] at sun.security.ssl.SSLTransport.decode(SSLTransport.java:172) ~[?:?] at sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:736) ~[?:?] at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:691) ~[?:?] at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:506) ~[?:?] at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:482) ~[?:?] at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:679) ~[?:?] at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:295) ~[netty-handler-4.1.73.Final.jar:4.1.73.Final] at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1341) ~[netty-handler-4.1.73.Final.jar:4.1.73.Final] at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1234) ~[netty-handler-4.1.73.Final.jar:4.1.73.Final] at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1283) ~[netty-handler-4.1.73.Final.jar:4.1.73.Final] at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:510) ~[netty-codec-4.1.73.Final.jar:4.1.73.Final] at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:449) ~[netty-codec-4.1.73.Final.jar:4.1.73.Final] ... 16 more
To Reproduce Steps to reproduce the behavior:
- Start up multi node opensearch cluster
- try to connect with opensearch-dashboards -> leads to http 500 Server Error, or see cluster status degrading to yellow and finally red
- See error in opensearch log
- restart pods to resolve the issue
Expected behavior Tls handshake should either always work or never.
Chart Name opensearch 2.3.0 (also 2.1.0)
Host/Environment (please complete the following information):
- Helm Version: version.BuildInfo{Version:"v3.6.3", GitCommit:"d506314abfb5d21419df8c7e7e68012379db2354", GitTreeState:"clean", GoVersion:"go1.16.5"}
- Kubernetes Version: Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.9+rke2r1", GitCommit:"6df4433e288edc9c40c2e344eb336f63fad45cd2", GitTreeState:"clean", BuildDate:"2022-04-20T17:24:51Z", GoVersion:"go1.16.15b7", Compiler:"gc", Platform:"linux/amd64"}
@opensearch-project/security Can someone please take a look at this issue and provide an update?