OpenSearch
OpenSearch copied to clipboard
Node drops out with OutOfMemoryError for reasons other than the VM having run out of memory
Describe the bug Node can drop out for OutOfMemory errors other than the VM OutOfMemory like
java.lang.OutOfMemoryError: UTF16 String size is 1089861360, should be less than 1073741823
This is coming from the implementation limits of StringUTF16.
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:53) ~[?:?]
at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:763) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:952) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.transport.TcpTransport.handleResponse(TcpTransport.java:977) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:216) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.transport.TcpTransport$1.doRun(TcpTransport.java:985) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1104) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleResponse(SearchTransportService.java:454) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.ActionListenerResponseHandler.handleResponse(ActionListenerResponseHandler.java:54) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.SearchActionListener.onResponse(SearchActionListener.java:29) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.SearchActionListener.onResponse(SearchActionListener.java:45) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.FetchSearchPhase$2.innerOnResponse(FetchSearchPhase.java:163) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.FetchSearchPhase$2.innerOnResponse(FetchSearchPhase.java:166) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.CountedCollector.onResult(CountedCollector.java:64) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.CountedCollector.countDown(CountedCollector.java:53) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.FetchSearchPhase.lambda$innerRun$2(FetchSearchPhase.java:104) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.FetchSearchPhase.moveToNextPhase(FetchSearchPhase.java:206) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:153) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.executePhase(AbstractSearchAsyncAction.java:160) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.ExpandSearchPhase.run(ExpandSearchPhase.java:120) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:153) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.executePhase(AbstractSearchAsyncAction.java:160) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.FetchSearchPhase$3.run(FetchSearchPhase.java:213) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onResponse(AbstractSearchAsyncAction.java:50) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onResponse(AbstractSearchAsyncAction.java:311) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:81) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:85) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.rest.action.RestActionListener.onResponse(RestActionListener.java:47) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.rest.action.RestResponseListener.processResponse(RestResponseListener.java:37) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.rest.RestController$ResourceHandlingHttpChannel.sendResponse(RestController.java:497) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.common.text.Text.toString(Text.java:94) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.common.text.Text.string(Text.java:89) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.elasticsearch.common.bytes.BytesReference.utf8ToString(BytesReference.java:98) ~[elasticsearch-6.8.0.jar:6.8.0]
at org.apache.lucene.util.BytesRef.utf8ToString(BytesRef.java:138) ~[lucene-core-7.7.0.jar:7.7.0-SNAPSHOT bb8faecf0a738cf5294e398973014b0090e9dc51 - akjain - 2020-02-14 16:08:48]
at java.lang.String.<init>(String.java:276) ~[?:?]
at java.lang.String.<init>(String.java:3222) ~[?:?]
at java.lang.StringUTF16.toBytes(StringUTF16.java:151) ~[?:?]
at java.lang.StringUTF16.newBytesFor(StringUTF16.java:49) ~[?:?]
Caused by: java.lang.OutOfMemoryError: UTF16 String size is 1089861360, should be less than 1073741823
at java.lang.Thread.run(Thread.java:834) [?:?]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) [netty-common-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:656) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278) [netty-codec-4.1.32.Final.jar:4.1.32.Final]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:441) [netty-codec-4.1.32.Final.jar:4.1.32.Final]
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:502) [netty-codec-4.1.32.Final.jar:4.1.32.Final]
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1247) [netty-handler-4.1.32.Final.jar:4.1.32.Final]
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1203) [netty-handler-4.1.32.Final.jar:4.1.32.Final]
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1436) [netty-handler-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) [netty-handler-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:297) [netty-codec-4.1.32.Final.jar:4.1.32.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323) [netty-codec-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:364) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.notifyHandlerException(AbstractChannelHandlerContext.java:856) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.exceptionCaught(Netty4MessageChannelHandler.java:66) [transport-netty4-client-6.8.0.jar:6.8.0]
java.lang.Exception: java.lang.OutOfMemoryError: UTF16 String size is 1089861360, should be less than 1073741823
https://bugs.openjdk.java.net/browse/JDK-8230744
To Reproduce Steps to reproduce the behavior:
- Go to '...'
- Click on '....'
- Scroll down to '....'
- See error
Expected behavior Nodes should not drop off the cluster due to a single request/response that doesn't affect VM directly
Plugins Please list all plugins currently enabled.
Screenshots If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
- OS: [e.g. iOS]
- Version [e.g. 22]
Additional context Add any other context about the problem here.
@Bukhtawar : Is there a way to replicate this issue ? That will be helpful to debug, root-cause and finally verify the fix.
I think the JDK issue mentioned in the description is incorrect. https://bugs.openjdk.org/browse/JDK-8190429
Compact strings are enabled by default in JDK9. I haven't tried it but it seems that behavior can be disabled by specifying +XX:-CompactStrings
in jvm.options
Since the issue (aka OOM & node drop) occur due to increased response size which String is not able to hold, is there an option to introduce graceful failure of such search requests that lead to extra large response sizes (and avoid system instability i.e. node drops)?
There are bunch of ideas and some projects along those lines, @Bukhtawar do you know a specific one that would address this exact issue?