[BUG] Get field mapping API returns 404 error in a mixed cluster with multiple versions
Describe the bug
When diving deep into this flaky test: https://github.com/opensearch-project/OpenSearch/issues/2440, I found that actually it's caused by a bug, this bug can be triggered in a mixed cluster with multiple versions(1.3.16, 2.11.0), the reason is that the deserialization method of GetFieldMappingsResponse doesn't handle the stream from older version node correctly, so when executing the get field mapping test against a mixed cluster, if the get field mapping API request performs on 2.x node, but need to deserialize the response from a 1.x node, the deserialization method will throw Expected single type but received [0], finally the 2.x node returns 404 error because this line is hit, but if the API request performs on a 1.x node, it works fine.
Related component
Search
To Reproduce
- Setup a 3 nodes cluster with multiple versions, two nodes are 1.3.16, another node is 2.11.0
- Create an index with one shard and zero replica
- Make sure the shard of the index is not allocated on the 2.11.0 node, this is to ensure that the following api request will be forwarded to other 1.x nodes.
- Call get field mapping API against the 2.11.0 node
curl localhost:9400/test/_mapping/field/non-existen -i
- Then you can see the api returns 404 error with response
{}, and the log in the 2.11.0 node shows some error stacktrace, which is not as expected.
[2024-05-10T17:15:05,577][WARN ][o.o.t.InboundHandler ] [node-2.11] Failed to deserialize response from [127.0.0.1/127.0.0.1:9300]
org.opensearch.transport.TransportSerializationException: Failed to deserialize response from handler [org.opensearch.transport.TransportService$ContextRestoreResponseHandler/org.opensearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction$2@5a0a0884]
at org.opensearch.transport.InboundHandler.handleResponse(InboundHandler.java:393) [opensearch-2.11.0.jar:2.11.0]
at org.opensearch.transport.InboundHandler.messageReceived(InboundHandler.java:168) [opensearch-2.11.0.jar:2.11.0]
at org.opensearch.transport.InboundHandler.inboundMessage(InboundHandler.java:123) [opensearch-2.11.0.jar:2.11.0]
at org.opensearch.transport.TcpTransport.inboundMessage(TcpTransport.java:770) [opensearch-2.11.0.jar:2.11.0]
at org.opensearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:175) [opensearch-2.11.0.jar:2.11.0]
at org.opensearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:150) [opensearch-2.11.0.jar:2.11.0]
at org.opensearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:115) [opensearch-2.11.0.jar:2.11.0]
at org.opensearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:95) [transport-netty4-client-2.11.0.jar:2.11.0]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442) [netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) [netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) [netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:280) [netty-handler-4.1.100.Final.jar:4.1.100.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442) [netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) [netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) [netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) [netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) [netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) [netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) [netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689) [netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652) [netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) [netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.100.Final.jar:4.1.100.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.100.Final.jar:4.1.100.Final]
at java.lang.Thread.run(Thread.java:1583) [?:?]
Caused by: java.lang.IllegalStateException: Expected single type but received [0]
at org.opensearch.action.admin.indices.mapping.get.GetFieldMappingsResponse.<init>(GetFieldMappingsResponse.java:118) ~[opensearch-2.11.0.jar:2.11.0]
at org.opensearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction$2.read(TransportSingleShardAction.java:288) ~[opensearch-2.11.0.jar:2.11.0]
at org.opensearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction$2.read(TransportSingleShardAction.java:284) ~[opensearch-2.11.0.jar:2.11.0]
at org.opensearch.transport.TransportService$ContextRestoreResponseHandler.read(TransportService.java:1507) ~[opensearch-2.11.0.jar:2.11.0]
at org.opensearch.transport.TransportService$ContextRestoreResponseHandler.read(TransportService.java:1494) ~[opensearch-2.11.0.jar:2.11.0]
at org.opensearch.transport.InboundHandler.handleResponse(InboundHandler.java:389) ~[opensearch-2.11.0.jar:2.11.0]
... 26 more
- Call the get field mapping API against other 1.x nodes, everything is fine, it returns 200 with response
{"test":{"mappings":{}}}
Expected behavior
The get field mapping API always return 200 no matter which nodes are requested in a mixed cluster.
Additional Details
No response