security icon indicating copy to clipboard operation
security copied to clipboard

[BUG] Cross Cluster Search to same node fails auth

Open dforste opened this issue 2 months ago • 8 comments

What is the bug? Cross Cluster Search to same node fails authentication with:

How can one reproduce the bug? Steps to reproduce the behavior:

  1. Create a cluster.
  2. Create a remote cluster with same node as seed.
  3. Search remote cluster.

What is the expected behavior? Expect credentials to be passed and permissions to be honored.

What is your host/environment?

  • OS: Redhat/Docker Official image.
  • Version 3.3.0
  • Plugins

Do you have any additional context? https://forum.opensearch.org/t/ccs-query-to-same-cluster-is-not-working/17995/4 Looks like only solution is to not auth for CCS which doesn't seem right.

Here is the stacktrace I observed:

[2025-12-09T19:24:02,331][WARN ][r.suppressed             ] [opensearch-1] path: /*%3Alogs/_search, params: {ignore_unavailable=true, preference=1765306631469, index=*:notifydis, timeout=30000ms, track_total_hits=true}
org.opensearch.transport.RemoteTransportException: [error while communicating with remote cluster [local]]
Caused by: org.opensearch.transport.RemoteTransportException: [opensearch-1][172.19.0.4:9300][indices:data/read/search]
Caused by: org.opensearch.OpenSearchSecurityException: No user found for indices:data/read/search
	at org.opensearch.security.filter.SecurityFilter.apply0(SecurityFilter.java:373) ~[?:?]
	at org.opensearch.security.filter.SecurityFilter.apply(SecurityFilter.java:175) ~[?:?]
	at org.opensearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:218) ~[opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.performanceanalyzer.action.PerformanceAnalyzerActionFilter.apply(PerformanceAnalyzerActionFilter.java:81) ~[?:?]
	at org.opensearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:218) ~[opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.action.support.TransportAction.execute(TransportAction.java:190) ~[opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.action.support.HandledTransportAction$TransportHandler.messageReceived(HandledTransportAction.java:133) ~[opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.action.support.HandledTransportAction$TransportHandler.messageReceived(HandledTransportAction.java:129) ~[opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.security.ssl.transport.SecuritySSLRequestHandler.messageReceivedDecorate(SecuritySSLRequestHandler.java:209) ~[?:?]
	at org.opensearch.security.transport.SecurityRequestHandler.messageReceivedDecorate(SecurityRequestHandler.java:353) ~[?:?]
	at org.opensearch.security.ssl.transport.SecuritySSLRequestHandler.messageReceived(SecuritySSLRequestHandler.java:157) ~[?:?]
	at org.opensearch.security.OpenSearchSecurityPlugin$6$1.messageReceived(OpenSearchSecurityPlugin.java:939) ~[?:?]
	at org.opensearch.indexmanagement.rollup.interceptor.RollupInterceptor$interceptHandler$1.messageReceived(RollupInterceptor.kt:120) ~[?:?]
	at org.opensearch.performanceanalyzer.transport.PerformanceAnalyzerTransportRequestHandler.messageReceived(PerformanceAnalyzerTransportRequestHandler.java:44) ~[?:?]
	at org.opensearch.performanceanalyzer.transport.RTFPerformanceAnalyzerTransportRequestHandler.messageReceived(RTFPerformanceAnalyzerTransportRequestHandler.java:63) ~[?:?]
	at org.opensearch.wlm.WorkloadManagementTransportInterceptor$RequestHandler.messageReceived(WorkloadManagementTransportInterceptor.java:63) ~[opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:108) ~[opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.transport.NativeMessageHandler.handleRequest(NativeMessageHandler.java:297) [opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.transport.NativeMessageHandler.handleMessage(NativeMessageHandler.java:169) [opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.transport.NativeMessageHandler.messageReceived(NativeMessageHandler.java:149) [opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.transport.InboundHandler.messageReceivedFromPipeline(InboundHandler.java:152) [opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.transport.InboundHandler.inboundMessage(InboundHandler.java:144) [opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.transport.TcpTransport.inboundMessage(TcpTransport.java:804) [opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.transport.InboundBytesHandler.forwardFragments(InboundBytesHandler.java:137) [opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.transport.InboundBytesHandler.doHandleBytes(InboundBytesHandler.java:77) [opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:124) [opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:113) [opensearch-3.3.2.jar:3.3.2]
	at org.opensearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:95) [transport-netty4-client-3.3.2.jar:3.3.2]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:280) [netty-handler-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442) ~[netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:107) ~[netty-codec-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1519) [netty-handler-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1377) [netty-handler-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1428) [netty-handler-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530) [netty-codec-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469) [netty-codec-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) [netty-codec-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1357) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:868) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:796) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:697) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:660) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) [netty-transport-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:998) [netty-common-4.1.125.Final.jar:4.1.125.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.125.Final.jar:4.1.125.Final]
	at java.lang.Thread.run(Thread.java:1447) [?:?]

dforste avatar Dec 09 '25 19:12 dforste

Just a curious question: What's the motivation to do a CCS to the local cluster?

nibix avatar Dec 10 '25 22:12 nibix

@nibix We have several clusters/datacenters and it is easier to just add each cluster as a remote cluster and setup all the index-patters like *:logs-xyz-app then it is to have people log into each cluster.

dforste avatar Dec 10 '25 22:12 dforste

Do you think this would be resolved / prevented if we used dedicated coordination nodes?

dforste avatar Dec 10 '25 22:12 dforste

Thank you for the swift response!

I do not think that coordinator nodes would substantially change the situation; but that is just a very quick intuitive assessment.

I think I understand your use-case; still, just for completenesses sake, one should point out that CCS to the local cluster will always add a bit more overhead to the whole request handling; more network roundtrips, more serialization/deserialization, etc.

For fixing this I guess we have to look into how remote cluster requests can be distinguished from local cluster requests on the transport level. Based on that, one can then adapt the request decoration.

nibix avatar Dec 10 '25 22:12 nibix

I did some testing and I agree that coordination only nodes won't help.

CCS doesn't seem to function on nodes with roles: []. They seem to need the data role to enable CCS. It is odd because the config will be set everywhere but only nodes with data role respond to GET _remote/info or searches.

In my current cluster of 3 every third request seems to have this exception.

dforste avatar Dec 11 '25 04:12 dforste

[Triage] Thank you for filing this issue @dforste. If my understanding is correct, this is only the case when using the <cluster_name>:<index_name> syntax? If searching with simplying the index name (without prefixing the cluster name) does that perform auth as expected?

cwperks avatar Dec 15 '25 16:12 cwperks

@cwperks Yes it works as expected when auth is against itself and not using CCS. The only problem is with CCS and connecting back to itself. If it connects to a different node results are returned and auth is correct. If it connects to same node you get the error.

dforste avatar Dec 17 '25 02:12 dforste

I was able to work around this by creating another cluster without data(only dashboards) to coordinate CCS. This prevents the connection back to the same node.

dforste avatar Dec 17 '25 02:12 dforste