opensearch-k8s-operator icon indicating copy to clipboard operation
opensearch-k8s-operator copied to clipboard

Not able to start cluster in docker desktop. Pods are not coming up

Open skhilar opened this issue 3 years ago • 6 comments

[2022-06-26T06:58:30,286][INFO ][o.o.c.c.JoinHelper ] [my-cluster-masters-0] failed to join {my-cluster-bootstrap-0}{XjYfnMojQVyS1nfzu8eq6Q}{UmdFhhnpT82d5jpg6KRWNA}{my-cluster-bootstrap-0}{10.1.2.125:9300}{m}{shard_indexing_pressure_enabled=true} with JoinRequest{sourceNode={my-cluster-masters-0}{AvA7jGY2QNGTLqPOpdxOPg}{EKD489jDRHSgnHSSPf8Cuw}{my-cluster-masters-0}{10.1.2.126:9300}{dm}{shard_indexing_pressure_enabled=true}, minimumTerm=1, optionalJoin=Optional.empty} org.opensearch.transport.RemoteTransportException: [my-cluster-bootstrap-0][10.1.2.125:9300][internal:cluster/coordination/join] Caused by: java.lang.IllegalStateException: failure when sending a validation request to node at org.opensearch.cluster.coordination.Coordinator$2.onFailure(Coordinator.java:626) ~[opensearch-2.0.1.jar:2.0.1] at org.opensearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:74) ~[opensearch-2.0.1.jar:2.0.1] at org.opensearch.security.transport.SecurityInterceptor$RestoringTransportResponseHandler.handleException(SecurityInterceptor.java:318) ~[?:?] at org.opensearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1370) ~[opensearch-2.0.1.jar:2.0.1] at org.opensearch.transport.InboundHandler.lambda$handleException$3(InboundHandler.java:420) ~[opensearch-2.0.1.jar:2.0.1] at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:739) ~[opensearch-2.0.1.jar:2.0.1] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?] at java.lang.Thread.run(Thread.java:833) [?:?] Caused by: org.opensearch.transport.RemoteTransportException: [my-cluster-masters-0][10.1.2.126:9300][internal:cluster/coordination/join/validate] Caused by: org.opensearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid I1fR5DlPTi-vCu1jClFcmg than local cluster uuid xoWTSY6dSyyTsDh8zfPbcA, rejecting at org.opensearch.cluster.coordination.JoinHelper.lambda$new$4(JoinHelper.java:208) ~[opensearch-2.0.1.jar:2.0.1] at org.opensearch.security.ssl.transport.SecuritySSLRequestHandler.messageReceivedDecorate(SecuritySSLRequestHandler.java:193) ~[?:?] at org.opensearch.security.transport.SecurityRequestHandler.messageReceivedDecorate(SecurityRequestHandler.java:283) ~[?:?] at org.opensearch.security.ssl.transport.SecuritySSLRequestHandler.messageReceived(SecuritySSLRequestHandler.java:153) ~[?:?] at org.opensearch.security.OpenSearchSecurityPlugin$7$1.messageReceived(OpenSearchSecurityPlugin.java:651) ~[?:?] at org.opensearch.indexmanagement.rollup.interceptor.RollupInterceptor$interceptHandler$1.messageReceived(RollupInterceptor.kt:118) ~[?:?] at org.opensearch.performanceanalyzer.transport.PerformanceAnalyzerTransportRequestHandler.messageReceived(PerformanceAnalyzerTransportRequestHandler.java:43) ~[?:?] at org.opensearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:103) ~[opensearch-2.0.1.jar:2.0.1] at org.opensearch.transport.InboundHandler$RequestHandler.doRun(InboundHandler.java:453) ~[opensearch-2.0.1.jar:2.0.1] at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:798) ~[opensearch-2.0.1.jar:2.0.1] at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) ~[opensearch-2.0.1.jar:2.0.1] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?] at java.lang.Thread.run(Thread.java:833) ~[?:?]

skhilar avatar Jun 26 '22 07:06 skhilar

Hey @skhilar, can you please share your Yaml file?

idanl21 avatar Jun 26 '22 13:06 idanl21

apiVersion: opensearch.opster.io/v1 kind: OpenSearchCluster metadata: name: my-cluster namespace: default spec: general: version: 2.0.1 httpPort: 9200 vendor: opensearch serviceName: my-cluster dashboards: version: 2.0.1 enable: true replicas: 1 resources: requests: memory: "1Gi" cpu: "500m" limits: memory: "1Gi" cpu: "500m" confMgmt: smartScaler: true nodePools: - component: masters replicas: 1 diskSize: "2Gi" NodeSelector: resources: requests: memory: "1Gi" cpu: "500m" limits: memory: "2Gi" cpu: "500m" roles: - "master" - "data" - "ingest" persistence: pvc: storageClass: hostpath accessModes: - ReadWriteOnce

skhilar avatar Jun 26 '22 14:06 skhilar

Hi @skhilar. I see that you are using version 2.0.1 of opensearch. That is currently not yet supported by operator. Could you please try if it works with version 1.3.2?

One side note: Could you please use the markdown code blocks feature when posting logs or yamls to preserve formatting?

swoehrl-mw avatar Jun 29 '22 12:06 swoehrl-mw

Hey @skhilar, let us know if you need any help

idanl21 avatar Jul 03 '22 13:07 idanl21

It does not work for me. I can see if pod is taking longer time to start it is killed due to prob configuration(readyness and liveness prob). Probably we should have way to configure the settings for operator.

skhilar avatar Jul 04 '22 09:07 skhilar

Hi @skhilar. You are trying to launch a single-node opensearch cluster (replicas: 1), that is currently not yet supported by the operator.

swoehrl-mw avatar Jul 11 '22 14:07 swoehrl-mw

Closing this as no longer relevant. Single-node clusters are not supported by the operator.

swoehrl-mw avatar Nov 24 '22 14:11 swoehrl-mw