hazelcast-kubernetes icon indicating copy to clipboard operation
hazelcast-kubernetes copied to clipboard

java.lang.IllegalStateException: Unknown protocol: HTT

Open andrey-gava opened this issue 5 years ago • 9 comments

We have java spring app, with embedded hazelcast, and run it in Kubernetes. For member lookup we using DNS Lookup Discovery mode. All instances off hazelcast cluster starting well, and members find each other. But every time when I scale deployment, or update docker image, we see a lot of warnings in logs like this:

2020-06-09 13:38:06.578 WARN 1 --- [.IO.thread-in-2] com.hazelcast.nio.tcp.TcpIpConnection : [10.47.0.6]:5701 [abc-2-app] [3.12.7] Connection[id=303, /10.47.0.6:5736->/10.42.0.9:5701, qualifier=null, endpoint=[10.42.0.9]:5701, alive=false, type=NONE] closed. Reason: Exception in Connection[id=303, /10.47.0.6:5736->/10.42.0.9:5701, qualifier=null, endpoint=[10.42.0.9]:5701, alive=true, type=NONE], thread=hz._hzInstance_1_abc-2-app.IO.thread-in-2 java.lang.IllegalStateException: Unknown protocol: HTT at com.hazelcast.nio.tcp.UnifiedProtocolDecoder.onRead(UnifiedProtocolDecoder.java:107) at com.hazelcast.internal.networking.nio.NioInboundPipeline.process(NioInboundPipeline.java:135) at com.hazelcast.internal.networking.nio.NioThread.processSelectionKey(NioThread.java:369) at com.hazelcast.internal.networking.nio.NioThread.processSelectionKeys(NioThread.java:354) at com.hazelcast.internal.networking.nio.NioThread.selectLoop(NioThread.java:280) at com.hazelcast.internal.networking.nio.NioThread.run(NioThread.java:235)

            join.getKubernetesConfig().setEnabled(true)
                    .setProperty("service-dns", hazelcastProperties.getKubernetes().getServiceDns());
  hazelcast:
    clusterName: abc-app
    multicast:
      enabled: false
    kubernetes:
      enabled: true
      service-dns: abc-hazelcast.abc-2.svc.cluster.local
apiVersion: v1
kind: Service
metadata:
  name: abc-hazelcast
  namespace: abc-2
spec:
  ports:
  - name: hazelcast
    port: 5701 
    targetPort: 5701 
  selector:
    app: abc-2
    env: dev
    type: backend
  clusterIP: None
        <dependency>
            <groupId>com.hazelcast</groupId>
            <artifactId>spring-data-hazelcast</artifactId>
            <version>2.2.5</version>
        </dependency>
        <dependency>
            <groupId>com.hazelcast</groupId>
            <artifactId>hazelcast-kubernetes</artifactId>
            <version>1.5.3</version>
        </dependency>

andrey-gava avatar Jun 09 '20 11:06 andrey-gava

Hi @andrey-gava , How do you scale ? using kubectl scale or helm upgrade command? What version of hazelcast are you trying to upgrade to?

mesutcelik avatar Jun 09 '20 11:06 mesutcelik

Hi @andrey-gava , How do you scale ? using kubectl scale or helm upgrade command?

Manualy from shell: kubectl scale deployments.apps -n abc-2 abc-v2-dev-local --replicas=2

From Jenkins like this: kubectl set image -n abc-2 deployment/$JOB_NAME JOB_NAME=nexus.company.com/$JOB_NAME:$BUILD_NUMBER

What version of hazelcast are you trying to upgrade to?

I don't want to upgrade. I just try understand why this warning appears. Because its produce over 110 messages in a second every time when I scale, and it garbage my technical logs when debugging.

andrey-gava avatar Jun 09 '20 11:06 andrey-gava

This issue is probably related to Hazelcast itself, not Kubernetes plugin, because we see it in some other scenarios, like this https://github.com/hazelcast/hazelcast/issues/15446

I was not able to reproduce it with just Hazelcast, I tried:

  • Start Hazelcast cluster (3 members) with DNS Lookup discovery
  • Scale down to 2 members
  • Scale up to 3 members

No such logs.

@andrey-gava Would you be able to provide the steps to reproduce, so we could have a closer look into that?

leszko avatar Jun 15 '20 09:06 leszko

@andrey-gava Would you be able to provide the steps to reproduce, so we could have a closer look into that?

This issue is probably related to Hazelcast itself

But you not 100% sure about it. Am I right?

From the point off k8s there no specific settings in deployment or service. K8s v1.16.10 Network addon: docker.io/weaveworks/weave-kube:2.6.0 Java docker image bellsoft/liberica-openjdk-alpine:11.0.7-10 But we have another deployments in this k8s cluster that using same hazelcast version and same kubernetes plugin. And have no issue, like ours. Maybe it really somewhere in application code.

Speaking about project, its a commercial product so I cant share it.

andrey-gava avatar Jun 18 '20 11:06 andrey-gava

Posting in case it helps someone.
Look for the log message similar to the following:

2021-01-18 13:07:57.902  INFO 7 --- [           main] c.h.s.d.integration.DiscoveryService     : [10.233.70.147]:5701 [dev] [4.1.1] Kubernetes Discovery properties: { service-dns: null, service-dns-timeout: 5, service-name: null, service-port: 0, service-label: null, service-label-value: true, namespace: api, pod-label: null, pod-label-value: null, resolve-not-ready-addresses: true, use-node-name-as-external-address: false, kubernetes-api-retries: 3, kubernetes-master: https://kubernetes.default.svc}

Make sure that the service-name is not null or it will look for members among all services under the namespace -- in my case, this namespace was api.

My issue had to do with setting the property in the incorrect location. I didn't make sure that setProperty() was after getKubernetesConfig() and was setting the property directly on the config instance itself. This is what working code looks like:

    config
        .getNetworkConfig()
        .getJoin()
        .getKubernetesConfig()
        .setEnabled(true)
        .setProperty("namespace", NAMESPACE)
        .setProperty("service-name", SERVICE-NAME);

tlynema-bravolt avatar Jan 18 '21 14:01 tlynema-bravolt

@leszko Can this problem be related to situation when hazelcast embedded in application and its container expose two ports, one for http (tomcat) and second for hazelcast? Maybe if set service-port it will go away. But I see this setting only described in API discovery mode, but not in DNS.

andrey-gava avatar Mar 22 '21 06:03 andrey-gava

@leszko Can this problem be related to situation when hazelcast embedded in application and its container expose two ports, one for http (tomcat) and second for hazelcast? Maybe if set service-port it will go away. But I see this setting only described in API discovery mode, but not in DNS.

To be honest I don't think it's related. Could you add the exact minimum steps to reproduce this issue?

leszko avatar Mar 22 '21 08:03 leszko

Problem was in linkerd service mesh. #307

Add to deployment

  template:
    metadata:
      annotations:
        linkerd.io/inject: enabled
        config.linkerd.io/skip-outbound-ports: "5701"
        config.linkerd.io/skip-inbound-ports: "5701"

Issue can be closed.

andrey-gava avatar Nov 29 '21 10:11 andrey-gava

Hey folks,

I am facing same issue . We have java spring app, with embedded hazelcast, and run it in Kubernetes. For member lookup we using DNS Lookup Discovery mode , member creation looks good but Hazelcast trying to connect some random IP and giving below warning.

java.lang.IllegalStateException: Unknown protocol: OPT java.lang.IllegalStateException: TLS handshake header detected, but plain protocol header was expected.

AshishWat avatar Dec 13 '23 06:12 AshishWat