zipkin icon indicating copy to clipboard operation
zipkin copied to clipboard

zipkin to elasticsearch error

Open vjvel opened this issue 2 years ago • 8 comments

We have deployed zipkin with elastic search in k8s and it was working fine. Now we use zipkin to connect the elastic search in aws with creds but its not working. Its giving the below error. we checked the connectivity from zipkin to elasticsearch using curl it works. from application its giving below error..

2022-03-10 12:51:31.229  WARN [/] 1 --- [orker-epoll-2-2] c.l.a.c.l.LoggingClient                  : [creqId=852846f2, sreqId=5dd28847][http://UNKNOWN/#GET] Request: {startTime=2022-03-10T12:51:21.228Z(1646916681228066), length=0B, duration=10000ms(10000784449ns), cause=com.linecorp.armeria.client.UnprocessedRequestException: com.linecorp.armeria.client.endpoint.EmptyEndpointGroupException, scheme=none+http, name=get-node, headers=[]}
2022-03-10 12:51:31.230  WARN [/] 1 --- [orker-epoll-2-2] c.l.a.c.l.LoggingClient                  : [creqId=852846f2, sreqId=5dd28847][http://UNKNOWN/#GET] Response: {startTime=2022-03-10T12:51:31.229Z(1646916691229031), length=0B, duration=0ns, totalDuration=10000ms(10000966403ns), cause=com.linecorp.armeria.client.UnprocessedRequestException: com.linecorp.armeria.client.endpoint.EmptyEndpointGroupException, headers=[]}

com.linecorp.armeria.client.UnprocessedRequestException: com.linecorp.armeria.client.endpoint.EmptyEndpointGroupException
        at com.linecorp.armeria.client.UnprocessedRequestException.of(UnprocessedRequestException.java:45) ~[armeria-1.13.4.jar:?]
        at com.linecorp.armeria.client.HttpClientDelegate.execute(HttpClientDelegate.java:73) ~[armeria-1.13.4.jar:?]
        at com.linecorp.armeria.client.HttpClientDelegate.execute(HttpClientDelegate.java:47) ~[armeria-1.13.4.jar:?]
        at com.linecorp.armeria.client.metric.AbstractMetricCollectingClient.execute(AbstractMetricCollectingClient.java:61) ~[armeria-1.13.4.jar:?]
        at com.linecorp.armeria.client.encoding.DecodingClient.executeAndDecodeResponse(DecodingClient.java:160) ~[armeria-1.13.4.jar:?]
        at com.linecorp.armeria.client.encoding.DecodingClient.execute(DecodingClient.java:119) ~[armeria-1.13.4.jar:?]
        at com.linecorp.armeria.client.encoding.DecodingClient.execute(DecodingClient.java:49) ~[armeria-1.13.4.jar:?]
        at com.linecorp.armeria.client.logging.AbstractLoggingClient.execute(AbstractLoggingClient.java:125) ~[armeria-1.13.4.jar:?]
        at zipkin2.server.internal.elasticsearch.BasicAuthInterceptor.execute(BasicAuthInterceptor.java:45) ~[classes/:?]
        at zipkin2.server.internal.elasticsearch.BasicAuthInterceptor.execute(BasicAuthInterceptor.java:30) ~[classes/:?]
        at com.linecorp.armeria.internal.client.ClientUtil.pushAndExecute(ClientUtil.java:153) ~[armeria-1.13.4.jar:?]
        at com.linecorp.armeria.internal.client.ClientUtil.initContextAndExecuteWithFallback(ClientUtil.java:107) ~[armeria-1.13.4.jar:?]
        at com.linecorp.armeria.internal.client.ClientUtil.lambda$initContextAndExecuteWithFallback$0(ClientUtil.java:81) ~[armeria-1.13.4.jar:?]
        at java.util.concurrent.CompletableFuture.uniHandle(Unknown Source) ~[?:?]
        at java.util.concurrent.CompletableFuture$UniHandle.tryFire(Unknown Source) ~[?:?]
        at java.util.concurrent.CompletableFuture.postComplete(Unknown Source) ~[?:?]
        at java.util.concurrent.CompletableFuture$AsyncSupply.run(Unknown Source) ~[?:?]
        at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:547) ~[armeria-1.13.4.jar:?]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [netty-common-4.1.70.Final.jar:4.1.70.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469) [netty-common-4.1.70.Final.jar:4.1.70.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384) [netty-transport-classes-epoll-4.1.70.Final.jar:4.1.70.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) [netty-common-4.1.70.Final.jar:4.1.70.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.70.Final.jar:4.1.70.Final]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.70.Final.jar:4.1.70.Final]
        at java.lang.Thread.run(Unknown Source) [?:?]
Caused by: com.linecorp.armeria.client.endpoint.EmptyEndpointGroupException
        at com.linecorp.armeria.client.endpoint.EmptyEndpointGroupException.get(EmptyEndpointGroupException.java:37) ~[armeria-1.13.4.jar:?]
        ... 24 more

2022-03-10 12:51:31.233  WARN [/] 1 --- [orker-epoll-2-2] z.s.i.BodyIsExceptionMessage             : Unexpected error handling request.

java.util.concurrent.RejectedExecutionException: EmptyEndpointGroupException
        at zipkin2.elasticsearch.internal.client.HttpCall.lambda$sendRequest$3(HttpCall.java:227) ~[zipkin-storage-elasticsearch-2.23.16.jar:?]
        at java.util.concurrent.CompletableFuture.uniExceptionally(Unknown Source) ~[?:?]
        at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(Unknown Source) ~[?:?]
        at java.util.concurrent.CompletableFuture.postComplete(Unknown Source) ~[?:?]
        at java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source) ~[?:?]
        at com.linecorp.armeria.common.util.UnmodifiableFuture.doCompleteExceptionally(UnmodifiableFuture.java:139) ~[armeria-1.13.4.jar:?]
        at com.linecorp.armeria.common.util.UnmodifiableFuture.lambda$wrap$0(UnmodifiableFuture.java:98) ~[armeria-1.13.4.jar:?]
        at java.util.concurrent.CompletableFuture.uniHandle(Unknown Source) ~[?:?]
        at java.util.concurrent.CompletableFuture$UniHandle.tryFire(Unknown Source) ~[?:?]
        at java.util.concurrent.CompletableFuture.postComplete(Unknown Source) ~[?:?]
        at java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source) ~[?:?]
        at com.linecorp.armeria.common.stream.DeferredStreamMessage.lambda$delegate$0(DeferredStreamMessage.java:132) ~[armeria-1.13.4.jar:?]
        at java.util.concurrent.CompletableFuture.uniHandle(Unknown Source) ~[?:?]
        at java.util.concurrent.CompletableFuture.uniHandleStage(Unknown Source) ~[?:?]
        at java.util.concurrent.CompletableFuture.handle(Unknown Source) ~[?:?]
        at com.linecorp.armeria.common.stream.DeferredStreamMessage.delegate(DeferredStreamMessage.java:128) ~[armeria-1.13.4.jar:?]
        at com.linecorp.armeria.common.DeferredHttpResponse.delegate(DeferredHttpResponse.java:47) ~[armeria-1.13.4.jar:?]
        at com.linecorp.armeria.common.DeferredHttpResponse.lambda$delegateWhenComplete$0(DeferredHttpResponse.java:58) ~[armeria-1.13.4.jar:?]
        at java.util.concurrent.CompletableFuture.uniHandle(Unknown Source) ~[?:?]
        at java.util.concurrent.CompletableFuture$UniHandle.tryFire(Unknown Source) ~[?:?]
        at java.util.concurrent.CompletableFuture.postComplete(Unknown Source) ~[?:?]
        at java.util.concurrent.CompletableFuture$AsyncSupply.run(Unknown Source) ~[?:?]
        at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:547) ~[armeria-1.13.4.jar:?]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [netty-common-4.1.70.Final.jar:4.1.70.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469) [netty-common-4.1.70.Final.jar:4.1.70.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384) [netty-transport-classes-epoll-4.1.70.Final.jar:4.1.70.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) [netty-common-4.1.70.Final.jar:4.1.70.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.70.Final.jar:4.1.70.Final]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.70.Final.jar:4.1.70.Final]
        at java.lang.Thread.run(Unknown Source) [?:?]
Caused by: com.linecorp.armeria.client.endpoint.EmptyEndpointGroupException

vjvel avatar Mar 10 '22 12:03 vjvel

I have the same issue on k8s in this two months(first time deploy in March 2022, everything was ok), all logs about LoggingClient for es api are UNKNOWN host like this "http://UNKNOWN/#GET", then I have change logger level to debug, and dns resolver is ok. i once thought it is a es problem, but other es clients are working. During debugging I find something confusing that is when zipkin server started and connected es failed, then i restart one pod of es cluster node, zipkin server would work again, LoggingClient logs ES_HOSTS not UNKNOWN, but if i restart zipkin server this issue will come again.

SingKS8 avatar Jul 11 '22 13:07 SingKS8

We are experiencing the same issue. Anyone have any idea why this is happening?

John-Athan avatar Jul 12 '22 11:07 John-Athan

I have the same issue on k8s in this two months(first time deploy in March 2022, everything was ok), all logs about LoggingClient for es api are UNKNOWN host like this "http://UNKNOWN/#GET", then I have change logger level to debug, and dns resolver is ok. i once thought it is a es problem, but other es clients are working. During debugging I find something confusing that is when zipkin server started and connected es failed, then i restart one pod of es cluster node, zipkin server would work again, LoggingClient logs ES_HOSTS not UNKNOWN, but if i restart zipkin server this issue will come again.

I find that might be alpine base image network issues on kubernetes. I have repackage zipkin image with eclipse-temurin ubuntu base image, and it works. I find that more and more libraries running on jvm alpine base image that will have network issues since kubernetes 1.24-1.25 (1.24 in last comment), some components like spring config server, spring config client, some libraries like spring resttemplate, jgit. In my team, since kubernetes being upgraded, almost all projects need to be repackaged from alpine base to non-alpine. So, does the openzipkin team consider providing non-alpine base image as an option?

SingKS8 avatar Oct 12 '22 02:10 SingKS8

I have the same issue on k8s in this two months(first time deploy in March 2022, everything was ok), all logs about LoggingClient for es api are UNKNOWN host like this "http://UNKNOWN/#GET", then I have change logger level to debug, and dns resolver is ok. i once thought it is a es problem, but other es clients are working. During debugging I find something confusing that is when zipkin server started and connected es failed, then i restart one pod of es cluster node, zipkin server would work again, LoggingClient logs ES_HOSTS not UNKNOWN, but if i restart zipkin server this issue will come again.

I find that might be alpine base image network issues on kubernetes. I have repackage zipkin image with eclipse-temurin ubuntu base image, and it works. I find that more and more libraries running on jvm alpine base image that will have network issues since kubernetes 1.24-1.25 (1.24 in last comment), some components like spring config server, spring config client, some libraries like spring resttemplate, jgit. In my team, since kubernetes being upgraded, almost all projects need to be repackaged from alpine base to non-alpine. So, does the openzipkin team consider providing non-alpine base image as an option?

Cloud you share your build image with non-alpine? tks

jacklu2016 avatar Nov 10 '22 06:11 jacklu2016

I've being having the same issue. If I delete the K8s deployment to change, for example, heap memory and apply again, Zipkin can't no longer search on ES. It only works again if I rename the index using the env var ES_INDEX in Zipkin deployment. It doesn't seem to be a network issue but an app issue, where Zipkin lost some index reference when we redeploy.

But I appreciate if you share your image with us @jacklu2016. Thanks.

jeff-lemos avatar Dec 01 '22 13:12 jeff-lemos

Complementing If my previous index was zipkin-test and for some reason I delete the zipkin deployment to change something or change something on ES that affects the index, I'll have to rename the index but it has to be something different like new-test or new-zipkin-test not zipkin-test2 or zipkin-another-test`. I don't know why but if the name is too much similar, it continues to failing.

jeff-lemos avatar Dec 01 '22 13:12 jeff-lemos

Same issue, although I am not running this on Kubernetes. The instance can successfully call the /_cluster/health?pretty endpoint using curl, but when I launch Zipkin with the specified ES_HOSTS=..... connect to the Zipkin UI, I just get met with the error above. Trying the /_cat/shards/ on the elastic search instance, I receive no names related to Zipkin

DaneLyttinen avatar Mar 08 '23 04:03 DaneLyttinen

Upgrade the version of JDK and run Zipkin then.

tangmingxiang avatar Jun 06 '23 14:06 tangmingxiang

zipkin was renovated at the end of last year including updates to all things including JRE and alpine. Closing this out, but if you have something with a current version, please ping back. FYI zipkin-helm is also much renovated, for those using k8s

codefromthecrypt avatar Apr 16 '24 07:04 codefromthecrypt