zipkin icon indicating copy to clipboard operation
zipkin copied to clipboard

UI Search stops working

Open jeff-lemos opened this issue 2 years ago • 4 comments

Describe the Bug

I'm running Zipkin in a Kubernetes deployment and storing tracers in ELastic Search. All fine, but sometimes some miss behaviors happens, like:

  • Today, doing some basic researches to test query performance seems to make the application stops working. I literally did nothing, only researches, but it stopped working.
  • If I redeploy my deployment it looses the "link" between Zipkin and Elastic Search so I have to change the ES_INDEX variable to make it work again but it will create a new index from scratch.

Response error and UI error

image image

The query

[query](curl -X GET "http://zipkin.url/zipkin/api/v2/traces?serviceName=SERVICE_NAME&spanName=%2Fv6%2Fblock-account%2F%3Cstring%3Auser_ting%3E%2Fconfirm-block&limit=10" -H "accept: application/json")

The log error

it's the same for all the cases and situations when it stops**

2022-12-05 22:19:17.942  WARN [/] 1 --- [orker-epoll-2-3] z.s.i.BodyIsExceptionMessage             : Unexpected error handling request.

com.linecorp.armeria.server.RequestTimeoutException: null
        at com.linecorp.armeria.server.RequestTimeoutException.get(RequestTimeoutException.java:36) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:467) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$setTimeoutNanosFromNow0$13(CancellationScheduler.java:293) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:555) ~[armeria-1.17.2.jar:?]
        at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:391) [netty-transport-classes-epoll-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at java.lang.Thread.run(Unknown Source) [?:?]

Steps to Reproduce

Deploy Zipkin through Kubernetes make it work and then just redeploy the Zipkin part. It won't be able to search the same index, you'll have to set the ES_INDEX variables with a completely different index name so it will get back to work.

Expected Behaviour

Sometimes we have to redeploy the Zipkin part because we have to increase heap memory, for example. If I do it, Zipkin stops working. I could be able to delete the deployment, recreate and all should work properly, unless I change the ES_INDEX variable on purpose.

jeff-lemos avatar Dec 05 '22 22:12 jeff-lemos

Complete error log

  • Index size: 380GB
  • 5 shards, 5 replicas ( 1 per primary)

2022-12-06 20:27:02.117  WARN [6290136d5c370272/6290136d5c370272] 1 --- [orker-epoll-2-4] z.s.i.BodyIsExceptionMessage             : Unexpected error handling request.

com.linecorp.armeria.client.ResponseTimeoutException: null
        at com.linecorp.armeria.client.ResponseTimeoutException.get(ResponseTimeoutException.java:38) ~[armeria-1.17.2.jar:?]
com.linecorp.armeria.client.ResponseTimeoutException: null
        at com.linecorp.armeria.client.ResponseTimeoutException.get(ResponseTimeoutException.java:38) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:469) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$init0$2(CancellationScheduler.java:123) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:555) ~[armeria-1.17.2.jar:?]
        at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:394) [netty-transport-classes-epoll-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at java.lang.Thread.run(Unknown Source) [?:?]

        at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:469) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$init0$2(CancellationScheduler.java:123) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:555) ~[armeria-1.17.2.jar:?]
        at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:394) [netty-transport-classes-epoll-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at java.lang.Thread.run(Unknown Source) [?:?]

2022-12-06 20:27:02.695  WARN [/] 1 --- [orker-epoll-2-2] z.s.i.BodyIsExceptionMessage             : Unexpected error handling request.

com.linecorp.armeria.client.ResponseTimeoutException: null
        at com.linecorp.armeria.client.ResponseTimeoutException.get(ResponseTimeoutException.java:38) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:469) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$init0$2(CancellationScheduler.java:123) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:555) ~[armeria-1.17.2.jar:?]
        at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:394) [netty-transport-classes-epoll-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at java.lang.Thread.run(Unknown Source) [?:?]

2022-12-06 20:27:12.939  WARN [/] 1 --- [orker-epoll-2-1] z.s.i.BodyIsExceptionMessage             : Unexpected error handling request.

com.linecorp.armeria.client.ResponseTimeoutException: null
        at com.linecorp.armeria.client.ResponseTimeoutException.get(ResponseTimeoutException.java:38) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:469) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$init0$2(CancellationScheduler.java:123) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:555) ~[armeria-1.17.2.jar:?]
        at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:394) [netty-transport-classes-epoll-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at java.lang.Thread.run(Unknown Source) [?:?]```


jeff-lemos avatar Dec 06 '22 20:12 jeff-lemos

It can't even find the service's names. It has communication, I've removed all network policies, I can find in the index through Kibana.

image

jeff-lemos avatar Dec 06 '22 20:12 jeff-lemos

I am having same issues, any updates on this issue!

taragurung avatar Jun 04 '24 17:06 taragurung