zipkin
zipkin copied to clipboard
UI Search stops working
Describe the Bug
I'm running Zipkin in a Kubernetes deployment and storing tracers in ELastic Search. All fine, but sometimes some miss behaviors happens, like:
- Today, doing some basic researches to test query performance seems to make the application stops working. I literally did nothing, only researches, but it stopped working.
- If I redeploy my deployment it looses the "link" between Zipkin and Elastic Search so I have to change the ES_INDEX variable to make it work again but it will create a new index from scratch.
Response error and UI error
The query
[query](curl -X GET "http://zipkin.url/zipkin/api/v2/traces?serviceName=SERVICE_NAME&spanName=%2Fv6%2Fblock-account%2F%3Cstring%3Auser_ting%3E%2Fconfirm-block&limit=10" -H "accept: application/json")
The log error
it's the same for all the cases and situations when it stops**
2022-12-05 22:19:17.942 WARN [/] 1 --- [orker-epoll-2-3] z.s.i.BodyIsExceptionMessage : Unexpected error handling request.
com.linecorp.armeria.server.RequestTimeoutException: null
at com.linecorp.armeria.server.RequestTimeoutException.get(RequestTimeoutException.java:36) ~[armeria-1.17.2.jar:?]
at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:467) ~[armeria-1.17.2.jar:?]
at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$setTimeoutNanosFromNow0$13(CancellationScheduler.java:293) ~[armeria-1.17.2.jar:?]
at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:555) ~[armeria-1.17.2.jar:?]
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:391) [netty-transport-classes-epoll-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at java.lang.Thread.run(Unknown Source) [?:?]
Steps to Reproduce
Deploy Zipkin through Kubernetes make it work and then just redeploy the Zipkin part. It won't be able to search the same index, you'll have to set the ES_INDEX variables with a completely different index name so it will get back to work.
Expected Behaviour
Sometimes we have to redeploy the Zipkin part because we have to increase heap memory, for example. If I do it, Zipkin stops working. I could be able to delete the deployment, recreate and all should work properly, unless I change the ES_INDEX variable on purpose.
Complete error log
- Index size: 380GB
- 5 shards, 5 replicas ( 1 per primary)
2022-12-06 20:27:02.117 WARN [6290136d5c370272/6290136d5c370272] 1 --- [orker-epoll-2-4] z.s.i.BodyIsExceptionMessage : Unexpected error handling request.
com.linecorp.armeria.client.ResponseTimeoutException: null
at com.linecorp.armeria.client.ResponseTimeoutException.get(ResponseTimeoutException.java:38) ~[armeria-1.17.2.jar:?]
com.linecorp.armeria.client.ResponseTimeoutException: null
at com.linecorp.armeria.client.ResponseTimeoutException.get(ResponseTimeoutException.java:38) ~[armeria-1.17.2.jar:?]
at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:469) ~[armeria-1.17.2.jar:?]
at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$init0$2(CancellationScheduler.java:123) ~[armeria-1.17.2.jar:?]
at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:555) ~[armeria-1.17.2.jar:?]
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:394) [netty-transport-classes-epoll-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at java.lang.Thread.run(Unknown Source) [?:?]
at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:469) ~[armeria-1.17.2.jar:?]
at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$init0$2(CancellationScheduler.java:123) ~[armeria-1.17.2.jar:?]
at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:555) ~[armeria-1.17.2.jar:?]
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:394) [netty-transport-classes-epoll-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at java.lang.Thread.run(Unknown Source) [?:?]
2022-12-06 20:27:02.695 WARN [/] 1 --- [orker-epoll-2-2] z.s.i.BodyIsExceptionMessage : Unexpected error handling request.
com.linecorp.armeria.client.ResponseTimeoutException: null
at com.linecorp.armeria.client.ResponseTimeoutException.get(ResponseTimeoutException.java:38) ~[armeria-1.17.2.jar:?]
at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:469) ~[armeria-1.17.2.jar:?]
at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$init0$2(CancellationScheduler.java:123) ~[armeria-1.17.2.jar:?]
at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:555) ~[armeria-1.17.2.jar:?]
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:394) [netty-transport-classes-epoll-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at java.lang.Thread.run(Unknown Source) [?:?]
2022-12-06 20:27:12.939 WARN [/] 1 --- [orker-epoll-2-1] z.s.i.BodyIsExceptionMessage : Unexpected error handling request.
com.linecorp.armeria.client.ResponseTimeoutException: null
at com.linecorp.armeria.client.ResponseTimeoutException.get(ResponseTimeoutException.java:38) ~[armeria-1.17.2.jar:?]
at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:469) ~[armeria-1.17.2.jar:?]
at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$init0$2(CancellationScheduler.java:123) ~[armeria-1.17.2.jar:?]
at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:555) ~[armeria-1.17.2.jar:?]
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:394) [netty-transport-classes-epoll-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.78.Final.jar:4.1.78.Final]
at java.lang.Thread.run(Unknown Source) [?:?]```
It can't even find the service's names. It has communication, I've removed all network policies, I can find in the index through Kibana.
I am having same issues, any updates on this issue!