onyx-dashboard
onyx-dashboard copied to clipboard
Dashboard enters crash loop
Hello,
I'm running an onyx cluster in Kubernetes and also run as a Deployment
an onyx-dashboard using the official docker image. Occasionally, when performing deployments my dashboard enters a crash loop as seen below and I'm forced to delete the pod, and things correct themselves, until of course the next occurrence which is several times per day.
I'm unable to understand why this is the case, my cluster is running, and there are always peers even if no job is currently scheduled or running. Any help on the subject would be much appreciated.
18-05-18 08:57:15 onyx-dashboard-7cdf967649-pcqbx WARN [onyx.log.zookeeper:243] - Log parameters have yet to be written to ZooKeeper by a peer. Backing off 500ms and trying again...
18-05-18 08:57:16 onyx-dashboard-7cdf967649-pcqbx WARN [onyx.log.zookeeper:242] -
java.lang.Thread.run Thread.java: 748
java.util.concurrent.ThreadPoolExecutor$Worker.run ThreadPoolExecutor.java: 624
java.util.concurrent.ThreadPoolExecutor.runWorker ThreadPoolExecutor.java: 1149
...
clojure.core.async/thread-call/fn async.clj: 434
onyx.log.zookeeper/fn/fn zookeeper.clj: 319
onyx.log.zookeeper/find-log-parameters zookeeper.clj: 239
onyx.log.zookeeper/find-log-parameters/fn zookeeper.clj: 240
...
onyx.log.zookeeper/fn zookeeper.clj: 722
onyx.log.zookeeper/fn zookeeper.clj: 724
onyx.monitoring.measurements/measure-latency measurements.clj: 11
onyx.log.zookeeper/fn/fn zookeeper.clj: 725
onyx.log.zookeeper/clean-up-broken-connections zookeeper.clj: 101
onyx.log.zookeeper/fn/fn/fn zookeeper.clj: 727
onyx.log.zookeeper/read-log-parameters zookeeper.clj: 720
onyx.log.curator/data curator.clj: 128
org.apache.curator.framework.imps.GetDataBuilderImpl$2.forPath GetDataBuilderImpl.java: 138
org.apache.curator.framework.imps.GetDataBuilderImpl$2.forPath GetDataBuilderImpl.java: 142
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath GetDataBuilderImpl.java: 279
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground GetDataBuilderImpl.java: 287
org.apache.curator.RetryLoop.callWithRetry RetryLoop.java: 107
org.apache.curator.framework.imps.GetDataBuilderImpl$4.call GetDataBuilderImpl.java: 291
org.apache.curator.framework.imps.GetDataBuilderImpl$4.call GetDataBuilderImpl.java: 302
org.apache.zookeeper.ZooKeeper.getData ZooKeeper.java: 1212
org.apache.zookeeper.KeeperException.create KeeperException.java: 51
org.apache.zookeeper.KeeperException.create KeeperException.java: 111
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /onyx/diligo-dev/log-parameters/log-parameters
code: -101
path: "/onyx/diligo-dev/log-parameters/log-parameters"
@neuromantik33 Which docker image are you using?
onyxplatform/onyx-dashboard:latest (outdated) or onyx/onyx-dashboard:latest
The problem has disappeared once migrating to the latest 0.13.0.1 version however I have additional problems but I will file a new issue.