janusgraph
janusgraph copied to clipboard
JanusGraph image stop responding after query timeout
I'm using the default docker image janusgraph/janusgraph:latest (Berkeley and Lucene) and connecting with gremlin console.
When JanusGraph server exceeded his 'evaluationTimeout' the server stop responding
server error:
java.util.concurrent.TimeoutException: Evaluation exceeded the configured 'evaluationTimeout' threshold of 30000 ms or evaluation was otherwise cancelled directly for request [g.V()]
at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$1(GremlinExecutor.java:316)
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.lang.Thread.run(Thread.java:748)
1318978 [pool-6-thread-1] WARN org.janusgraph.diskstorage.log.kcvs.KCVSLog - Could not read messages for timestamp [2020-05-24T10:12:30.449Z] (this read will be retried)
org.janusgraph.core.JanusGraphException: Could not execute operation due to backend exception
at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:56)
at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:158)
at org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller.run(KCVSLog.java:725)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.janusgraph.diskstorage.PermanentBackendException: Could not start BerkeleyJE transaction
at org.janusgraph.diskstorage.berkeleyje.BerkeleyJEStoreManager.beginTransaction(BerkeleyJEStoreManager.java:163)
at org.janusgraph.diskstorage.berkeleyje.BerkeleyJEStoreManager.beginTransaction(BerkeleyJEStoreManager.java:47)
at org.janusgraph.diskstorage.keycolumnvalue.keyvalue.OrderedKeyValueStoreManagerAdapter.beginTransaction(OrderedKeyValueStoreManagerAdapter.java:68)
at org.janusgraph.diskstorage.log.kcvs.KCVSLog.openTx(KCVSLog.java:319)
at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:145)
at org.janusgraph.diskstorage.util.BackendOperation$1.call(BackendOperation.java:161)
at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:68)
at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:54)
... 9 more
Caused by: com.sleepycat.je.ThreadInterruptedException: (JE 18.3.12) Environment must be closed, caused by: com.sleepycat.je.ThreadInterruptedException: Environment invalid because of previous exception: (JE 18.3.12) /var/lib/janusgraph/data java.lang.InterruptedException THREAD_INTERRUPTED: InterruptedException may cause incorrect internal state, unable to continue. Environment is invalid and must be closed.
at com.sleepycat.je.ThreadInterruptedException.wrapSelf(ThreadInterruptedException.java:105)
at com.sleepycat.je.dbi.EnvironmentImpl.checkIfInvalid(EnvironmentImpl.java:1835)
at com.sleepycat.je.dbi.EnvironmentImpl.checkOpen(EnvironmentImpl.java:1844)
at com.sleepycat.je.Environment.checkOpen(Environment.java:2697)
at com.sleepycat.je.Environment.beginTransactionInternal(Environment.java:1409)
at com.sleepycat.je.Environment.beginTransaction(Environment.java:1383)
at org.janusgraph.diskstorage.berkeleyje.BerkeleyJEStoreManager.beginTransaction(BerkeleyJEStoreManager.java:146)
... 16 more
Caused by: com.sleepycat.je.ThreadInterruptedException: Environment invalid because of previous exception: (JE 18.3.12) /var/lib/janusgraph/data java.lang.InterruptedException THREAD_INTERRUPTED: InterruptedException may cause incorrect internal state, unable to continue. Environment is invalid and must be closed.
at com.sleepycat.je.latch.LatchImpl.acquireExclusive(LatchImpl.java:67)
at com.sleepycat.je.tree.IN.latch(IN.java:547)
at com.sleepycat.je.dbi.CursorImpl.latchBIN(CursorImpl.java:402)
at com.sleepycat.je.dbi.CursorImpl.cloneCursor(CursorImpl.java:230)
at com.sleepycat.je.Cursor.beginMoveCursor(Cursor.java:5252)
at com.sleepycat.je.Cursor.beginMoveCursor(Cursor.java:5259)
at com.sleepycat.je.Cursor.retrieveNextNoDups(Cursor.java:3550)
at com.sleepycat.je.Cursor.retrieveNext(Cursor.java:3312)
at com.sleepycat.je.Cursor.getInternal(Cursor.java:1313)
at com.sleepycat.je.Cursor.get(Cursor.java:1244)
at com.sleepycat.je.Cursor.getNext(Cursor.java:1512)
after the query been sent to server and timeout exceeded other queries which worked before gets same response
Evaluation exceeded the configured 'evaluationTimeout' threshold of 30000 ms or evaluation was otherwise cancelled directly for request [g.V().limit(4).valueMap()]: null - try increasing the timeout with the :remote command
I've found the same behaviour in 0.5.3 submitting scripts both from the console and from a connection. Once the server launches a timeout, it stops answering and tells you that it's always a timeout.
I can confirm that it happens with Berkeley + ES, versions 0.5.2 and 0.5.3 (when using Cassandra + ES in those versions, this doesn't happen).
I've found the same behaviour in 0.5.3 submitting scripts both from the console and from a connection. Once the server launches a timeout, it stops answering and tells you that it's always a timeout.
I can confirm that it happens with Berkeley + ES, versions 0.5.2 and 0.5.3 (when using Cassandra + ES in those versions, this doesn't happen).
We've encountered the same issue using the full release version 0.5.3 with the Cassandra + ES backend, connecting through a JavaScript Driver, a Python Driver, and a Gremlin.sh Groovy console.
I faced same issue on 0.6 (latest) + Cassandra + ES.
Have no idea why, any update on how to overcome it? I had to remove all my datas then re-run the engine to get it working, does it mean that the data is corrupted?
@mohamad-haddad-tribo The exception from above clear comes from berkeley. I'm sure you did get a berkeley exception in cassandra setup.
I'm also running into the same issue with Cassandra during data ingestion using concurrent inserts
I am using JanusGraph 0.6.0 and I confirm this is still an issue with BerkeleyDB. Once this error occurs, the server won't be able to recover from it. (P.S.: I know 0.6.1 has been released, but I was encountering issues with it, so I stick with 0.6.0).
Same for us on the in-memory backend :(
The same issue with the Cassandra backend in 1.0.0