convex
convex copied to clipboard
Calling persistPeerData in Server close, causes out of memory error
On slower peers, calling the Server methodpersistPeerData during a shutdown, causes errors since the storage close method is also being called by the shutdown hook. This leaves the etch database unreadable.
^CException in thread "Thread-1" java.lang.OutOfMemoryError: Java heap space
at java.base/java.util.jar.Manifest$FastInputStream.
It is a bit strange that the error occurs in the logging code. I will adjust the logging to see if we can get better messages.
the new log is a follows:
^C09:10:58.585 [Thread-1] WARN convex.peer.Server - Failed to persist peer state when closing server: Java heap space
java.lang.OutOfMemoryError: Java heap space
at convex.core.data.ACell.createEncoding(ACell.java:172)
at convex.core.data.ACell.getEncoding(ACell.java:119)
at convex.core.data.ACell.getHash(ACell.java:69)
at convex.core.data.RefDirect.getHash(RefDirect.java:76)
at convex.core.data.Ref.encode(Ref.java:591)
at convex.core.data.VectorTree.encodeRaw(VectorTree.java:168)
at convex.core.data.VectorTree.encode(VectorTree.java:159)
at convex.core.data.ACell.createEncoding(ACell.java:177)
at convex.core.data.ACell.getEncoding(ACell.java:119)
at convex.core.data.ACell.getHash(ACell.java:69)
at convex.core.data.RefDirect.getHash(RefDirect.java:76)
at convex.core.data.Ref.encode(Ref.java:591)
at convex.core.data.VectorTree.encodeRaw(VectorTree.java:168)
at convex.core.data.VectorTree.encode(VectorTree.java:159)
at convex.core.data.ACell.createEncoding(ACell.java:177)
at convex.core.data.ACell.getEncoding(ACell.java:119)
at convex.core.data.ACell.getHash(ACell.java:69)
at convex.core.data.RefDirect.getHash(RefDirect.java:76)
at convex.core.data.Ref.encode(Ref.java:591)
at convex.core.data.VectorTree.encodeRaw(VectorTree.java:168)
at convex.core.data.VectorTree.encode(VectorTree.java:159)
at convex.core.data.Ref.encode(Ref.java:588)
at convex.core.data.VectorLeaf.encodeRaw(VectorLeaf.java:287)
at convex.core.data.VectorLeaf.encode(VectorLeaf.java:271)
at convex.core.State.encodeRaw(State.java:185)
at convex.core.State.encode(State.java:180)
at convex.core.BlockResult.encodeRaw(BlockResult.java:160)
at convex.core.BlockResult.encode(BlockResult.java:155)
at convex.core.data.ACell.createEncoding(ACell.java:177)
at convex.core.data.ACell.getEncoding(ACell.java:119)
at convex.core.data.ACell.getEncodingLength(ACell.java:246)
at convex.core.data.ACell.isEmbedded(ACell.java:280)
Exception in thread "Update Loop on port: 8181" java.lang.OutOfMemoryError: Java heap space
09:11:07.228 [NIO Server selector loop on port: 8181] INFO convex.net.NIOServer - Selector loop ended on port: 0
Interesting. I shall investigate!
How does the data length cause this problem?
How does the data length cause this problem?
when the peer crashes, the etch.close is not called. So the dataLength value stored in the etch file header is incorrect, so the etch db is corrupted on the next open
Sounds like we have two underlying problems:
- Etch.close() not getting called when it should be
- Something causing too much memory usage when encoding
Would be good to focus in on those, independent of the Etch corruption problem. I will look at this more.
Think this is fixed for now