convex
convex copied to clipboard
ServerPeer processing MISSING_DATA: convex.api.convex acquire method fails on first time call to a peer.
I found that when I call the acquire method to get the current state, the call will always fail on first time call to a peer. This is because the acquire method used by the client will start to blast the peer for the missing data items.
The peer does not handle multiple requests very well and starts to send bad data. Eventually the client crashes, or the client continues to request data but does not accept any data items.
At the moment I have fixed this by increasing the timeout in the acquire method to stop too many requests of the same item.
Below is a cleaned list of the request and received data by the client...
State hash: 0x1535910a6079a3890fcf0c59cd3706a113920cfddc977d2deb81857c34e68c55
INFO: Request missing: 0x1535910a6079a3890fcf0c59cd3706a113920cfddc977d2deb81857c34e68c55
INFO: Request missing: 0x1535910a6079a3890fcf0c59cd3706a113920cfddc977d2deb81857c34e68c55
INFO: Request missing: 0x1535910a6079a3890fcf0c59cd3706a113920cfddc977d2deb81857c34e68c55
INFO: Request missing: 0x1535910a6079a3890fcf0c59cd3706a113920cfddc977d2deb81857c34e68c55
INFO: Recieved DATA for hash 0x1535910a6079a3890fcf0c59cd3706a113920cfddc977d2deb81857c34e68c55
INFO: Still missing: 0xfcb8fe6979a842428c0f616f9b9059cdec44f1a700f64058f0535409182dcb7c
INFO: Request missing: 0xfcb8fe6979a842428c0f616f9b9059cdec44f1a700f64058f0535409182dcb7c
INFO: Still missing: 0xfcb8fe6979a842428c0f616f9b9059cdec44f1a700f64058f0535409182dcb7c
INFO: Request missing: 0xfcb8fe6979a842428c0f616f9b9059cdec44f1a700f64058f0535409182dcb7c
INFO: Still missing: 0xfcb8fe6979a842428c0f616f9b9059cdec44f1a700f64058f0535409182dcb7c
INFO: Recieved DATA for hash 0x1535910a6079a3890fcf0c59cd3706a113920cfddc977d2deb81857c34e68c55
INFO: Request missing: 0xfcb8fe6979a842428c0f616f9b9059cdec44f1a700f64058f0535409182dcb7c
INFO: Still missing: 0xfcb8fe6979a842428c0f616f9b9059cdec44f1a700f64058f0535409182dcb7c
INFO: Request missing: 0xfcb8fe6979a842428c0f616f9b9059cdec44f1a700f64058f0535409182dcb7c
INFO: Recieved DATA for hash 0x1535910a6079a3890fcf0c59cd3706a113920cfddc977d2deb81857c34e68c55
INFO: Still missing: 0xfcb8fe6979a842428c0f616f9b9059cdec44f1a700f64058f0535409182dcb7c
INFO: Request missing: 0xfcb8fe6979a842428c0f616f9b9059cdec44f1a700f64058f0535409182dcb7c
INFO: Still missing: 0xfcb8fe6979a842428c0f616f9b9059cdec44f1a700f64058f0535409182dcb7c
INFO: Request missing: 0xfcb8fe6979a842428c0f616f9b9059cdec44f1a700f64058f0535409182dcb7c
INFO: Recieved DATA for hash 0x1535910a6079a3890fcf0c59cd3706a113920cfddc977d2deb81857c34e68c55
INFO: Still missing: 0xfcb8fe6979a842428c0f616f9b9059cdec44f1a700f64058f0535409182dcb7c
INFO: Request missing: 0xfcb8fe6979a842428c0f616f9b9059cdec44f1a700f64058f0535409182dcb7c
INFO: Recieved DATA for hash 0xfcb8fe6979a842428c0f616f9b9059cdec44f1a700f64058f0535409182dcb7c
INFO: Still missing: 0x53cdca3e590f79c1e40599d632e577a669781c99eac494f0047fd12704377201
INFO: Request missing: 0x9bbdc2efd5b93eef6b379820a5888e80f91401b18020459862b3270a0c781ba4
INFO: Request missing: 0x53cdca3e590f79c1e40599d632e577a669781c99eac494f0047fd12704377201
INFO: Request missing: 0x146c91e8944f8a602240991eaacb4bffa32508219e926080eb2b71860988d580
INFO: Request missing: 0x485356bdb1137f848093d4bc3c8ed55035d449bbf0d6750b1b9bcb5aad303727
INFO: Still missing: 0x53cdca3e590f79c1e40599d632e577a669781c99eac494f0047fd12704377201
INFO: Request missing: 0x9bbdc2efd5b93eef6b379820a5888e80f91401b18020459862b3270a0c781ba4
INFO: Request missing: 0x53cdca3e590f79c1e40599d632e577a669781c99eac494f0047fd12704377201
INFO: Request missing: 0x146c91e8944f8a602240991eaacb4bffa32508219e926080eb2b71860988d580
INFO: Request missing: 0x485356bdb1137f848093d4bc3c8ed55035d449bbf0d6750b1b9bcb5aad303727
INFO: Recieved DATA for hash 0xfcb8fe6979a842428c0f616f9b9059cdec44f1a700f64058f0535409182dcb7c
There should be backpressure to prevent the client from flooding with too many simultaneous requests. This might require a timeout to be long enough to allow the full download to occur.
It is expected that a few missing data exceptions will happen while this is occurring.
Technically, it may be better for the Peer to acquire a Belief from the source Peer, in which case it can still reconstruct the state from previous blocks. Needs a bit more investigation.
I'm more concerned that a peer can get flooded with too many requests and then will start to affect other peers data exchange.
BTW, I find that the tests I run on my desktop used to get randomly stuck around this part of the testing:
[INFO] Running convex.gui.GUITest
[INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.005 s - in convex.core.data.BlobsTest
[INFO] Running convex.actors.OracleTest
[INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.011 s - in convex.core.lang.ReaderTest
[INFO] Running convex.actors.TorusTest
[INFO] Running convex.actors.ActorsTest
~~after changing the timeout I does not hang anymore~~ It's now occurring again, so maybe not the answer for the hang ups in testing
I might have broken something on the GUI side in recent commits. Trying to refactor stuff to make launching independent Peers easier, so that it can sync with another Peer
To recheck but probably fixed in more recent updates