wavelet icon indicating copy to clipboard operation
wavelet copied to clipboard

Account status on on Wavelet node is not working

Open asim-sha opened this issue 6 years ago • 11 comments

It looks like the account status (balance, stake, reward etc.) is not correctly shown on the Node on both the Status HTTP API as well as the status command on the cmdline. It shows 0 for everything when thats not the case and Lens shows the correct value.

{ public_key: "405c26514497ea02badd3ac5e6e9f9ada353dfa7604ddc418901acdda562ad08", balance: 0, gas_balance: 0, stake: 0, reward: 0, nonce: 0, is_contract: false }

asim-sha avatar Oct 06 '19 18:10 asim-sha

This issue usually comes abrupt if a node that you are personally running is not properly syncing with the testnet. Are there any errors or peers repetitively connecting/disconnecting to/from your node?

iwasaki-kenta avatar Oct 10 '19 06:10 iwasaki-kenta

I will check it again tonight. But issue #268 is probably unrelated because that was happening on lens.perlin.net and not my local node.

asim-sha avatar Oct 10 '19 06:10 asim-sha

I've had a look again the does not seem to be in sync. Although its been connected for over an hour it has not synced. I do see this error when starting the node:

[2m12:08PM[0m [1m[31mERR[0m[0m error while checking out of sync with 144.91.73.85:3000 [31merror: [0m[31m"rpc error: code = Unavailable desc = transport is closing"[0m [2mevent: [0msync

And I also see peer repeatedly connecting/disconnecting as you mentioned.

asim-sha avatar Oct 13 '19 11:10 asim-sha

I've had a look again the does not seem to be in sync. Although its been connected for over an hour it has not synced. I do see this error when starting the node:

�[2m12:08PM�[0m �[1m�[31mERR�[0m�[0m error while checking out of sync with 144.91.73.85:3000 �[31merror: �[0m�[31m"rpc error: code = Unavailable desc = transport is closing"�[0m �[2mevent: �[0msync

And I also see peer repeatedly connecting/disconnecting as you mentioned.

Gotcha. The problem then is that your router is blocking nodes in the testnet from connecting to your node. After port-forwarding the port that you assigned your node, you should then be able to properly sync and interact with the testnet.

iwasaki-kenta avatar Oct 13 '19 11:10 iwasaki-kenta

I've had a look again the does not seem to be in sync. Although its been connected for over an hour it has not synced. I do see this error when starting the node: �[2m12:08PM�[0m �[1m�[31mERR�[0m�[0m error while checking out of sync with 144.91.73.85:3000 �[31merror: �[0m�[31m"rpc error: code = Unavailable desc = transport is closing"�[0m �[2mevent: �[0msync And I also see peer repeatedly connecting/disconnecting as you mentioned.

Gotcha. The problem then is that your router is blocking nodes in the testnet from connecting to your node. After port-forwarding the port that you assigned your node, you should then be able to properly sync and interact with the testnet.

Opening the port has helped get the account information. But looks like it goes out of sync again very quickly and latest account status returns stale values. It eventually does pick up the changes but takes quite a while (15-30mins). Is this normal?

Also I see a bunch of errors but not sure if they are related:

failed to unmarshal the received message proto: wrong wireType = 2 for field OutOfSync" event: apply 
1:28PM WRN Snowball liveness fault count: 5 last_id: true name: syncer new_id: false
1:28PM WRN Snowball liveness fault count: 0 last_id: false name: syncer new_id: true
1:28PM WRN Snowball liveness fault count: 1 last_id: true name: syncer new_id: false
1:28PM WRN Snowball liveness fault count: 0 last_id: false name: syncer new_id: true
1:28PM ERR error while checking out of sync with 176.223.142.119:3000 error: "rpc error: code = Internal desc = grpc: failed to unmarshal the received message proto: wrong wireType = 2 for field OutOfSync" event: apply
1:28PM WRN Snowball liveness fault count: 5 last_id: true name: syncer new_id: false
1:28PM WRN Snowball liveness fault count: 0 last_id: false name: syncer new_id: true
1:28PM WRN Snowball liveness fault count: 0 last_id: true name: syncer new_id: false
1:28PM WRN Snowball liveness fault count: 0 last_id: false name: syncer new_id: true
1:28PM WRN Snowball liveness fault count: 6 last_id: true name: syncer new_id: false

asim-sha avatar Oct 13 '19 12:10 asim-sha

I can also confirm that #268 is an independent issue as its happening regardless of where the transaction was initiated. Its probably a more serious issue than this as Transactions should not be disappearing, its seems like a very critical bug.

I've added more information to #268

asim-sha avatar Oct 13 '19 13:10 asim-sha

The "Snowball liveness fault" errors are normal warning-level errors that can occur.

The "error while checking out of sync" error likely means you are connecting to a peer running an older version of Wavelet and can safely be ignored as long as it's not too frequent.

It's not normal for the node to become out of sync, but if it does it should detect that it is out of sync with its peers and sync to the latest round and get up-to-date quickly. It will emit messages about this. Do you see any messages similar to:

Noticed that we are out of sync; downloading latest state Snapshot from our peer(s)

?

rkeene avatar Oct 24 '19 19:10 rkeene

I have just tried again (using himitsu branch). My transactions were not reflecting in the account status on the local node (transactions were in stuck in "received" state but on testnet.perlin.net the tx was successfully"applied" and correct account state was shown).

I also did get the message you have mentioned above:

?[2m6:31PM?[0m ?[32mINF?[0m Noticed that we are out of sync; downloading latest state Snapshot from our peer(s). ?[2mcurrent_round: ?[0m2726033 ?[2mevent: ?[0mapply ?[2m6:31PM?[0m ?[33mDBG?[0m Discovered the round which the majority of our peers are currently in. ?[2mevent: ?[0mapply ?[2mlatest_round: ?[0m2726309 ?[2mlatest_round_root: ?[0mbada114bb7801cae79fc1cdd45eefad3a7ca729dd5caf58087605f995b549630 ?[2m6:31PM?[0m ?[33mDBG?[0m Starting up workers to downloaded all chunks of data needed to sync to the latest round... ?[2mevent: ?[0mapply ?[2mnum_chunks: ?[0m37 ?[2mnum_workers: ?[0m16 ?[2m6:31PM?[0m ?[33mDBG?[0m Downloaded whatever chunks were available to sync to the latest round, and shutted down all workers. Checking validity of chunks... ?[2mevent: ?[0mapply ?[2mnum_chunks: ?[0m37 ?[2mnum_workers: ?[0m16 ?[2m6:31PM?[0m ?[32mINF?[0m All chunks have been successfully verified and re-assembled into a diff. Applying diff... ?[2mevent: ?[0mapply ?[2mnum_chunks: ?[0m37 ?[2mtarget_round: ?[0m2726309 ?[2m6:31PM?[0m ?[1m?[31mERR?[0m?[0m Failed to apply re-assembled diff to our ledger state. Restarting sync... ?[31merror: ?[0m?[31m"invalid difference: avl: could not find node a85018ca372e6414ee90fabfccf3ea2d"?[0m ?[2mevent: ?[0mapply ?[2mtarget_round: ?[0m2726309 ?[2m6:31PM?[0m ?[33mDBG?[0m Discovered the round which the majority of our peers are currently in. ?[2mevent: ?[0mapply ?[2mlatest_round: ?[0m2726309 ?[2mlatest_round_root: ?[0mbada114bb7801cae79fc1cdd45eefad3a7ca729dd5caf58087605f995b549630 ?[2m6:31PM?[0m ?[33mDBG?[0m Starting up workers to downloaded all chunks of data needed to sync to the latest round... ?[2mevent: ?[0mapply ?[2mnum_chunks: ?[0m37 ?[2mnum_workers: ?[0m16 ?[2m6:31PM?[0m ?[33mDBG?[0m Downloaded whatever chunks were available to sync to the latest round, and shutted down all workers. Checking validity of chunks... ?[2mevent: ?[0mapply ?[2mnum_chunks: ?[0m37 ?[2mnum_workers: ?[0m16 ?[2m6:31PM?[0m ?[32mINF?[0m All chunks have been successfully verified and re-assembled into a diff. Applying diff... ?[2mevent: ?[0mapply ?[2mnum_chunks: ?[0m37 ?[2mtarget_round: ?[0m2726309 ?[2m6:31PM?[0m ?[1m?[31mERR?[0m?[0m Failed to apply re-assembled diff to our ledger state. Restarting sync... ?[31merror: ?[0m?[31m"invalid difference: avl: could not find node a85018ca372e6414ee90fabfccf3ea2d"?[0m ?[2mevent: ?[0mapply ?[2mtarget_round: ?[0m2726309

The node never resynced, I am assuming its because of the avl error above.

asim-sha avatar Oct 26 '19 17:10 asim-sha

One more thing to note was that even when the node is in sync and there is an active stake it says "...not taking part in consensus"

asim-sha avatar Oct 26 '19 17:10 asim-sha

The Himitsu branch is an experiment and will not talk to the Testnet meaningfully. Ideally to talk to Testnet you should use v0.2.0.

Can you try again with v0.2.0 and see if you get any of those same errors ?

rkeene avatar Oct 28 '19 13:10 rkeene

The Himitsu branch is an experiment and will not talk to the Testnet meaningfully. Ideally to talk to Testnet you should use v0.2.0.

Can you try again with v0.2.0 and see if you get any of those same errors ?

I dont get those errors but the result is the same. Transactions are applied straight away on testnet but at local node they remain at "received" state. They eventually do get "applied" but takes quite a while (5-6 mins approx.). Here are the logs:

10:24PM INF Finalized consensus round, and initialized a new round. event: round_end new_difficulty: 8 new_merkle_root: 4a86351024e410e84a35b7b10405eea5 new_root: 1ac00b3869f1e2f47f12961e6059020b366023359bb5e04e 572c1fb587148fc4 new_round: 2731205 num_applied_tx: 67 num_ignored_tx: 0 num_rejected_tx: 0 ol d_difficulty: 8 old_merkle_root: 784bb63d44b674d43df6a85038a808b3 old_root: ec1881146981570065f5841be99dcb42bc c1e77b5a71e9839ea52b9029a98d05 old_round: 2731204 round_depth: 67

10:24PM INF Success! Your stake placement transaction ID: ddd2eaecc8a11fb4becbaa6380056d31dc1aa6abd53f3578579de29 41b4d9aad 10:29PM INF Responded to sync chunk request. event: provide_chunk requested_hash: 81e47a19e6b29b0 a65b9591762ce5143ed30d0261e5d24a3201752506b20f15c 10:31PM INF Finalized consensus round, and initialized a new round. event: round_end new_difficulty: 8 new_merkle_root: 51e003ee099251fc8110a2eecea1fb54 new_root: 402af597f6019cf43b3e129b73e1674eced21b0cccd589f2 953833099f89ddab new_round: 2731206 num_applied_tx: 645 num_ignored_tx: 0 num_rejected_tx: 0 o ld_difficulty: 8 old_merkle_root: 4a86351024e410e84a35b7b10405eea5 old_root: 1ac00b3869f1e2f47f12961e6059020b3 66023359bb5e04e572c1fb587148fc4 old_round: 2731205 round_depth: 645

asim-sha avatar Oct 28 '19 22:10 asim-sha