witnet-rust icon indicating copy to clipboard operation
witnet-rust copied to clipboard

feat(nodestats): add counters to monitor failed data requests

Open drcpu-github opened this issue 3 years ago • 3 comments

This is a first push towards adding some extra counters that can be used to monitor why your node fails to solve data requests. Currently this info is printed to the standard output, but that's not too useful for the average node operator. This was discussed here at the moment of the Witnet node monitor demo on Discord.

This PR will fail the tests because it modifies NodeStats and thus the ChainState. However, I don't quite know how to create migrations for the ChainState, but I still wanted to get this out for comments. Any hints?

drcpu-github avatar Jun 08 '22 18:06 drcpu-github

Yes, migrations are hard because we are using bincode. The ChainState struct has support for adding new fields, however these fields must always be at the end of the list of fields, old fields cannot be removed or modified, and it requires a bit of code to set it up properly. See PR #2159 for an example with unspent_outputs_pool_old_migration_db.

Since the NodeStats struct is pretty small and it's likely to change (with more fields and useful stats), I think it may be worth it to change its serialization format to something that allows adding or removing fields easily. I made a branch with a proof of concept: https://github.com/tmpolaczyk/witnet-rust/tree/node-stats-json which uses JSON to store the NodeStats field, so it's JSON inside bincode. If you rebase this pull request it should just work, but not sure yet if that will be the best approach. Probably the final version will be using something else instead of JSON, or storing the NodeStats in a separate key in the database (to avoid the bincode step).

As per the content of this PR, I think it's a great addition, the more useful stats the better.

tmpolaczyk avatar Jun 09 '22 16:06 tmpolaczyk

Yes, migrations are hard because we are using bincode. The ChainState struct has support for adding new fields, however these fields must always be at the end of the list of fields, old fields cannot be removed or modified, and it requires a bit of code to set it up properly. See PR #2159 for an example with unspent_outputs_pool_old_migration_db.

Yeah, that's a pretty significant downside for something that's still in flux.

Since the NodeStats struct is pretty small and it's likely to change (with more fields and useful stats), I think it may be worth it to change its serialization format to something that allows adding or removing fields easily. I made a branch with a proof of concept: https://github.com/tmpolaczyk/witnet-rust/tree/node-stats-json which uses JSON to store the NodeStats field, so it's JSON inside bincode.

That seems sensible. Definitely more future-proof than the current implementation.

If you rebase this pull request it should just work, but not sure yet if that will be the best approach. Probably the final version will be using something else instead of JSON, or storing the NodeStats in a separate key in the database (to avoid the bincode step).

I think a separate database key is definitely the nicest solution (does that mean we can just modify it as we want, I'm not sure?). Don't quite see the advantage of having it in ChainState anyway. I can have a look, but I'm not sure if I understand that part of the node code well enough.

drcpu-github avatar Jun 09 '22 18:06 drcpu-github

So having it as part of the ChainState was easier to implement, and it ensures that the stats are always consistent with the chain state. This means that when there is a rollback, the stats also roll back (but we could implement it in a way that the stats do not roll back, or only some of them do, if desired).

Having it as a separate key allows us to use something different than bincode without needing the JSON-inside-bincode hack, but I'm not sure yet if that is a strong enough argument. Also if stored as a separate key it would be a bit complex to handle rollbacks.

Currently there is a work in progress scripting branch that is also blocked by chain state migrations, so we could include this pull request in the assessment on how to allow making changes to the ChainState more easily.

tmpolaczyk avatar Jun 13 '22 13:06 tmpolaczyk