Optional disk indexing for withdrawals, proposals, exits, etc
Description
Currently there are several queries that are impossible to answer without custom indexing of the database, like:
- How many withdrawals has validator
Nreceived? - How many block proposals has validator
Nhad? - In which slot did validator
Nhave their voluntary exit included on chain?
There are few ways we could improve this:
- Add this functionality to
beacon.watchand promote runningbeacon.watchfor these sorts of use-cases. - Index this data by default when running an archive node. With
tree-statesthis shouldn't consume too much space. - Optionally index this data when certain flags are provided, e.g.
--index=withdrawals,exits. - Implement a "plug-in" system that allows users to define their own "tables" derived from the block/state data. This could be quite cool, and all the above queries could be implemented in terms of this plug-in system. The hard part would be making the plugins dynamically load-able (we can't load Rust code at runtime). We might have to use a scripting language or lightweight VM (WASM? eBPF?). It may not be worth the complexity.
It's also possible we could index data just for a few user-specified indices, which would be more space-efficient but also more complicated.
Version
Lighthouse v5.1.0
I believe the statement "we can't load Rust code at runtime" is false. It's certainly isn't trivia,l but I believe using FFI or Channels or ... it's feasible. Please don't use a "scripting language or VM'!
@winksaville You're right, we could do plugins in Rust (I found https://adventures.michaelfbryan.com/posts/plugins-in-rust/). I think it may be simpler to avoid that complexity for now though, and just implement a few indexes that users are likely to want. If we do the implementation with a view to allow more plugins in future, that could be good
@michaelsproul good plan, use the initial implementations as design/implementation exercises to find the commonality and let the system evolve.as knowledge is gained.
Ask Peter how happy he is about maintaining geth scripting capabilities
If this is implemented in a standardized way across all beacon nodes you can't imagine how happy this would make all of us data consumers!
It would make it possible to run your own node and get some really important data which unfortunately right now is retrievable only by indexers.
Agree, tho maybe after electra? 😅
https://github.com/ethereum/EIPs/commit/810c347a48052ab36a53a6aa684737ce386f6093