ethereumjs-monorepo
ethereumjs-monorepo copied to clipboard
ETH/LES Protocols - Simplify all the things
I've been complaining about this for long enough so it's time to put down the issue I see with our network stack and see if we can agree on a best way forward.
Our current Devp2p Protocol stack is a complete mess with layers of inheritance and bound functions and classes that obscure what's actually happening and completely breaks intellisense and the ability to jump to type/function definition. As a starting point, what I'm proposing is that we move all of the ETH/LES protocol handlers from the client to devp2p. The biggest issue here is something like below:
The message handlers and encoding/decoding for each protocol (ETH/LES/SNAP) is handled inside the client in a Protocol method that extends BoundProtocol which is connected to the underlying devp2p send/receive functionality via bound methods. Because the top level message handlers inside the client's ETH protocol class (which extends a BoundProtocol) which is has a devp2p RLPx network class embedded inside it with the RLPx send method then bound back up to the BoundProtocol, there is no straight-forward way to go step through the traces of the call stack for debugging purposes without much agony trying to track the message flow.
So, what I'm envisioning is something like completely doing away with the Protocol/BoundProtocolparent classes and then moving all of the logic for ETH/LES protocol message encoding/decoding/sending into devp2p. I'm not sure I can put pseudocode down at the moment to describe it but I'd love to have the client explicitly instantiate an ETH Protocol, an LES Protocol, and a SNAP protocol (or whichever subset it is running) where all the Protocol classes are housed in devp2p. So, when the client needs to announce new transactions, it just calls devp2p.eth.newPooledTxHashes and passes in the list of tx hashes (or Eth68 equivalent) to send and then devp2p takes care of serializing and sending the message.
The main things I want to get rid of are instances where we do are binding essentially unknown functions to timers which then makes it nigh impossible to figure out what's going on later on (basically this sort of thing in BoundProtocol)
return new Promise((resolve, reject) => {
resolver.timeout = setTimeout(() => {
resolver.timeout = null
this.resolvers.delete(message.response!)
reject(new Error(`Request timed out after ${this.timeout}ms`))
}, this.timeout)
resolver.resolve = resolve
resolver.reject = reject
})
Ideally this gets replaced with explicitly names message handlers in a new one size fits all EthProtocol class inside of devp2p.
return new Promise(resolve, reject) => {
pooledMessages = setTimeout(() => this.newPooledTxnHashes(resolve, ...otherParams), this.timeout)
reject(new Error('request timed out)'))
})
Any thoughts on this? I'm sure I'm missing some of the complexity of why things were done the way they currently are but I'm game to take a deep dive on this and try and reason it out.
Will give this more detailed insights later, but another thing we should take care of is that we now only support sequentially sending requests (so devp2p msgs with a response, such as GetBlockBodies / GetPooledTransations). We should parallelize this and ensure any request is sent immediately, then if we get a response match this to the request, if we dont get any response within a timeout revert the request Promise.
See -> #2940
Great great initiative!!! 🙏 🎉 I think I can already go along with 80% of the analysis + proposed solutions, and the remaining 20% are just things where I am just not deep enough into it and wouldn't dare an opinion yet.
Yes, this protocol binding - so for now in the client - was introduced by Vinay who wrote the initial client code and I think is just an example for "too much abstraction". So I think the intention was very good, so to prepare that this - theoretically - would work with every kind of protocol and could be - whoosh, whoosh - exchanged on the fly. I think reality and practical experience has very very much proven though over the years though that there are "just not so much protocols to bind" 😆 and even if there are 1-2 other protocols coming in (in a very long-term timeframe) it will likely still be a lot easier to handle on a one-by-one case or by inheritance or whatever than just to have this "plug everything in" solution. This has very much proven that the downsides are just too strong to keep and it would bring dramatical structural improvements to simplify here.
One thing: in case this is breaking timing would be a bit unlucky though right now. I wonder if this is for sure breaking though or if we find a way to introduce in a non-breaking way.
So:
-
Everything we do/change in the client itself is not problematic anyhow. So there we can shift things around, remove wrapper classes,.... Maybe that's e.g. already enough to eliminate this binding procedure?
-
For moving stronger responsibility/extend the realm of duties for the
devp2pclass and also do the object serialization/deserialization one might think that it might also be possible to add this as just an additional outer communication layer to the library which then talks to the - unchanged - current package API? 🤔 Just some thought and not sure if realistic.
I generally think that this whole task can likely be separated into 3-5 independent sub-tasks (like the un-binding stuff) and it's likely worth to give this some thought where the lines are there.
And an updated thought while writing this down: I think actually that devp2p is one of the (few) libraries where it is not problematic to do an in-between breaking release. We only have a direct dependency to the client library and we do not create any in-between update necessities if we update this particular library. And overall usage of the library is also low, so also not problematic from this front.
Post-submit changes, please read on site.
I've done a fair amount of experimental coding around this and it's unfortunately going to be trickier than I originally thought because of just how tightly intertwined each instance of the protocol is to the peer it's connected to. I have a vision of how it "should" work, but haven't figured out how to dive into it without getting lost in all the connection points. My end goal is going ot be something like a singleon instance of each protocol instantiated by the EthereumService (ETH/LES/SNAP as needed) and then whenever the client needs to send a protocol message, it looks something like:
res = await service.ethProtocol.sendGetBlockBodies(startBlock: number, endBlock: number) => {
return service.getRandomPeer.rlpx.request('getBlockBodies', startBlock, endBlock)
}