nimbus-eth2 Avoid double decompression of gossip messages

Gossip messages are decompressed twice in the current flow:

During message id generation: https://github.com/status-im/nimbus-eth2/blob/74f2350a2c85b60230e2d181ef34fe1831a5ba1a/beacon_chain/networking/eth2_network.nim#L1846

Once during validation: https://github.com/status-im/nimbus-eth2/blob/74f2350a2c85b60230e2d181ef34fe1831a5ba1a/beacon_chain/networking/eth2_network.nim#L1952

Once would be enough, really - messages that fail to decompress are rejected - they really don't need a message ID because they end up rejected anyway. Rejecting and not storing their id in the seen cache also avoids wasting memory on junk.

cc @Menduist @dryajov

Nov 01 '21 12:11 arnetheduck

Ok, we can avoid storing rejected messages, but not sure how we can prevent double compression in libp2p?

We need to compute the message id to dedup before validation, unless you want to validate 8* more messages. Or I missed something?

Nov 02 '21 08:11 Menduist

avoid storing rejected

yeah, the validation would have to return a ValidationResult that would affect scoring as well

Or I missed something?

well, it would need some new API to do this efficiently I guess - I believe rust-libp2p allows this optimization somehow, for example

Nov 02 '21 09:11 arnetheduck

Ok, so msgIdProvider could fail (we can use an exception, or a ValidationResult), and in that case we skip immediately And if the regular validation failed, we also don't store

Lgty?

Nov 02 '21 09:11 Menduist

And if the regular validation failed, we also don't store

we should still store if regular validation fails: we want to avoid validating failed messages repeatedly - the message id however, we can't avoid doing repeatedly regardless of failure or not

Nov 02 '21 09:11 arnetheduck

This is possible to do with observers which allow modifying the received or sent message. Observers are chained, so it's possible to modify the message in different ways.

Observers for incoming messages are called before validation happens, so it's possible to decompress the message there, observers for outgoing (sent or forwarded) messages are called on the raw message, before encoded to protobuf, so it's also possible to compress the message there if desired.

Observers are defined as:

  PubSubObserver* = ref object
    onRecv*: proc(peer: PubSubPeer; msgs: var RPCMsg) {.gcsafe, raises: [Defect].}
    onSend*: proc(peer: PubSubPeer; msgs: var RPCMsg) {.gcsafe, raises: [Defect].}

and can be added/removed with:

proc addObserver*(p: PubSub; observer: PubSubObserver)
proc removeObserver*(p: PubSub; observer: PubSubObserver)

Nov 02 '21 18:11 dryajov

This is possible

this doesn't seem to cover scoring though - data that can't be decompressed means a reject -> score loss. Also, we don't want to compress the data again for rebroadcast - then it's better to decompress twice.

Nov 03 '21 06:11 arnetheduck

Invalid message are now rejected early, https://github.com/status-im/nimbus-eth2/pull/3438

Mar 30 '22 11:03 Menduist

Invalid message are now rejected early

Valid messages still go through two snappy decompressions

Mar 30 '22 11:03 arnetheduck

Oh ok, didn't understand that you also wanted to avoid that. So libp2p should allow you to return custom data from the messageId provider, and forward it back to the validator?

Mar 30 '22 11:03 Menduist

forward it back to the validator?

something like that, yeah - it's important that we don't do "double" processing - at the same time, we have to be mindful that copying the data back to libp2p and then to the handler might be worse than decompressing twice, if done inefficiently, so doing this properly would require some benchmarks that we don't presently have.

Mar 30 '22 12:03 arnetheduck

nimbus-eth2 nimbus-eth2 copied to clipboard

Avoid double decompression of gossip messages

nimbus-eth2
nimbus-eth2 copied to clipboard