lighthouse Delayed RPC Send Using Tokens

Description

Currently we use the leaky-bucket algorithm to generate tokens for RPC requests. We use this to enforce rate limiting. If a peer runs out of tokens we send a rate limit error on the RPC, with a time at which they can start requesting again.

The current problem with this is that the rate limit errors are not spec'd and not really used. We practically just downscore peers that hit our rate limits and the errors are somewhat meaningless.

The specification is shifting more toward the behaviour where if a peer hits our rate limit, we hold the stream open but slowly send responses as the rate limit tokens get regenerated. This practically slows the response down to match our limits. The requesting peer may decide to drop the request if the responses are too slow.

Implementation Details and Rough Guide

The rate limit logic is found here: https://github.com/sigp/lighthouse/blob/stable/beacon_node/lighthouse_network/src/rpc/rate_limiter.rs

I don't think we need to change any code in here, but if you are interested in seeing how the tokens get re-generated this is where the logic is.

When an RPC request comes in, we check that it is within our rate limits here: https://github.com/sigp/lighthouse/blob/stable/beacon_node/lighthouse_network/src/rpc/mod.rs#L301

If it hits our limit we simply return an error: https://github.com/sigp/lighthouse/blob/stable/beacon_node/lighthouse_network/src/rpc/mod.rs#L327

There are probably a few things to note here:

In the error case, if the request is too large (i.e outside of the spec), we probably still want to error.
If the request is requesting something that would make us wait longer than the spec TIMEOUT time (currently 10 seconds), then we should just error and close the stream rather than wait the 10 seconds and reach the timeout.
For batch requests, like blocks_by_range. It could be that the first request uses most of the tokens, so we process straight away, then a second request doesn't have enough tokens to fulfill the entire request, but we will be able to fulfill the request in the 10 second mark, then we probably want to send the entire request to the block processor to read from the db in a batch, then trickle the responses back to the peer as tokens become available.

I think it might be easier to implement the cases where we process the request then trickle the responses.

In order to trickle the responses, I think (off the top of my head) there are maybe two or three ways to do it. There may be more, but these are some suggestions to get started.

Once we have processed the request and send the response to the handler, i.e via: https://github.com/sigp/lighthouse/blob/stable/beacon_node/lighthouse_network/src/rpc/mod.rs#L169 then we can check the rate limiter for tokens and only trickle the responses inside this function inside the behaviour rpc/mod.rs. This has the advantage that we don't need to share the state of the rpc rate limiter, because the rpc behaviour (rpc.mod) knows about the current rate limits. The downside is that we would have to queue the responses in the behaviour and trickle them as they get tokens to send them.
We could do everything inside the handler. We can probably Arc<> or share the rate limiter and give access to it in the handler. We would have to wait until the tokens get regenerated and then send the messages on the stream. The benefits here are that the handler naturally has queues and it runs in parallel per peer. The downside is we have to share the state of the rate limiter between each handler per peer.
Maybe we can do a hybrid thing, where we send some timing info about when the next tokens will be generated within the send_request function and ultimately change the HandlerMessage to accommodate. Then we dont have to share state, the handler manages the sends. I think this might be a bit trickier tho.

Feel free to hit me up for more info/direction if needed :).

May 15 '24 07:05 AgeManning

I'm working on this. Note for self: spec discussion related to this issue, https://github.com/ethereum/consensus-specs/pull/3767

May 22 '24 13:05 ackintosh

@AgeManning The flowcharts below show how RPC (responder side) should behave, as illustrated from my understanding. Could you please point out if there is any misunderstanding?

As-is

flowchart TD
    Start([START]) --> A[Receive request]
    A --> B{Rate limit reached or too large?}
    B -->|Yes| C[Send error response]
    C --> End([END])
    B -->|No| E[Process request]
    E --> F[Send response]
    F --> End

To-be

Regarding Are there more than two concurrent requests with the same protocol? in the diagram below, this is taken from the PR in consensus-spec.
- https://github.com/ethereum/consensus-specs/pull/3767/files
- The requester MUST NOT make more than two concurrent requests with the same ID.
If my understanding shown in the diagram below is correct, I think we would make the rate limiter behave similarly to the self-limiter which trickles the requests that a peer sends.

flowchart TD
    Start2([START]) --> AA[Receive request]
    AA --> COND1{Are there more than two concurrent requests <br> with the same protocol?}
    COND1 --> |Yes| CC[Send error response]
    CC --> End2([END])
    COND1 --> |No| COND2{Request is too large?}
    COND2 --> |Yes| CC
    COND2 --> |No| DD[Process request]
    DD --> EE{Rate limit reached?}
    EE --> |Yes| FF[Wait until tokens are regenerated]
    FF --> EE
    EE --> |No| GG[Send response]
    GG --> End2

May 27 '24 18:05 ackintosh

Hey @ackintosh - Yep, this looks right to me!

Jun 03 '24 04:06 AgeManning

Completed in #5923 🎉

Apr 30 '25 04:04 jimmygchen

lighthouse lighthouse copied to clipboard

Delayed RPC Send Using Tokens

Description

Implementation Details and Rough Guide

As-is

To-be

lighthouse
lighthouse copied to clipboard