builder-specs icon indicating copy to clipboard operation
builder-specs copied to clipboard

Clarify errors on registrations

Open ralexstokes opened this issue 3 years ago • 3 comments

The validator registration endpoint currently takes a single registration with the understanding that relay muxers will send the registration to each relay they are aware of.

The spec for this endpoint returns a binary success/error response which precludes the case where a some relays succeed where others fail when processing a registration.

I think we want to erase this level of detail from the API in which case we don't need to change anything. For example, a caller of this endpoint who receives an error should assume the registration has completely failed and they need to try again (e.g. beacon node queues a retry for later). If they get a success, they should assume the responsibility on promulgating their registration is now with the muxer software (and the implication is that internally, the muxer may retry against transient relay failures, etc.)

If this is not the intended semantics, then we should extend the response so many errors can be returned.

ralexstokes avatar May 10 '22 22:05 ralexstokes

Absolutely. Thinking about this from the perspective of mev-boost:

  • a registerValidator call is received (containing a single or an array of individual registrations)
  • mev-boost sends this call to all connected relays
  • mev-boost returns 200 OK if at least one relay received it successfully (previous assumption)

Additionally possible (although perhaps better avoided):

  • mev-boost updates it's internal list of latest registrations per validator
  • mev-boost periodically resends the latest registrations to all the relays (in order to update new relays / builders coming online)

Rather than returning 200 OK if at least one relay has received it successfully, mev-boost could return an array where each entry contains relayUri, statuscode, and error.

metachris avatar May 12 '22 14:05 metachris

I think if we assume the boost software will periodically submit the registrations upstream then the error model for the registerValidator endpoint is simply: 200 OK if this directly hop in the pipeline successfully got a copy of the data, and error otherwise.

If we use this error discipline then each hop (validator to mux to relay to builder and back) can return an error according to their local context and there doesn't seem to be any situation where it implies a "global" error that the validator needs to be aware of

Otherwise, we put the responsibility of managing resubmissions on the validator which we already said was the responsibility of the "sidecar" software (and I want to keep as much logic out of the validator as possible)

ralexstokes avatar May 20 '22 22:05 ralexstokes

@realbigsean mentioned they implemented resending once per epoch into the Lighthouse validator client.

I'm not sure if it's best to move this functionality to mev-boost or have it in the validator and mev-boost simply to proxy it 🤔

metachris avatar Jun 15 '22 07:06 metachris