getStateValidators does not dedupe validators
Describe the bug
Not sure if this should be considered a bug but I think it is better that we dedupe the result of getStateValidators if the same validator index/pubkey is provided multiple times.
Current behavior
Returns the same validator multiple times if index or pubkey is provided multiple times.
See implementation https://github.com/ChainSafe/lodestar/blob/3e254d588ce5169a1fd08e2b6ea6b7f509695fbc/packages/beacon-node/src/api/impl/beacon/state/index.ts#L58
Expected behavior
Should dedupe the result and only return each unique validator once.
Steps to Reproduce
Start lodestar in dev mode
./lodestar dev --genesisValidators 8 --startValidators 0..7
send http request to getStateValidators API
curl "http://localhost:9596/eth/v1/beacon/states/head/validators?id=1,1" | jq
be surprised that the validator with index 1 is returned twice
{
"data": [
{
"index": "1",
"balance": "32000000000",
"status": "active_ongoing",
"validator": {
"pubkey": "0xb89bebc699769726a318c8e9971bd3171297c61aea4a6578a7a4f94b547dcba5bac16a89108b6b6a1fe3695d1a874a0b",
"withdrawal_credentials": "0x00ec7ef7780c9d151597924036262dd28dc60e1228f4da6fecf9d402cb3f3594",
"effective_balance": "32000000000",
"slashed": false,
"activation_eligibility_epoch": "0",
"activation_epoch": "0",
"exit_epoch": "18446744073709551615",
"withdrawable_epoch": "18446744073709551615"
}
},
{
"index": "1",
"balance": "32000000000",
"status": "active_ongoing",
"validator": {
"pubkey": "0xb89bebc699769726a318c8e9971bd3171297c61aea4a6578a7a4f94b547dcba5bac16a89108b6b6a1fe3695d1a874a0b",
"withdrawal_credentials": "0x00ec7ef7780c9d151597924036262dd28dc60e1228f4da6fecf9d402cb3f3594",
"effective_balance": "32000000000",
"slashed": false,
"activation_eligibility_epoch": "0",
"activation_epoch": "0",
"exit_epoch": "18446744073709551615",
"withdrawable_epoch": "18446744073709551615"
}
}
],
"execution_optimistic": false
}
I mean, you requested the validator at index 1 twice, so it's not wrong tor return the validator with index 1 twice. What does the spec says about this?
@dapplion There is no clear guidance on this in the Beacon API spec itself and neither does the OpenAPI spec details on how to handle duplicate keys in an array query string parameter.
However, the in the Beacon API spec it is at least enforced to have uniqueItems which would fail the example query of the issue during schema validation. But then the question would still be what do we do if the same validator is queries with pubkey and index, the schema validation would not fail in that case.
Regarding, stricter schema validation, I think we can not enforce this at this point as it likely would just cause issues without a major benefit in most cases.
I think it is fine how our API currently behaves, and considering breaking things if we change the behavior we might just wanna keep it as is. The most important thing is that our APIs behave the same across the board to allow users to make assumptions and potentially save some time.
This issue is mostly intended to at least quickly discuss this topic
If the spec says uniqueItems, then let's obey by it. uniqueItems seems to mean to return 400 if the request contains duplicate items
@dapplion the spec also says maxItems 30 but nobody (not a single CL client) enforces that as far as I have seen. That's a bit of a pity in my opinion as stricter APIs are usually a good thing to ensure consistency and predictability, even across different implemenations.
Unfortunately, I think it is too late to fully enforce the OpenAPI spec requirements as downstream tooling relies on this lax API validation. For example, rocketpool sends a request like this eth/v1/beacon/states/head/validators?id=<1000 validator pubkeys, comma-separated> and it works with all CL clients.
Ok, then let's de-duplicate
@nflaig is this still useful?
yeah, I think we should dedupe the response although this is low prio
I can do it by adding a uniqBy or a similar function to utils, and using it before returning the validator set. Or, I can check the existence of a possible duplicate and avoid pushing it to the response array. If it sounds good, let me know to open a PR.
What do you think about just enforcing it earlier, ie. at schema validation? As per spec the items in query should be unique, see here. It is likely a oversight if someone requests the same index twice, so I am thinking we should just throw a 400 error with a proper error message.
Same is applicable to POST variant of this api POST /eth/v1/beacon/states/{state_id}/validators where we would need to enforce the same on the array passed in the body.
Same for status, the beacon-api has several apis with uniqueItems: true on arrays, it might be good to be stricter in all those cases.
Our schemas are defined here https://github.com/ChainSafe/lodestar/blob/99033e3bd9e7a9e73de99f2a6b3e299f59c76440/packages/api/src/utils/schema.ts#L42
@nflaig Yeah, it seems a better solution. I can work on it nonetheless.
This may need a bit of conversation. Should I leave comments here, or open a thread on Discord?
Should I leave comments here, or open a thread on Discord?
Either works, but usually Discord works better for fast feedback cycles / discussions