lodestar icon indicating copy to clipboard operation
lodestar copied to clipboard

getStateValidators does not dedupe validators

Open nflaig opened this issue 2 years ago • 7 comments

Describe the bug

Not sure if this should be considered a bug but I think it is better that we dedupe the result of getStateValidators if the same validator index/pubkey is provided multiple times.

Current behavior

Returns the same validator multiple times if index or pubkey is provided multiple times.

See implementation https://github.com/ChainSafe/lodestar/blob/3e254d588ce5169a1fd08e2b6ea6b7f509695fbc/packages/beacon-node/src/api/impl/beacon/state/index.ts#L58

Expected behavior

Should dedupe the result and only return each unique validator once.

Steps to Reproduce

Start lodestar in dev mode

./lodestar dev  --genesisValidators 8 --startValidators 0..7

send http request to getStateValidators API

curl "http://localhost:9596/eth/v1/beacon/states/head/validators?id=1,1" | jq

be surprised that the validator with index 1 is returned twice

{
  "data": [
    {
      "index": "1",
      "balance": "32000000000",
      "status": "active_ongoing",
      "validator": {
        "pubkey": "0xb89bebc699769726a318c8e9971bd3171297c61aea4a6578a7a4f94b547dcba5bac16a89108b6b6a1fe3695d1a874a0b",
        "withdrawal_credentials": "0x00ec7ef7780c9d151597924036262dd28dc60e1228f4da6fecf9d402cb3f3594",
        "effective_balance": "32000000000",
        "slashed": false,
        "activation_eligibility_epoch": "0",
        "activation_epoch": "0",
        "exit_epoch": "18446744073709551615",
        "withdrawable_epoch": "18446744073709551615"
      }
    },
    {
      "index": "1",
      "balance": "32000000000",
      "status": "active_ongoing",
      "validator": {
        "pubkey": "0xb89bebc699769726a318c8e9971bd3171297c61aea4a6578a7a4f94b547dcba5bac16a89108b6b6a1fe3695d1a874a0b",
        "withdrawal_credentials": "0x00ec7ef7780c9d151597924036262dd28dc60e1228f4da6fecf9d402cb3f3594",
        "effective_balance": "32000000000",
        "slashed": false,
        "activation_eligibility_epoch": "0",
        "activation_epoch": "0",
        "exit_epoch": "18446744073709551615",
        "withdrawable_epoch": "18446744073709551615"
      }
    }
  ],
  "execution_optimistic": false
}

nflaig avatar Mar 16 '23 16:03 nflaig

I mean, you requested the validator at index 1 twice, so it's not wrong tor return the validator with index 1 twice. What does the spec says about this?

dapplion avatar Mar 19 '23 07:03 dapplion

@dapplion There is no clear guidance on this in the Beacon API spec itself and neither does the OpenAPI spec details on how to handle duplicate keys in an array query string parameter.

However, the in the Beacon API spec it is at least enforced to have uniqueItems which would fail the example query of the issue during schema validation. But then the question would still be what do we do if the same validator is queries with pubkey and index, the schema validation would not fail in that case.

Regarding, stricter schema validation, I think we can not enforce this at this point as it likely would just cause issues without a major benefit in most cases.

I think it is fine how our API currently behaves, and considering breaking things if we change the behavior we might just wanna keep it as is. The most important thing is that our APIs behave the same across the board to allow users to make assumptions and potentially save some time.

This issue is mostly intended to at least quickly discuss this topic

nflaig avatar Mar 19 '23 07:03 nflaig

If the spec says uniqueItems, then let's obey by it. uniqueItems seems to mean to return 400 if the request contains duplicate items

dapplion avatar Mar 19 '23 08:03 dapplion

@dapplion the spec also says maxItems 30 but nobody (not a single CL client) enforces that as far as I have seen. That's a bit of a pity in my opinion as stricter APIs are usually a good thing to ensure consistency and predictability, even across different implemenations.

Unfortunately, I think it is too late to fully enforce the OpenAPI spec requirements as downstream tooling relies on this lax API validation. For example, rocketpool sends a request like this eth/v1/beacon/states/head/validators?id=<1000 validator pubkeys, comma-separated> and it works with all CL clients.

nflaig avatar Mar 19 '23 09:03 nflaig

Ok, then let's de-duplicate

dapplion avatar Mar 19 '23 23:03 dapplion

@nflaig is this still useful?

wemeetagain avatar Oct 10 '24 21:10 wemeetagain

yeah, I think we should dedupe the response although this is low prio

nflaig avatar Oct 10 '24 21:10 nflaig

I can do it by adding a uniqBy or a similar function to utils, and using it before returning the validator set. Or, I can check the existence of a possible duplicate and avoid pushing it to the response array. If it sounds good, let me know to open a PR.

mkermani144 avatar May 24 '25 08:05 mkermani144

What do you think about just enforcing it earlier, ie. at schema validation? As per spec the items in query should be unique, see here. It is likely a oversight if someone requests the same index twice, so I am thinking we should just throw a 400 error with a proper error message.

Same is applicable to POST variant of this api POST /eth/v1/beacon/states/{state_id}/validators where we would need to enforce the same on the array passed in the body.

Same for status, the beacon-api has several apis with uniqueItems: true on arrays, it might be good to be stricter in all those cases.

Our schemas are defined here https://github.com/ChainSafe/lodestar/blob/99033e3bd9e7a9e73de99f2a6b3e299f59c76440/packages/api/src/utils/schema.ts#L42

nflaig avatar May 24 '25 08:05 nflaig

@nflaig Yeah, it seems a better solution. I can work on it nonetheless.
This may need a bit of conversation. Should I leave comments here, or open a thread on Discord?

mkermani144 avatar May 24 '25 09:05 mkermani144

Should I leave comments here, or open a thread on Discord?

Either works, but usually Discord works better for fast feedback cycles / discussions

nflaig avatar May 24 '25 17:05 nflaig