prysm icon indicating copy to clipboard operation
prysm copied to clipboard

SSV visit prysm use getBlock V3Proposal, proposal block miss.

Open hwhe opened this issue 11 months ago • 0 comments

Describe the bug

My business test scenario as follows: 4 ssv operators use the same beacon node (use prysm client). After I upgraded the prysm version from v4.2.1 to v5.0.0, the block miss rate increased.

v5.0.0 prysm logs as follows.

17:33:11" level=info msg="Begin building block" prefix="rpc/validator" sinceSlotStartTime=163.167957ms 
17:33:11" level=info msg="Begin building block" prefix="rpc/validator" sinceSlotStartTime=169.776038ms 
17:33:11" level=info msg="Begin building block" prefix="rpc/validator" sinceSlotStartTime=176.72173ms s
17:33:11" level=info msg="Begin building block" prefix="rpc/validator" sinceSlotStartTime=177.049293ms 
17:33:12" level=info msg="Received header with bid" blockHash=0x9cd85236ac929293a18916b97a7302be53390d0
17:33:12" level=info msg="Received header with bid" blockHash=0x10d176fa7b9f165bc26ea4fa1d1bac22c9e73c9
17:33:12" level=info msg="Received header with bid" blockHash=0x10d176fa7b9f165bc26ea4fa1d1bac22c9e73c9
17:33:12" level=info msg="Received header with bid" blockHash=0x10d176fa7b9f165bc26ea4fa1d1bac22c9e73c9
17:33:12" level=info msg="Finished building block" prefix="rpc/validator" sinceSlotStartTime=1.33401147
17:33:13" level=info msg="Finished building block" prefix="rpc/validator" sinceSlotStartTime=2.83911335
17:33:14" level=info msg="Finished building block" prefix="rpc/validator" sinceSlotStartTime=3.55759571
17:33:15" level=info msg="Finished building block" prefix="rpc/validator" sinceSlotStartTime=4.34404219

The logs show that the time when prysm receives four requests is almost the same, and the time when prysm obtains blocks from the remote builder is also almost the same. However, the time for completing blocks varies greatly.

v4.2.1 prysm logs as follows.

13:57:24" level=info msg="Received header with bid" blockHash=0xc9bda138581b4de79d0a1efad16390f1575b2e7c4 13:57:24" level=info msg="Received header with bid" blockHash=0xc9bda138581b4de79d0a1efad16390f1575b2e7c4 13:57:24" level=info msg="Finished building block" prefix="rpc/validator" sinceSlotStartTime=1.127391152s 13:57:24" level=info msg="Received header with bid" blockHash=0xc9bda138581b4de79d0a1efad16390f1575b2e7c4 13:57:24" level=info msg="Received header with bid" blockHash=0x0a024f80f22cd3c394a728c28aca62f8ae1a9e78c 13:57:24" level=info msg="Finished building block" prefix="rpc/validator" sinceSlotStartTime=1.168264301s 13:57:24" level=info msg="Finished building block" prefix="rpc/validator" sinceSlotStartTime=1.209053896s 13:57:24" level=info msg="Finished building block" prefix="rpc/validator" sinceSlotStartTime=1.250095787s

It can be seen that the four concurrent requests are completed at the same time.Very consistent pace.

ths ssv proposal block as follows // executeDuty steps: // 1) sign a partial randao sig and wait for 2f+1 partial sigs from peers // 2) reconstruct randao and send GetBeaconBlock to BN // 3) start consensus on duty + block data // 4) Once consensus decides, sign partial block and broadcast // 5) collect 2f+1 partial sigs, reconstruct and broadcast valid block sig to the BN SSV requires at least three nodes to complete getblock request to complete the consensus operation. Because step 2 is slow, the consensus is slow. As a result, the prososal block duty failed.

so I'm guessing that new version prysm some modify may be not very friendly to concurrency support when dealing getBlocks, resulting in time-consuming sequential execution.

pls help me check it. tks.

Has this worked before in a previous version?

v4.2.1 is good

🔬 Minimal Reproduction

No response

Error

No response

Platform(s)

No response

What version of Prysm are you running? (Which release)

v5.0.0

Anything else relevant (validator index / public key)?

No response

hwhe avatar Mar 09 '24 15:03 hwhe