lodestar
lodestar copied to clipboard
Update libp2p to ESM version
Motivation
We want to get the latest libp2p, gossipsub, etc updates.
Description
TODO:
- more type errors
- [x] https://github.com/multiformats/js-multiaddr/pull/247 - not needed
- [x] https://github.com/libp2p/js-libp2p-interfaces/issues/227
- [x] merge conflicts
- [ ] configure libp2p to not break
Closes #4140 Closes https://github.com/ChainSafe/lodestar/issues/4043
@wemeetagain : is there an estimate on when this will be completed? I'm asking because am hoping Lodestar can use the "canary" version of js-libp2p to provide feedback on the resource protection mechanisms that have recently been introduced.
Performance Report
✔️ no performance regression detected
Full benchmark results
Benchmark suite | Current: 270628128add88c9eeb74850e086a16e60561269 | Previous: 1beeb2bc71ea78e88ea89636d3db2682bb00c0e1 | Ratio |
---|---|---|---|
getPubkeys - index2pubkey - req 1000 vs - 250000 vc | 3.0490 ms/op | 2.2950 ms/op | 1.33 |
getPubkeys - validatorsArr - req 1000 vs - 250000 vc | 96.803 us/op | 74.553 us/op | 1.30 |
BLS verify - blst-native | 2.4159 ms/op | 1.8533 ms/op | 1.30 |
BLS verifyMultipleSignatures 3 - blst-native | 4.8576 ms/op | 3.7984 ms/op | 1.28 |
BLS verifyMultipleSignatures 8 - blst-native | 10.583 ms/op | 8.1662 ms/op | 1.30 |
BLS verifyMultipleSignatures 32 - blst-native | 39.519 ms/op | 29.656 ms/op | 1.33 |
BLS aggregatePubkeys 32 - blst-native | 52.465 us/op | 39.155 us/op | 1.34 |
BLS aggregatePubkeys 128 - blst-native | 202.31 us/op | 152.87 us/op | 1.32 |
getAttestationsForBlock | 126.50 ms/op | 85.728 ms/op | 1.48 |
isKnown best case - 1 super set check | 584.00 ns/op | 421.00 ns/op | 1.39 |
isKnown normal case - 2 super set checks | 555.00 ns/op | 413.00 ns/op | 1.34 |
isKnown worse case - 16 super set checks | 553.00 ns/op | 410.00 ns/op | 1.35 |
CheckpointStateCache - add get delete | 12.007 us/op | 8.7520 us/op | 1.37 |
validate gossip signedAggregateAndProof - struct | 5.8516 ms/op | 4.2644 ms/op | 1.37 |
validate gossip attestation - struct | 2.6393 ms/op | 2.0377 ms/op | 1.30 |
pickEth1Vote - no votes | 2.7208 ms/op | 2.1636 ms/op | 1.26 |
pickEth1Vote - max votes | 25.624 ms/op | 19.963 ms/op | 1.28 |
pickEth1Vote - Eth1Data hashTreeRoot value x2048 | 15.346 ms/op | 11.365 ms/op | 1.35 |
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 | 26.404 ms/op | 20.873 ms/op | 1.26 |
pickEth1Vote - Eth1Data fastSerialize value x2048 | 1.9266 ms/op | 1.4984 ms/op | 1.29 |
pickEth1Vote - Eth1Data fastSerialize tree x2048 | 16.742 ms/op | 13.271 ms/op | 1.26 |
bytes32 toHexString | 1.3290 us/op | 1.0160 us/op | 1.31 |
bytes32 Buffer.toString(hex) | 797.00 ns/op | 740.00 ns/op | 1.08 |
bytes32 Buffer.toString(hex) from Uint8Array | 1.2040 us/op | 1.0110 us/op | 1.19 |
bytes32 Buffer.toString(hex) + 0x | 833.00 ns/op | 737.00 ns/op | 1.13 |
Object access 1 prop | 0.49200 ns/op | 0.36700 ns/op | 1.34 |
Map access 1 prop | 0.36200 ns/op | 0.29700 ns/op | 1.22 |
Object get x1000 | 17.814 ns/op | 18.194 ns/op | 0.98 |
Map get x1000 | 1.0130 ns/op | 1.0350 ns/op | 0.98 |
Object set x1000 | 127.57 ns/op | 117.68 ns/op | 1.08 |
Map set x1000 | 80.423 ns/op | 68.539 ns/op | 1.17 |
Return object 10000 times | 0.40130 ns/op | 0.37780 ns/op | 1.06 |
Throw Error 10000 times | 8.3322 us/op | 5.9567 us/op | 1.40 |
enrSubnets - fastDeserialize 64 bits | 3.3560 us/op | 2.6100 us/op | 1.29 |
enrSubnets - ssz BitVector 64 bits | 808.00 ns/op | 748.00 ns/op | 1.08 |
enrSubnets - fastDeserialize 4 bits | 435.00 ns/op | 381.00 ns/op | 1.14 |
enrSubnets - ssz BitVector 4 bits | 833.00 ns/op | 764.00 ns/op | 1.09 |
prioritizePeers score -10:0 att 32-0.1 sync 2-0 | 116.00 us/op | 95.978 us/op | 1.21 |
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 | 156.63 us/op | 124.82 us/op | 1.25 |
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 | 291.37 us/op | 213.17 us/op | 1.37 |
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 | 668.00 us/op | 467.01 us/op | 1.43 |
prioritizePeers score 0:0 att 64-1 sync 4-1 | 608.12 us/op | 464.00 us/op | 1.31 |
RateTracker 1000000 limit, 1 obj count per request | 229.73 ns/op | 198.88 ns/op | 1.16 |
RateTracker 1000000 limit, 2 obj count per request | 161.02 ns/op | 153.91 ns/op | 1.05 |
RateTracker 1000000 limit, 4 obj count per request | 140.08 ns/op | 129.29 ns/op | 1.08 |
RateTracker 1000000 limit, 8 obj count per request | 128.65 ns/op | 117.24 ns/op | 1.10 |
RateTracker with prune | 5.4210 us/op | 4.3870 us/op | 1.24 |
array of 16000 items push then shift | 5.7012 us/op | 3.1716 us/op | 1.80 |
LinkedList of 16000 items push then shift | 23.408 ns/op | 18.411 ns/op | 1.27 |
array of 16000 items push then pop | 287.55 ns/op | 240.67 ns/op | 1.19 |
LinkedList of 16000 items push then pop | 19.055 ns/op | 17.484 ns/op | 1.09 |
array of 24000 items push then shift | 8.4241 us/op | 4.5562 us/op | 1.85 |
LinkedList of 24000 items push then shift | 22.135 ns/op | 22.420 ns/op | 0.99 |
array of 24000 items push then pop | 261.41 ns/op | 212.00 ns/op | 1.23 |
LinkedList of 24000 items push then pop | 21.404 ns/op | 19.634 ns/op | 1.09 |
intersect bitArray bitLen 8 | 15.382 ns/op | 11.762 ns/op | 1.31 |
intersect array and set length 8 | 231.81 ns/op | 161.62 ns/op | 1.43 |
intersect bitArray bitLen 128 | 84.892 ns/op | 61.951 ns/op | 1.37 |
intersect array and set length 128 | 2.8094 us/op | 2.2117 us/op | 1.27 |
Buffer.concat 32 items | 2.7360 ns/op | 1.9180 ns/op | 1.43 |
pass gossip attestations to forkchoice per slot | 7.5348 ms/op | 4.0512 ms/op | 1.86 |
computeDeltas | 6.0913 ms/op | 6.3272 ms/op | 0.96 |
computeProposerBoostScoreFromBalances | 915.75 us/op | 907.80 us/op | 1.01 |
altair processAttestation - 250000 vs - 7PWei normalcase | 5.6117 ms/op | 3.9559 ms/op | 1.42 |
altair processAttestation - 250000 vs - 7PWei worstcase | 8.1975 ms/op | 5.7794 ms/op | 1.42 |
altair processAttestation - setStatus - 1/6 committees join | 288.96 us/op | 216.77 us/op | 1.33 |
altair processAttestation - setStatus - 1/3 committees join | 543.17 us/op | 416.59 us/op | 1.30 |
altair processAttestation - setStatus - 1/2 committees join | 849.38 us/op | 583.20 us/op | 1.46 |
altair processAttestation - setStatus - 2/3 committees join | 1.0761 ms/op | 747.93 us/op | 1.44 |
altair processAttestation - setStatus - 4/5 committees join | 1.4851 ms/op | 1.0349 ms/op | 1.44 |
altair processAttestation - setStatus - 100% committees join | 1.8126 ms/op | 1.2305 ms/op | 1.47 |
altair processBlock - 250000 vs - 7PWei normalcase | 33.919 ms/op | 29.260 ms/op | 1.16 |
altair processBlock - 250000 vs - 7PWei normalcase hashState | 47.411 ms/op | 40.503 ms/op | 1.17 |
altair processBlock - 250000 vs - 7PWei worstcase | 120.89 ms/op | 83.742 ms/op | 1.44 |
altair processBlock - 250000 vs - 7PWei worstcase hashState | 125.70 ms/op | 108.82 ms/op | 1.16 |
phase0 processBlock - 250000 vs - 7PWei normalcase | 5.1877 ms/op | 4.3816 ms/op | 1.18 |
phase0 processBlock - 250000 vs - 7PWei worstcase | 60.410 ms/op | 49.045 ms/op | 1.23 |
altair processEth1Data - 250000 vs - 7PWei normalcase | 1.1887 ms/op | 814.30 us/op | 1.46 |
Tree 40 250000 create | 1.0853 s/op | 830.11 ms/op | 1.31 |
Tree 40 250000 get(125000) | 341.21 ns/op | 288.04 ns/op | 1.18 |
Tree 40 250000 set(125000) | 2.8838 us/op | 2.9523 us/op | 0.98 |
Tree 40 250000 toArray() | 38.926 ms/op | 33.907 ms/op | 1.15 |
Tree 40 250000 iterate all - toArray() + loop | 39.657 ms/op | 34.791 ms/op | 1.14 |
Tree 40 250000 iterate all - get(i) | 145.47 ms/op | 114.74 ms/op | 1.27 |
MutableVector 250000 create | 16.893 ms/op | 18.215 ms/op | 0.93 |
MutableVector 250000 get(125000) | 14.166 ns/op | 14.802 ns/op | 0.96 |
MutableVector 250000 set(125000) | 843.06 ns/op | 717.62 ns/op | 1.17 |
MutableVector 250000 toArray() | 7.4086 ms/op | 7.8917 ms/op | 0.94 |
MutableVector 250000 iterate all - toArray() + loop | 7.9347 ms/op | 7.9987 ms/op | 0.99 |
MutableVector 250000 iterate all - get(i) | 3.5385 ms/op | 3.4406 ms/op | 1.03 |
Array 250000 create | 7.0889 ms/op | 7.3742 ms/op | 0.96 |
Array 250000 clone - spread | 3.8526 ms/op | 4.2339 ms/op | 0.91 |
Array 250000 get(125000) | 1.6610 ns/op | 1.7150 ns/op | 0.97 |
Array 250000 set(125000) | 1.7860 ns/op | 1.7020 ns/op | 1.05 |
Array 250000 iterate all - loop | 153.20 us/op | 167.87 us/op | 0.91 |
effectiveBalanceIncrements clone Uint8Array 300000 | 356.97 us/op | 85.549 us/op | 4.17 |
effectiveBalanceIncrements clone MutableVector 300000 | 803.00 ns/op | 630.00 ns/op | 1.27 |
effectiveBalanceIncrements rw all Uint8Array 300000 | 314.02 us/op | 252.73 us/op | 1.24 |
effectiveBalanceIncrements rw all MutableVector 300000 | 227.69 ms/op | 169.24 ms/op | 1.35 |
phase0 afterProcessEpoch - 250000 vs - 7PWei | 224.87 ms/op | 201.31 ms/op | 1.12 |
phase0 beforeProcessEpoch - 250000 vs - 7PWei | 84.716 ms/op | 80.299 ms/op | 1.06 |
altair processEpoch - mainnet_e81889 | 677.45 ms/op | 602.13 ms/op | 1.13 |
mainnet_e81889 - altair beforeProcessEpoch | 191.61 ms/op | 156.50 ms/op | 1.22 |
mainnet_e81889 - altair processJustificationAndFinalization | 74.695 us/op | 28.365 us/op | 2.63 |
mainnet_e81889 - altair processInactivityUpdates | 11.347 ms/op | 11.819 ms/op | 0.96 |
mainnet_e81889 - altair processRewardsAndPenalties | 105.32 ms/op | 96.361 ms/op | 1.09 |
mainnet_e81889 - altair processRegistryUpdates | 15.393 us/op | 4.0670 us/op | 3.78 |
mainnet_e81889 - altair processSlashings | 4.6940 us/op | 972.00 ns/op | 4.83 |
mainnet_e81889 - altair processEth1DataReset | 4.1230 us/op | 898.00 ns/op | 4.59 |
mainnet_e81889 - altair processEffectiveBalanceUpdates | 2.6356 ms/op | 2.4184 ms/op | 1.09 |
mainnet_e81889 - altair processSlashingsReset | 25.014 us/op | 9.2450 us/op | 2.71 |
mainnet_e81889 - altair processRandaoMixesReset | 25.251 us/op | 6.9030 us/op | 3.66 |
mainnet_e81889 - altair processHistoricalRootsUpdate | 3.9270 us/op | 1.1710 us/op | 3.35 |
mainnet_e81889 - altair processParticipationFlagUpdates | 14.944 us/op | 4.5380 us/op | 3.29 |
mainnet_e81889 - altair processSyncCommitteeUpdates | 3.6410 us/op | 764.00 ns/op | 4.77 |
mainnet_e81889 - altair afterProcessEpoch | 221.67 ms/op | 198.28 ms/op | 1.12 |
phase0 processEpoch - mainnet_e58758 | 616.17 ms/op | 545.99 ms/op | 1.13 |
mainnet_e58758 - phase0 beforeProcessEpoch | 327.02 ms/op | 245.62 ms/op | 1.33 |
mainnet_e58758 - phase0 processJustificationAndFinalization | 67.292 us/op | 24.110 us/op | 2.79 |
mainnet_e58758 - phase0 processRewardsAndPenalties | 91.132 ms/op | 132.24 ms/op | 0.69 |
mainnet_e58758 - phase0 processRegistryUpdates | 34.264 us/op | 12.119 us/op | 2.83 |
mainnet_e58758 - phase0 processSlashings | 3.9410 us/op | 982.00 ns/op | 4.01 |
mainnet_e58758 - phase0 processEth1DataReset | 4.0030 us/op | 1.0220 us/op | 3.92 |
mainnet_e58758 - phase0 processEffectiveBalanceUpdates | 2.0074 ms/op | 2.3950 ms/op | 0.84 |
mainnet_e58758 - phase0 processSlashingsReset | 17.223 us/op | 4.6280 us/op | 3.72 |
mainnet_e58758 - phase0 processRandaoMixesReset | 27.886 us/op | 5.0820 us/op | 5.49 |
mainnet_e58758 - phase0 processHistoricalRootsUpdate | 4.7070 us/op | 888.00 ns/op | 5.30 |
mainnet_e58758 - phase0 processParticipationRecordUpdates | 25.906 us/op | 5.0020 us/op | 5.18 |
mainnet_e58758 - phase0 afterProcessEpoch | 168.22 ms/op | 173.60 ms/op | 0.97 |
phase0 processEffectiveBalanceUpdates - 250000 normalcase | 2.4557 ms/op | 2.6266 ms/op | 0.93 |
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 | 2.7366 ms/op | 3.4871 ms/op | 0.78 |
altair processInactivityUpdates - 250000 normalcase | 67.569 ms/op | 41.647 ms/op | 1.62 |
altair processInactivityUpdates - 250000 worstcase | 55.950 ms/op | 51.814 ms/op | 1.08 |
phase0 processRegistryUpdates - 250000 normalcase | 27.994 us/op | 8.4310 us/op | 3.32 |
phase0 processRegistryUpdates - 250000 badcase_full_deposits | 540.09 us/op | 455.39 us/op | 1.19 |
phase0 processRegistryUpdates - 250000 worstcase 0.5 | 279.11 ms/op | 218.11 ms/op | 1.28 |
altair processRewardsAndPenalties - 250000 normalcase | 111.94 ms/op | 132.83 ms/op | 0.84 |
altair processRewardsAndPenalties - 250000 worstcase | 166.70 ms/op | 86.961 ms/op | 1.92 |
phase0 getAttestationDeltas - 250000 normalcase | 15.251 ms/op | 13.380 ms/op | 1.14 |
phase0 getAttestationDeltas - 250000 worstcase | 14.338 ms/op | 13.583 ms/op | 1.06 |
phase0 processSlashings - 250000 worstcase | 6.7549 ms/op | 5.4716 ms/op | 1.23 |
altair processSyncCommitteeUpdates - 250000 | 344.39 ms/op | 279.04 ms/op | 1.23 |
BeaconState.hashTreeRoot - No change | 633.00 ns/op | 479.00 ns/op | 1.32 |
BeaconState.hashTreeRoot - 1 full validator | 71.372 us/op | 64.361 us/op | 1.11 |
BeaconState.hashTreeRoot - 32 full validator | 829.38 us/op | 678.28 us/op | 1.22 |
BeaconState.hashTreeRoot - 512 full validator | 8.8300 ms/op | 6.8405 ms/op | 1.29 |
BeaconState.hashTreeRoot - 1 validator.effectiveBalance | 102.53 us/op | 87.513 us/op | 1.17 |
BeaconState.hashTreeRoot - 32 validator.effectiveBalance | 1.6028 ms/op | 1.1719 ms/op | 1.37 |
BeaconState.hashTreeRoot - 512 validator.effectiveBalance | 19.592 ms/op | 15.904 ms/op | 1.23 |
BeaconState.hashTreeRoot - 1 balances | 84.721 us/op | 62.226 us/op | 1.36 |
BeaconState.hashTreeRoot - 32 balances | 714.47 us/op | 615.06 us/op | 1.16 |
BeaconState.hashTreeRoot - 512 balances | 7.5394 ms/op | 5.8641 ms/op | 1.29 |
BeaconState.hashTreeRoot - 250000 balances | 124.75 ms/op | 94.954 ms/op | 1.31 |
aggregationBits - 2048 els - zipIndexesInBitList | 35.332 us/op | 36.236 us/op | 0.98 |
regular array get 100000 times | 62.570 us/op | 67.419 us/op | 0.93 |
wrappedArray get 100000 times | 61.818 us/op | 67.445 us/op | 0.92 |
arrayWithProxy get 100000 times | 37.099 ms/op | 28.922 ms/op | 1.28 |
ssz.Root.equals | 602.00 ns/op | 533.00 ns/op | 1.13 |
byteArrayEquals | 599.00 ns/op | 507.00 ns/op | 1.18 |
shuffle list - 16384 els | 13.165 ms/op | 11.080 ms/op | 1.19 |
shuffle list - 250000 els | 185.06 ms/op | 166.95 ms/op | 1.11 |
processSlot - 1 slots | 17.990 us/op | 14.083 us/op | 1.28 |
processSlot - 32 slots | 2.4390 ms/op | 1.7584 ms/op | 1.39 |
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei | 473.10 us/op | 399.59 us/op | 1.18 |
getCommitteeAssignments - req 1 vs - 250000 vc | 5.5776 ms/op | 5.3835 ms/op | 1.04 |
getCommitteeAssignments - req 100 vs - 250000 vc | 8.2227 ms/op | 7.3234 ms/op | 1.12 |
getCommitteeAssignments - req 1000 vs - 250000 vc | 9.1009 ms/op | 7.7859 ms/op | 1.17 |
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei | 10.710 ns/op | 10.840 ns/op | 0.99 |
state getBlockRootAtSlot - 250000 vs - 7PWei | 1.5647 us/op | 1.0767 us/op | 1.45 |
computeProposers - vc 250000 | 23.492 ms/op | 17.763 ms/op | 1.32 |
computeEpochShuffling - vc 250000 | 180.65 ms/op | 169.58 ms/op | 1.07 |
getNextSyncCommittee - vc 250000 | 357.28 ms/op | 271.46 ms/op | 1.32 |
by benchmarkbot/action
This is ready for initial review, the last remaining items to fix are related to updating stream limits per protocol. Running this locally shows that there are many errors related to 'too many streams' per protocol.
Seeing gossip logs like:
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm9KWBLBBNMg7fvQNjYEkja8XgRLaT59AdvuYj3WH8DwTS as there is no open stream to it available +0ms
libp2p:gossipsub:score unexpected delivery: message from 16Uiu2HAmE3P9zMXa33mafrvKemkPfMLLwFc9wpVwMFCQErfx6FPa was first seen 2505 ago and has delivery status NaN +0ms
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm2zjUvbtntJfrpXz2ehpv9BhHRqjgziYWbbi3nKoGsEoh as there is no open stream to it available +0ms
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm9KWBLBBNMg7fvQNjYEkja8XgRLaT59AdvuYj3WH8DwTS as there is no open stream to it available +0ms
libp2p:gossipsub:score unexpected delivery: message from 16Uiu2HAmE3P9zMXa33mafrvKemkPfMLLwFc9wpVwMFCQErfx6FPa was first seen 2525 ago and has delivery status NaN +1ms
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm2zjUvbtntJfrpXz2ehpv9BhHRqjgziYWbbi3nKoGsEoh as there is no open stream to it available +1ms
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm9KWBLBBNMg7fvQNjYEkja8XgRLaT59AdvuYj3WH8DwTS as there is no open stream to it available +0ms
libp2p:gossipsub:score unexpected delivery: message from 16Uiu2HAmE3P9zMXa33mafrvKemkPfMLLwFc9wpVwMFCQErfx6FPa was first seen 2671 ago and has delivery status NaN +1ms
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm2zjUvbtntJfrpXz2ehpv9BhHRqjgziYWbbi3nKoGsEoh as there is no open stream to it available +1ms
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm9KWBLBBNMg7fvQNjYEkja8XgRLaT59AdvuYj3WH8DwTS as there is no open stream to it available +0ms
libp2p:gossipsub:score unexpected delivery: message from 16Uiu2HAmE3P9zMXa33mafrvKemkPfMLLwFc9wpVwMFCQErfx6FPa was first seen 3083 ago and has delivery status NaN +1ms
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm2zjUvbtntJfrpXz2ehpv9BhHRqjgziYWbbi3nKoGsEoh as there is no open stream to it available +1ms
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm9KWBLBBNMg7fvQNjYEkja8XgRLaT59AdvuYj3WH8DwTS as there is no open stream to it available +0ms
libp2p:gossipsub:score unexpected delivery: message from 16Uiu2HAmE3P9zMXa33mafrvKemkPfMLLwFc9wpVwMFCQErfx6FPa was first seen 3152 ago and has delivery status NaN +0ms
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm2zjUvbtntJfrpXz2ehpv9BhHRqjgziYWbbi3nKoGsEoh as there is no open stream to it available +0ms
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm9KWBLBBNMg7fvQNjYEkja8XgRLaT59AdvuYj3WH8DwTS as there is no open stream to it available +0ms
libp2p:gossipsub:score unexpected delivery: message from 16Uiu2HAmE3P9zMXa33mafrvKemkPfMLLwFc9wpVwMFCQErfx6FPa was first seen 3212 ago and has delivery status NaN +0ms
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm2zjUvbtntJfrpXz2ehpv9BhHRqjgziYWbbi3nKoGsEoh as there is no open stream to it available +1ms
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm9KWBLBBNMg7fvQNjYEkja8XgRLaT59AdvuYj3WH8DwTS as there is no open stream to it available +0ms
libp2p:gossipsub:score unexpected delivery: message from 16Uiu2HAmE3P9zMXa33mafrvKemkPfMLLwFc9wpVwMFCQErfx6FPa was first seen 3225 ago and has delivery status NaN +1ms
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm2zjUvbtntJfrpXz2ehpv9BhHRqjgziYWbbi3nKoGsEoh as there is no open stream to it available +0ms
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm9KWBLBBNMg7fvQNjYEkja8XgRLaT59AdvuYj3WH8DwTS as there is no open stream to it available +0ms
libp2p:gossipsub:score unexpected delivery: message from 16Uiu2HAmE3P9zMXa33mafrvKemkPfMLLwFc9wpVwMFCQErfx6FPa was first seen 3254 ago and has delivery status NaN +1ms
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm2zjUvbtntJfrpXz2ehpv9BhHRqjgziYWbbi3nKoGsEoh as there is no open stream to it available +1ms
libp2p:gossipsub Cannot send RPC to 16Uiu2HAm9KWBLBBNMg7fvQNjYEkja8XgRLaT59AdvuYj3WH8DwTS as there is no open stream to it available +0ms
gossipsub appears to be working after https://github.com/ChainSafe/js-libp2p-gossipsub/pull/298
Last issue is regarding req/resp behavior:
With this PR:
Current stable:
looking at the logs:
this PR:
cat logs | grep method=ping
Jul-09 19:11:54.736[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Nimbus, peer=16...Cqkngc, requestId=9097, body=0
Jul-09 19:11:54.738[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Nimbus, peer=16...Cqkngc, requestId=9097
Jul-09 19:11:55.322[NETWORK] verbose: Req done method=ping, encoding=ssz_snappy, client=Nimbus, peer=16...Cqkngc, requestId=9097, numResponse=1
Jul-09 19:11:58.431[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Unknown, peer=16...x9EBmM, requestId=9098
Jul-09 19:11:58.448[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Unknown, peer=16...oFAqDy, requestId=9100
Jul-09 19:11:58.485[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Unknown, peer=16...eoCHbq, requestId=9102
Jul-09 19:11:58.503[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Unknown, peer=16...Yux7ye, requestId=9104
Jul-09 19:11:58.556[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Unknown, peer=16...x9EBmM, requestId=9098, body=0
Jul-09 19:11:58.558[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Unknown, peer=16...x9EBmM, requestId=9098
Jul-09 19:11:58.561[NETWORK] verbose: Req error method=ping, encoding=ssz_snappy, client=Unknown, peer=16...x9EBmM, requestId=9098 code=REQUEST_ERROR_EMPTY_RESPONSE
Jul-09 19:11:58.586[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Unknown, peer=16...oFAqDy, requestId=9100, body=0
Jul-09 19:11:58.587[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Unknown, peer=16...oFAqDy, requestId=9100
Jul-09 19:11:58.590[NETWORK] verbose: Req error method=ping, encoding=ssz_snappy, client=Unknown, peer=16...oFAqDy, requestId=9100 code=REQUEST_ERROR_EMPTY_RESPONSE
Jul-09 19:11:58.610[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Unknown, peer=16...kreBey, requestId=9106
Jul-09 19:11:58.623[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Unknown, peer=16...eoCHbq, requestId=9102, body=0
Jul-09 19:11:58.625[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Unknown, peer=16...eoCHbq, requestId=9102
Jul-09 19:11:58.627[NETWORK] verbose: Req error method=ping, encoding=ssz_snappy, client=Unknown, peer=16...eoCHbq, requestId=9102 code=REQUEST_ERROR_EMPTY_RESPONSE
Jul-09 19:11:58.668[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Unknown, peer=16...Yux7ye, requestId=9104, body=0
Jul-09 19:11:58.671[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Unknown, peer=16...Yux7ye, requestId=9104
Jul-09 19:11:58.674[NETWORK] verbose: Req error method=ping, encoding=ssz_snappy, client=Unknown, peer=16...Yux7ye, requestId=9104 code=REQUEST_ERROR_EMPTY_RESPONSE
Jul-09 19:11:58.736[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Unknown, peer=16...3C524j, requestId=9108
Jul-09 19:11:58.857[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Unknown, peer=16...pAkwoA, requestId=9110
Jul-09 19:11:58.870[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Unknown, peer=16...TUjGiF, requestId=9112
Jul-09 19:11:58.925[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Unknown, peer=16...8GBh6W, requestId=9114
Jul-09 19:11:58.933[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Unknown, peer=16...3C524j, requestId=9108, body=0
Jul-09 19:11:58.937[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Unknown, peer=16...3C524j, requestId=9108
Jul-09 19:11:58.939[NETWORK] verbose: Req error method=ping, encoding=ssz_snappy, client=Unknown, peer=16...3C524j, requestId=9108 code=REQUEST_ERROR_EMPTY_RESPONSE
Jul-09 19:11:58.947[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Unknown, peer=16...kreBey, requestId=9106, body=0
Jul-09 19:11:58.950[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Unknown, peer=16...kreBey, requestId=9106
Jul-09 19:11:58.959[NETWORK] verbose: Req error method=ping, encoding=ssz_snappy, client=Unknown, peer=16...kreBey, requestId=9106 code=REQUEST_ERROR_EMPTY_RESPONSE
Jul-09 19:11:59.030[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Unknown, peer=16...BFTf6V, requestId=9116
Jul-09 19:11:59.072[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Unknown, peer=16...8GBh6W, requestId=9114, body=0
Jul-09 19:11:59.073[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Unknown, peer=16...8GBh6W, requestId=9114
Jul-09 19:11:59.081[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Unknown, peer=16...PV9yXS, requestId=9118
Jul-09 19:11:59.160[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Unknown, peer=16...FQNgj9, requestId=9120
Jul-09 19:11:59.174[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Unknown, peer=16...pAkwoA, requestId=9110, body=0
Jul-09 19:11:59.177[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Unknown, peer=16...pAkwoA, requestId=9110
Jul-09 19:11:59.181[NETWORK] verbose: Req error method=ping, encoding=ssz_snappy, client=Unknown, peer=16...pAkwoA, requestId=9110 code=REQUEST_ERROR_EMPTY_RESPONSE
Jul-09 19:11:59.189[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Unknown, peer=16...TUjGiF, requestId=9112, body=0
Jul-09 19:11:59.191[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Unknown, peer=16...TUjGiF, requestId=9112
Jul-09 19:11:59.193[NETWORK] verbose: Req error method=ping, encoding=ssz_snappy, client=Unknown, peer=16...TUjGiF, requestId=9112 code=REQUEST_ERROR_EMPTY_RESPONSE
Jul-09 19:11:59.249[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Unknown, peer=16...vQ6bQN, requestId=9122
Jul-09 19:11:59.348[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Unknown, peer=16...BFTf6V, requestId=9116, body=0
Jul-09 19:11:59.349[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Unknown, peer=16...BFTf6V, requestId=9116
Jul-09 19:11:59.356[NETWORK] verbose: Req error method=ping, encoding=ssz_snappy, client=Unknown, peer=16...BFTf6V, requestId=9116 code=REQUEST_ERROR_EMPTY_RESPONSE
Jul-09 19:11:59.362[NETWORK] verbose: Req done method=ping, encoding=ssz_snappy, client=Unknown, peer=16...8GBh6W, requestId=9114, numResponse=1
Jul-09 19:11:59.398[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Unknown, peer=16...vQ6bQN, requestId=9122, body=0
Jul-09 19:11:59.400[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Unknown, peer=16...vQ6bQN, requestId=9122
Jul-09 19:11:59.416[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Unknown, peer=16...PV9yXS, requestId=9118, body=0
Jul-09 19:11:59.418[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Unknown, peer=16...PV9yXS, requestId=9118
Jul-09 19:11:59.422[NETWORK] verbose: Req error method=ping, encoding=ssz_snappy, client=Unknown, peer=16...PV9yXS, requestId=9118 code=REQUEST_ERROR_EMPTY_RESPONSE
Jul-09 19:11:59.525[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Unknown, peer=16...FQNgj9, requestId=9120, body=0
Jul-09 19:11:59.534[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Unknown, peer=16...FQNgj9, requestId=9120
Jul-09 19:11:59.539[NETWORK] verbose: Req error method=ping, encoding=ssz_snappy, client=Unknown, peer=16...FQNgj9, requestId=9120 code=REQUEST_ERROR_EMPTY_RESPONSE
Jul-09 19:11:59.834[NETWORK] verbose: Req done method=ping, encoding=ssz_snappy, client=Unknown, peer=16...vQ6bQN, requestId=9122, numResponse=1
Jul-09 19:12:03.096[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Prysm, peer=16...AmMsUF, requestId=9126
Jul-09 19:12:03.242[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Prysm, peer=16...AmMsUF, requestId=9126, body=0
Jul-09 19:12:03.242[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Prysm, peer=16...AmMsUF, requestId=9126
Jul-09 19:12:03.385[NETWORK] verbose: Req done method=ping, encoding=ssz_snappy, client=Prysm, peer=16...AmMsUF, requestId=9126, numResponse=1
Jul-09 19:12:07.576[NETWORK] debug: Resp received request method=ping, client=Prysm, peer=16...nRwVeq, requestId=4994, body=1
Jul-09 19:12:07.578[NETWORK] verbose: Resp done method=ping, client=Prysm, peer=16...nRwVeq, requestId=4994
Jul-09 19:12:09.412[NETWORK] debug: Resp received request method=ping, client=Lighthouse, peer=16...HQRasZ, requestId=4995, body=1858
Jul-09 19:12:09.413[NETWORK] verbose: Resp done method=ping, client=Lighthouse, peer=16...HQRasZ, requestId=4995
Jul-09 19:12:13.119[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Nimbus, peer=16...poaCnX, requestId=9127
Jul-09 19:12:13.779[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Nimbus, peer=16...poaCnX, requestId=9127, body=0
Jul-09 19:12:13.782[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Nimbus, peer=16...poaCnX, requestId=9127
Jul-09 19:12:15.208[NETWORK] verbose: Req done method=ping, encoding=ssz_snappy, client=Nimbus, peer=16...poaCnX, requestId=9127, numResponse=1
Jul-09 19:12:17.387[NETWORK] debug: Resp received request method=ping, client=Prysm, peer=16...X1CM45, requestId=4997, body=1
Jul-09 19:12:17.540[NETWORK] verbose: Resp done method=ping, client=Prysm, peer=16...X1CM45, requestId=4997
Jul-09 19:12:18.188[NETWORK] debug: Resp received request method=ping, client=Prysm, peer=16...AmMsUF, requestId=4999, body=16
Jul-09 19:12:18.189[NETWORK] verbose: Resp done method=ping, client=Prysm, peer=16...AmMsUF, requestId=4999
Jul-09 19:12:20.391[NETWORK] debug: Resp received request method=ping, client=Unknown, peer=16...vQ6bQN, requestId=5000, body=535
Jul-09 19:12:20.394[NETWORK] verbose: Resp done method=ping, client=Unknown, peer=16...vQ6bQN, requestId=5000
current stable:
cat logs | grep method=ping
Jul-07 23:59:26.186[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...kZKAAL, requestId=617659, body=1
Jul-07 23:59:26.187[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...kZKAAL, requestId=617659
Jul-07 23:59:28.256[NETWORK] debug: Resp received request method=ping, client=Lighthouse, peer=16...tXTAHY, requestId=617660, body=1865
Jul-07 23:59:28.257[NETWORK] verbose: Resp done method=ping, client=Lighthouse, peer=16...tXTAHY, requestId=617660
Jul-07 23:59:29.667[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Nimbus, peer=16...b81hCx, requestId=223310
Jul-07 23:59:29.668[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Prysm, peer=16...TNFJH1, requestId=223311
Jul-07 23:59:29.763[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Nimbus, peer=16...b81hCx, requestId=223310, body=7
Jul-07 23:59:29.764[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Nimbus, peer=16...b81hCx, requestId=223310
Jul-07 23:59:29.826[NETWORK] verbose: Req done method=ping, encoding=ssz_snappy, client=Nimbus, peer=16...b81hCx, requestId=223310, numResponse=1
Jul-07 23:59:29.886[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Prysm, peer=16...TNFJH1, requestId=223311, body=7
Jul-07 23:59:29.887[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Prysm, peer=16...TNFJH1, requestId=223311
Jul-07 23:59:29.920[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...tRMb5C, requestId=617662, body=9
Jul-07 23:59:29.920[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...tRMb5C, requestId=617662
Jul-07 23:59:30.105[NETWORK] verbose: Req done method=ping, encoding=ssz_snappy, client=Prysm, peer=16...TNFJH1, requestId=223311, numResponse=1
Jul-07 23:59:30.233[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...Y4Sk7n, requestId=617663, body=6
Jul-07 23:59:30.234[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...Y4Sk7n, requestId=617663
Jul-07 23:59:30.788[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...RHxnGh, requestId=617664, body=1
Jul-07 23:59:30.789[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...RHxnGh, requestId=617664
Jul-07 23:59:31.574[NETWORK] debug: Resp received request method=ping, client=Prysm, peer=16...W4FzZB, requestId=617666, body=8
Jul-07 23:59:31.575[NETWORK] verbose: Resp done method=ping, client=Prysm, peer=16...W4FzZB, requestId=617666
Jul-07 23:59:31.843[NETWORK] debug: Resp received request method=ping, client=Prysm, peer=16...xxC1bz, requestId=617668, body=22
Jul-07 23:59:31.844[NETWORK] verbose: Resp done method=ping, client=Prysm, peer=16...xxC1bz, requestId=617668
Jul-07 23:59:35.007[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...ta9ahW, requestId=617669, body=45
Jul-07 23:59:35.009[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...ta9ahW, requestId=617669
Jul-07 23:59:36.229[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...kZKAAL, requestId=617670, body=1
Jul-07 23:59:36.233[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...kZKAAL, requestId=617670
Jul-07 23:59:39.668[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Prysm, peer=16...T5F4ms, requestId=223313
Jul-07 23:59:39.867[NETWORK] debug: Req sending request method=ping, encoding=ssz_snappy, client=Prysm, peer=16...T5F4ms, requestId=223313, body=7
Jul-07 23:59:39.868[NETWORK] debug: Req request sent method=ping, encoding=ssz_snappy, client=Prysm, peer=16...T5F4ms, requestId=223313
Jul-07 23:59:39.922[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...tRMb5C, requestId=617671, body=9
Jul-07 23:59:39.923[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...tRMb5C, requestId=617671
Jul-07 23:59:40.109[NETWORK] verbose: Req done method=ping, encoding=ssz_snappy, client=Prysm, peer=16...T5F4ms, requestId=223313, numResponse=1
Jul-07 23:59:40.231[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...Y4Sk7n, requestId=617672, body=6
Jul-07 23:59:40.232[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...Y4Sk7n, requestId=617672
Jul-07 23:59:41.673[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...RHxnGh, requestId=617673, body=1
Jul-07 23:59:41.678[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...RHxnGh, requestId=617673
Jul-07 23:59:44.960[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...ta9ahW, requestId=617674, body=45
Jul-07 23:59:44.961[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...ta9ahW, requestId=617674
Jul-07 23:59:46.185[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...kZKAAL, requestId=617675, body=1
Jul-07 23:59:46.186[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...kZKAAL, requestId=617675
Jul-07 23:59:48.252[NETWORK] debug: Resp received request method=ping, client=Lighthouse, peer=16...tXTAHY, requestId=617676, body=1865
Jul-07 23:59:48.253[NETWORK] verbose: Resp done method=ping, client=Lighthouse, peer=16...tXTAHY, requestId=617676
Jul-07 23:59:49.964[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...tRMb5C, requestId=617678, body=9
Jul-07 23:59:49.966[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...tRMb5C, requestId=617678
Jul-07 23:59:50.235[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...Y4Sk7n, requestId=617679, body=6
Jul-07 23:59:50.236[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...Y4Sk7n, requestId=617679
Jul-07 23:59:50.691[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...RHxnGh, requestId=617680, body=1
Jul-07 23:59:50.692[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...RHxnGh, requestId=617680
Jul-07 23:59:55.016[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...ta9ahW, requestId=617682, body=45
Jul-07 23:59:55.018[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...ta9ahW, requestId=617682
Jul-07 23:59:55.595[NETWORK] debug: Resp received request method=ping, client=Prysm, peer=16...KnuCCp, requestId=617684, body=191
Jul-07 23:59:55.595[NETWORK] verbose: Resp done method=ping, client=Prysm, peer=16...KnuCCp, requestId=617684
Jul-07 23:59:56.230[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...kZKAAL, requestId=617685, body=1
Jul-07 23:59:56.231[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...kZKAAL, requestId=617685
Jul-07 23:59:59.670[NETWORK] debug: Req dialing peer method=ping, encoding=ssz_snappy, client=Prysm, peer=16...sGmKgx, requestId=223317
Jul-07 23:59:59.920[NETWORK] debug: Resp received request method=ping, client=Teku, peer=16...tRMb5C, requestId=617686, body=9
Jul-07 23:59:59.921[NETWORK] verbose: Resp done method=ping, client=Teku, peer=16...tRMb5C, requestId=617686
I'm seeing a memory leak, not sure if its related to this PR
Also the peer count seems off. Connected peers vs # of peers that gossipsub is tracking is different
Some more metrics
Searching thru 'Too many' in the logs shows many many failed inbound streams. It turns out that most/all of the errors look like this:
2022-07-12T11:09:11.012Z libp2p:upgrader outbound: incoming stream opened on /eth2/beacon_chain/req/goodbye/1/ssz_snappy
2022-07-12T11:09:11.013Z libp2p:upgrader:error Error: Too many incoming protocol streams
at file:///home/cayman/Code/lodestar/node_modules/libp2p/src/upgrader.ts:345:31
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
Looking at logs/metrics on contabo-8
Consistently low peer count (~20ish). Choppy metrics (metrics update timeout?)
This one is weird because we already had connected to this peer. Maybe nothing (peer just turned off their node). Just with Prysm.
Jul-16 00:00:54.651[NETWORK] ^[[36mverbose^[[39m: Req error method=ping, encoding=ssz_snappy, client=Prysm, peer=16...6YeGaz, requestId=55618 code=REQUEST_ERROR_DIAL_ERROR, error=All promises were rejected
AggregateError: All promises were rejected
Error: REQUEST_ERROR_DIAL_ERROR
at file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/index.ts:100:15
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/index.ts:86:20)
at ReqResp.sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:156:22)
at ReqResp.ping (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:102:12)
at PeerManager.requestPing (file:///usr/src/lodestar/packages/beacon-node/src/network/peers/peerManager.ts:348:25)
Jul-16 00:00:54.652[NETWORK] ^[[36mverbose^[[39m: Req error method=status, encoding=ssz_snappy, client=Prysm, peer=16...6YeGaz, requestId=55622 code=REQUEST_ERROR_DIAL_ERROR, error=All promises were rejected
AggregateError: All promises were rejected
Error: REQUEST_ERROR_DIAL_ERROR
at file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/index.ts:100:15
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/index.ts:86:20)
at ReqResp.sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:156:22)
at ReqResp.status (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:94:12)
at PeerManager.requestStatus (file:///usr/src/lodestar/packages/beacon-node/src/network/peers/peerManager.ts:360:27)
at async Promise.all (index 0)
at PeerManager.requestStatusMany (file:///usr/src/lodestar/packages/beacon-node/src/network/peers/peerManager.ts:369:7)
Several stream resets like this from prysm
Jul-16 00:01:05.864[NETWORK] ^[[36mverbose^[[39m: Req error method=ping, encoding=ssz_snappy, client=Prysm, peer=16...okdd6X, requestId=55631 stream reset
Error: stream reset
at Object.reset (file:///usr/src/lodestar/node_modules/@libp2p/mplex/src/stream.ts:140:27)
at MplexStreamMuxer._handleIncoming (file:///usr/src/lodestar/node_modules/@libp2p/mplex/src/mplex.ts:298:16)
at file:///usr/src/lodestar/node_modules/@libp2p/mplex/src/mplex.ts:191:20
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at MplexStreamMuxer.sink (file:///usr/src/lodestar/node_modules/@libp2p/mplex/src/mplex.ts:185:9)
Still a few 'too many outgoing' status streams to a peer
Jul-16 00:01:06.735[NETWORK] ^[[36mverbose^[[39m: Req error method=status, encoding=ssz_snappy, client=Prysm, peer=16...Dkr5sw, requestId=55634 code=REQUEST_ERROR_DIAL_ERROR, error=Too many outgoing protocol streams
Error: Too many outgoing protocol streams
at ConnectionImpl.newStream [as _newStream] (file:///usr/src/lodestar/node_modules/libp2p/src/upgrader.ts:392:27)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at ConnectionImpl.newStream (file:///usr/src/lodestar/node_modules/@libp2p/connection/src/index.ts:121:20)
at EventTarget.dialProtocol (file:///usr/src/lodestar/node_modules/libp2p/src/libp2p.ts:421:12)
at file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/index.ts:89:22
at withTimeout (file:///usr/src/lodestar/packages/utils/src/timeout.ts:19:12)
at sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/index.ts:86:20)
at ReqResp.sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:156:22)
at ReqResp.status (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:94:12)
Error: REQUEST_ERROR_DIAL_ERROR
at file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/index.ts:100:15
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/index.ts:86:20)
at ReqResp.sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:156:22)
at ReqResp.status (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:94:12)
at PeerManager.requestStatus (file:///usr/src/lodestar/packages/beacon-node/src/network/peers/peerManager.ts:360:27)
at async Promise.all (index 1)
at PeerManager.requestStatusMany (file:///usr/src/lodestar/packages/beacon-node/src/network/peers/peerManager.ts:369:7)
Sometimes we're not getting ping or status responses?
Jul-16 00:01:39.887[NETWORK] ^[[36mverbose^[[39m: Req error method=ping, encoding=ssz_snappy, client=Unknown, peer=16...5DzgaP, requestId=55655 code=REQUEST_ERROR_EMPTY_RESPONSE
Error: REQUEST_ERROR_EMPTY_RESPONSE
at file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/collectResponses.ts:20:13
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at runNextTicks (node:internal/process/task_queues:65:3)
at listOnTimeout (node:internal/timers:526:9)
at processTimers (node:internal/timers:500:7)
at withTimeout (file:///usr/src/lodestar/packages/utils/src/timeout.ts:19:12)
at sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/index.ts:133:25)
at ReqResp.sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:156:22)
at ReqResp.ping (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:102:12)
Jul-16 00:01:39.888[NETWORK] ^[[36mverbose^[[39m: Req error method=status, encoding=ssz_snappy, client=Unknown, peer=16...5DzgaP, requestId=55656 code=REQUEST_ERROR_EMPTY_RESPONSE
Error: REQUEST_ERROR_EMPTY_RESPONSE
at file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/collectResponses.ts:20:13
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at runNextTicks (node:internal/process/task_queues:65:3)
at listOnTimeout (node:internal/timers:526:9)
at processTimers (node:internal/timers:500:7)
at withTimeout (file:///usr/src/lodestar/packages/utils/src/timeout.ts:19:12)
at sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/index.ts:133:25)
at ReqResp.sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:156:22)
at ReqResp.status (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:94:12)
Similar thing with a goodbye
Jul-16 00:05:57.605[NETWORK] ^[[36mverbose^[[39m: Req error method=goodbye, encoding=ssz_snappy, client=Teku, peer=16...EEC8aU, requestId=55798 code=REQUEST_ERROR_EMPTY_RESPONSE
Error: REQUEST_ERROR_EMPTY_RESPONSE
at file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/collectResponses.ts:20:13
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at withTimeout (file:///usr/src/lodestar/packages/utils/src/timeout.ts:19:12)
at sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/index.ts:133:25)
at ReqResp.sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:156:22)
at ReqResp.goodbye (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:98:5)
at PeerManager.goodbyeAndDisconnect (file:///usr/src/lodestar/packages/beacon-node/src/network/peers/peerManager.ts:604:7)
Jul-16 00:05:57.605[NETWORK] ^[[36mverbose^[[39m: Failed to send goodbye peer=16...EEC8aU method=goodbye, encoding=ssz_snappy, peer=16...EEC8aU, code=REQUEST_ERROR_EMPTY_RESPONSE
Error: REQUEST_ERROR_EMPTY_RESPONSE
at sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/index.ts:171:13)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at ReqResp.sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:156:22)
at ReqResp.goodbye (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:98:5)
at PeerManager.goodbyeAndDisconnect (file:///usr/src/lodestar/packages/beacon-node/src/network/peers/peerManager.ts:604:7)
And this one is strange, we've made connection, then trying to update metadata
Jul-16 00:08:29.373[NETWORK] ^[[36mverbose^[[39m: Req error method=metadata, encoding=ssz_snappy, client=Teku, peer=16...ugsGLm, requestId=55953 code=REQUEST_ERROR_DIAL_ERROR, error=stream ended before 1 bytes became available
Error: stream ended before 1 bytes became available
at file:///usr/src/lodestar/node_modules/it-reader/src/index.ts:44:9
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at runNextTicks (node:internal/process/task_queues:65:3)
at listOnTimeout (node:internal/timers:526:9)
at processTimers (node:internal/timers:500:7)
at Object.next (file:///usr/src/lodestar/node_modules/@libp2p/multistream-select/src/multistream.ts:47:23)
at abortable (file:///usr/src/lodestar/node_modules/abortable-iterator/src/index.ts:42:18)
at decoder (file:///usr/src/lodestar/node_modules/it-length-prefixed/src/decode.ts:107:22)
at first (/usr/src/lodestar/node_modules/it-first/index.js:11:20)
Error: REQUEST_ERROR_DIAL_ERROR
at file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/index.ts:100:15
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at runNextTicks (node:internal/process/task_queues:65:3)
at listOnTimeout (node:internal/timers:526:9)
at processTimers (node:internal/timers:500:7)
at sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/request/index.ts:86:20)
at ReqResp.sendRequest (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:156:22)
at ReqResp.metadata (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/reqResp.ts:108:12)
at PeerManager.requestMetadata (file:///usr/src/lodestar/packages/beacon-node/src/network/peers/peerManager.ts:340:29)
A few of this error just with Lighthouse
Jul-16 00:09:41.549[NETWORK] ^[[36mverbose^[[39m: Resp error method=ping, client=Lighthouse, peer=16...MLTEPx, requestId=42983 Timeout
Error: Timeout
at timeoutPromise (file:///usr/src/lodestar/packages/utils/src/timeout.ts:15:11)
at runMicrotasks (<anonymous>)
at runNextTicks (node:internal/process/task_queues:61:5)
at processTimers (node:internal/timers:497:9)
at withTimeout (file:///usr/src/lodestar/packages/utils/src/timeout.ts:19:12)
at requestHandlerSource (file:///usr/src/lodestar/packages/beacon-node/src/network/reqresp/response/index.ts:58:29)
at file:///usr/src/lodestar/node_modules/it-handshake/src/index.ts:29:5
at abortable (file:///usr/src/lodestar/node_modules/abortable-iterator/src/index.ts:42:18)
at Object.sink (file:///usr/src/lodestar/node_modules/@libp2p/mplex/src/stream.ts:164:26)
here's a snippet of a CPU profile of a running node:
Self Total Function Library
35.32 % 35.32 % concat concat.js:1 uint8arrays
35.32 % 35.32 % slice index.js:121 uint8arraylist
9.75 % 9.75 % ReadHandlers decode.js:11 it-length-prefixed
8.49 % 8.49 % decoder decode.js:59 it-length-prefixed
8.39 % 8.39 % (anonymous) index.js:28 it-handshake
4.29 % 4.29 % write decode.js:8 mplex
4.29 % 4.29 % bytesDecode bytes.js:13 protons-runtime
35% of run time is dedicated to concatenating uint8arrays :(
Current status:
Testing against the latest version of libp2p Local sanity checks passing (run the node, wait for peers) Deployed this branch to the feat1 group :crossed_fingers:
Will fix unit tests and merge conflicts now
Enabled libp2p metrics and attached a new dashboard exposing the metrics
Thanks for sharing @wemeetagain . No expectation to respond as don't want to create extra work, but if there's a place can point to, I'd be interested in knowing what the remaining work is and if there's anything needed from js-libp2p at this point.
I don't think there's anything remaining on js-libp2p's side that we need.
Performance is now mostly on-par or better than our unstable and stable branches. We've redeployed 67da674 across a group in our fleet (roughly 24 hrs ago) and are monitoring metrics to iron out the last remaining things.
The big thing we're working through:
- Metrics show external memory (eg: ArrayBuffer/Buffer memory, native code memory, etc) grows, like a memory leak, but occasionally and inconsistently drops back to normal levels. We're still figuring out the root cause. (I'm hoping that https://github.com/ChainSafe/js-libp2p-gossipsub/pull/325 will fix this but we'll see)
cc @achingbrain we finally did it buddy
Some different gossipsub metrics I notice after 6h-7h (in a 12h timeframe):
-
Heartbeat is way faster
-
Score functions are called less
-
We send out way less RPC control messages
-
We receive more RPC PRUNE messages ===> need to investigate!
-
More gossip attestation messages are dropped ==> something to notice
-
Besides, epoch transition is faster ==> is this coincidence?
-
SubmitPoolAttestations request/response time is bigger ===> this is a concern as we already have the issue #4600 after 1.1.0