rolling-shutter
rolling-shutter copied to clipboard
Fix flakey `p2p.TestStartNetworkNodeIntegration` test
Sometimes the TestStartNetworkNodeIntegration fails with a slice out of bounds error in the gossipsub-router
=== FAIL: p2p TestStartNetworkNodeIntegration (unknown)
INF [ p2p.go:146] dropping message, not subscribed to topic topic=testTopic1
INF [ p2p.go:169] created libp2p host address=/ip4/127.0.0.1/tcp/2001/p2p/12D3KooWFhf64KBUDXZozUmX5yyzexGLqcfgSdUX37GiWEkh9LjW
INF [ p2p.go:169] created libp2p host address=/ip4/127.0.0.1/tcp/2000/p2p/12D3KooWFG7sWvyzsovbUbqPySMAR7UoohJZSMwwX6oMkwXHkNam
INF [ p2p.go:169] created libp2p host address=/ip4/127.0.0.1/tcp/2002/p2p/12D3KooWMr6tcH2mFL3GgGUmF2296GiYRabTK49Pk8B6n8vqwN6W
INF [ p2p.go:169] created libp2p host address=/ip4/127.0.0.1/tcp/2003/p2p/12D3KooWDPtr6wENaiD2cxJ9wnDDX9Be79oj5JST2bdu4r6Vw3bf
ERR [ bootstrap.go:43] couldn't connect to boostrap node error="failed to find peers: failed to find any peer in table" peer="{12D3KooWMr6tcH2mFL3GgGUmF2296GiYRabTK49Pk8B6n8vqwN6W: [/ip4/127.0.0.1/tcp/2002]}"
ERR [ bootstrap.go:43] couldn't connect to boostrap node error="failed to find peers: failed to find any peer in table" peer="{12D3KooWDPtr6wENaiD2cxJ9wnDDX9Be79oj5JST2bdu4r6Vw3bf: [/ip4/127.0.0.1/tcp/2003]}"
DBG [ bootstrap.go:88] called retriable function error="could not connect to any bootstrap node" count=1 duration=27.093565 funcName=]
ERR [ bootstrap.go:43] couldn't connect to boostrap node error="failed to find peers: failed to find any peer in table" peer="{12D3KooWDPtr6wENaiD2cxJ9wnDDX9Be79oj5JST2bdu4r6Vw3bf: [/ip4/127.0.0.1/tcp/2003]}"
DBG [ bootstrap.go:77] called retriable function error="could not connect to any bootstrap node" count=1 duration=22.686801 funcName=]
ERR [ bootstrap.go:43] couldn't connect to boostrap node error="failed to find peers: failed to find any peer in table" peer="{12D3KooWDPtr6wENaiD2cxJ9wnDDX9Be79oj5JST2bdu4r6Vw3bf: [/ip4/127.0.0.1/tcp/2003]}"
panic: runtime error: slice bounds out of range [4:1]
goroutine 17365 [running]:
github.com/libp2p/go-libp2p-pubsub.(*GossipSubRouter).heartbeat(0xc0031e4960)
/Users/ezdac/.asdf/installs/golang/1.20.1/packages/pkg/mod/github.com/libp2p/[email protected]/gossipsub.go:1441 +0x2c0a
github.com/libp2p/go-libp2p-pubsub.(*PubSub).processLoop(0xc004dc7440, {0x101b645b8, 0xc003ac96d0})
/Users/ezdac/.asdf/installs/golang/1.20.1/packages/pkg/mod/github.com/libp2p/[email protected]/pubsub.go:651 +0x113b
created by github.com/libp2p/go-libp2p-pubsub.NewPubSub
/Users/ezdac/.asdf/installs/golang/1.20.1/packages/pkg/mod/github.com/libp2p/[email protected]/pubsub.go:334 +0x1bce
Where the relevant slice has something to do with the connected peers and it's score:
// We keep the first D_score peers by score and the remaining up to D randomly
// under the constraint that we keep D_out peers in the mesh (if we have that many)
shufflePeers(plst[gs.params.Dscore:])
This looks like it could be related to the bootstrap nodes, since there seems to have been connection failures
Investigate wether:
- this is a bug related to the integration test setup
- a gossipsub bug
- a test-unrelated misconfiguration of the p2p network, or related to the peer bootstrapping
469d405262dda02d6cc4896ce803c71279 prevents the panic. not 100% sure this is a proper fix, so feel free to have a look at it.
This seems to be fixed.