gocql
gocql copied to clipboard
Re-create connection picker on shard count change
We have seen the following panic
panic: scylla: 10.127.248.9:9042 invalid number of shards
goroutine 43250910 [running]:
github.com/gocql/gocql.(*scyllaConnPicker).Put(0xc009dde630, 0xc06f5dd520)
/go/pkg/mod/github.com/kiwicom/[email protected]/scylla.go:340 +0x42e
github.com/gocql/gocql.(*hostConnPool).connect(0xc072c16980)
/go/pkg/mod/github.com/kiwicom/[email protected]/connectionpool.go:539 +0x2f0
github.com/gocql/gocql.(*hostConnPool).fill(0xc072c16980)
/go/pkg/mod/github.com/kiwicom/[email protected]/connectionpool.go:390 +0x17c
github.com/gocql/gocql.(*policyConnPool).addHost(0xc000b574a0, 0xc0acaa0d00)
/go/pkg/mod/github.com/kiwicom/[email protected]/connectionpool.go:238 +0x10f
github.com/gocql/gocql.(*Session).startPoolFill(0xc000713000, 0xc0acaa0d00)
/go/pkg/mod/github.com/kiwicom/[email protected]/events.go:277 +0x2d
github.com/gocql/gocql.(*Session).addNewNode(0xc000713000, {0xc07b470c80, 0x4, 0x4}, 0xc000a91ea8)
/go/pkg/mod/github.com/kiwicom/[email protected]/events.go:202 +0xe7
github.com/gocql/gocql.(*Session).handleNewNode(0xc000713000, {0xc07b470c80, 0xc0ac04c350, 0xc}, 0x6)
/go/pkg/mod/github.com/kiwicom/[email protected]/events.go:224 +0x99
github.com/gocql/gocql.(*Session).handleNodeEvent(0x100000000000000, {0xc0b47de000, 0x2, 0xc0abc41b38})
/go/pkg/mod/github.com/kiwicom/[email protected]/events.go:169 +0x1b3
created by github.com/gocql/gocql.(*eventDebouncer).flush
/go/pkg/mod/github.com/kiwicom/[email protected]/events.go:67 +0xb5
when we replaced a server node with a new one with a different CPU core count, but the same IP address as the old node had.
I haven't tried compiling or running this code yet. I'd like to start discussion about possible solutions.
I'm not sure if that would solve the problem.
I'm not sure if that would solve the problem.
@mmatczuk Why? Could you please elaborate on what issues do you see with this code?
Do you have some alternatives in mind?
@martin-sucha shouldn't this work the way that driver will refresh metadata about node? to me this looks like the metadata refresh is not working (I filed a similar bug re down nodes or joining nodes that don't get properly marked in topology in client and it tries to connect to them while they are down)