aerospike-kubernetes-operator icon indicating copy to clipboard operation
aerospike-kubernetes-operator copied to clipboard

Panic during AerospikeCluster upgrade

Open noam-ma-ma opened this issue 1 year ago • 5 comments

Issue Title:

AerospikeOperator panic: runtime error: invalid memory address or nil pointer dereference during server upgrade from 6.2 to 6.4.

Issue Description:

During the upgrade process of my aerospike cluster from 6.2 to 6.4 - the aerospike operator paniced 5 times

panic: runtime error: invalid memory address or nil pointer dereference[signal SIGSEGV: segmentation violation code=0x1 addr=0x48 pc=0x187c1a0]goroutine 632 [running]:github.com/aerospike/aerospike-client-go/v6.(*Connection).IsConnected(0x1ed34fb?) /go/pkg/mod/github.com/aerospike/aerospike-client-go/[email protected]/connection.go:338github.com/aerospike/aerospike-management-lib/info.(*AsInfo).doInfo(0xc0000ac230, {0xc000c3a090, 0x1, 0x1}) /go/pkg/mod/github.com/aerospike/[email protected]/info/as_parser.go:232 +0x109github.com/aerospike/aerospike-management-lib/info.(*AsInfo).RequestInfo(0x23c2060?, {0xc000c3a090, 0x1, 0x1}) /go/pkg/mod/github.com/aerospike/[email protected]/info/as_parser.go:194 +0x93github.com/aerospike/aerospike-management-lib/deployment.(*cluster).infoCmd(0xc0010fef68?, {0xc000ff32d0?, 0xc000c7bd58?}, {0x1e700c9, 0xa}) /go/pkg/mod/github.com/aerospike/[email protected]/deployment/cluster.go:760 +0xd7github.com/aerospike/aerospike-management-lib/deployment.(*cluster).infoOnHosts.func1({0xc000ff32d0, 0x10}, 0xc00015fe90?) /go/pkg/mod/github.com/aerospike/[email protected]/deployment/cluster.go:787 +0x85created by github.com/aerospike/aerospike-management-lib/deployment.(*cluster).infoOnHosts /go/pkg/mod/github.com/aerospike/[email protected]/deployment/cluster.go:784 +0x2d2

Expected Behavior:

No panic of the operator during upgrade

Steps to Reproduce (if applicable):

  1. Change aerospike server version in the AerospikeCluster CRD from 6.2 to 6.4
  2. apply the CRD

Environment Details (if applicable):

  • Cloud: AWS
  • Aerospike Operator Version: 3.2.1
  • EKS Version: 1.25

noam-ma-ma avatar Feb 14 '24 08:02 noam-ma-ma

Hi @noam-ma-ma, thanks for raising the issue. We are looking into it and will try to reproduce it.

abhishekdwivedi3060 avatar Feb 14 '24 11:02 abhishekdwivedi3060

This issue seems intermittent. We'll fix this in the upcoming release.

abhishekdwivedi3060 avatar Feb 15 '24 06:02 abhishekdwivedi3060

Hi @abhishekdwivedi3060. Do you have a schedule for when it will arrive? The operator is always in the crash loop due to the nil pointer.

mjoehl avatar Mar 18 '24 09:03 mjoehl

Hey @mjoehl,

We released AKO3.2.2 three days ago, which includes the fix for this issue. You can upgrade your operator and give it a try.

tanmayja avatar Mar 18 '24 11:03 tanmayja

Hi @tanmayja I saw the new release but maybe I overlooked the hint on "whats changed" regarding this issue. I'm pretty sure that this was not listed today morning. But anyway. I have installed the new version now and we will test the bugfix.

Thanks for the update.

mjoehl avatar Mar 18 '24 12:03 mjoehl