cluster-api-provider-aws Bad protocol name in `cniIngressRules` causes panic

trafficstars

/kind bug

What steps did you take and what happened:

I set protocol to uppercase "UDP" in cniIngressRules and it caused a panic.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: AWSCluster
metadata:
  name: k3-test-23
spec:
  network:
    cni:
      cniIngressRules:
      - description: flannel
        fromPort: 8472
        protocol: UDP
        toPort: 8472

E1214 04:40:48.632444       1 runtime.go:79] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 504 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x237b460?, 0x424b830})
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:75 +0x99
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc001255f80?})
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:49 +0x75
panic({0x237b460, 0x424b830})
	/usr/local/go/src/runtime/panic.go:838 +0x207
sigs.k8s.io/cluster-api-provider-aws/v2/pkg/cloud/services/securitygroup.ingressRuleToSDKType(0xc000e9ad38)
	/workspace/pkg/cloud/services/securitygroup/securitygroups.go:683 +0x485
sigs.k8s.io/cluster-api-provider-aws/v2/pkg/cloud/services/securitygroup.(*Service).authorizeSecurityGroupIngressRules(0xc0014f2400, {0xc001e96180, 0x14}, {0xc0000005a0?, 0x4, 0x4})
	/workspace/pkg/cloud/services/securitygroup/securitygroups.go:398 +0x165
sigs.k8s.io/cluster-api-provider-aws/v2/pkg/cloud/services/securitygroup.(*Service).ReconcileSecurityGroups.func3()
	/workspace/pkg/cloud/services/securitygroup/securitygroups.go:176 +0x31
sigs.k8s.io/cluster-api-provider-aws/v2/pkg/cloud/services/wait.WaitForWithRetryable.func1()
	/workspace/pkg/cloud/services/wait/wait.go:58 +0x6d
k8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1({0x24326a0, 0x901})
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:220 +0x1b
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext({0x2cbb678?, 0xc000116008?}, 0x70?)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:233 +0x57
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection(0x253f620?)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:226 +0x39
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff({0x3b9aca00, 0x3ffb5c28f5c28f5c, 0x3fd999999999999a, 0xa, 0x0}, 0x40d987?)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:421 +0x5f
sigs.k8s.io/cluster-api-provider-aws/v2/pkg/cloud/services/wait.WaitForWithRetryable({0x3b9aca00, 0x3ffb5c28f5c28f5c, 0x3fd999999999999a, 0xa, 0x0}, 0xc000342af0, {0xc0018955c0, 0x1, 0x1})
	/workspace/pkg/cloud/services/wait/wait.go:54 +0x125
sigs.k8s.io/cluster-api-provider-aws/v2/pkg/cloud/services/securitygroup.(*Service).ReconcileSecurityGroups(0xc0014f2400)
	/workspace/pkg/cloud/services/securitygroup/securitygroups.go:175 +0x1365

What did you expect to happen:

to not panic :-)

Anything else you would like to add:

The protocol constants in the code are all lowercase, e.g.:

	// SecurityGroupProtocolUDP represents the UDP protocol in ingress rules.
	SecurityGroupProtocolUDP = SecurityGroupProtocol("udp")

I thus tried to change it to lowercase and it worked (it did not panic and it added the rule in the SG):

apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: AWSCluster
metadata:
  name: k3-test-23
spec:
  network:
    cni:
      cniIngressRules:
      - description: flannel
        fromPort: 8472
        protocol: udp
        toPort: 8472

Environment:

Cluster-api-provider-aws version:
Kubernetes version: (use kubectl version):
OS (e.g. from /etc/os-release):

Dec 14 '22 04:12 mkmik

/triage accepted

@mkmik Hello. Which version are you running? Did you see this error log line:

		scope.Error(fmt.Errorf("invalid protocol '%s'", i.Protocol), "invalid protocol for security group", "protocol", i.Protocol)

Dec 14 '22 08:12 Skarlso

@Skarlso sorry I forgot to mention: v2.0.2

Dec 14 '22 08:12 mkmik

Ok, so, did you see that log line? :)

Dec 14 '22 08:12 Skarlso

Did you see this error log line:

no; I only saw the panic. also the panic says /workspace/pkg/cloud/services/securitygroup/securitygroups.go:683 but but that's doesn't seem like a reasonable line of code for v2.0.2

Dec 14 '22 08:12 mkmik

image: registry.k8s.io/cluster-api-aws/cluster-api-aws-controller:v2.0.2

Dec 14 '22 08:12 mkmik

Did you see this error log line:

no; I only saw the panic. also the panic says /workspace/pkg/cloud/services/securitygroup/securitygroups.go:683 but but that's doesn't seem like a reasonable line of code for v2.0.2

Yep, it's pretty interesting.

Dec 14 '22 08:12 Skarlso

Ah, it's this line:

	res.UserIdGroupPairs = append(res.UserIdGroupPairs, userIDGroupPair)

I didn't check the 2.0.2 tag. :)

Dec 14 '22 08:12 Skarlso

@mkmik Can you please try main?

Dec 14 '22 12:12 Skarlso

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

Confirm that this issue is still relevant with /triage accepted (org members only)
Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

Jan 19 '24 20:01 k8s-triage-robot

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Apr 18 '24 20:04 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

May 18 '24 20:05 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Jun 17 '24 21:06 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Jun 17 '24 21:06 k8s-ci-robot

cluster-api-provider-aws cluster-api-provider-aws copied to clipboard

Bad protocol name in `cniIngressRules` causes panic

cluster-api-provider-aws
cluster-api-provider-aws copied to clipboard