kops
kops copied to clipboard
NLB for apiserver port 8443 is unreachable
/kind bug
1. What kops
version are you running? The command kops version
, will display
this information.
It's actually a build from the master
branch since I was testing a fix for another issue that I had previously encountered (that is now fixed).
It was built like so:
$ go version
go version go1.21.3 linux/amd64
$ export S3_BUCKET=hetest-kops
$ export VERSION=1.28.0-dev.1
$ make kops-install VERSION=$VERSION
$ make upload S3_BUCKET=s3://$S3_BUCKET VERSION=$VERSION
And results in
$ kops version
Client version: 1.28.0-dev.1 (git-v1.29.0-alpha.1-139-gab5b8a873a)
2. What Kubernetes version are you running? kubectl version
will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops
flag.
$ kubectl version
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.2
3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
I have created a cluster from a manifest as per how usually create clusters. This time using this custom kops to deploy a 1.28 k8s cluster.
5. What happened after the commands executed?
I couldn't talk to the API server externally.
6. What did you expect to happen?
Be able to use the cluster.
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.
I'll just provide the part that probably matters.
We use a custom SSL certificate for the API server with a name of api.$CLUSTER_NAME
.
It is provided in the cluster spec like so:
spec:
api:
loadBalancer:
class: Network
sslCertificate: arn:aws:acm:$AWS_REGION:$AWS_ACCOUNT:certificate/REDACTED
sslPolicy: ELBSecurityPolicy-TLS13-1-3-2021-06
type: Internal
When using a custom SSL certificate, rules are created in the NLB for port 8443
in addition to 443
and the kops export kubecfg --admin --name $CLUSTER
created entries in .kube/config
referring to port 8443
.
8. Please run the commands with most verbose logging by adding the -v 10
flag.
Paste the logs into this report, or in a gist and provide the gist link here.
9. Anything else do we need to know?
The NLB for the new cluster has a security group attached which is a relatively new feature from AWS.
Examining it, it had rules for port 443
but none for 8443
.
Editing the security group and duplicating the rules for 443
for 8443
fixes my problem.
the original PR which breaks this https://github.com/kubernetes/kops/pull/15993
we should add 8443 rules to nlb security group as well (if needed)
This should probably be very similar to https://github.com/kubernetes/kops/pull/16006.
@hakman Can I take this up?
Should the addition of the 8443 port to the security group rule be conditional or we do it by default along with 443?
@karanrn Sure, it is conditional, same condition as for adding the port to the NLB.
@jim-barber-he I have a fix, but I wanted to confirm with you that client-cert authentication does not work with a custom-certificate on port 443 (but does work on port 8443). So I assume you're using one of the auth systems like dex (?)
@jim-barber-he I have a fix, but I wanted to confirm with you that client-cert authentication does not work with a custom-certificate on port 443 (but does work on port 8443). So I assume you're using one of the auth systems like dex (?)
I am using the AWS IAM Authenticator as set up by kops via:
spec:
aws:
backendMode: CRD
clusterID: $CLUSTER_NAME
identityMappings:
- arn: arn: arn:aws:iam::$AWS_ACCOUNT_ID:role/$ROLE_ADMIN
groups:
- system:masters
username: admin:{{\`{{SessionNameRaw}}\`}}
- ...
Plus CRDs for other IAM to k8s mappings.
We did have dex in place a number of years ago when we first came up with the cluster specs.
@jim-barber-he Thank you for the explanation! Any way to test the https://github.com/kubernetes/kops/pull/16405 cherry-pick we did yesterday for kOps 1.28?
Yeah I can give it a go by rolling out a test cluster, but probably won't be until next week.
FYI: kOps 1.28 wasn't having an issue for me, it was when I built from the master branch to test something else that I hit the problem, so reported this before it ended up breaking kOps 1.29 for me.
So to test I assume I'd need to build from master
again?
Ah, cool. Not sure if so easy to test the master brach build (there may be some nodeup changes), but please give it a try. We will also release an official beta.1 this week.
I've rolled out a k8s 1.28.8 cluster this morning with a kops binary built from the v1.29.0-beta.1
git tag.
The cluster has come up properly and is accessible.
The security group associated with the API server NLB has rules in it for port 8443.
So it looks like this issue is fixed.
Perfect. Thank you for confirming @jim-barber-he! /close
@hakman: Closing this issue.
In response to this:
Perfect. Thank you for confirming @jim-barber-he! /close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.