agones
agones copied to clipboard
agones-ping-udp-service brings up incompatible load balancer in EKS
What happened:
When installing Agones in an EKS cluster, the agones-ping-udp-service fails to create with the following error:
Warning CreatingLoadBalancerFailed 1s (x4 over 37s) service-controller Error creating load balancer (will retry): failed to ensure load balancer for service agones-system/agones-ping-udp-service: Only TCP LoadBalancer is supported for AWS ELB
What you expected to happen: The service should have created successfully.
How to reproduce it (as minimally and precisely as possible): Build an EKS cluster, follow installation instructions for Agones.
Anything else we need to know?:
Yes, we already know the fix! The service needs the following annotation:
annotations: service.beta.kubernetes.io/aws-load-balancer-type: nlb
Environment:
- Agones version: 1.0.0
- Kubernetes version (use
kubectl version
): 1.12 - Cloud provider or hardware configuration: AWS/EKS
- Install method (yaml/helm): YAML, but I checked the Helm chart and it would have the same issue
- Troubleshooting guide log(s): n/a
- Others:
I figure we could set this annotation by default. I don't think it's going to affect other cloud platform's load balancers?
I can check if it would work. /assign alekser
I tried the fix which you suggested and it has not worked for the Kubernetes version: 1.12. https://github.com/kubernetes/kubernetes/issues/79523 I think we should wait till 1.17 for bug triage as per comments.
kubectl describe agones-ping-udp-service
output:
Name: agones-ping-udp-service
Namespace: agones-system
Labels: app=agones
chart=agones-1.1.0
component=ping
heritage=Tiller
release=agones-manual
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"service.beta.kubernetes.io/aws-load-balancer-type":"nlb"},"labels":{"app":...
service.beta.kubernetes.io/aws-load-balancer-type: nlb
API Version: v1
Kind: Service
Metadata:
Creation Timestamp: 2019-11-01T11:49:47Z
Resource Version: 4196
Self Link: /api/v1/namespaces/agones-system/services/agones-ping-udp-service
UID: b44a207f-fc9d-11e9-b526-0ad256200a94
Spec:
Cluster IP: 10.100.25.80
External Traffic Policy: Cluster
Ports:
Name: udp
Node Port: 30518
Port: 50000
Protocol: UDP
Target Port: 8080
Selector:
agones.dev/role: ping
Session Affinity: None
Type: LoadBalancer
Status:
Load Balancer:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal EnsuringLoadBalancer 51s (x2 over 82s) service-controller Ensuring load balancer
Warning CreatingLoadBalancerFailed 51s (x2 over 82s) service-controller Error creating load balancer (will retry): failed to ensure load balancer for service agones-system/agones-ping-udp-service: Only TCP LoadBalancer is supported for AWS ELB
@aLekSer based on your output, it looks like the service is still trying to build without the annotation. You need to completely destroy the service and rebuild it for that to work. When I tested this it was in fact on 1.12.
You're getting the same error message that indicates that the load balancer was built sans annotation: "Only TCP LoadBalancer is supported for AWS ELB." There's a brief AWS doc on the annotation here.
I will soon recheck this if it still actual.
Tested this on the most recent Terraform EKS config with udp_expose = "true"
in examples/terraform-submodules/eks/module.tf
:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal EnsuringLoadBalancer 4m31s (x7 over 9m46s) service-controller Ensuring load balancer
Warning CreatingLoadBalancerFailed 4m31s (x7 over 9m46s) service-controller Error creating load balancer (will retry): failed to ensure load balancer for service agones-system/agones-ping-udp-service: Only TCP LoadBalancer is supported for AWS ELB
BY-IT00060:eks alexander.apalikov$ kubectl get services --namespace agones-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
agones-allocator LoadBalancer 172.20.90.48 abf6453fe41e311eaa72b02706d23ee8-806979722.us-west-2.elb.amazonaws.com 443:31237/TCP 39m
agones-controller-service ClusterIP 172.20.120.21 <none> 443/TCP,8080/TCP 39m
agones-ping-http-service LoadBalancer 172.20.172.167 abf5f999b41e311eaa72b02706d23ee8-1635672743.us-west-2.elb.amazonaws.com 80:30456/TCP 39m
agones-ping-udp-service LoadBalancer 172.20.196.38 <pending> 50000:30098/UDP 10m
Will try to add annotations to helm.tf. We should wait for this PR to be merged: https://github.com/kubernetes/kubernetes/pull/87549/files It seems that Kubernetes code itself does not allow this to happen.
It seems that the fix could happen in 1.19 version only as per this comment: https://github.com/kubernetes/kubernetes/pull/87549#issuecomment-595440964
1.19 is a long way away (especially on EKS). Is there anything we can do in the mean time? Is there a documentation change we can make to explain to EKS users how to fix this manually?
Yes, I am also looking into a documentation update.
The upstream fix for this has landed in k8s master, so will be in 1.19, which should be available in EKS sometime in 2021.
I didn't notice until recently, that AWS EKS has backported the NLB UDP support from 1.19 as far back as 1.15, per Platform versions:
- 1.15.11.eks.4 or later
- 1.16.13.eks.3 or later
- 1.17.9.eks.2 or later
1.18 doesn't mention it, but the above backports (August 12th) predate 1.18 on EKS (October 13th), so it should have been included there too. My team expects to deploy Agones on an EKS 1.18 platform sometime next month, so I'll try and remember to check that; we disable the UDP-ping service by default, so I'll ask the person doing the deployment to try leaving it on with an NLB (or ideally NLB-IP) annotation.
EKS support for 1.14 ends on December 8th, so I believe this issue is in a good state to close, with the caveat that 1.14 doesn't support it, and never will. However, I have not tested the fix myself yet.
Is this solved? I think it might be, but I'm not sure.
Can someone confirm, so we can close this issue?
I think you still need to set the annotation for the agones-ping Service service.beta.kubernetes.io/aws-load-balancer-type: nlb
in the installation sources if you want it to work out of the box. We use kustomize and have an overlay for it.
I think that would be a user-issue though? I wouldn't expect the Agones source to contain annotations for any/every kind of platform-specific environment.
And deploying things to AWS, I had to use such annotations everywhere (since I wanted NLB everywhere) so this wasn't a shock or a challenge for me.
The error could be nicer: "Use NLB mode" would be better than "ELB can't do this", but that's a k8s/AWS problem, not an Agones problem.
Particularly because for AWS, we now have both nlb
and nlb-ip
types, and the latter is better but requires a (not in-the-box) AWS Load Balancer Controller (due to the migration of cloud-specific controllers out of kubernetes core), so including either annotation is going to be wrong for a significant group of AWS users, and nonsensical for non-AWS users.
I would have to check this, but I think for load balancers, AWS are trying to introduce different annotations, which adds 'external' to the list of possible types, in order to explicitly hand-off to the afore-mentioned AWS Load Balancer Controller.
If there's AWS-specific instructions around, it's worth mentioning the options there, I guess, with appropriate links
If there are docs to be added to: https://agones.dev/site/docs/installation/creating-cluster/eks/ - that is always appreciated!
@markmandel Yes, the following annotations needs to be added if we have to replace Classic Load balancers with Network Load Balancers. NLB supports UDP s
I am currently testing the annotations on Agones Helm chart and the following works on EKS 1.19 and Agones 1.15. It creates two NLBs, one for HTTP and other one for UDP ping.
http:
expose: true
response: ok
port: 80
serviceType: LoadBalancer
loadBalancerIP: ""
loadBalancerSourceRanges: []
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "false"
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-access-log-enabled: "true"
service.beta.kubernetes.io/aws-load-balancer-access-log-s3-bucket-name: ${s3_bucket}
udp:
expose: true
rateLimit: 20
port: 50000
serviceType: LoadBalancer
loadBalancerIP: ""
loadBalancerSourceRanges: []
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "false"
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-access-log-enabled: "true"
service.beta.kubernetes.io/aws-load-balancer-access-log-s3-bucket-name: ${s3_bucket}
@markmandel Yes, the following annotations needs to be added if we have to replace Classic Load balancers with Network Load Balancers. NLB supports UDP s
Can someone who clearly knows EKS way better than I do file a PR for the docs? That would be aces.
Sounds like you've got it worked out @vara-bonthu ?
service.beta.kubernetes.io/aws-load-balancer-type
does bring up a valid NLB with a target group, but the nodes in the target group fail healthchecks as NLBs do not support UDP healthchecks and would fail TCP healthchecks on the traffic port. LoadBalancer
services w/ mixed UDP and TCP ports are not currently supported, but I was able to get this working by doing some manual setup of the Agones ping service and setting the following annotations on the UDP ping service:
"service.beta.kubernetes.io/aws-load-balancer-type" = "nlb"
"service.beta.kubernetes.io/aws-load-balancer-healthcheck-protocol" = "HTTP"
"service.beta.kubernetes.io/aws-load-balancer-healthcheck-path" = "/"
"service.beta.kubernetes.io/aws-load-balancer-healthcheck-port" = {NodePortForHTTPPingService}
"service.beta.kubernetes.io/aws-load-balancer-target-node-labels" = "agones.dev/agones-system=true"
This will be cleaner in the future when MixedProtocolLBService
is suported in EKS.
Hello I installed agones on EKS last week, and agones-ping-udp-service still says "pending" I followed the instructions on the agones EKS site to the details using kubeclt apply .yaml since I dont like using helm can someone update the documentation or post the steps here please?
agones-ping-udp-service LoadBalancer 10.xx.xx.xx <pending>
My understanding is that right now you need to use helm to set the additional annotations to configure the NLB to use UDP. You don't have to use helm to do the install; you can use it to render a customized yaml file that contains the additional annotations (this is described on the Install Agones using YAML page). Alternatively, you could edit your yaml file by hand to add the annotations shown above.
@roberthbailey Thank you for the reply and the pointers, we never use helm, rather if the right config goes into the official documentation for kustomize/yaml install [update] I guess for what I understand I can create the .yaml/kustomize file with helm output, so I downloaded it locally and did
helm pull --untar https://agones.dev/chart/stable/agones-1.25.0.tgz
So, now I can run I guess the other part, what options should I add to this next helm command to force the right NLB to use UDP?
@markmandel I'm seeing this issue in OpenShift-on-AWS too (probably similar underlying parts as EKS) and can open a PR. However, I just found https://github.com/googleforgames/agones/commit/3c38876382d505244af68489211d8916dce07596 which might be better for me to use than just changing the Helm charts and YAML install files. What do you think?
I consider using pkg/cloudproduct
since the NLB Health Checks must be aware of the Service nodePort
and this isn't available at install time using Helm/YAML.
I consider using pkg/cloudproduct since the NLB Health Checks must be aware of the Service nodePort and this isn't available at install time using Helm/YAML.
After typing this, I realize it might be overkill if MixedProtocolLBService
is going to be supported in EKS (and other k8s integrations for AWS). Maybe just updating the YAML is best so the UDP ping service at least starts.
Yeah, also that's totally a separate thing that affects more Pod creation for GameServers (although, very interesting! 😄 )
Question though: Is this a documentation issue?
If we added some kind of AWS installation docs that said what agones.ping.udp.annotations
and agones.ping.http.annotations
should be on installation, would that solve the issue?
'This issue is marked as Stale due to inactivity for more than 30 days. To avoid being marked as 'stale' please add 'awaiting-maintainer' label or add a comment. Thank you for your contributions '
This issue is marked as obsolete due to inactivity for last 60 days. To avoid issue getting closed in next 30 days, please add a comment or add 'awaiting-maintainer' label. Thank you for your contributions
@author, We are closing this as there was no activity in this issue for last 90 days. Please reopen if you’d like to discuss anything further.