thundernetes
thundernetes copied to clipboard
Draft - update networking between nodeagent daemonset and gameserver pods
thought a lot about #108 and this is what i have so far:
-
adding a network policy for the nodeagent pods allows you to control the ingress/egress traffic. there isn't a lot to it, it only lets you choose on label names for pods/namespaces/cidr blocks. the following network policy only allows traffic from any pod with the
OwningOperator: thundernetes
in any namespace, and from pods in themonitoring
namespace. this would still allow traffic from pods in other nodes, but that is addressed below -
adding a ~headless~ service in front of the nodeagent pods allows the gameservers to reach the heartbeat endpoint via a FQDN rather than the node ip address. the service also has
internalTrafficPolicy: Local
, meaning gameservers will always be routed to nodeagent pods on the same node. i think this is a more decoupled strategy rather than using the node ip.
edit: dont think it needs to be a headless service actually, we dont want DNS records returned for each node behind the service, we just want the service to route to the correct node. more info about that here.
- using DNS to get a public IP address of the current node could (in theory) solve the problem of getting the node public IP in the gameserver pods, although i have not figured out how to do that. doing an
nslookup
of the node name resolves to the internal IP.
let me know what you think. I see that the setup for the game server pod creation is opinionated by using the init container to create a json file for the gsdk, so I wasn't sure at what point during the initialization was best to inject the service endpoint variable in the environment.
thanks @vachillo, this seems to be trickier than what I originally thought. In this case, we'd need to have a single service per Node, correct? That seems a bit hard to maintain unfortunately.
Yeah, the GSDK that is running on the GameServer Pod expects to find a JSON file with the necessary configuration. If this fails it will not start.
Regarding the Public IP, I was trying to find out ways on how to do that since Kubernetes does not expose the External IP in the downstream API. I've collected a simple hack here: https://playfab.github.io/thundernetes/howtos/publicipaddress.html
Not sure what is the best approach for this, let's leave this open for further investigation and discussion. Thank you!
thanks @vachillo, this seems to be trickier than what I originally thought. In this case, we'd need to have a single service per Node, correct? That seems a bit hard to maintain unfortunately.
we actually wouldn't need a different service per node, that's what the internalTrafficPolicy: Local
does. whatever node the pod is calling from, the service would route it to a pod in its backend pool only on that node. read more about it here, i think its a relatively new feature.
Yeah, the GSDK that is running on the GameServer Pod expects to find a JSON file with the necessary configuration. If this fails it will not start.
is there a mechanism to provide your own init container to initialize that json file? I haven't tried it, but my assumption is that the default one would overwrite any changes if another one was provided in the podTemplateSpec.
Regarding the Public IP, I was trying to find out ways on how to do that since Kubernetes does not expose the External IP in the downstream API. I've collected a simple hack here: https://playfab.github.io/thundernetes/howtos/publicipaddress.html
Not sure what is the best approach for this, let's leave this open for further investigation and discussion. Thank you!
however that public IP gets retrieved, maybe it can be exposed as an endpoint on that nodeagent daemonset pod? we know that the pods asking for those IPs will always be on the same node. an endpoint to periodically get metadata about the node could be useful. maybe that complicates it more than necessary, but just a thought.
we actually wouldn't need a different service per node, that's what the
internalTrafficPolicy: Local
does. whatever node the pod is calling from, the service would route it to a pod in its backend pool only on that node. read more about it here, i think its a relatively new feature.
Gotcha, wasn't aware of the internalTrafficPolicy
, will take a look.
Yeah, the GSDK that is running on the GameServer Pod expects to find a JSON file with the necessary configuration. If this fails it will not start.
is there a mechanism to provide your own init container to initialize that json file? I haven't tried it, but my assumption is that the default one would overwrite any changes if another one was provided in the podTemplateSpec.
We actually append the Thundernetes initContainer to the existing ones provided by the user -> https://github.com/PlayFab/thundernetes/blob/c0107046dc63facff96c0f9e9e0ce64160ade0d9/pkg/operator/controllers/controller_utils.go#L230
however that public IP gets retrieved, maybe it can be exposed as an endpoint on that nodeagent daemonset pod? we know that the pods asking for those IPs will always be on the same node. an endpoint to periodically get metadata about the node could be useful. maybe that complicates it more than necessary, but just a thought.
That's a great suggestion, we could expose an extra endpoint on the NodeAgent with Node details, e.g. anything not available on the downstream API. This would solve #136
Given that this addition would need more testing, unfortunately I don't think we have time to include for 0.6. Sorry :( Additionally, YAML changes (Service etc.) would need to be included in this folder https://github.com/PlayFab/thundernetes/tree/c0107046dc63facff96c0f9e9e0ce64160ade0d9/pkg/operator/config since everything on installfiles
gets overridden during testing/new release creation.
Appreciate the discussion!