data-safe-haven
data-safe-haven copied to clipboard
Add load balancing in front of Azure Container Instances
:white_check_mark: Checklist
- [x] I have searched open and closed issues for duplicates.
- [x] This is a request for a new feature in the Data Safe Haven or an upgrade to an existing feature.
- [x] The feature is still missing in the latest version.
- [x] I have read through the documentation.
- [x] This isn't an open-ended question (open a discussion if it is).
:strawberry: Suggested change
Azure Container Instances need to be deployed to a subnet and are given an IP address from that subnet. This is usually (but not always) the first available IP address. When they are rebooted, they often change IP address and other services that refer to them are not updated.
See here for more details.
:steam_locomotive: How could this be done?
If we add appropriate load balancing in front of all ACI services, we can ensure that they're reachable regardless of their exact IP address.
Placing an Azure Load Balancer in front of container instances in a networked container group is not currently supported. See here for more information.
Consider an az cli
sidecar like this or this.
Here is one that's already been written, using an Azure service principal to make the necessary changes.
This is concerning and surprising.
I think we should look at alternatives as it feels like perhaps container instances are not suppose to be used in this way.
Switching to kubernetes might save us a lot of headaches.
I looked at Kubernetes when I first started the rewrite. You really need to buy into it from the beginning and structure the whole project around it. Sticking with Azure, it's possible that e.g. Azure Container Apps might be a good fit here. Equally, Azure might start supporting Load Balancing soon (doesn't feel like it should be a large amount of effort for them).
The load balancing issues seem to have been open for a while and it looks like the feature was removed from the roadmap.
I've been using Kubernetes for a personal project. It seems really nice when you get used to it. It handles routing and entrypoints in a very simple way, ingress controllers let you set up your web server, reverse proxy and load balancing all in the Kubernetes specifications.
I could see establishing a cluster then moving services over one at a time.
Looking forward, the supported way to deploy AWX is in Kubernetes too.
If we can't fix this, we should document this as a known issue and give a work around.
We did encounter this issue. Successfully worked around by manually updating IP address in the DNS.