azure-aci
azure-aci copied to clipboard
The spec.template.spec.terminationGracePeriodSeconds: 3600 setting has no effect
My container runs a Windows Console application in an Azure Kubernetes instance. I'm doing the SetConsoleCtrlHandler subscription, I catch the CTRL_SHUTDOWN_EVENT (6) and Thread.Sleep(TimeSpan.FromSeconds(3600)); so the SIGKILL won't get sent to the container. The container receives indeed the CTRL_SHUTDOWN_EVENT and logs on a separate thread one message/second to show for how long it kept waiting.
I'm adding the required registry settings,
USER ContainerAdministrator
RUN reg add hklm\system\currentcontrolset\services\cexecsvc /v ProcessShutdownTimeoutSeconds /t REG_DWORD /d 3600 && \
reg add hklm\system\currentcontrolset\control /v WaitToKillServiceTimeout /t REG_SZ /d 3600000 /f
ADD publish/ /
I verified this running the container on my computer and 'docker stop -t <seconds>' achieves the delayed shutdown.
The relevant .yaml deployment file fragment.
spec:
replicas: 1
selector:
matchLabels:
app: aks-aci-boldiq-external-solver-runner
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
labels:
app: aks-aci-boldiq-external-solver-runner
spec:
terminationGracePeriodSeconds: 3600
containers:
- image: ...
imagePullPolicy: Always
name: boldiq-external-solver-runner
resources:
requests:
memory: 8G
cpu: 1
imagePullSecrets:
- name: docker-registry-secret-official
nodeName: virtual-kubelet-aci-connector-windows-windows-westus
After deployment I ran the 'kubectl get pod aks-aci-boldiq-external-solver-runner-69bf9cd949-njzz2 -o yaml' command and verified that the setting below is present in the output:
terminationGracePeriodSeconds: 3600
If I do 'kubectl delete pod', the containers stays alive only for the default 30 seconds instead of the 1 hour that I want to get. Could the problem be in the VK or could this behavior be caused by AKS please?
@eugen-nw, so this actually not supported at all today on 2 levels: 1- VK package itself up till v1.1 (the current used by azure virtual kubelet) didn’t honor that. It simply calls a delete pod on provider and sets always 30 seconds. But this got updated in v1.2, so we should be able to utilize it in future. The other thing is that our provider don’t send any updates about pod after delete call, so actually it immediately gets deleted although the 30 secs showing from K8s side. This later point is going to be fixed shortly, i’m currently working on an update for that. 2- This is main problem, ACI doesn’t support a way to configure how the termination should be handled, and what grace period to be used if specified. The delete operation is synchronous too, so actual resource is removed regardless of actual pod cleanup that gets triggered on ACI’s backend. We’re aware of the limitations on ACI, but till these are supported, the fixes mentioned in (1) won’t make a difference. @macolso the async deletion is coming with new api, but I remember you/Deep mentioning about termination handling. Can you please elaborate if it is planned for next semester ?
Thanks very much for having looked into this! When will this issue be fixed please? Our major customer is not pleased with the fact that some of their long running computations get killed midway through and need to be restarted on a different container.
@ibabou your answer 2. above implies that even if we'd use Linux containers running on virtual-node-aci-linux we'd run into the exact same problem. I assume that virtual-node-aci-linux is the equivalent Linux ACI connector. Are both of these 2 statements correct please?
@eugen-nw if you mean the graceful period and the wait on containers termination on ACI's side, yeah that's not currently supported to either Linux or Windows.
Thanks very much, that's what I was asking about. That's very bad behavior on ACI's side. Do they plan to fix it?
So our team owns both ACI service and AKS-VK integration. but I don't have an ETA about that feature. I'll let @dkkapur @macolso elaborate more.
@eugen-nw indeed :( we're looking into fixing this in the coming months on ACI's side. Hope to have an update for you in terms of a concrete timeline shortly.
@dkkapur: THANKS VERY MUCH for planning to address this problem soon! This is a major issue for our largest customer.
We scale our processing on demand, based on workload sent to containers through a Service Bus Queue. There are two distinct types of processing: 1). under 2 minutes (the majority) 2). over 40 minutes (occurs now and then). Whenever the AKS HPA scales down, it kills the containers that it spun during scale up. If any of the long processing operations happen to land on one of those scale-up containers, it will get aborted and currently we have no way of avoiding that. We've designed the solution such as the processing will restart on another container, but our customer is definitely not happy with the fact that the 40' processing may happen to run for much longer durations on occasion.
Ya - I've been working on enabling graceful termination / lifecycle hooks for ACI. If you want to talk more about your use case, I'd love to set up some time - shoot me a piece of mail [email protected]
Bumping into the same issue with the auto scaler.
4 months passed, are there any known workarounds? Or ETA for the fix?
@dkkapur @macolso @ibabou Sorry for bumping it again, it hurts us quite a lot here, any news on this front?
Probably customer focus is no longer trendy these days? I’ll check out the AWS offerings and will report back.
Hi @AlexeyRaga , unfortunately no concrete ETA we can share at this point. We're happy to hop on a call and talk a bit to the product roadmap though - email shared above ^^
This is a big drawback where the pods scheduled on virtual node does not support Pod Lifecycle Hooks or terminationGracePeriodSeconds. This functionality is needed to stop the pods from getting terminated during scaling-in.
Is there any timeline to implement this issue? @macolso
Does the terminationGracePeriodSeconds work for aws eks pods on fargate ? Fargate nodes also looks like a kind of virtual nodes.
Any progress on this at all yet? It's over 2 years since the last update.
Hey @Andycharalambous , we will start working on it soon, no ETA yet.