nomad
nomad copied to clipboard
On migrate block add option to keep old allocation until new healthy one available
Proposal
Add a new param in the migrate block that will keep the old job allocations until the new ones are healthy
Use-cases
When draining nodes allocations should be kept until new ones are healthy. This already works if there are multiple allocations on the node but not if there is only one allocation.
Attempted Solutions
Nothing found.
Hi @cberescu and thanks for raising this issue. This makes sense as something to achieve, however, I think there will require some investigation into the reconciliation process in order to understand what is possible here and what needs to change. I'll mark this for roadmapping.
I'll add my use case to this as well.
We have a less critical environment where we allow our services in Nomad to auto-scale from 1 instance up to 4 based on simple metrics like CPU and memory usage. A lot of the services sit around 1 instance even when they are getting constant, routine traffic.
When a migration is kicked off, a service that is being actively hit with requests every few seconds can encounter a lot of seconds of downtime while the single instance migrates to another node.
Auto-scaling would be much more helpful for us in this environment if migrations ensured instance counts remained in the scaling min and max thresholds. Don't let the service instance count drop to 0 healthy allocations when we could have burst to 2 allocations temporarily to ensure we didn't disrupt the original, single healthy allocation until the new one on the other node was ready.
So for us, a new setting in the migrate block to allow this would be great if Nomad isn't tweaked to just allow this behavior by default.