aimmo Aim for zero-downtime in aimmo deployments

Aim for zero-downtime in aimmo deployments

Open mrniket opened this issue 5 years ago • 2 comments

What is the problem that you wish to solve? Please describe. Currently, if we want to deploy our Kurono, we have to delete all the running games and then start the game-creator up again to start all the games and workers up. This causes downtime during deployment but also a loss of game_states and avatar states which are currently generated on game pod creation.

Describe the solution you'd like Use Kubernetes Deployments/Stateful sets to bootup a pod with the new version of Kurono whilst the old one is still running. Once it is ready, move the state over to the new version and then delete the old version pod

On shutdown, we persist the game_states and avatar_states. Clients should be notified that they have to disconnect and reconnect to the new pod once it is ready.

Describe alternatives you've considered Keeping the state of the game not in memory but on a shared volume, this means the game pod just becomes a stateless processor. This sounds good but I'm not quite sure what kind of i/o overhead this would have on each turn.

Additional context This is important if we want to be able to deploy regularly without causing disruption to running games (potentially stopping classes from using Kurono is we deploy mid-session)

With any solution, we should look at:

Whether switching to a new version works
Record any potential downtime that occurs

Feb 26 '20 15:02 mrniket

Not aiming to do this right now, will revisit later

Sep 02 '20 14:09 mrniket

Using agones and their update strategy look like a good way to do this and reduce the amount of code we have to maintain

Nov 12 '20 13:11 mrniket

Not relevant.

Nov 21 '23 12:11 faucomte97

aimmo aimmo copied to clipboard

Aim for zero-downtime in aimmo deployments

aimmo
aimmo copied to clipboard