agones icon indicating copy to clipboard operation
agones copied to clipboard

Allow fleet autoscaler buffersize to be 0

Open roberthbailey opened this issue 3 years ago • 27 comments

Is your feature request related to a problem? Please describe.

Allow the fleet autoscaler to have a buffersize of 0. During development, it is a cost savings to not be running idle game servers and the startup latency of spinning up a new one is not an issue.

Describe the solution you'd like

Change the validation here to allow the buffer size to be 0.

Describe alternatives you've considered Leave it the way it is now.

Additional context n/a

roberthbailey avatar Sep 02 '20 16:09 roberthbailey

Question on this:

Do you want a buffersize of 0? Or do you want a min replicas of 0, with a buffersize of 0?

Which probably are different problems :smile:

Basically, are we talking about scaling to zero at this point?

The TL;DR either way is - I don't think it's worth the extra complexity to autoscale to 0, since there are many, many complex edge cases, and you are going to have at least 1 node running in your cluster regardless - so having just 1 game server in there is not going to cost you anything extra infrastructure wise.

But would love to hear more details!

markmandel avatar Sep 02 '20 17:09 markmandel

I have a use case around this :smile:

TL;DR; want a fleet per engineer that regularly test their branch against a server build.

I would like to be able to spin up an arbitary number of fleets for internal testing purposes, game devs want to be able to spin up the servers for their own branch and not effect other people.

So this would be the latter I want replicas to be able to be 0 and therefore probably only want to start buffering when at least one instance is up.

For us this currently means that I am running a fleet per engineer who regularly tests their branch so we do not need to buy or setup hardware. Currently means that there is a fair few extra nodes running on the machines that are just sat in READY whislt this isnt too much of an issue it often means spilling over onto another box. Which means we pay extra (startup life is tough).

domgreen avatar Nov 06 '20 18:11 domgreen

The original use case I heard was also for development. I'm wondering if there is a different way to tackle this other than changing the fleet autoscaler....

You can currently create an arbitrary number of fleets set to 0 replicas without an autoscaler. So the question becomes what changes the size from 0 -> 1 and back to 0. The first answer seems to be to try and make the fleetautoscaler do it, but it's also something you could drive from your CI system as well.

Would something like this work:

  1. Trigger build, create image, push to registry
  2. Update fleet spec with new build, set replicas to 1
  3. Create cron job (in CI or in k8s) to set replicas back to 0 after N hours (would probably want to check a hash or create time here)

Depending on what the devs are doing (maybe they need a variable number of game servers) step 2 could be to insert a fleet autoscaler and step 3 could be to remove it and set the replicas back to zero.

This would mean that as long as the developer is actively pushing changes they would have a game server to test. And if they aren't then it gets reaped automatically.

roberthbailey avatar Nov 06 '20 22:11 roberthbailey

:point_up: I like this idea.

This goes back to my original question:

Do you want a buffersize of 0? Or do you want a min replicas of 0, with a buffersize of 0?

And if you want min replicas of 0 -- what tells the system, "Hey, I'd like a Ready GameServer now, so I can do an allocation shortly" ?

I think @roberthbailey 's strategy above is a good one. Maybe even tie it into your dev matchmaker somehow?

markmandel avatar Nov 06 '20 23:11 markmandel

Given that this has been stagnant for a long time, I'm going to close it as "won't implement" (at least for now). We can always re-open to continue the discussion if there is anything more to add later.

roberthbailey avatar Jun 23 '21 20:06 roberthbailey

Hi!

Could we consider re-opening this discussion?

Allow the fleet autoscaler to have a buffersize of 0. During development, it is a cost savings to not be running idle game servers and the startup latency of spinning up a new one is not an issue.

This is exactly my use case. I'm setting up Agones for my open-source pet project, and I would definitely like to cut infrastructure costs while the game is in active development, and a player may appear maybe once a month.

Thank you.

mvlabat avatar Nov 10 '21 21:11 mvlabat

I'm happy to re-open, but I don't know if this is something we will be able to prioritize soon.

roberthbailey avatar Nov 10 '21 22:11 roberthbailey

I'll repeat my original question:

Do you want a buffersize of 0? Or do you want a min replicas of 0, with a buffersize of 0?

Which then leads into the following questions that came after that. Without answers to those questions, I'm not sure what more we can do here to automate this.

One solution several people have done is use the webhook autoscaler and coordinate that with your dev matchmaker to size up Fleets as needed based on the needs of the development system, since your system actually knows if you need new GameServers to scale up from 0 and Agones has no idea.

markmandel avatar Nov 10 '21 22:11 markmandel

I believe I want both min replicas and buffersize set to 0.

One solution several people have done is use the webhook autoscaler and coordinate that with your dev matchmaker to size up Fleets as needed based on the needs of the development system, since your system actually knows if you need new GameServers to scale up from 0.

That's an interesting idea. I'll dive into its documentation deeper, maybe it's indeed something I could use as a solution.

and Agones has no idea

Won't creating new game server allocations give Agones the idea to scale up the fleet? I was thinking about coding my matchmaker service the way so that it would ask Kubernetes API to create new allocations.

mvlabat avatar Nov 10 '21 22:11 mvlabat

Won't creating new game server allocations give Agones the idea to scale up the fleet?

But we can't guarantee that a game server will spin up before the allocation request timed out (I've lost track if it's 30s or a minute) - which is locked from the K8s API.

markmandel avatar Nov 10 '21 22:11 markmandel

Would be nice to have this:

apiVersion: "autoscaling.agones.dev/v1"
kind: FleetAutoscaler
metadata:
  name: fleet-autoscaler
spec:
  policy:
    buffer:
      bufferSize: 0
      minReplicas: 0
      maxReplicas: 2

I have any fleets, most of these sit still. Only tested from to time.

dzmitry-lahoda avatar Nov 15 '21 15:11 dzmitry-lahoda

@dzmitry-lahoda what would that do exactly? leave the Fleet at 0? And then what happens on allocation? I'm assuming nothing.

At which point, I'm wondering what is the point of the autoscaler at all? 🤔

markmandel avatar Nov 15 '21 20:11 markmandel

i see that fleet needs to run at least 1 hot server. our main gs do run at least one. but testing and debug gs are launched from time to time, not often, so not sure if these need 1 hot. why to use fleet? to reuse same mm and devops flows for these gs. if allocation is requested and not gs ready, and not limit reached, may launch one gs. fine for first allocation request to timeout. system will ask once more in loop. if gs did not became allocated, but stuck i ready, shutdown it after timeout. assuming 2x of allocation timout time. i would not like to operate fleet via api. developing dev ops flow which depending on some activity scales to one and back seems complicated.

dzmitry-lahoda avatar Nov 15 '21 20:11 dzmitry-lahoda

So ultimately, you are asking for scale to zero with Fleet auto scaling, which I'm not against, but is a fair bit of a nightmare to handle all the edge cases. I also don't think we can do a "scale to zero, but only for development". As soon as it exists, it needs to work for production and development at all times.

What (I think I understood) from the above, probably works for you and your game, but may not work for everyone, so it requires a pretty thorough design, with consideration for all the race conditions that can occur. This is also noting that this system is not (mostly) imperative. It's a declarative, self-healing system with a set of decoupled systems working in concert - so it's a little bit trickier than saying "on allocation, just spin up a game server". Who has that responsibility? Is the allocator service now changing replicas in the Fleet? (which is otherwise the autoscaler's responsibility). What happens when the autoscaler collides with the allocation creating a new GameServer and removes it? Maybe we should change the min buffer on the autoscaler? But then, what tells it to scale back down? Uuurg. it gets very messy very quickly.

developing dev ops flow which depending on some activity scales to one and back seems complicated.

I am amused by this 😁 yes, this is complicated, that's why we've never really done it.

But I'd love hear if people have detailed designs in mind that covers both scale up and scale down that cover all the integrated components 👍🏻

markmandel avatar Nov 15 '21 22:11 markmandel

my naive attempt

code design

spec state of fleet becames sum type.

enum FleetSize { 
    YamlSpec(buffer, replicas),
    LiftedSpec(YamlSpec, least_timeout)
}

fleet is 1

  • GSA request coming
  • all works as before

fleet is 0

  • GSA request is coming
  • Fleet spec swapped with into LiftedSpec
  • Agones allocator allocates as if spec is lifted value of buffer is 1

Ready

  • if GS did not change state, A(agones allocator) swap specs back to basic YamlSpec.
  • Deallocates

Allocated

  • A behaves as if YamlSpec was changed for N to N - 1, deallocates by rules of that
  • So it does not deallocates until it in state Allocated
  • If shutdown happened and but lease timeout not passed, allocated again

fleet is 1, but yaml speced to 0

  • follow rules of N to N - 1

what is spec was changed to 0, and allocation came along to change to LiftedSpec to 1

  • it is fine to deallocated and allocated new, not sure how it differs from N to N -1 and almost immediate N - 1 to N?

fleet is 1+ always

  • algorithm is disabled, so if bugs will exists, they will be only isolated to near zero fleets which are already debug fleets

concerns

  • implementing externally via devops is more brittle and complex then from within
  • probably simple sum type with lease may really help to handle

What happens when the autoscaler collides with the allocation creating a new GameServer and removes it? Maybe we should change the min buffer on the autoscaler? .

would collision be the same as if i change yaml manually to +1 and then -1 and than + 1?

But then, what tells it to scale back down?

LiftedSpec lease timeout. LeastTimeout should be at least 2x of allocation timeout. I would prefer it can be dynamically set via k8s API of A.

production

enum FleetSize { 
    YamlSpec(buffer, replicas),
    LiftedZeroSpec(YamlSpec, least_timeout),
    LiftedProductionSpec(YamlSpec, LeasedAllocatorFunction)
}

So in production I can than provide least to grow buffer from 13 to 42 if LeasedAllocatorFunction tells to do so. Default LeasedAllocatorFunction would be ingnorant about allocation requests. With option to swap with allocator fuction name, like can be some simple linear interpolation from last 3 minutes of allocation (time needed to swawn new VM and docker).

Not sure what is hook in Agones to do buffer customization except chaging spec via YAML, but again - it would be nice some algorithm build in - it may be similar tech as used for near zero.

dzmitry-lahoda avatar Nov 16 '21 09:11 dzmitry-lahoda

For the use case of development, we thought about doing something like this but we actually settled on an internal command (in our case its in game, but you could do a slack command/something else all the same) that creates a one off GameServer instance that is programmatically applied to Kubernetes. This enabled us to run the full atones workflow without having to create and manage a fleet and fleetautoscaler for those special instances.

theminecoder avatar Nov 19 '21 10:11 theminecoder

yeah, can create if in code like, if fleet exists, create from fleet, if not exists, create some named GS. but who controls how many of such GS are around? so will build some limit on such servers. also we have clean up per fleet (just kill long running), another logic. so kind of imitating fleet with random docker images. but somehow need to define docker version, we have gitops for fleets, so will need some gitops for fleets or extend gs client to do so(pass specific docker image). it is possible. but raises different set of question around feature fleet which can be 0 instances and run any docker adhoc and has some kind of limit somehow

dzmitry-lahoda avatar Nov 19 '21 14:11 dzmitry-lahoda

i would either have users who are responsible (have access to cluster and can allocated as please, and then clean) and others, who goes only via formalism of fleet setup.

dzmitry-lahoda avatar Nov 19 '21 14:11 dzmitry-lahoda

But we can't guarantee that a game server will spin up before the allocation request timed out (I've lost track if it's 30s or a minute) - which is locked from the K8s API.

@markmandel oh, I didn't know about that. Is there a way to track allocations? I noticed that they always spawn a gameserver in case there wasn't available one, but the response doesn't always include a server name (I believe it happens if a game server didn't exist and it's a newly created one). So I believe it makes it impossible to reliably correlate allocations to game servers and track their status in this case.

Also, what makes a game allocation time out? My current understanding is that it will time out if a game server doesn't get promoted to Allocated, and one of the cases, when that might happen, is that a node suitable for the game server's pod doesn't boot up in time. Please correct me if I'm wrong.

And the last question I have: if an allocation request times out, will a game server get deleted as well? Or will it remain in Starting state until it finds a node?

If a game server allocation always responded with a game server name and game servers outlived timed out allocations, I believe the problem you've mentioned could be mitigated by letting clients wait for game servers to finally boot up and re-sending game allocations if needed.

mvlabat avatar Nov 19 '21 18:11 mvlabat

enum FleetSize { 
    YamlSpec(buffer, replicas),
    LiftedSpec(YamlSpec, least_timeout)
}

So I'm struggling with this for a variety of reasons:

  1. Go doesn't have sum types, so this doesn't really translate to the language the controllers are all written in.
  2. We shouldn't be switching out a declarative spec at runtime. This seems extremely counter to how Kubernetes works.

The more I think about this too, I don't think there should be blocking operations in an allocation. We already have a limited set of retries, but allocation tends to be a hot path, and I'm not comfortable putting blockers in it's way.

To make things even MORE complicated: an Allocation is not tied to a Fleet. It has a set of preferential selectors, which can easily be cross Fleet, based on arbitrary labels, or used with singular GameServers -- so we can't even tie a Fleet spec/replicas/autoscalers to an Allocation either.

Again, I come back to webhooks. Only you know how to scale to 0 based on your matchmaking criteria (especially if you are scaling your nodes to 0, and need to account for node scale up time), will likely want to do it well before allocation happens, and you are the one that knows which Fleets to scale and when.

I don't think Agones can do that for you.

markmandel avatar Nov 22 '21 22:11 markmandel

Go doesn't have sum types, so this doesn't really translate to the language the controllers are all written in. Sum can be concept, encoded as

struct StateUnion {
    is_state_a : bool
    is_state_b : bool
    state_a : *SomeStuff
   state_b : *OtherStuff
}

unsafe, but can coded as pattern. typical sum type from go is return of error and result value, usually when error than there is no result.

We shouldn't be switching out a declarative spec at runtime. This seems extremely counter to how Kubernetes works. that can be internal detail, each time api is requested, it responds with original spec. i am more on internal state and safely handle it everywere.

The more I think about this too, I don't think there should be blocking operations in an allocation. We already have a limited set of retries, but allocation tends to be a hot path, and I'm not comfortable putting blockers in it's way

sure, there should not be. but there could be extra speculative heuristic to overallocate?

To make things even MORE complicated: an Allocation is not tied to a Fleet. It has a set of preferential selectors, which can easily be cross Fleet, based on arbitrary labels, or used with singular GameServers -- so we can't even tie a Fleet spec/replicas/autoscalers to an Allocation either.

so allocation and warm up from zero several fleets? may fine if that was requested which covers several fleets by selectors?

Only you know how to scale to 0 based on your matchmaking criteria (especially if you are scaling your nodes to 0, and need to account for node scale up time), will likely want to do it well before allocation happens, and you are the one that knows which Fleets to scale and when

I can do that, but not sure what MM should know. Example,

I have VM with 4 GS Allocated.There is room for 5, which is Ready, with buffer of one. So right after that 5th allocated, new VM should warm up. Imagine that buffer of VMs is zero. So allocation request will fail in loop(rate limited client requests, with open match amid user and allocator), until VM will boot up. By induction this may happen with any buffer size.

No blocking allocation with non infinite buffer of everything do fail and timeout regardless of fleet size. Need to account for node scale up time.

But yeah, it is more clear on options which can be used to imitate zero size fleet.

Another option. Fleet with CPU limit of near zero. In this case fleet will never be able to run instance. When AR comes via MM (before allocated call to Agones), call K8S to increase CPU limit and after timeout reduce it back. Store timeout, time, and limits in Fleet annotations, so can always set these right. But these requires write access to K8S :( So will need to isolate that with RBAC and even namespace.

dzmitry-lahoda avatar Nov 23 '21 07:11 dzmitry-lahoda

my 2p ... I think this isn't an agones and will likley add more complexity and confusion for the majority of users ... it is more of a workflow/testing issue that could be different for all studios.

Much like @theminecoder our studio has got around this issue using a workflow that will easily spin up and down fleets/game servers via CI/CD this achieves the same goal and keeps it out of the Agones codebase.

Even running UE4 game servers does not give very much additional cost or complexity, there is some logic that means (the majority of) dev servers have all maps loaded rather than requiring server travel to different containers but this is adequate for them and can easily be altered if server travel needs to be tested.

domgreen avatar Nov 23 '21 08:11 domgreen

@domgreen so everybody solves same problem, builds work around. why would not it should make into agones?

workarounds create additional attack vector. what if i would want to allow to run test GS on live env. having some another job wich changes fleets settings or allocates wild severs does not looks as something that does not require special care

dzmitry-lahoda avatar Nov 23 '21 10:11 dzmitry-lahoda

I wanted to share my results of building a webhook fleet autoscaler, as @markmandel has suggested in one of the previous posts. The implementation turned out to be fairly simple.

In my setup, I have a matchmaker service, which clients connect to via WebSocket when they want to find a server to play. In order to track servers, my matchmaker subscribes to a kubernetes namespace and watches GameServer resources. My logic of determining how many servers are needed reads as follows:

let active_players = websocket_subscribers + servers.iter().map(|s| s.player_count).sum()
let desired_replicas = active_players.unstable_div_ceil(PLAYER_CAPACITY);

I use the assumption that PLAYER_CAPACITY is static and is equal for every GameServer, but this can easily be changed if your case requires more complex logic.

So, basically, the only introduced complexity over the initially discussed variant (when matchmaker just calls Agones/Kubernetes API to create an allocation) is spinning up an HTTP listener, that would respond with the information about desired replicas count, instead of making API calls by yourself.

With having all the mentioned problems unsolved (allocation retries, blocking, etc), the webhook indeed sounds like a more reliable solution, and it allows for more flexibility.

If anyone's interested in a ready recipe for a matchmaker service written in Rust, I can share my example:

Disclaimer: my DevOps knowledge is quite limited, so read the config with caution if you decide to take inspiration from it. I can't promise the setup is effective and secure enough. :)

Another important note is that my current setup assumes only 1 replica for the matchmaker service. Fixing this limitation would require a more complex solution that would support sharing the state between replicas. Otherwise, different replicas will respond with different numbers of active WebSocket subscribers, and that can affect the desired fleet replica count.

I hope this helps.

mvlabat avatar Nov 25 '21 08:11 mvlabat

In order to track servers, my matchmaker subscribes to a kubernetes namespace and watches GameServer resources. My logic of determining how many servers are needed reads as follows:

Looking at the FleetStatus - number of GameServers, and total player count and capacity is available. Does it not come through in the JSON?

https://agones.dev/site/docs/reference/agones_crd_api_reference/#agones.dev/v1.FleetStatus

markmandel avatar Nov 26 '21 00:11 markmandel

@markmandel I didn't check it tbh, but I don't have any reason to believe it doesn't work. My use-case requires watching game servers anyway, as I want to list their names, IP addresses and player count.

mvlabat avatar Nov 26 '21 06:11 mvlabat

Adding another small use case to this: this could be a minor convenience when manually deploying servers. E.g. you have a playtest one morning and deploy a Fleet + Fleet Autoscaler for it. You know that you'll be testing again tomorrow with the same build but you don't want to leave the servers up for a day (save money, or to block access). Instead of tearing down the fleet you just scale to 0, then scale back up the next day- you don't have to keep the Fleet/Fleet Autoscaler configuration handy for the second day, particularly useful if it's done by a different person. The answer to

what tells the system, "Hey, I'd like a Ready GameServer now, so I can do an allocation shortly" ?

in this case is a manual operator.

pgilfillan avatar Dec 08 '21 07:12 pgilfillan

'This issue is marked as Stale due to inactivity for more than 30 days. To avoid being marked as 'stale' please add 'awaiting-maintainer' label or add a comment. Thank you for your contributions '

github-actions[bot] avatar Aug 01 '23 10:08 github-actions[bot]

This issue is marked as obsolete due to inactivity for last 60 days. To avoid issue getting closed in next 30 days, please add a comment or add 'awaiting-maintainer' label. Thank you for your contributions

github-actions[bot] avatar Sep 01 '23 02:09 github-actions[bot]

Bumping to unstale

theminecoder avatar Sep 01 '23 08:09 theminecoder