apprunner-roadmap icon indicating copy to clipboard operation
apprunner-roadmap copied to clipboard

Scale to zero

Open simonw opened this issue 3 years ago • 64 comments

I have a serious side-project habit - I often have dozens of side projects on the go at once.

As such, I really appreciate scale-to-zero services like Google Cloud Run and Vercel, where if my project isn't getting any traffic at all it costs me nothing (or just a few cents a month in storage costs) - then it spins up a server when a request comes in, with a cold-start delay of a few seconds before it starts serving traffic.

I would love it if App Runner could do this! It looks like at the moment you have to pay for a minimum of one running instance.

simonw avatar May 19 '21 01:05 simonw

Scale to zero is really important for small projects that don’t need 24/7 compute running, and especially contractor work. Besides that microservices that don’t need to be running all the time, and side projects where someone wants a full container and is willing to deal with cold starts (like lambda but not being constrained to API gateway or lambda utilities in the container).

scale to 0 is the only thing that prevents me from using it.

danthegoodman1 avatar May 19 '21 15:05 danthegoodman1

This could be used very well for light batch jobs as well if it could scale to 0

danthegoodman1 avatar May 19 '21 15:05 danthegoodman1

It scales to just $0.007 per GB/hour if the application is idle as far as I can tell, no vCPU cost. Or there is a PauseService API call to eliminate that cost too. If you had a batch job, you could call ResumeService at the start and PauseService at the end?

timanderson avatar May 19 '21 15:05 timanderson

Yeah but that’s still $5/month more than cloud run, and if I’m using such a heavily managed service I wouldn’t want to automate pausing and resuming myself (with batch it would be fine but for an api it would be very difficult)

danthegoodman1 avatar May 19 '21 15:05 danthegoodman1

What should happen when a request is made when instances are zero? Currently (if service is paused) the root url gives a http status code 404. Do you want the end-point to wait a bit from responding for some time to give a chance to spawn an instance and respond?

Munawwar avatar May 19 '21 16:05 Munawwar

What should happen when a request is made when instances are zero? Currently (if service is paused) the root url gives a http status code 404. Do you want the end-point to wait a bit from responding for some time to give a chance to spawn an instance and respond?

Yeah, basically a cold start similar to lambda. That’s how cloud run does it.

danthegoodman1 avatar May 19 '21 16:05 danthegoodman1

I’ll also add it would be really good for many users to not be forced to scale to zero like cloud run does. Like if we had some field to set “minimum containers” or something similar to lambda provisioned concurrency, because there are some use cases where you never want a cold start.

danthegoodman1 avatar May 19 '21 16:05 danthegoodman1

@danthegoodman1 you can configure the minimum "provisioned" containers which stay active (paying for mem only, not CPU) - except you can only set that to >= 1.

It would be nice if you could leave that at 1 if you wanted to remove cold start, or 0 if you wanted to optimize for costs and were ok with some latency when the first request comes in to the system after its scaled down.

mwarkentin avatar May 19 '21 17:05 mwarkentin

@danthegoodman1 You aren't forced to scale to zero in Cloud Run and can set minimum instances that even charge/run at the "idle rate":

https://cloud.google.com/run/docs/configuring/min-instances

nelsonjchen avatar May 19 '21 17:05 nelsonjchen

@danthegoodman1 You aren't forced to scale to zero in Cloud Run and can set minimum instances that even charge/run at the "idle rate":

https://cloud.google.com/run/docs/configuring/min-instances

This only used to be available if you were using cloud run for anthos, didn’t realize they updated it, thanks.

danthegoodman1 avatar May 19 '21 18:05 danthegoodman1

@danthegoodman1 you can configure the minimum "provisioned" containers which stay active (paying for mem only, not CPU) - except you can only set that to >= 1.

It would be nice if you could leave that at 1 if you wanted to remove cold start, or 0 if you wanted to optimize for costs and were ok with some latency when the first request comes in to the system after its scaled down.

Yep just wanted to make sure we still kept that feature just in case!

danthegoodman1 avatar May 19 '21 18:05 danthegoodman1

I would use scaling to zero for dev stack. Also if i want to show the new version of my app to my customer, he can then look at the app just when he want.

flibustenet avatar May 19 '21 20:05 flibustenet

Apart from smaller projects, scale to zero would be super useful for development workflows. Imagine many developers deploying code branches for testing. Right now, they would have to consciously deprovision the service when they are not working on it.

With scale to zero, there would be no costs when they aren’t working (= not sending requests to their personal deployment). And cold start latency isn’t relevant in this scenario.

486 avatar May 19 '21 23:05 486

I'm surprised this hasn't been mentioned before but a turnoff of GCP is that they don't have an "Amazon Aurora Serverless" equivalent to go along with Cloud Run.

Scale to Zero App Runner + Amazon Aurora Serverless would be a dream.

nelsonjchen avatar May 20 '21 00:05 nelsonjchen

I'm surprised this hasn't been mentioned before but a turnoff of GCP is that they don't have an "Amazon Aurora Serverless" equivalent to go along with Cloud Run.

Scale to Zero AppRunner + Amazon Aurora Serverless would be a dream.

300 IQ right there, that’s an awesome idea. Also dynamodb would work too but aurora serverless (v2 Postgres plz) would be a real separator

danthegoodman1 avatar May 20 '21 00:05 danthegoodman1

Seconding that 300 IQ statement. We need that! GCP Cloud Run is ahead. Don't make us use their product.

tomaszdudek7 avatar May 20 '21 08:05 tomaszdudek7

This is also amazing for Slack bots btw. I'd move our team Slack bot here in an instant if we could scale to 0.

danthegoodman1 avatar May 20 '21 15:05 danthegoodman1

@nelsonjchen @danthegoodman1 Why is it not yet possible to use App Runner + Aurora PostgreSQL Serverless?

I am trying to deploy an API on app on App Runner, and the API is supposed to connect to the Aurora PostgreSQL Serverless endpoint, but I cant get it to connect (locally or on App Runner). Does this mean it doesnt work?

stephanoparaskeva avatar Jun 06 '21 11:06 stephanoparaskeva

@stephanoparaskeva make sure you’re in the same vpc, have proper security groups and routing tables, or in your case it sounds like enabled public access if you’re trying to test locally (although please don’t do in production)

danthegoodman1 avatar Jun 06 '21 11:06 danthegoodman1

@stephanoparaskeva Aurora Serverless can only be accessed from within VPC via private IP. AppRunner does not support VPC integration yet, however it is on the roadmap: https://github.com/aws/apprunner-roadmap/issues/1

pavelsource avatar Jun 06 '21 11:06 pavelsource

@nelsonjchen @danthegoodman1 Why is it not yet possible to use App Runner + Aurora PostgreSQL Serverless?

I am trying to deploy an API on app on App Runner, and the API is supposed to connect to the Aurora PostgreSQL Serverless endpoint, but I cant get it to connect (locally or on App Runner). Does this mean it doesnt work?

I don't think I said it wasn't possible. You likely have connectivity issues not related to this lack of scale to zero issue like @danthegoodman1 said.

nelsonjchen avatar Jun 06 '21 11:06 nelsonjchen

@stephanoparaskeva Aurora Serverless can only work within VPC with a private IP. AppRunner does not support VPC integration yet, however it is on the roadmap: #1

Ah ok so I should use a public DB for the time being.

  • Once VPC is released -- if both App Runner and Aurora are in the same VPC, should it just connect via endpoint + user + password?

  • Also how does one connect to Aurora using a locally running version of their API (is this possible)?

Thanks for the swift response!

stephanoparaskeva avatar Jun 06 '21 11:06 stephanoparaskeva

@stephanoparaskeva Aurora Serverless can only work within VPC with a private IP. AppRunner does not support VPC integration yet, however it is on the roadmap: #1

Ah ok so I should use a public DB for the time being.

  • Once VPC is released -- if both App Runner and Aurora are in the same VPC, should it just connect via endpoint + user + password?

  • Also how does one connect to Aurora using a locally running version of their API (is this possible)?

Thanks for the swift response!

Could you take this conversation elsewhere? This issue is about scale to zero.

nelsonjchen avatar Jun 06 '21 11:06 nelsonjchen

Just for clarification, but you can use Aurora Serverless v1 without VPC by using its Data API :)

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/data-api.html

toricls avatar Jun 06 '21 11:06 toricls

I think there's some relevant discussion in https://github.com/aws/containers-roadmap/issues/1017 on scale-to-zero for Fargate which is probably applicable here too. I made a more detailed post RE Fargate, but summing up quickly: a few seconds (or more maybe) of cold-start latency would be OK for me, and I can send a ping to the service to mitigate that prior to hitting it full-force.

JonMarbach avatar Jun 17 '21 00:06 JonMarbach

Just adding my 2 cents. Lots of backoffice-like and data applications could not care less about latency. Actually being able to scale to zero would be amazing for many use cases. This could be a non default configurable feature and the UI could say explicitly how scaling memory to zero affects latency.

CarlosDomingues avatar Jan 14 '22 18:01 CarlosDomingues

I use currently Heroku for this use-case - it scales to zero on free accounts.

I have not tried it yet, but because Heroku runs on AWS it could have low latency to Aurora Serverless databases.

iBobik avatar Jan 31 '22 19:01 iBobik

I think that "what about the cold start" argument is unrelated to the topic, because you'll expect cold starts in any service providing autoscaling, that's just the way programs work (especially ones running in a virtual machine).

Our company usecase: we want our applications to be serverless, but we don't really want to change our development model to lambda one. We want to build traditional microservices, which could handle more than one request at a time, but we also want our dev and staging environment cost less when it is not used.

siviae avatar Feb 08 '22 08:02 siviae

If AppRunner doesn't scale to 0, why does mine show 0 instances when there is no traffic? Also my minimum is set to 1.

phishy avatar Mar 22 '22 20:03 phishy

@phishy AppRunner can scale active instances to zero. However, you can have at a minimum 1 inactive instance. You are charged for the memory of the inactive instances.

jvisker avatar Mar 30 '22 19:03 jvisker