chromium icon indicating copy to clipboard operation
chromium copied to clipboard

[REQUEST] Investigate decompression times in AWS Lambda

Open aleehedl opened this issue 1 year ago • 11 comments

What would you like to have implemented?

Clarification on the decompression times

Why would it be useful?

The README has a table showing various inflation times. However, when this inflation is run in AWS Lambda the time is significantly slower. I'm guessing the benchmark is run on a local machine? For me, with a 512 MB Lambda the inflation can take up to 10 seconds. Increasing the memory to 2 GB can cut the time to half but I wasn't able to get any lower than that. Is this to be expected?

If I wanted to avoid the decompression step altogether should I consider another approach completely? Such as running Lambda with a custom Docker image with Chrome included?

aleehedl avatar Mar 27 '23 18:03 aleehedl

I'm not sure, that whole table is from alixaxel's repo, so I don't know what the testing times are from. Personally, I get around 5 seconds during testing, and around 4 seconds if I disable the swiftshader extract, which seems to track from what you are saying. (I run 3gb, maybe I can drop it down to 2gb).

Regardless, that slowdown should only be on cold starts. I'm not sure whether the slowdown is from lambda grabbing the code to run it during a cold start, or whether it's this package extracting the brotli files. I suppose it's probably both.

To solve the problem of avoiding decompression altogether, there are a few options.

  1. As you suggest, a custom lambda image with a browser included. This will also affect cold start times.
  2. Keep the lambda container warm using Provisioned ~~Capacity~~ Concurrency, thus only affecting the absolute first call to the function.

Probably others, maybe allow for an uncompressed chromium binary, but that's 150mb, which might be slower than the brotli file.

Sparticuz avatar Mar 27 '23 18:03 Sparticuz

I just looked at my production cloudwatch logs and I'm getting ~5-6 seconds for a cold start, and around 250ms for a warm start... I may look into provisioned concurrency myself!

Sparticuz avatar Mar 27 '23 19:03 Sparticuz

Another option may be to mount a VFS, which seems to not incur that much of a hit on starts. Then have the chromium binary already there and extracted. LMK if you do this and I'll add instructions

Sparticuz avatar Mar 27 '23 19:03 Sparticuz

Thanks for the response! In my use case the provisioned concurrency approach doesn't sound like the best choice. As a PoC I tried building a custom docker image with unpacked Chromium (from this package) residing next to the actual function code (that is, /var/task/tmp/chromium as executablePath). I got it working but the cold start time was in the same ballpark as previously 😄 Around 5 seconds with 2 GB memory.

The file system approach sounds interesting so I will try it next. I have no previous experience with EFS so we'll see how it goes.

aleehedl avatar Mar 28 '23 05:03 aleehedl

Same here, it takes around 3-4 sec to decompress the brotli files and launch the browser compared to 80ms when warm (and thus already decompressed). It's with a 4G lambda and without counting the cold start (and so not counting the layer being pulled). I would argue that, when used with a layer, it's probably better to have the uncompressed version in the layer. The cold start will be slightly bigger (bigger layer to pull) but I bet that it will be unnoticeable compared to the speedup due to not having to decompress.

EHadoux avatar Apr 28 '23 18:04 EHadoux

The problem we might run into eventually is layer size. When I get some time, I might look into building an already decompressed layer and see if it's feasible

Sparticuz avatar Apr 28 '23 18:04 Sparticuz

Does anyone have tried the VFS solution?

maximang avatar May 30 '23 12:05 maximang

The problem we might run into eventually is layer size. When I get some time, I might look into building an already decompressed layer and see if it's feasible

+1

pranavburnwal avatar Oct 04 '23 05:10 pranavburnwal

@pranavburnwal Do you have any findings on this topic?

owen800q avatar Nov 29 '23 08:11 owen800q

@pranavburnwal Do you have any findings on this topic?

Nope didn't investigate a lot after that, Solved this via https://github.com/Sparticuz/chromium/blob/master/examples/remote-min-binary/index.js#L9 to pull from S3 for cold start and saved to /tmp to use.

pranavburnwal avatar Nov 29 '23 11:11 pranavburnwal

Just FYI, a Lambda layer backed by S3 storage works well for me, size not an issue.

Sleepful avatar Jul 23 '24 21:07 Sleepful