llrt Rerun the benchmark?

Is it possible to rerun the benchmark with the latest release? Since January 2024, the llrt binaries have grown 2+ MB. I've been doing some coldstart benchmarking to compare LLRT standard sdk with Node 22. My test using arm in us-east-2 instantiates a STS client and invokes get-caller-identity. It isn't quite the same as a ddb put, but it's similar enough that I'd expect the http and λ times to be close to yours on coldstarts. In the table below http means time recorded from the client and λ is time from the invocation logs like yours. Overhead means http - λ. I'm noticing that the overhead in the cold start is higher for LLRT than when using the Node 22 runtime by about 50 ms. I also notice that my p50 for http is ~40 ms higher than your benchmark, but λ is ~17 ms higher. I suspect the increase in binary size is the main contributor, but if you can rerun the benchmarks that will help narrow down what is causing the deltas.

Test	memory	samples	p0(http)	p50(http)	p99(http)	p50(overhead)	p0(λ)	p50(λ)	p99(λ)
LLRT v.0.7.0-beta	128	140	186	275	357	193	61	81	128
Node 22 AwsLite	512	140	344	409	492	144	208	260	326
Node 22 v3 SDK w/optimizations	512	140	369	447	543	144	248	300	363
Node 22 v3 SDK minified	512	140	460	547	640	151	320	393	464
Node 22 v3 SDK from disk	512	140	557	681	788	145	444	535	633

Sep 25 '25 20:09 perpil

Hi @perpil. Sure thing, we should rerun the benchmark. And you are also right that the overall increased cold start time is due to binary size due to disk read performance (and decompression) on Lambda.

I also think you should benchmark with similar memory size. In your current test, Node has 4x more memory which also means 4x more CPU (and CPU time) which makes it very biased.

Also get-caller-identity for STS is quite slow which also means that a lot of duration (for both Node and LLRT is spent waiting) that affects the test. For example say there is 100ms latency for the call and LLRT finnishes in 125 and node 200. On paper this makes LLRT 60% faster (200/100), but it's actually 4x faster (100/25).

Without a closer look at the benchmark bundle files on Node I also suspect that the overheard comes from shipping pure JS Code for Node vs with LLRT shipping the executable + JS Code.

That being said, a lot of the recent size increases in LLRT comes from the WebCrypto APIs that are in pure Rust. We're working on building variants of LLRT that uses static and runtime loaded OpenSSL instead. This is significantly shrink the binary size and only introduce a cold start penalty if you actually use WebCrypto APIs which most use cases don't. Additionally that increased cold start time is offset by increased performance of OpenSSL and libcrypto vs pure rust crypto implementations. The downside of this approach is that it requires OpenSSL/libcrypto to exists (which it does in Lambda). That's why different versions of LLRT will use different Crypto Providers, where the Lambda version will dynamically load OpenSSL symbols at runtime on demand when this feature is completed.

Sep 30 '25 09:09 richarddavison

Thanks @richarddavison all good points.

I also think you should benchmark with similar memory size. In your current test, Node has 4x more memory which also means 4x more CPU (and CPU time) which makes it very biased.

I'm mainly trying to compare v.0.7.0-beta to the current llrt benchmark in the readme which is using 128 MB. I'm making the call to STS in init (vs. the handler like your benchmark) so CPU should be the 1 full vCPU regardless of memory size. Agreed that I should use the same amount of memory between llrt and node if I wanted an apples to apples comparison.

Also get-caller-identity for STS is quite slow which also means that a lot of duration (for both Node and LLRT is spent waiting) that affects the test. For example say there is 100ms latency for the call and LLRT finnishes in 125 and node 200. On paper this makes LLRT 60% faster (200/100), but it's actually 4x faster (100/25).

I don't think get-caller-identity is a particularly heavy operation, but this is a valid point. As I understand, it shouldn't be much greater than the auth latency of any authed AWS call. The maxday website is down at the moment so I can't see p50 time between my benchmark and raw llrt, but what I recall is api's E2E latency is on the order of 10-15 ms including ssl.

Without a closer look at the benchmark bundle files on Node I also suspect that the overheard comes from shipping pure JS Code for Node vs with LLRT shipping the executable + JS Code.

I suspect the same. I think I'm seeing 4.5MB package size for llrt vs 77KB for bundled node.

Sep 30 '25 16:09 perpil