apm-agent-nodejs Improve uuid performance

Each transaction is currently assigned a uuid v4. We use the uuid module which internally uses crypto.randomBytes(16) to generate the uuid.

This is currently one of our biggest performance drains when creating new transactions, so I ran some benchmarks to see if there were any faster alternatives.

Here's the best(🔥) synchronous performers of the benchmark compared our current method(🎯):

Method	Leaky	Format	Re-use	Cache	Ops/sec	RME
uuid/v1	💦	uuid		n/a	1,074,518	±1.17%
uuid/v1	💦	hex		n/a	404,325	±1.63%
uuid/v1	💦	hex	✅	n/a	545,523	±1.14%
uuid/v4🎯		uuid		n/a	233,282	±2.16%
uuid/v4		hex		n/a	207,050	±4.65%
uuid/v4		hex	✅	n/a	243,580	±3.50%
fast-uuid	💦	uuid		n/a	630,719	±0.83%
uuid-random🔥		uuid		512	1,214,445	±0.42%
uuid-random🔥		hex		512	1,159,229	±0.80%
sodium-uuid		hex		n/a	214,087	±0.51%
sodium-uuid		hex	✅	n/a	220,409	±0.54%
hyperid🔥	💦	custom		n/a	8,843,573	±1.52%
crypto.randomBytes		hex		n/a	243,894	±3.27%
crypto.randomBytes		hex		256	932,165	±1.11%
crypto.randomBytes		hex	✅	256	897,471	±0.95%
crypto.randomBytes🔥		hex		4096	1,221,396	±0.64%
crypto.randomBytes🔥		hex	✅	4096	1,140,619	±0.42%
crypto.randomBytes🔥		hex		65536	1,199,742	±4.04%
crypto.randomBytes🔥		hex	✅	65536	1,163,009	±0.63%
crypto.randomFillSync		hex	✅	n/a	309,164	±0.52%
crypto.randomFillSync		hex		256	1,053,831	±0.77%
crypto.randomFillSync		hex	✅	256	1,012,593	±0.52%
crypto.randomFillSync🔥		hex		4096	1,318,476	±0.69%
crypto.randomFillSync🔥		hex	✅	4096	1,253,235	±0.39%
crypto.randomFillSync🔥		hex		65536	1,294,847	±1.67%
crypto.randomFillSync🔥		hex	✅	65536	1,295,193	±1.05%
/dev/random		hex	✅	n/a	228,396	±0.34%
/dev/random		hex	✅	256	446,458	±0.62%
/dev/random		hex	✅	4096	482,517	±0.89%
/dev/random		hex	✅	65536	540,828	±0.75%
/dev/urandom	💦	hex	✅	n/a	227,162	±0.72%
/dev/urandom	💦	hex	✅	256	447,749	±0.46%
/dev/urandom	💦	hex	✅	4096	490,723	±0.49%
/dev/urandom	💦	hex	✅	65536	542,039	±0.87%

View all the results here and the source code for the benchmark here: https://github.com/watson/uuid-benchmark

Method: Name of npm module, Node.js core function, or OS based random generator used
Leaky: Indicates that the method used to generate the 128 bit number doesn't leak metadata
Format:
- uuid - ID is a UUID formatted string representing a 128 bit number, e.g. 3a017fc5-4f50-4db9-b0ce-4547ba0a1bfd
- hex - ID is a hex formatted string representing a 128 bit number, e.g. 3a017fc54f504db9b0ce4547ba0a1bfd
- custom - ID is not a UUID or 128 bit number, but is still globally unique
Re-use: Indicates an output buffer was re-used between each test to potentially reduce the number of objects that needed to be created
Cache:
- n/a - Only the amount of bytes required to generate a 128 bit number was read into memory
- <Number> - Number of bytes read into memory the first time a 128 bit number was requested. Subsequent runs would use the leftover bytes in the cache until it had been depleted, at which time another chunk of bytes would be read into memory
Ops/sec: Number of UUID's generated per second
RME: The relative margin of error (expressed as a percentage of the mean)

Gotcha

The crypto.randomFillSync function was added in Node.js v7.10.0. For older versions crypto.randomBytes can be used instead.

~~I'm not 100% certain that uuid-random is actually secure. The judgement is purely based on what's written in its readme.~~ I read the source code, and it looks fine.

Discussion topics

Are we ok with leaking some types of metadata to gain speed? E.g. a sequential number
Are there any privacy/safety issues in using a cache to speed up generating multiple random numbers?

Aug 12 '18 12:08 watson

Just updated the table with a few more benchmarks

Aug 12 '18 15:08 watson

Since commit 51da76bb6f3ca425 the code no longer uses uuid/v4 as far as I can see. In the current tip of master (f695e142d9444) there appears to instead be just a single call to crypto.randomBytes(16).toString('hex'), which sadly has the same performance profile as uuid/v4. So I suppose this issue is still relevant.

Aug 06 '20 10:08 tsondergaard

@watson is there a branch with an alternate UUID generator that one can try? We are evaluating Elastic APM and are seeing about 50-70% increase in 95 percentile response times due to APM. If we turn off transaction spans this is 12-20%, but it is still high. We were expecting a 1-3% impact, not this high!

Feb 02 '21 07:02 saurabhnanda

@saurabhnanda, we saw the same thing when we started using this module. I think the most important knob here is transactionSampleRate. Try setting it e.g via env variable ELASTIC_APM_TRANSACTION_SAMPLE_RATE to 0.1.

Feb 02 '21 08:02 tsondergaard

@tsondergaard unfortunately reducing the sample rate is not something that we'd like to do. We also plan to use Elastic APM for logging all req/res in our production environment, so anything less than 100% (for transactionSampleRate) will force us to use yet another component for logging.

Feb 02 '21 08:02 saurabhnanda

@saurabhnanda, all req/res will be recorded even with a reduced transactionSampleRate. If you set transactionSampleRate to 0.1 you will still have information about all http requests and responses including response codes, you will just only have full traces of 10% of these requests. My initial reaction to the performance of this module was exactly like yours, but I found out that a reduced sampling rate is not such a big deal because you still get all req/res recorded.

Feb 02 '21 08:02 tsondergaard

@tsondergaard thanks for the tip. Even the docs don't seem to explicitly mention this subtlety about transactionSampleRate.

What config are using in production for the nodejs agent?

Feb 02 '21 09:02 saurabhnanda

@saurabhnanda, the project that I work on that uses the elastic-apm-node module is not in production yet, but the load test simultations run with a sampling rate of 0.1 to keep the overhead so low that it does not significantly skew the results.

Feb 02 '21 10:02 tsondergaard

@tsondergaard what about captureSpanStackTraces? Do you have that enabled?

Feb 02 '21 11:02 saurabhnanda

@saurabhnanda, yes, so far!

Feb 02 '21 16:02 tsondergaard

I don't think UUID generation is the cause of the problem you're describing and there's no branch available where the UUID engine have been changed.

Have you checked out the Performance Tuning page in the docs?

Feb 22 '21 10:02 watson

This issue is still relevant and I'm seeing 2-2.5x decrease in performance due to crypto.randomFillSync() being called on every transaction, regardless of sampling rate. I have opentelemetryBridgeEnabled as true because I have a custom OTEL instrumentation.

Nov 14 '24 15:11 ezioda004