apm-agent-nodejs icon indicating copy to clipboard operation
apm-agent-nodejs copied to clipboard

Improve uuid performance

Open watson opened this issue 7 years ago • 12 comments

Each transaction is currently assigned a uuid v4. We use the uuid module which internally uses crypto.randomBytes(16) to generate the uuid.

This is currently one of our biggest performance drains when creating new transactions, so I ran some benchmarks to see if there were any faster alternatives.

Here's the best(🔥) synchronous performers of the benchmark compared our current method(🎯):

Method Leaky Format Re-use Cache Ops/sec RME
uuid/v1 💦 uuid n/a 1,074,518 ±1.17%
uuid/v1 💦 hex n/a 404,325 ±1.63%
uuid/v1 💦 hex n/a 545,523 ±1.14%
uuid/v4🎯 uuid n/a 233,282 ±2.16%
uuid/v4 hex n/a 207,050 ±4.65%
uuid/v4 hex n/a 243,580 ±3.50%
fast-uuid 💦 uuid n/a 630,719 ±0.83%
uuid-random🔥 uuid 512 1,214,445 ±0.42%
uuid-random🔥 hex 512 1,159,229 ±0.80%
sodium-uuid hex n/a 214,087 ±0.51%
sodium-uuid hex n/a 220,409 ±0.54%
hyperid🔥 💦 custom n/a 8,843,573 ±1.52%
crypto.randomBytes hex n/a 243,894 ±3.27%
crypto.randomBytes hex 256 932,165 ±1.11%
crypto.randomBytes hex 256 897,471 ±0.95%
crypto.randomBytes🔥 hex 4096 1,221,396 ±0.64%
crypto.randomBytes🔥 hex 4096 1,140,619 ±0.42%
crypto.randomBytes🔥 hex 65536 1,199,742 ±4.04%
crypto.randomBytes🔥 hex 65536 1,163,009 ±0.63%
crypto.randomFillSync hex n/a 309,164 ±0.52%
crypto.randomFillSync hex 256 1,053,831 ±0.77%
crypto.randomFillSync hex 256 1,012,593 ±0.52%
crypto.randomFillSync🔥 hex 4096 1,318,476 ±0.69%
crypto.randomFillSync🔥 hex 4096 1,253,235 ±0.39%
crypto.randomFillSync🔥 hex 65536 1,294,847 ±1.67%
crypto.randomFillSync🔥 hex 65536 1,295,193 ±1.05%
/dev/random hex n/a 228,396 ±0.34%
/dev/random hex 256 446,458 ±0.62%
/dev/random hex 4096 482,517 ±0.89%
/dev/random hex 65536 540,828 ±0.75%
/dev/urandom 💦 hex n/a 227,162 ±0.72%
/dev/urandom 💦 hex 256 447,749 ±0.46%
/dev/urandom 💦 hex 4096 490,723 ±0.49%
/dev/urandom 💦 hex 65536 542,039 ±0.87%

View all the results here and the source code for the benchmark here: https://github.com/watson/uuid-benchmark

  • Method: Name of npm module, Node.js core function, or OS based random generator used
  • Leaky: Indicates that the method used to generate the 128 bit number doesn't leak metadata
  • Format:
    • uuid - ID is a UUID formatted string representing a 128 bit number, e.g. 3a017fc5-4f50-4db9-b0ce-4547ba0a1bfd
    • hex - ID is a hex formatted string representing a 128 bit number, e.g. 3a017fc54f504db9b0ce4547ba0a1bfd
    • custom - ID is not a UUID or 128 bit number, but is still globally unique
  • Re-use: Indicates an output buffer was re-used between each test to potentially reduce the number of objects that needed to be created
  • Cache:
    • n/a - Only the amount of bytes required to generate a 128 bit number was read into memory
    • <Number> - Number of bytes read into memory the first time a 128 bit number was requested. Subsequent runs would use the leftover bytes in the cache until it had been depleted, at which time another chunk of bytes would be read into memory
  • Ops/sec: Number of UUID's generated per second
  • RME: The relative margin of error (expressed as a percentage of the mean)

Gotcha

The crypto.randomFillSync function was added in Node.js v7.10.0. For older versions crypto.randomBytes can be used instead.

~~I'm not 100% certain that uuid-random is actually secure. The judgement is purely based on what's written in its readme.~~ I read the source code, and it looks fine.

Discussion topics

  • Are we ok with leaking some types of metadata to gain speed? E.g. a sequential number
  • Are there any privacy/safety issues in using a cache to speed up generating multiple random numbers?

watson avatar Aug 12 '18 12:08 watson

Just updated the table with a few more benchmarks

watson avatar Aug 12 '18 15:08 watson

Since commit 51da76bb6f3ca425 the code no longer uses uuid/v4 as far as I can see. In the current tip of master (f695e142d9444) there appears to instead be just a single call to crypto.randomBytes(16).toString('hex'), which sadly has the same performance profile as uuid/v4. So I suppose this issue is still relevant.

tsondergaard avatar Aug 06 '20 10:08 tsondergaard

@watson is there a branch with an alternate UUID generator that one can try? We are evaluating Elastic APM and are seeing about 50-70% increase in 95 percentile response times due to APM. If we turn off transaction spans this is 12-20%, but it is still high. We were expecting a 1-3% impact, not this high!

saurabhnanda avatar Feb 02 '21 07:02 saurabhnanda

@saurabhnanda, we saw the same thing when we started using this module. I think the most important knob here is transactionSampleRate. Try setting it e.g via env variable ELASTIC_APM_TRANSACTION_SAMPLE_RATE to 0.1.

tsondergaard avatar Feb 02 '21 08:02 tsondergaard

@tsondergaard unfortunately reducing the sample rate is not something that we'd like to do. We also plan to use Elastic APM for logging all req/res in our production environment, so anything less than 100% (for transactionSampleRate) will force us to use yet another component for logging.

saurabhnanda avatar Feb 02 '21 08:02 saurabhnanda

@saurabhnanda, all req/res will be recorded even with a reduced transactionSampleRate. If you set transactionSampleRate to 0.1 you will still have information about all http requests and responses including response codes, you will just only have full traces of 10% of these requests. My initial reaction to the performance of this module was exactly like yours, but I found out that a reduced sampling rate is not such a big deal because you still get all req/res recorded.

tsondergaard avatar Feb 02 '21 08:02 tsondergaard

@tsondergaard thanks for the tip. Even the docs don't seem to explicitly mention this subtlety about transactionSampleRate.

What config are using in production for the nodejs agent?

saurabhnanda avatar Feb 02 '21 09:02 saurabhnanda

@saurabhnanda, the project that I work on that uses the elastic-apm-node module is not in production yet, but the load test simultations run with a sampling rate of 0.1 to keep the overhead so low that it does not significantly skew the results.

tsondergaard avatar Feb 02 '21 10:02 tsondergaard

@tsondergaard what about captureSpanStackTraces? Do you have that enabled?

saurabhnanda avatar Feb 02 '21 11:02 saurabhnanda

@saurabhnanda, yes, so far!

tsondergaard avatar Feb 02 '21 16:02 tsondergaard

I don't think UUID generation is the cause of the problem you're describing and there's no branch available where the UUID engine have been changed.

Have you checked out the Performance Tuning page in the docs?

watson avatar Feb 22 '21 10:02 watson

This issue is still relevant and I'm seeing 2-2.5x decrease in performance due to crypto.randomFillSync() being called on every transaction, regardless of sampling rate. I have opentelemetryBridgeEnabled as true because I have a custom OTEL instrumentation.

ezioda004 avatar Nov 14 '24 15:11 ezioda004