apm-agent-nodejs
apm-agent-nodejs copied to clipboard
feat: 'elasticsearchCaptureBodyUrls' config option
This adds a new config option, per the 'elasticsearch_capture_body_urls' cross-agent spec, to control for which ES client requests the request body is captured. By default it is just the body of search-related APIs. The use case for configuring this is to capture for all requests for reply in Kibana performance work. This option is not expected to be used often.
Closes: #2019 Refs: https://github.com/elastic/apm/issues/665 Refs: https://github.com/elastic/apm/pull/670
checklist
- [x] finish implementation (docs, some tests)
- [x] CHANGELOG entry
- [x] TAV tests: https://apm-ci.elastic.co/job/apm-agent-nodejs/job/apm-agent-nodejs-mbp/job/PR-2873/7/
- [ ] perhaps wait for the spec to be merged
:green_heart: Build Succeeded
the below badges are clickable and redirect to their specific view in the CI or DOCS
![]()
![]()
![]()
![]()
![]()
Expand to view the summary
Build stats
-
Start Time: 2022-08-12T23:15:21.852+0000
-
Duration: 34 min 28 sec
Test stats :test_tube:
| Test | Results |
|---|---|
| Failed | 0 |
| Passed | 260042 |
| Skipped | 0 |
| Total | 260042 |
:robot: GitHub comments
To re-run your PR in the CI, just comment with:
-
/test: Re-trigger the build. -
run module tests for <modules>: Run TAV tests for one or more modules, where<modules>can be either a comma separated list of modules (e.g. memcached,redis) or the string literal ALL to test all modules -
run benchmark tests: Run the benchmark test only. -
runelasticsearch-ci/docs: Re-trigger the docs validation. (use unformatted text in the comment!)
run module tests for elasticsearch,@elastic/elasticsearch
some premature optimization perf notes
I digressed slightly to sanity check that the change in ES URL path checking here (from a custom pathIsAQuery regexp, to a WildcardMatcher-created array of regexps) will not be a performance problem.
tl;dr: The WildcardMatcher-based comparison is much slower. However, it is on the order of 1 microsecond per path check on my MacBook Pro (2.4 GHz 8-Core Intel Core i9). Given that ES queries are remote calls and on the order of milliseconds, it would generally be premature optimization to bother getting fancy to improve performance of this.
details
Here is my benchmark script: https://gist.github.com/trentm/ba426e5a08317b18d40e897aeb536c0c
And a run:
% node bench-shouldCaptureBody.js
WilcardMatcher array:
[/^.*\/_search$/i, /^.*\/_search\/template$/i, /^.*\/_msearch$/i, /^.*\/_msearch\/template$/i, /^.*\/_async_search$/i, /^.*\/_count$/i, /^.*\/_sql$/i, /^.*\/_eql\/search$/i]
Current pathIsAQuery RegExp: /\/(_search|_msearch|_count|_async_search|_sql|_eql)(\/|$)/
Naive WildcardMatcher in a single RegExp:
/^(.*\/_search|.*\/_search\/template|.*\/_msearch|.*\/_msearch\/template|.*\/_async_search|.*\/_count|.*\/_sql|.*\/_eql\/search)$/i
customEquivReIgnoreCase:
/\/(_search|_search\/template|_msearch|_msearch\/template|_async_search|_count|_sql|_eql\/search)$/i
customEquivRe:
/\/(_search|_search\/template|_msearch|_msearch\/template|_async_search|_count|_sql|_eql\/search)$/
WildcardMatcher array x 719,208 ops/sec ±0.99% (90 runs sampled)
Current pathIsAQuery RegExp x 15,700,534 ops/sec ±0.97% (90 runs sampled)
Naive WildcardMatcher in a single RegExp x 856,094 ops/sec ±1.10% (88 runs sampled)
customEquivReIgnoreCase x 17,477,871 ops/sec ±1.37% (88 runs sampled)
customEquivRe x 17,942,769 ops/sec ±0.91% (88 runs sampled)
Fastest is customEquivRe
- The WildcardMatcher-based array of regexps that this PR changes to is ~20x slower than the current
pathIsAQueryRegExp. - A naive potential optimization might be to make a WildcardMatcher array config into a single regexp --
^(pattern1|pattern2|...)$. This turns out to be not much faster, in this case. - The major perf difference is that the (a) the WildcardMatcher necessarily always has a
^anchor and (b) every pattern is leading with*to avoid that anchor. A possible optimization of WildcardMatcher usage would be to drop the^anchor and the leading*if and only if every pattern had that leading*. An equivalent optimization could be done for the '$' anchor and a trailing*on every pattern. - There is a small perf win with dropping the case-insensitive flag. Arguably the default
elasticsearch_capture_body_urlsvalues could/should be case-sensitive. I don't believe the Elasticsearch REST API would acceptPOST /my-index/_SeArCh.
However, this is all fast enough, so I'm not proposing doing any WildcardMatcher perf improvements right now.
@astorm the spec for this will get merged tomorrow, so feel free to review this when you get a chance. Thanks.
:green_heart: Build Succeeded
the below badges are clickable and redirect to their specific view in the CI or DOCS
![]()
![]()
![]()
![]()
![]()
Expand to view the summary
Build stats
-
Start Time: 2022-08-18T23:49:18.790+0000
-
Duration: 21 min 41 sec
Test stats :test_tube:
| Test | Results |
|---|---|
| Failed | 0 |
| Passed | 260042 |
| Skipped | 0 |
| Total | 260042 |
:robot: GitHub comments
To re-run your PR in the CI, just comment with:
-
/test: Re-trigger the build. -
run module tests for <modules>: Run TAV tests for one or more modules, where<modules>can be either a comma separated list of modules (e.g. memcached,redis) or the string literal ALL to test all modules -
run benchmark tests: Run the benchmark test only. -
runelasticsearch-ci/docs: Re-trigger the docs validation. (use unformatted text in the comment!)
[email protected] is release with this now