apm-agent-nodejs
apm-agent-nodejs copied to clipboard
invalid values for time/duration config vars gives surprising and unwanted behaviour
There are a few config vars that support a time/duration as a string of the form <number><unit>
, e.g. "10s" to mean 10 seconds. The supported units are "m" for minutes, "ms" for milliseconds, "s" for seconds. If the config value is a number or a string without units, it defaults to seconds.
Currently that parsing/conversion is done by normalizeTime
in lib/config.js:
function normalizeTime (opts) {
for (const key of TIME_OPTS) {
if (key in opts) opts[key] = toSeconds(String(opts[key]))
}
}
and the relevant config vars are:
var TIME_OPTS = [
'abortedErrorThreshold',
'apiRequestTime',
'metricsInterval',
'serverTimeout',
'spanFramesMinDuration'
]
Currently toSeconds()
above will return null
if the string value is invalid (e.g. "10ss", "10h", "fuzzywuzzy"). There is no log warning. The results can be surprising: If this null
is used in math (often that is the case as the seconds value is converted to milliseconds for usage in setTimeout
and the like) or comparisons, JavaScript will treat it as zero, which sometimes is a sentinel value that means something unexpected. Specifically:
-
apiRequestTime
: Results in a value of0
being used to the http-client -> a-1
sentinel value to StreamChopper, which results in time-based closing of requests to APM server not happening which -- for low traffic apps -- results inERROR (elastic-apm-node): APM Server transport error: APM Server response timeout (30000ms)
. which results in the stream chopper not -
metricsInterval
: Results in null -> 0 -> turning off metrics collection. -
serverTimeout
: Results in no socket timeout being set on the communication with APM server which could rarely result in hung communication with APM server. -
spanFramesMinDuration
: Results in possibly unexpected additional CPU usage as the agent collections stack traces for all spans. -
abortedErrorThreshold
: Not yet investigated. Comment indicate this is an old undocumented(?) feature that "will be removed".
[Sergey Kleyman]
AFAIK most of the agents fallback on default value when the given value is invalid.
If that's not in the spec yet, could you add it?
If that's not in the spec yet, could you add it?
Will do, on Colton's PR (https://github.com/elastic/apm/pull/454/files#r656467947) or separately.
This thread is a life-saver. Spent good of amount figuring out the weird error. Thanks a ton @trentm