apm-agent-nodejs
apm-agent-nodejs copied to clipboard
invalid values for time/duration config vars gives surprising and unwanted behaviour
There are a few config vars that support a time/duration as a string of the form <number><unit>, e.g. "10s" to mean 10 seconds. The supported units are "m" for minutes, "ms" for milliseconds, "s" for seconds. If the config value is a number or a string without units, it defaults to seconds.
Currently that parsing/conversion is done by normalizeTime in lib/config.js:
function normalizeTime (opts) {
for (const key of TIME_OPTS) {
if (key in opts) opts[key] = toSeconds(String(opts[key]))
}
}
and the relevant config vars are:
var TIME_OPTS = [
'abortedErrorThreshold',
'apiRequestTime',
'metricsInterval',
'serverTimeout',
'spanFramesMinDuration'
]
Currently toSeconds() above will return null if the string value is invalid (e.g. "10ss", "10h", "fuzzywuzzy"). There is no log warning. The results can be surprising: If this null is used in math (often that is the case as the seconds value is converted to milliseconds for usage in setTimeout and the like) or comparisons, JavaScript will treat it as zero, which sometimes is a sentinel value that means something unexpected. Specifically:
apiRequestTime: Results in a value of0being used to the http-client -> a-1sentinel value to StreamChopper, which results in time-based closing of requests to APM server not happening which -- for low traffic apps -- results inERROR (elastic-apm-node): APM Server transport error: APM Server response timeout (30000ms). which results in the stream chopper notmetricsInterval: Results in null -> 0 -> turning off metrics collection.serverTimeout: Results in no socket timeout being set on the communication with APM server which could rarely result in hung communication with APM server.spanFramesMinDuration: Results in possibly unexpected additional CPU usage as the agent collections stack traces for all spans.abortedErrorThreshold: Not yet investigated. Comment indicate this is an old undocumented(?) feature that "will be removed".
[Sergey Kleyman]
AFAIK most of the agents fallback on default value when the given value is invalid.
If that's not in the spec yet, could you add it?
If that's not in the spec yet, could you add it?
Will do, on Colton's PR (https://github.com/elastic/apm/pull/454/files#r656467947) or separately.
This thread is a life-saver. Spent good of amount figuring out the weird error. Thanks a ton @trentm