apm-agent-nodejs
apm-agent-nodejs copied to clipboard
`transaction.context.request.url.*` is wrong for an incoming HTTP request with a pathname starting with a double-slash
transaction.context.request.url.*
is wrong for an incoming HTTP request with a pathname starting with a double-slash. E.g.:
GET //foo/bar HTTP/1.1
Host: example.com
repro
Apply this patch:
diff --git a/examples/trace-http.js b/examples/trace-http.js
index f9fbb524..31e5840b 100755
--- a/examples/trace-http.js
+++ b/examples/trace-http.js
@@ -56,7 +56,12 @@ server.listen(3000, function () {
//
// Note that this there is no current "transaction" here, so this HTTP
// request is not captured by APM. See "trace-http-request.js" for more.
- const clientReq = http.request('http://localhost:3000/', function (clientRes) {
+ // const clientReq = http.request('http://localhost:3000/', function (clientRes) {
+ const clientReq = http.request({
+ host: 'localhost',
+ port: 3000,
+ path: '//user@foo'
+ }, function (clientRes) {
console.log('client response: %s %s', clientRes.statusCode, clientRes.headers)
const chunks = []
clientRes.on('data', function (chunk) {
diff --git a/lib/parsers.js b/lib/parsers.js
index 377ba50d..82ed1800 100644
--- a/lib/parsers.js
+++ b/lib/parsers.js
@@ -27,6 +27,7 @@ function getContextFromRequest (req, conf, type) {
url: getUrlFromRequest(req),
headers: undefined
}
+ console.log('XXX context.url', context.url)
if (req.socket && req.socket.remoteAddress) {
context.socket = {
remote_address: req.socket.remoteAddress
Then run: node examples/trace-http.js
.
The generated transaction includes:
{
"transaction": {
"name": "GET //user@foo",
...
"request": {
"http_version": "1.1",
"method": "GET",
"url": {
"raw": "//user@foo",
"protocol": "http:",
"hostname": "foo",
"port": "3000",
"full": "http://foo:3000"
},
"headers": {
"host": "localhost:3000",
"connection": "close"
},
"socket": {
"remote_address": "::ffff:127.0.0.1"
}
},
details
The recent https://github.com/elastic/apm-agent-nodejs/pull/3133 fixed a similar issue in the handling of transaction.name
for this case. The issue there was that the request path, e.g. //foo/bar
, was used as a full URL in URL parsing (via new URL(...)
or similar). However, this req.url
is the "request-target" in the HTTP/1.1 request line and not a full URL.
The transaction.context.request.url.*
value is calculated using the original-url
module here: https://github.com/elastic/apm-agent-nodejs/blob/34cc2a77820ae5b52f5ffab36f33fbaece8ccaf6/lib/parsers.js#L27
Root cause seems to be the same as it was in #3133
In this situation getUrlFromRequest
is an exported method from a package named origina-url. We can see its import here.
The package original-url uses the legacy API url.parse which is not recommended in the documentation as it says
url.parse()
uses a lenient, non-standard algorithm for parsing URL strings. It is prone to security issues such as host name spoofing and incorrect handling of usernames and passwords. Do not use with untrusted input. CVEs are not issued forurl.parse()
vulnerabilities. Use the WHATWG URL API instead.
The package is using it to create the base URL object for the result. You can see it at the top of the index.js file. Here's in an extract of it
const parseUrl = require('url').parse // Here it's requiring the legacy function
const parseForwarded = require('forwarded-parse')
const net = require('net')
module.exports = function (req) {
const raw = req.originalUrl || req.url
const url = parseUrl(raw || '') // <-- Here we parse the url string
const secure = req.secure || (req.connection && req.connection.encrypted)
const result = { raw: raw }
let host
What would be ideal is to have the dependency updated. But seeing the commit history it seems unmaintained. Maybe @watson may help us on this or let us update the package.
@david-luna: @\watson will very likely be receptive to a PR on original-url for this if you wanted to work on one. @\watson used to be the primary developer of this elastic-apm-node
package and, I think, original-url
was initial developed for this package.
@david-luna yes, please if you don't mind to open a PR against original-url
, then I'll take a look ❤️