opentelemetry-js
opentelemetry-js copied to clipboard
instrumentation-http uses the wrong method to parse URLs for outgoing requests
What happened?
instrumentation-http intercepts all requests made with the http or https node modules, but if the first argument is a string, it uses url.parse() to convert the string url into the RequestOptions object, instead of new URL() as documented by nodejs (in recent versions).
This causes a discrepancy between instrumented and non-instrumented requests, in particular with regards to requests that have unescaped non-ASCII Unicode characters in the path or query
Steps to Reproduce
Have a script that makes a request to a URL containing a non-ASCII Unicode character (not percent encoded).
Expected Result
The request succeeds, as if the character was % encoded. (This is what happens if the request is not instrumented)
Actual Result
TypeError: Request path contains unescaped characters
OpenTelemetry Setup Code
I'm not using OpenTelemetry directly. OpenTelemetry is setup by the Sentry node js SDK.
package.json
@opentelemetry/instrumentation-http: 0.53.0
Relevant log output
No response
@gcampax, here’s a sample script that demonstrates how to use OpenTelemetry to instrument an HTTP request to a URL with non-ASCII characters:
// javascript
const { HttpInstrumentation } = require("@opentelemetry/instrumentation-http");
const { NodeSDK } = require("@opentelemetry/sdk-node");
const { ConsoleSpanExporter } = require("@opentelemetry/sdk-trace-node");
// Initialize OpenTelemetry
const sdk = new NodeSDK({
traceExporter: new ConsoleSpanExporter(),
instrumentations: [HttpInstrumentation],
});
// Start the SDK
sdk.start().then(() => {
// Define a URL with non-ASCII characters
const url = 'http://example.com/привет'; // Replace with your desired URL
// Make an HTTP request
fetch(url)
.then(response => response.text())
.then(body => {
console.log('Response received:', body);
})
.catch(err => {
console.error('Error occurred:', err);
})
.finally(() => {
// Shutdown the SDK
sdk.shutdown().then(() => {
console.log('SDK shut down');
});
});
});
Explanation:
- Dependencies: The script imports necessary modules from OpenTelemetry and node-fetch for making HTTP requests.
- SDK Initialization: The OpenTelemetry SDK is initialized with an HTTP instrumentation that allows tracing of HTTP requests.
- Non-ASCII URL: A URL is defined that includes non-ASCII characters (in this case, "привет").
- HTTP Request: The script makes a fetch request to the specified URL and logs the response or any errors.
- Shutdown: Finally, it shuts down the OpenTelemetry SDK after the request completes.
Step 3: Run the Script
You can run this script using Node.js:
bash node your-script-name.js
Make sure to replace your-script-name.js with the actual name of your script file. This will execute the HTTP request to the URL with non-ASCII characters and log the response.
Hi thanks for creating this reproducer, did you run it? Do you see the issue?
No. try and run the code snippet in your editor and let me know your reply or rather share your code where you have this issues so I can pinpoint what you might be doing wrong.
Your script is buggy, but this one works:
// javascript
const { HttpInstrumentation } = require("@opentelemetry/instrumentation-http");
const { NodeSDK } = require("@opentelemetry/sdk-node");
const { ConsoleSpanExporter } = require("@opentelemetry/sdk-trace-node");
// Initialize OpenTelemetry
const sdk = new NodeSDK({
traceExporter: new ConsoleSpanExporter(),
instrumentations: [new HttpInstrumentation],
});
// Start the SDK
sdk.start();
// Define a URL with non-ASCII characters
const url = 'https://example.com/привет'; // Replace with your desired URL
// Make an HTTP request
const https = require('https');
https.get(url)
Run it like this and you get
/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation-http/build/src/http.js:414
throw error;
^
TypeError [ERR_UNESCAPED_CHARACTERS]: Request path contains unescaped characters
at new ClientRequest (node:_http_client:184:13)
at request (node:https:381:10)
at /home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation-http/build/src/http.js:410:94
at safeExecuteInTheMiddle (/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation/build/src/utils.js:28:18)
at /home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation-http/build/src/http.js:410:78
at AsyncLocalStorage.run (node:internal/async_local_storage/async_hooks:91:14)
at AsyncLocalStorageContextManager.with (/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/context-async-hooks/build/src/AsyncLocalStorageContextManager.js:33:40)
at ContextAPI.with (/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/api/build/src/api/context.js:60:46)
at outgoingRequest (/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation-http/build/src/http.js:401:38)
at httpsOutgoingRequest (/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation-http/build/src/http.js:153:93) {
code: 'ERR_UNESCAPED_CHARACTERS'
}
Node.js v22.9.0
Comment out all the OpenTelemetry init code / module loading and the request succeeds.
As I mentioned in the original comment, there is a clear bug in the OpenTelemetry code, where the wrong URL parsing function is used (a bug that could have security implications, given the documentation for url.parse()). Unless you disagree with that, I think this should be enough details to triage and resolve the issue.
Thanks for the detailed report and the reproducer @gcampax - I'll try to find someone to work on this ASAP.