opentelemetry-js icon indicating copy to clipboard operation
opentelemetry-js copied to clipboard

instrumentation-http uses the wrong method to parse URLs for outgoing requests

Open gcampax opened this issue 1 year ago • 5 comments
trafficstars

What happened?

instrumentation-http intercepts all requests made with the http or https node modules, but if the first argument is a string, it uses url.parse() to convert the string url into the RequestOptions object, instead of new URL() as documented by nodejs (in recent versions). This causes a discrepancy between instrumented and non-instrumented requests, in particular with regards to requests that have unescaped non-ASCII Unicode characters in the path or query

Steps to Reproduce

Have a script that makes a request to a URL containing a non-ASCII Unicode character (not percent encoded).

Expected Result

The request succeeds, as if the character was % encoded. (This is what happens if the request is not instrumented)

Actual Result

TypeError: Request path contains unescaped characters

OpenTelemetry Setup Code

I'm not using OpenTelemetry directly. OpenTelemetry is setup by the Sentry node js SDK.

package.json

@opentelemetry/instrumentation-http: 0.53.0

Relevant log output

No response

gcampax avatar Oct 09 '24 22:10 gcampax

@gcampax, here’s a sample script that demonstrates how to use OpenTelemetry to instrument an HTTP request to a URL with non-ASCII characters:

// javascript
const { HttpInstrumentation } = require("@opentelemetry/instrumentation-http");
const { NodeSDK } = require("@opentelemetry/sdk-node");
const { ConsoleSpanExporter } = require("@opentelemetry/sdk-trace-node");

// Initialize OpenTelemetry
const sdk = new NodeSDK({
    traceExporter: new ConsoleSpanExporter(),
    instrumentations: [HttpInstrumentation],
});

// Start the SDK
sdk.start().then(() => {
    // Define a URL with non-ASCII characters
    const url = 'http://example.com/привет'; // Replace with your desired URL

    // Make an HTTP request
    fetch(url)
        .then(response => response.text())
        .then(body => {
            console.log('Response received:', body);
        })
        .catch(err => {
            console.error('Error occurred:', err);
        })
        .finally(() => {
            // Shutdown the SDK
            sdk.shutdown().then(() => {
                console.log('SDK shut down');
            });
        });
});

Explanation:

  1. Dependencies: The script imports necessary modules from OpenTelemetry and node-fetch for making HTTP requests.
  2. SDK Initialization: The OpenTelemetry SDK is initialized with an HTTP instrumentation that allows tracing of HTTP requests.
  3. Non-ASCII URL: A URL is defined that includes non-ASCII characters (in this case, "привет").
  4. HTTP Request: The script makes a fetch request to the specified URL and logs the response or any errors.
  5. Shutdown: Finally, it shuts down the OpenTelemetry SDK after the request completes.

Step 3: Run the Script

You can run this script using Node.js:

bash node your-script-name.js

Make sure to replace your-script-name.js with the actual name of your script file. This will execute the HTTP request to the URL with non-ASCII characters and log the response.

olabodeIdowu avatar Oct 15 '24 23:10 olabodeIdowu

Hi thanks for creating this reproducer, did you run it? Do you see the issue?

gcampax avatar Oct 16 '24 00:10 gcampax

No. try and run the code snippet in your editor and let me know your reply or rather share your code where you have this issues so I can pinpoint what you might be doing wrong.

olabodeIdowu avatar Oct 16 '24 00:10 olabodeIdowu

Your script is buggy, but this one works:

// javascript
const { HttpInstrumentation } = require("@opentelemetry/instrumentation-http");
const { NodeSDK } = require("@opentelemetry/sdk-node");
const { ConsoleSpanExporter } = require("@opentelemetry/sdk-trace-node");

// Initialize OpenTelemetry
const sdk = new NodeSDK({
    traceExporter: new ConsoleSpanExporter(),
    instrumentations: [new HttpInstrumentation],
});

// Start the SDK
sdk.start();

// Define a URL with non-ASCII characters
const url = 'https://example.com/привет'; // Replace with your desired URL

// Make an HTTP request
const https = require('https');
https.get(url)

Run it like this and you get

/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation-http/build/src/http.js:414
                        throw error;
                        ^

TypeError [ERR_UNESCAPED_CHARACTERS]: Request path contains unescaped characters
    at new ClientRequest (node:_http_client:184:13)
    at request (node:https:381:10)
    at /home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation-http/build/src/http.js:410:94
    at safeExecuteInTheMiddle (/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation/build/src/utils.js:28:18)
    at /home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation-http/build/src/http.js:410:78
    at AsyncLocalStorage.run (node:internal/async_local_storage/async_hooks:91:14)
    at AsyncLocalStorageContextManager.with (/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/context-async-hooks/build/src/AsyncLocalStorageContextManager.js:33:40)
    at ContextAPI.with (/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/api/build/src/api/context.js:60:46)
    at outgoingRequest (/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation-http/build/src/http.js:401:38)
    at httpsOutgoingRequest (/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation-http/build/src/http.js:153:93) {
  code: 'ERR_UNESCAPED_CHARACTERS'
}

Node.js v22.9.0

Comment out all the OpenTelemetry init code / module loading and the request succeeds.

As I mentioned in the original comment, there is a clear bug in the OpenTelemetry code, where the wrong URL parsing function is used (a bug that could have security implications, given the documentation for url.parse()). Unless you disagree with that, I think this should be enough details to triage and resolve the issue.

gcampax avatar Oct 16 '24 01:10 gcampax

Thanks for the detailed report and the reproducer @gcampax - I'll try to find someone to work on this ASAP.

pichlermarc avatar Oct 16 '24 15:10 pichlermarc