sentry-javascript
sentry-javascript copied to clipboard
import hook incompatible with `tsx`
Is there an existing issue for this?
- [X] I have checked for existing issues https://github.com/getsentry/sentry-javascript/issues
- [X] I have reviewed the documentation https://docs.sentry.io/
- [X] I am using the latest SDK release https://github.com/getsentry/sentry-javascript/releases
How do you use Sentry?
Sentry Saas (sentry.io)
Which SDK are you using?
@sentry/node
SDK Version
8.0.0
Framework Version
No response
Link to Sentry event
No response
SDK Setup
Sentry.init({
dsn: 'https://[email protected]/0',
});
Steps to Reproduce
I've created a minimal reproduction here: https://github.com/nwalters512/sentry-v8-tsx-error-repro
- Clone the repository
- Install dependencies with
yarn - Run
yarn tsx src/index.ts - Observe that the process fails with the error
TypeError [ERR_INVALID_URL_SCHEME]: The URL must be of scheme file
Several observations that are hopefully helpful to y'all:
-
This only occurs when
"allowJs": trueis set intsconfig.json. As Sentry isn't concerned with this, that made me think this problem might actually be intsx. However... -
When adding instrumentation directly with the
@opentelemetry/*packages, everything works fine. This is reproducible by making the following change tosrc/index.ts:-import './instrument-sentry.js'; +import './instrument-opentelemetry.js';Given this, I'm strongly inclined to believe that this is an issue with the way in which Sentry is using OpenTelemetry.
-
This only breaks for core Node modules like
util,fs, etc. Importing other modules works fine. For instance, the following change tosrc/index.tsmakes it work without erroring:-await import('util'); +await import('zod'); -
This only breaks for dynamic imports. For instance, the following change to
src/index.tsmakes it work without erroring:-await import('util'); +import 'util';
Expected Result
I would expect the process to complete successfully and log util imported!.
Actual Result
The process errors out while importing util and does not print util imported!.
The full stack trace is:
TypeError [ERR_INVALID_URL_SCHEME]: The URL must be of scheme file
at new NodeError (node:internal/errors:406:5)
at fileURLToPath (node:internal/url:1393:11)
at finalizeResolution (node:internal/modules/esm/resolve:234:42)
at moduleResolve (node:internal/modules/esm/resolve:845:10)
at defaultResolve (node:internal/modules/esm/resolve:1043:11)
at nextResolve (node:internal/modules/esm/hooks:833:28)
at y (file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/node_modules/tsx/dist/esm/index.mjs?1715684968701:2:2079)
at j (file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/node_modules/tsx/dist/esm/index.mjs?1715684968701:2:3198)
at nextResolve (node:internal/modules/esm/hooks:833:28)
at resolve (/Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/node_modules/@opentelemetry/instrumentation/node_modules/import-in-the-middle/hook.js:238:23) {
code: 'ERR_INVALID_URL_SCHEME'
}
And here is the bottom of the call stack.
Maybe the loader hook added by import-in-the-middle is somehow clashing with the loader hooks added by tsx?
Looking at the stack trace, although import-in-the-middle is at the bottom, further up we see frames from tsx:
at y (file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/node_modules/tsx/dist/esm/index.mjs?1715684968701:2:2079)
at j (file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/node_modules/tsx/dist/esm/index.mjs?1715684968701:2:3198)
I guess it's not handling the params the iitm adds to the url?
So I've opened an issue there: https://github.com/privatenumber/tsx/issues/554
@timfish thanks for investigating this; your small reproduction that used just import-in-the-middle was very helpful. I'm relatively confident that the issue here is in fact with import-in-the-middle and the fact that it appends query parameters to the URL it returns from its resolvers. That is, consider this line:
https://github.com/DataDog/import-in-the-middle/blob/00b01fff1f5b69dd25e307593ec54d1d8abb4844/hook.js#L259
Changing that to just url: url.url makes your reproduction complete without error (though I have no idea what effect that has on import-in-the-middle actually working).
Specifically, it looks like the issue is that this URL with ?iitm=true ends up as the parentURL in a subsequent resolution, which you can see by adding console.log(specifier, context) immediately after https://github.com/privatenumber/tsx/blob/e2afc60bbcc299e09a5bf0e0c8909b6766b633a2/src/esm/hook/resolve.ts#L39:
file:///Users/nathan/git/tsx/node_modules/.pnpm/[email protected]/node_modules/import-in-the-middle/lib/register.ts {
conditions: [ 'node', 'import', 'node-addons' ],
importAttributes: {},
parentURL: 'node:util?iitm=true'
}
Tracing the resolving of the otel hooking, it looks like ?iitm=true is used so that iitm knows that the resolving is actually coming from its hook rather than the original location.
Adding the query to the parent URL should be fine as this is still a valid URL.
For example, the following code runs fine under both node and tsx:
import { register } from "node:module";
register(
new URL(
`data:application/javascript;base64,${Buffer.from(
`
export async function resolve(specifier, context, nextResolve) {
if (context.parentURL) {
context.parentURL += "?some=query";
}
console.log("resolve", specifier, context);
return nextResolve(specifier, context);
}
`
).toString("base64")}`
),
import.meta.url
);
await import("node:util");
and outputs:
resolve node:util {
conditions: [ 'node', 'import', 'node-addons' ],
importAssertions: {},
parentURL: 'file:///Users/tim/Documents/repro/test.ts?some=query'
}
node:util?iitm=true is also a valid URL so I'm not fully convinced that's causing the issue yet!
Changing that to just url: url.url makes your reproduction complete without error (though I have no idea what effect that has on import-in-the-middle actually working).
Ah, I just re-read this and it makes sense that this is the cause.
I think you make have just realized this, but I'll finish this comment just so we're both on the same page. Your example is subtly different from what's happening in the failure case. In your example, parentURL is indeed still a valid file:// URL. However, in the failure case, it is not. This was easy to prove to myself:
const { fileURLToPath } = require('url');
fileURLToPath('node:util?foo=bar');
The above errors out exactly the same as the failure case with import-in-the-middle:
Uncaught TypeError [ERR_INVALID_URL_SCHEME]: The URL must be of scheme file
at fileURLToPath (node:internal/url:1463:11) {
code: 'ERR_INVALID_URL_SCHEME'
}
You're right, but fileURLToPath('node:util') also throws with the same error!
I think the issue isn't the query string, it's more that import-in-the-middle results in the parentURL being the node built-in which I guess this should never normally be possible because it's not a valid URL.
For example, when the minimal reproduction is run through plain old Node:
import { register } from "node:module";
register("import-in-the-middle/hook.mjs", import.meta.url);
await import("node:util");
The iitm resolve function with this added:
console.log('resolve', [specifier, context.parentURL])
Results in:
resolve [
'node:util',
'file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/test.js'
]
resolve [
'file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/node_modules/import-in-the-middle/lib/register.js',
'node:util?iitm=true'
]
resolve [
'node:util?iitm=true',
'node:util?iitm=true'
]
All great observations! I managed to make a very isolated reproduction: https://github.com/nwalters512/register-hook-playground
It uses neither tsx nor import-in-the-middle; it copies the bare minimum amount of behavior from import-in-the-middle to be able to reproduce:
node:internal/modules/run_main:129
triggerUncaughtException(
^
TypeError [ERR_INVALID_URL_SCHEME]: The URL must be of scheme file
at fileURLToPath (node:internal/url:1463:11)
at finalizeResolution (node:internal/modules/esm/resolve:266:42)
at moduleResolve (node:internal/modules/esm/resolve:933:10)
at defaultResolve (node:internal/modules/esm/resolve:1157:11)
at nextResolve (node:internal/modules/esm/hooks:866:28)
at resolve (file:///Users/nathan/git/register-hook-playground/hook2.mjs:4:21)
at nextResolve (node:internal/modules/esm/hooks:866:28)
at Hooks.resolve (node:internal/modules/esm/hooks:304:30)
at MessagePort.handleMessage (node:internal/modules/esm/worker:196:24)
at [nodejs.internal.kHybridDispatch] (node:internal/event_target:825:20) {
code: 'ERR_INVALID_URL_SCHEME'
}
It turns on that the query params were a red herring of sorts. The presence of it actually triggers another behavior in import-in-the-middle that produces synthetic source code for the module being loaded (in this case, node:util?iitm=true):
https://github.com/DataDog/import-in-the-middle/blob/00b01fff1f5b69dd25e307593ec54d1d8abb4844/hook.js#L266
This puts us in a situation that Node would normally never expect to encounter: parentURL refers to a builtin (node:import). You can reproduce this even more simply with the following hook code (in my repro as hook-simpler.mjs):
export async function resolve(specifier, context, parentResolve) {
specifier = 'file:///dost-not-exist.mjs';
context.parentURL = 'node:util';
const url = await parentResolve(specifier, context, parentResolve);
}
This is obviously very contrived, but it reproduces the problem very simply: Node's parentResolve chokes when specifier is a a file:// URL and context.parentURL is not a file:// URL. This wasn't reproducible in your above example because Node seems to short-circuit if specifier is a core module, it never even tries to use context.parentURL in that case.
I'm feeling pretty confident that this is a bug in import-in-the-middle, as that's the one that ends up generating synthetic code for node:util that ends up trying to import some file:// thing. Do you want me to move this issue there?
What I don't yet fully understand is why iitm works with Node as it's using the same parentURLs.
For example these are the logs from resolve in import-in-the-middle calling down to the default resolver:
defaultResolve [
'node:util',
{
conditions: [ 'node', 'import', 'node-addons' ],
importAttributes: {},
parentURL: 'file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/test.js'
}
]
result [Object: null prototype] { url: 'node:util' }
defaultResolve [
'file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/node_modules/import-in-the-middle/lib/register.js',
{
conditions: [ 'node', 'import', 'node-addons' ],
importAttributes: {},
parentURL: 'node:util?iitm=true'
}
]
result [Object: null prototype] {
url: 'file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/node_modules/import-in-the-middle/lib/register.js',
format: 'commonjs'
}
defaultResolve [
'node:util?iitm=true',
{
conditions: [ 'node', 'import', 'node-addons' ],
importAttributes: {},
parentURL: 'node:util?iitm=true'
}
]
result [Object: null prototype] { url: 'node:util' }
It appears to handle the node: URLs without error!
From the stack trace, we can see that when tsx is involved, it's called after iitm, ie. iitm > tsx > default.
However, my console logs from tsx resolve aren't being outputted before the exception so it's really hard to see if/how the parameters are being modified before they hit the default resolver.
My best guess is that they're not getting flushed from the loader hooks thread and there's no obvious way to debug it otherwise.
I think I have an answer: this only happens when the file that we're trying to resolve with node:util as its parent does not exist! You can make the following change to hook.mjs in https://github.com/nwalters512/register-hook-playground to demonstrate this:
- const iitmURL = new URL('lib/register.mjs', import.meta.url).toString();
+ const iitmURL = new URL('register.mjs', import.meta.url).toString();
Note that the updated version points to a file that actually exists; when that happens, there's no error!
So... maybe this is a bug in Node? If nothing else, it's unexpected behavior. I think he expected behavior in the does-not-exist case is that we'd get a ERR_MODULE_NOT_FOUND, not ERR_INVALID_URL_SCHEME. This can be demonstrated with the following hook:
export async function resolve(specifier, context, parentResolve) {
await parentResolve('file:///does-not-exist.mjs', context, parentResolve);
}
Which errors with the following:
Error [ERR_MODULE_NOT_FOUND]: Cannot find module '/does-not-exist.mjs' imported from /Users/nathan/git/register-hook-playground/index.mjs
at finalizeResolution (node:internal/modules/esm/resolve:265:11)
at moduleResolve (node:internal/modules/esm/resolve:933:10)
at defaultResolve (node:internal/modules/esm/resolve:1157:11)
at nextResolve (node:internal/modules/esm/hooks:866:28)
at resolve (file:///Users/nathan/git/register-hook-playground/hook.mjs:2:9)
at nextResolve (node:internal/modules/esm/hooks:866:28)
at Hooks.resolve (node:internal/modules/esm/hooks:304:30)
at MessagePort.handleMessage (node:internal/modules/esm/worker:196:24)
at [nodejs.internal.kHybridDispatch] (node:internal/event_target:825:20)
at MessagePort.<anonymous> (node:internal/per_context/messageport:23:28) {
code: 'ERR_MODULE_NOT_FOUND',
url: 'file:///does-not-exist.mjs'
}
Aside: tsx actually relies on ERR_MODULE_NOT_FOUND being thrown so that it can tell when files don't exist.
So to summarize, the combination of tsx and import-in-the-middle means that we end in a situation with the following properties:
- We're trying to resolve a
file:///...URL - The file referenced by the
file:///URL does not exist - The
context.parentURLis anode:(core) module
This is a perfect storm that produces unexpected behavior in Node's own resolution process.
Ah got it, the error is being thrown from here. It's throwing while it's trying to create the error message: https://github.com/nodejs/node/blob/87b87a8f6017545e2c9d6048d7b081f73c8a1072/lib/internal/modules/esm/resolve.js#L265
Node tries to check if the parentURL is valid here, but it doesn't guard against node:* URLs and later just throws it into fileURLToPath to create the error message.
If we report this to Node they might want to throw on all node: parentURLs which would break import-in-the-middle 🤔
I'll open an issue there first and explain the detail...
Excellent find! Sounds good; I'll check out the issue once you make it.
Link to the node issues: https://github.com/nodejs/node/issues/52987
I opened a PR to fix this in import-in-the-middle: https://github.com/DataDog/import-in-the-middle/pull/76
Maybe it would be good idea for Sentry to fork iitm and launch it as their own package. DDs shipping cadedance has not been particularly frequent, and this is going to be a major bottleneck for ESM adoption.
Maybe it would be good idea for Sentry to fork iitm and launch it as their own package
The plan is for it to move to the Node org: https://github.com/nodejs/admin/issues/858
Maybe it would be good idea for Sentry to fork iitm and launch it as their own package
The plan is for it to move to the Node org: nodejs/admin#858
Node org is equally slow to make releases.
If this is going to be a bottleneck to Sentry v8 adoption, you want to have more control over it.
It is small enough package to make little difference even if there are multiple versions of it maintained, i.e. a copy can be incubated under Sentry and if Node.js wants to merge upstream, let them do it.
Maybe it would be good idea for Sentry to fork iitm and launch it as their own package
The plan is for it to move to the Node org: nodejs/admin#858
Node org is equally slow to make releases.
If this is going to be a bottleneck to Sentry v8 adoption, you want to have more control over it.
It is small enough package to make little difference even if there are multiple versions of it maintained, i.e. a copy can be incubated under Sentry and if Node.js wants to merge upstream, let them do it.
Normally, I'd agree, it would be nice if we could fork it. But sadly, in this case we can't do this because import-in-the-middle is a dependency of all the other opentelemetry instrumentation, which would not use our fork :( So we need to make fixes upstream there and live with the slower cadence 😬
I might have found a temporary workaround...
@nwalters512 Does this work as a temporary fix to get tsx working for your use case?
import { register } from "node:module";
import * as Sentry from "@sentry/node";
// register a loader hook before import-in-the-middle that
// ensures parentURL is a valid file URL before it hits Node
register(
new URL(
`data:application/javascript,
export async function resolve(specifier, context, nextResolve) {
if(context.parentURL.startsWith('node:')) { context.parentURL = 'file:///'; }
return nextResolve(specifier, context);
}`
),
import.meta.url
);
Sentry.init({ dsn: '__DSN__' });
await import("node:util");
This doesn't result in any errors in this basic example.
@timfish what ultimately fixed things for me was actually https://github.com/getsentry/sentry-javascript/pull/12043; we don't use performance instrumentation, so the error went away when Sentry stopped trying to unnecessarily instrument modules. This is of course not a general solution, and yours might very well be a reasonable bandaid for anyone who is using both tsx and Sentry tracing while we wait for import-in-the-middle maintainers to do their thing!
There are multiple outstanding PRs for import-in-the-middle (including the one from @nwalters512 🙏) that combined hopefully fix a wide range of issues.
Until they make it to a release, I've combined these into a patch that can be used with patch-package:
https://github.com/getsentry/sentry-javascript/issues/12242#issuecomment-2133993135
If anyone can confirm that this fixes their issues that would be great!
Does the fact that this PR is closed means that now Sentry is compatible with tsx?
This issue is still open https://github.com/getsentry/sentry-javascript/issues/12357, but this specific issue is fixed.
Just needs to be released - that will come out tomorrow most likely.
Hello, we've just released v8.8.0, which should hopefully resolve this ESM problem. Let us know if you updated and are still running into problems!
This issue might be fixed, but unfortunately tsx always adds it's loaders before others passed via --import so it's still not compatible:
https://github.com/getsentry/sentry-javascript/issues/12357
https://github.com/privatenumber/tsx/issues/571
import-in-the-middle parses the loaded code to find the imports and with the current order of the loaders, import-in-the-middle is passed TypeScript code which it cannot parse.