emscripten icon indicating copy to clipboard operation
emscripten copied to clipboard

Cross Origin pthread worker

Open msqr1 opened this issue 1 year ago • 36 comments

I love the new inline worker change, but now we can't never use a 3rd-party CDN to serve those files again, because Worker construction requires the file to be in the same origin. Previously, we can just host the very small worker file on our server, while serving the heavy main file through a CDN, but now we can't anymore. Is there any way we could address this problem?

msqr1 avatar May 14 '24 05:05 msqr1

Can you explain why this doesn't work? Why cant you serve the main JS file from the CDN?

We do have setting called mainScriptUrlOrBlob which controls where the main JS file finds itself. I'm curious why this would be needed though since I would expect that loading JS file from the same location from which the main thread already loaded it would make the most sense and result in the best caching behaviour. In fact, shouldn't the browser already have the main JS file cached, resulting in no extra JS fetches per-thread?

sbc100 avatar May 14 '24 15:05 sbc100

I know it is cached, but it is still from cross origin. The thing that matter is the origin, not cached or not.

msqr1 avatar May 14 '24 16:05 msqr1

Sorry I didn't read the title there.

I'm not too familiar with cross-origin limitations. If you, or anyone else, has any ideas on how to make this work better that would be most welcome.

sbc100 avatar May 14 '24 16:05 sbc100

So, to be clear, the security policy disassllows new Worker with a cross origin script but then allows the new worker to call importScripts from other origins?

sbc100 avatar May 14 '24 16:05 sbc100

Trying to understand the issue, the only mention I see about same-origin restrictions in workers looks like it related to subworkers: https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers. Are you using subworkers? Or is this some kind of additional non-default policy that you have setup?

sbc100 avatar May 14 '24 16:05 sbc100

Hmm ... I wonder if we could do something line new Worker(blobUrl("importScripts(...)" to work around this?

Maybe you could even try this out by adding something like this to your pre-js file:

Module['mainScriptUrlOrBlob'] = `importScripts(${document.currentScript})`;

?

sbc100 avatar May 14 '24 16:05 sbc100

I tried this locally and the following --pre-js seemed to work:

if (!ENVIRONMENT_IS_PTHREAD) {
  Module['mainScriptUrlOrBlob'] = new Blob([`importScripts('${document.currentScript.src}');`])
}

sbc100 avatar May 14 '24 16:05 sbc100

Did you try to serve that file through a cdn?

msqr1 avatar May 14 '24 17:05 msqr1

Nope, I'm just going by the fact that importScripts from your CDN worked for you in the past. So this is way to effectively revert to the prior behaviour.

sbc100 avatar May 14 '24 18:05 sbc100

Yes, importscript in the worker seems to work fine. We should integrate this fix because I'm sure many people will use a cdn

msqr1 avatar May 14 '24 19:05 msqr1

I'm confused by https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers though. It seems to suggest that only sub-workers have this requirement by default? So by default, using a CDN like this should work?

sbc100 avatar May 14 '24 20:05 sbc100

Can you confirm that the --pre-js fix works for you your use case?

sbc100 avatar May 14 '24 20:05 sbc100

I know it's weird, but see this: https://stackoverflow.com/questions/58098143/why-are-cross-origin-workers-blocked-and-why-is-the-workaround-ok

Basically the default mode for worker URL fetching is same-origin, so a cross origin URL will fail. Importscript default is just a normal fetch, so it works

msqr1 avatar May 14 '24 20:05 msqr1

So it sounds like maybe MDN is just wrong on this a not up-to-date with the spec, since it only mentions the same-origin requirement for sub-workers.

sbc100 avatar May 14 '24 20:05 sbc100

Spec: https://html.spec.whatwg.org/multipage/webappapis.html#fetch-a-classic-worker-script

sbc100 avatar May 14 '24 20:05 sbc100

Yeah it does, probably should report this to them.

msqr1 avatar May 14 '24 20:05 msqr1

Can you confirm that the --pre-js fix works for you your use case?

It doesn't, I'm using modularize, so in the module callback document.currentScript is null. I used the _scriptName global variable that keeps a reference to document.currentScript.src instead. It works. Normally, without modularize, it is available, but with that on, it won't.

msqr1 avatar May 14 '24 21:05 msqr1

Can you share the full fix you came up with here?

sbc100 avatar May 14 '24 21:05 sbc100

I took yours and replace document.currentScript.src with _scriptName for modularize mode, not sure about non-modularization

if (!ENVIRONMENT_IS_PTHREAD) {
  Module['mainScriptUrlOrBlob'] = new Blob([`importScripts('${_scriptName}');`])
}

msqr1 avatar May 14 '24 21:05 msqr1

Does that work? Looking at the code it seems that _scriptName is set only after the --pre-js code is included.

sbc100 avatar May 14 '24 21:05 sbc100

(We could potentially move the _scriptName assignment earlier).

sbc100 avatar May 14 '24 21:05 sbc100

Wait, I thought the _scriptName is declared before the iife is returned to be used in the function?

msqr1 avatar May 14 '24 22:05 msqr1

Wait, I thought the _scriptName is declared before the iife is returned to be used in the function?

I think MODULARIZE mode differs in this respect and injects _scriptName very early on.

sbc100 avatar May 14 '24 22:05 sbc100

PR?

msqr1 avatar May 14 '24 22:05 msqr1

I'd like to better understand the problem, and decide if we should do this default or make it a new option. I also wonder if there is any downside to doing in the case when things are hosted on the origin. For example, does it make debugging hard / more opaque when the URL passed to new Workers is a blob that then called importScripts vs just a URL?

For now, is it OK if you use your workaround?

sbc100 avatar May 14 '24 23:05 sbc100

I'm happy with my fix, but I think this should be the default if we decide to go with it. People wont expect this bug when they switch to a new version that keeps them from using a CDN. Also: image We can move capturing it to before the pre.js if possible

msqr1 avatar May 14 '24 23:05 msqr1

To be fair MDN kinda said it here: https://developer.mozilla.org/en-US/docs/Web/API/Worker/Worker that the URL should be same origin as the host

One more thing, if we decide to go with this, don't forget to set Blob's MIME type to text/javascript, otherwise it won't work on Firefox: https://developer.mozilla.org/en-US/docs/Web/API/Worker/Worker#:~:text=Strict%20MIME%20type%20checks%20for%20worker%20scripts

msqr1 avatar May 15 '24 00:05 msqr1

@msqr1 I'm curious, are there some CDNs (e.g couldflare and fastly type things) that don't require you to do cross-origin stuff? i.e. they take over your whole domain, so to speak? I could be misunderstanding but I wonder how popular those two different types of solution are?

sbc100 avatar May 15 '24 00:05 sbc100

There is no such CDN service that I know of. But I think Website hosting service does exactly what you say, they have servers everywhere that basically acts as CDN. Those servers serve your main page, your assets and everything else you have. You can basically treat it as your own domain while still utilizing their "CDN" to serve files fast. I don't think it would make sense for a CDN to take over your domain, because what CDN does is it helps you minimize distance by utilizing their numerous servers available around the globe (because you only have one, a small, or a few server).

msqr1 avatar May 15 '24 00:05 msqr1

I also wonder if there is any downside to doing in the case when things are hosted on the origin.

@sbc100, CSP needs to be set for worker-src or default-src to blob for the workaround to spawn the worker. But at this point, why don't we just spawn the whole worker content straight from the blob already? Why importScript if we're using blob anyway?

In my opinion, if we choose to do this, I would make a new setting like CROSS_ORIGIN_PTHREAD that uses the new approach of spawning from the blob.

msqr1 avatar May 23 '24 02:05 msqr1