proposal-binary-ast icon indicating copy to clipboard operation
proposal-binary-ast copied to clipboard

[Discussion] Out-of-band signal for requesting Binary AST

Open RReverser opened this issue 7 years ago • 26 comments

I recently asked on the chat about a planned way to request Binary AST from the server and got the following answer:

@Yoric: We plan to have a mechanism, but we haven't attempted to design it yet. The vague consensus for the moment was to use something like <script src="..." binsrc="...">, which seems like the cheap way to keep it backwards-compatible.

While this is a relatively simple solution, I have a concern about limitations it imposes.

In particular, in an ideal world I think it would be reasonable to support a usecase where e.g. a shared CDN with lots of JavaScript libraries could simply create Binary AST variants of all the assets, and return them instead of regular JavaScript when it knows that 1) browser supports it and 2) that such change would be mostly invisible to the consumer (that is, JS was indeed requested via <script> or import(...) or other means purely for execution, and not with XMLHttpRequest or fetch).

To support usecases like that, signal for Binary AST support should come not from HTML level (as it's much harder to get HTML updated on all the websites where script is inserted), but rather on network level.

One way to do this would be adding binast or similar marker to the Accept-Encoding list for script requests in supported browsers, which would tell the server that Binary AST version can be safely returned with Content-Encoding: binast in the response.

Using encoding headers for this goal feels quite natural, as it's mostly an encoding format for JavaScript, although one might argue that because it's not lossless in terms of debugging information, it doesn't belong to Accept-Encoding/Content-Encoding headers - in that case, I'm open to any proposals.

RReverser avatar Mar 10 '18 21:03 RReverser

Actually, another option could be to use Accept/Content-Type pair, but 1) ~~it might look weird to have different content-type to be loaded for <script type="text/javascript" src="..."> - maybe not a big deal though?~~ probably not a problem, since server can already return any supported JS MIME type ignoring what is specified on the tag and 2) currently script's Accept header simply sends */*, so this would need to be similar to WebP support where new variant is explicitly requested first.

RReverser avatar Mar 11 '18 10:03 RReverser

Thanks for starting this conversation. It would indeed be great to specify this in such a way that proxies and CDNs can speed up webpages transparently.

So far, our experiments with putting compression inside BinAST don't look useful, so I suspect that we're going to end up using an out-of-the-box compression mechanism. So what about a

-Accept-Encoding:binjs+gz (or + anything else); or -Content-Type: application/binjs and Accept-Encoding: gz.

Yoric avatar Mar 12 '18 14:03 Yoric

Having thought about it a bit more, I'm starting to think that mime-type way is indeed a bit easier to implement and makes more sense that encoding one I originally proposed.

So, to be more precise about:

-Content-Type: application/binjs and Accept-Encoding: gz.

Client-side will have to prepend another mime-type to Accept as well, so it will look like:

...
Accept: application/binjs, */*
Accept-Encoding: gzip, deflate, br
...

and server will respond with something like:

...
Content-Type: application/binjs
Content-Encoding: br
...

Does that look right?

RReverser avatar Mar 12 '18 15:03 RReverser

That looks fine to me.

Yoric avatar Mar 12 '18 15:03 Yoric

Cool, looks good to me too.

RReverser avatar Mar 12 '18 16:03 RReverser

@Yoric Do we need to commit to this in the proposal text somehow?

RReverser avatar Mar 12 '18 16:03 RReverser

I believe that TC39 doesn't care about mime types or content encoding (@syg, can you confirm?), so this should probably go to some other proposal.

For the sake of experimenting, I have filed a Firefox bug on the topic. I'll try and find someone to work on it (possibly me) once we have a working multipart tokenizer in Firefox.

Yoric avatar Mar 12 '18 17:03 Yoric

@Yoric Thanks!

RReverser avatar Mar 12 '18 17:03 RReverser

To clarify - what I meant, even if TC39 doesn't care about these details, it would be still nice to have all information about Binary AST (including delivery) in the same place just so that it would be easy to find and refer to.

RReverser avatar Mar 12 '18 17:03 RReverser

@RReverser Maybe in an Examples section? You'll have to discuss this with @syg, he's the Master of the Spec.

Yoric avatar Mar 12 '18 20:03 Yoric

FWIW, if this ends up being a thing this will need to be defined in the HTML Standard. Having a monkey patch of sorts of the algorithms there would be good so tests and such can be written against that. You might also want to file an upstream issue at https://github.com/whatwg/html/issues/new to make it clear you're extending the algorithms defined there.

annevk avatar Mar 26 '18 17:03 annevk

I also think we might want to restrict these to be CORS-loaded, as with module scripts. That's another thing that'll need to be defined here.

annevk avatar Mar 26 '18 17:03 annevk

@annevk

to make it clear you're extending the algorithms defined there

Given that this most likely won't be allowed inline in script tags, what algorithms do you think we'll need to change in HTML spec? As far as I see it, this proposal could get away with no changes to actual normative sections since it should behave exactly as any other external script element, with text/javascript and such, just like gzip and brotli don't have any special handling in HTML spec (AFAIK).

I also think we might want to restrict these to be CORS-loaded, as with module scripts.

Won't this break existing pages relying on scripts in the "transparent optimisation" scenario described above?

RReverser avatar Mar 26 '18 17:03 RReverser

@RReverser for classic scripts HTML requires all responses, regardless of their Content-Type header, to be parsed per the JavaScript specification. This would change that, no?

As for requiring CORS, we sorta decided to do that for new types of resources. Given what we now know about attacks on opaque responses that seems like a good thing. It seems bad to me to allow new types of resources to be loaded without CORS, thereby continuing to support known bad patterns.

annevk avatar Mar 26 '18 18:03 annevk

@annevk

This would change that, no?

I suppose that's true, yeah, although first we would need to have ECMAScript spec changes landed first to have something to link to.

As for requiring CORS, we sorta decided to do that for new types of resources.

I agree in general, but what concerns me in this case is that it's not really a new type of resource (as in regular definition of "resource type"), rather a special encoding of existing ones that should work transparently for actual websites.

So I'm not saying we shouldn't do it per se, but I do wonder if there are scenarios where it would break websites that rely on third-party scripts from a hosting that desides to opt-in to Binary AST. Or, if it's mostly third-party, this shouldn't cause any new issues?

RReverser avatar Mar 26 '18 18:03 RReverser

first we would need to have ECMAScript spec changes landed

The HTML Standard links directly to some ECMAScript proposals, such as BigInt and import(), that are expected to make it.

If the third-party makes the choice that would be problematic, yes. However, we could make it so that the Accept header is not modified for such requests (that might be a good idea regardless, as modifying the Accept header is itself a slight same-origin policy violation imo).

annevk avatar Mar 27 '18 06:03 annevk

Minimal change: I'd like to rename the mime type application/javascript+binast. I believe that this is clearer.

Yoric avatar Apr 18 '18 14:04 Yoric

I don't have any preferences regarding the mime type, so sounds good to me.

RReverser avatar Apr 18 '18 14:04 RReverser

I'd recommend using - instead of + since binast is specific to JavaScript if I'm not mistaken and not a general applicable suffix.

annevk avatar Apr 18 '18 14:04 annevk

Well, I have the not-so-well-hidden idea of applying this to json soon, and then to experiment with css and html, so I believe that the + makes sense. What do you think, @annevk?

Yoric avatar Apr 18 '18 15:04 Yoric

Yeah, then it certainly would.

annevk avatar Apr 18 '18 15:04 annevk

Well, I have the not-so-well-hidden idea of applying this to json soon

How does the result differ from CBOR?

hsivonen avatar Apr 20 '18 06:04 hsivonen

I wasn't aware of CBOR. This seems to be pretty much equivalent.

Yoric avatar Apr 20 '18 06:04 Yoric

On the question of + versus - in the MIME type: the existence of application/foo+bar implies an underlying standalone MIME type of application/bar. So, if you think it makes sense to have an application/binast, then the plus might make sense.

I suspect that's probably not what you want. But once you make a decision, I'm happy to help you figure out how to get this all registered. The MIME type registration would be an IETF item.

Tagging @linuxwolf for his awareness.

adamroach avatar Jun 15 '18 20:06 adamroach

@adamroach Would application/bar+foo also imply a MIME type of just application/bar?

j-f1 avatar Jun 15 '18 20:06 j-f1

@j-f1 no, e.g., there's image/svg+xml, but not image/svg (though note there's also no image/xml so @adamroach's rule doesn't work entirely either).

annevk avatar Jun 16 '18 13:06 annevk