fetch icon indicating copy to clipboard operation
fetch copied to clipboard

Redirected range requests and preflights.

Open mikewest opened this issue 10 years ago • 17 comments

Chrome has some funky behavior around HTMLMediaElement + redirected range requests.

https://codereview.chromium.org/1220963004 denied responses to range requests if their origin is distinct from the origin response for the initial request.

https://codereview.chromium.org/1356353003 relaxes that restriction to accept responses to range requests if they're CORS-same-origin with the origin response from the initial request. It also treats "range" as a simple header for the purposes of preflights if the request is CORS enabled (e.g. <video crossorigin ...>).

It would be nice to spec this out in a sane way. :)

mikewest avatar Oct 27 '15 07:10 mikewest

+@tyoshino

mikewest avatar Oct 27 '15 07:10 mikewest

cc @mnot - I'm a bit confused on the context - is this saying that 2 different uris with 206 responses should be stitched together just because they both had the same original uri before redirection? (and if they pass cors). That seems odd - they're different resources.

mcmanus avatar Oct 27 '15 16:10 mcmanus

This scheme is already in use widely by CDNs. Chrome's HTMLMediaElement is stitching fragments served for different URLs together (see the opening comment of https://code.google.com/p/chromium/issues/detail?id=532569 by strobe). Chrome's resource loader in general doesn't.

Given the situation, it seems we could document requirements for such an approach to make sure it's secure. It doesn't necessarily require all "fetching" on the web platform to do the stitching.

tyoshino avatar Oct 29 '15 07:10 tyoshino

Doing it generically would indeed be very broken.

To do this for a specific application (e.g., HTMLMediaElement), you need a really explicit assertion that not only are the two resources equivalent, but also that the two specific representations are exactly the same -- e.g., ETag sharing. Even then, this is not something happening in HTTP -- it has to be built on top.

See: http://httpwg.github.io/specs/rfc7233.html#combining.byte.ranges http://httpwg.github.io/specs/rfc7234.html#combining.responses

mnot avatar Oct 29 '15 08:10 mnot

Are we doing this @rocallahan?

annevk avatar Oct 30 '15 05:10 annevk

When our media resource loader takes over an HTTP load, it uses the final post-all-redirects URI as its canonical URI for the resource. All subsequent range requests start with that URI; if further redirects occur, they are honoured. The principal(s) associated with the media data are gathered from all final-URIs. If these are different origins that's generally OK: we'll still play the media, though (since at least one of those origins must not be same-origin with the page) certain APIs will be affected (e.g. after drawing a video frame to a canvas, the canvas will be tainted).

I'm not familiar with the CDN setup described in https://code.google.com/p/chromium/issues/detail?id=532569, but I assume the CDN has a canonical URI which redirects quasi-randomly to one of many mirror URIs, and the mirror URIs never do any more redirects. If so, then by using the final URI from the first load for every subsequent range request we're avoiding any issues.

rocallahan avatar Oct 30 '15 05:10 rocallahan

Okay, so it sounds like the HTML standard would need to do this for media elements. @foolip, have you looked into doing this? It would perhaps also require some overrides then to make sure Fetch does not do anything bad upstream.

annevk avatar Nov 03 '15 14:11 annevk

I haven't given this any thought in the spec, no. What I do know is that media elements integrate with the network layer in a rather unique way, that seems to be true of all implementations, and certainly was in Presto.

The problem of knowing that the resource is the same when requesting a second range isn't unique to redirects, even when the same server responds you in principle need some sanity checks. I doubt that these are interoperable today, and I doubt even more that doing the strict checks that would actually make sense (ETag) would really be web compatible.

foolip avatar Nov 04 '15 10:11 foolip

Just to make sure, the proposal by @rocallahan is that once the UA receives any body bytes back from the server, it stops following further redirects?

tyoshino avatar Nov 04 '15 12:11 tyoshino

Seems the model doesn't work for some CDNs. See this post by strobe@ from YouTube https://code.google.com/p/chromium/issues/detail?id=532569#c33

tyoshino avatar Nov 09 '15 18:11 tyoshino

Just to make sure, the proposal by @rocallahan is that once the UA receives any body bytes back from the server, it stops following further redirects?

Sorry, I thought I was pretty clear and I'm not sure how to make it clearer:

All subsequent range requests start with that URI; if further redirects occur, they are honoured.

...

Seems the model doesn't work for some CDNs. See this post by strobe@ from YouTube https://code.google.com/p/chromium/issues/detail?id=532569#c33

That seems to be based on a misunderstanding of what I said.

rocallahan avatar Nov 09 '15 21:11 rocallahan

I wanted to make sure I'm understanding what you said in the second paragraph correctly. It was my mistake that I referred to the paragraph by "proposal".

Thanks for replying to the crbug thread.

tyoshino avatar Nov 09 '15 21:11 tyoshino

https://jewel-chair.glitch.me/same-origin.html

  • Contains an <audio> that points to /audio-redirect-second-part.
  • If the request has a Range that starts at an offset other than 0, the server redirects to /audio-normal.

Chrome: Observes the redirect. Subsequent requests go to /audio-normal. Firefox: Observes the redirect. Subsequent requests go to /audio-redirect-second-part.

https://jewel-chair.glitch.me/same-origin-immediate-redirect.html

  • Contains an <audio> that points to /audio-redirect-first-part.
  • Always redirects to /audio-normal.

Chrome: Observes the redirect. Subsequent requests go to /audio-normal. Firefox: Observes the redirect. Subsequent requests go to /audio-normal.

I'm looking to spec the correct behaviour here, and I'd like to do the same for other range requests like downloads.

Initially, the Firefox behaviour seems inconsistent. But, if a browser were to request multiple ranges in parallel, Chrome's behaviour could be racey.

I'm not familiar with the CDN pattern @tyoshino mentioned. Are there any further details? Do these CDNs tend to redirect for the initial range, or do they perform multiple redirects for different parts of the media resource?

jakearchibald avatar Mar 06 '18 11:03 jakearchibald

Range is already allowed to be set by media elements due to https://fetch.spec.whatwg.org/#unsafe-request-flag. Not necessarily great as it allows poking holes in the same-origin policy (see also #568), but that is how it is.

annevk avatar Jan 19 '21 13:01 annevk

@horo-t @mikewest it seems Chrome has the strictest handling of media element range requests thanks to your efforts:

  • For the initial request, redirects are followed, resulting in an opaque response at a "final URL".
  • Subsequent requests are made directly to the "final URL". If this "final URL" redirects:
    • If the redirect crosses the origin boundary, error. (Firefox has this too, though it does not make subsequent requests directly to the "final URL".)
    • If the redirect does not end up at the "final URL" (e.g., directly or via another redirect), error. (Firefox does not have this.)

Given that rather weird behavior it seems we might be able to outlaw redirects for subsequent requests completely. This would also help https://github.com/annevk/orb, though it does not matter much. Is there a reason they are allowed? And if not, are you interested in simplifying that logic?

cc @padenot @anforowicz

annevk avatar Jan 20 '21 15:01 annevk

If we can get away with dropping redirects entirely, I'd be happy too. @jakearchibald might have more context on how we landed on the current behavior?

mikewest avatar Jan 22 '21 09:01 mikewest

IIRC, the subsequent redirects are sometimes used to reauthenticate the resource. I.e., you watch a video for some time and then walk away for a couple hours, upon clicking play again the provider may need to reauthenticate your session (for content license requirements) which may redirect through some validation before going back to the final redirected URL.

dalecurtis avatar Jan 25 '21 17:01 dalecurtis