html icon indicating copy to clipboard operation
html copied to clipboard

Standardize YouTube embed rewriting?

Open RByers opened this issue 9 years ago • 37 comments

Apparently chromium, Gecko and WebKit all have code for rewriting YouTube flash embeds to use HTML instead (so they work when flash is disabled, such as on mobile):

If this is now effectively required for web compat, perhaps we should work to align our implementations and specify it in HTML just to make this a little more predictable / sane? For automated testing purposes, perhaps we should include in the definition an additional hostname like youtube.web-platform.test?

@mounirlamouri can you share any data on how much of the web benefits from this? I imagine there may be other cases where we want to ask whether this is worth doing. So we may be setting a precedent here. I assume we'd only ever consider something like this when it's has truly massive positive impact on the user experience. I think it's worth having some discussion on what magnitude of usage qualifies (eg. as a fraction of page views, and/or a fraction of HTTP Archive pages).

RByers avatar Feb 23 '17 15:02 RByers

According to our data, this is affecting somewhere between 1-3% of our users on desktop and less than that on mobile (probably half). It's not huge because it's mostly for old websites, which is also the reason why mobile is less affected.

Is there any other data you would like to see?

mounirlamouri avatar Feb 24 '17 11:02 mounirlamouri

From https://nerdydata.com/ ("We index approximately 160 million websites") searching in "Deep Web":

src="//www.youtube.com/v/
Showing 1 - 50 (of approx. 22,090 found)

src="http://www.youtube.com/v/
Showing 1 - 50 (of approx. 281,530 found) 

src="https://www.youtube.com/v/
Showing 1 - 50 (of approx. 151,810 found) 

data="//www.youtube.com/v/
Showing 1 - 50 (of approx. 3,290 found) 

data="http://www.youtube.com/v/
Showing 1 - 50 (of approx. 28,200 found) 

data="https://www.youtube.com/v/
Showing 1 - 50 (of approx. 1,175 found) 

Adding these together and assuming the worst case (1 match is 1 site), that is ~0.3%.

zcorpan avatar Feb 24 '17 12:02 zcorpan

For comparison, we have <image>:

   <dt>A start tag whose tag name is "image"</dt>
   <dd>
    <!-- As of 2005-12, studies showed that around 0.2% of pages used the <image> element. -->
    <p><span>Parse error</span>. Change the token's tag name to "img" and reprocess it. (Don't
    ask.)</p>
   </dd>

zcorpan avatar Feb 24 '17 13:02 zcorpan

Thanks guys! As expected, this is pretty compelling data that this is important enough to justify some sort of fix. Here's some HTTP Archive data about the popularity of different embed origins/types (of the top 500k sites). So YouTube is present on ~0.27% of top sites, with histats.com behind that at about a tenth of that at 0.029%.

Rank	Origin	Type	Page count
1	http://www.youtube.com	application/x-shockwave-flash	1087	 
2	https://www.youtube.com	application/x-shockwave-flash	234	 
3	http://s10.histats.com	application/x-shockwave-flash	143	 
4	http://player.youku.com	application/x-shockwave-flash	102	 
5	http://www.xatech.com	application/x-shockwave-flash	98	 
6	http://	application/x-shockwave-flash	71	 
7	http://www.clocklink.com	application/x-shockwave-flash	69	 
8	http://blog.naver.com	application/x-shockwave-flash	66	 
9	http://www.dailymotion.com	application/x-shockwave-flash	61	 
10	http://img.i2i.jp	application/x-shockwave-flash	49	 
11	http://www.	application/x-shockwave-flash	49	 
12	http://www.smartadserver.com	application/x-shockwave-flash	48	 
13	http://home.uchat.co.kr	application/x-shockwave-flash	45	 
14	http://content.pop6.com	application/x-shockwave-flash	43	 
15	http://embed.scribblelive.com	application/x-shockwave-flash	41	
16	http://www.ustream.tv	application/x-shockwave-flash	32	 
17	http://chat.53kf.com	application/x-shockwave-flash	31	 
18	http://www.flickr.com	application/x-shockwave-flash	28	 
19	http://static-hw.xvideos.com	application/x-shockwave-flash	27	 
20	http://static.video.qq.com	application/x-shockwave-flash	27	 
21	http://pic.img80.com	application/x-shockwave-flash	26	 
22	http://vimeo.com	application/x-shockwave-flash	26	 
23	http://s3.rotaban.ru	application/x-shockwave-flash	24	 
24	https://embed-ssl.wistia.com	application/x-shockwave-flash	23	 
25	http://s4.rotaban.ru	application/x-shockwave-flash	23	 
26	http://h2.flashvortex.com	application/x-shockwave-flash	21	 
27	http://www.youtube-nocookie.com	application/x-shockwave-flash	19	 
28	http://c.brightcove.com	application/x-shockwave-flash	19	 
29	http://ws.amazon.com	application/x-shockwave-flash	19	 
30	http://localtimes.info	application/x-shockwave-flash	19	 

Query:

SELECT 
  REGEXP_EXTRACT(body, r'<embed[^>]*src=[\'"]?(http[s]?://[^/ "\']*)') AS embedsrc,
  REGEXP_EXTRACT(body, r'<embed[^>]*type=[\'"]?([^ "\']*)') AS type,
  COUNT(DISTINCT bodies.page) as count
  FROM [httparchive:har.2017_01_15_chrome_requests_bodies] 
  AS bodies  
JOIN (
  SELECT 
    page, url,
    JSON_EXTRACT(payload, '$._contentType') AS contentType
  FROM [httparchive:har.2017_01_15_chrome_requests]
) AS requests ON requests.url=bodies.url AND requests.page=bodies.page
WHERE REGEXP_MATCH(body, r'<embed[^>]*src=[\'"]?http[s]?://')
GROUP BY embedsrc, type
ORDER BY count DESC
LIMIT 5000

RByers avatar Feb 24 '17 19:02 RByers

I would be surprised this was required for compatibility.

rniwa avatar Feb 25 '17 05:02 rniwa

Our firefox telemetry had it at somewhere around 1% of users also when we implemented embed rewriting. It's now down to about 0.5% on our release branch.

https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_date=2017-01-25&keys=none!none!none&max_channel_version=release%252F51&measure=YOUTUBE_REWRITABLE_EMBED_SEEN&min_channel_version=null&processType=*&product=Firefox&sanitize=1&sort_keys=submissions&start_date=2017-01-18&table=0&trim=1&use_submission_date=0

qdot avatar Feb 27 '17 21:02 qdot

I would be surprised this was required for compatibility.

If there's no Flash plugin on the device or platform, this is how you get the embedded videos to keep working, right?

miketaylr avatar Feb 27 '17 21:02 miketaylr

Couldn't most of the "rewriting" be done by YouTube instead of in the browser engine, though? That is, if the only hack in the browser engine is type="application/x-shockwave-flash" -> type="text/html" for youtube.com/v/.

(And maybe eventually, do such type rewriting for any URL...?)

zcorpan avatar Feb 27 '17 22:02 zcorpan

Yeah, we've been discussing a bit a new generic web platform feature that could enable YouTube (and everyone else in a similar position) to solve this problem themselves (basically upgrading a flash embed to an iframe). It's not clear to me what the exact requirements would be for YouTube to be able to use such a feature. If people here feel that is worth pursuing, I can see if we can find someone at Google to drive this.

RByers avatar Mar 02 '17 16:03 RByers

Update: @mounirlamouri has pointed out that flash embeds are now completely deprecated by youtube. We've still got per-browser solutions to rewriting. Should we revisit this?

qdot avatar Sep 08 '17 20:09 qdot

Is there a counter somewhere for

<embed src="https://www.youtube.com/…"
       type="application/x-shockwave-flash"
…

karlcow avatar Oct 30 '23 17:10 karlcow

Hi @karlcow, use counters for embeds that are eligible for rewriting were added to Firefox a few months ago. See bug 1847295 for more info :slightly_smiling_face:

https://mozilla.github.io/usecounters/#channel=nightly&group=YOUTUBEFLASHEMBED

gregorypappas avatar Nov 14 '23 23:11 gregorypappas

Can Chromium also implement a use counter?

zcorpan avatar Nov 20 '23 16:11 zcorpan

Can Chromium also implement a use counter?

I am embarrassed to see that I didn't add a UseCounter for this years ago. Absolutely, we can not only add a counter, but also make an informed opinion on our ability to remove this logic from chromium. Bug 1503916 assigned to myself, will plan to land the counter today.

Also note that Chromium's impl isn't just about YouTube anymore, but also dailymotion.com and vimeo.com. I assume we'd want to tackle all three at once, right? I'm adding a UseCounter for all such embed rewrites, but if it turns out to be too large to break, then we could always dig in and look at each independently (though perhaps if we're going to spec one, we should just spec them all?).

RByers avatar Nov 20 '23 19:11 RByers

OK, great. As far as I can tell, Gecko only rewrites for Youtube (here). I'll check if we've received web compat issues for dailymotion/vimeo embeds.

zcorpan avatar Nov 21 '23 12:11 zcorpan

The counter has landed but probably too late to request a merge into Chrome 120, so it won't hit stable until Chrome 121 late January. Regardless data from early release channels should start showing up here soon. It's wildly unreliable but I have initial data from Chrome Canary on Windows - on the order of 0.001% of page loads, inline with the Firefox data. I'd guess that means it's too common still for us to just remove, but it's not impossible we could do so with more work. In particular, now that I have a rough idea of magnitude and that this is indeed a candidate for potential removal, I can opt this counter into anonymized URL collection - with any luck we could find >99% of the usage is one or two sites suggesting removal is potentially easy.

If we can't immediately remove it would you want to spec it, or maybe since it's been unspecced for so long we just wait another 6 months or something to get a sense for the trend line?

RByers avatar Nov 24 '23 14:11 RByers

Individual use counters for dailymotion/vimeo/youtube would be nice. My feeling, based on absolutely nothing, is that unshipping vimeo/dailymotion embed rewriting right now is viable. This would bring all the implementations closer together, allowing for easier specification if necessary.

I personally would like to see this hack completely removed from the web, so +1 for waiting another 6 months or so to get a sense for the trend line.

gregorypappas avatar Nov 24 '23 17:11 gregorypappas

Regardless of the Usecounter values, I'm personally uncomfortable with Google Chrome having a workaround just for YouTube - even if it is, by far, the largest use. I'd prefer to either remove them all or keep them all in Chromium.

RByers avatar Nov 24 '23 19:11 RByers

Let's see if it's possible to remove all of them, or simplify (maybe treat Flash type as text/html and let the affected sites serve HTML or redirect?).

zcorpan avatar Nov 28 '23 15:11 zcorpan

https://chromestatus.com/metrics/feature/timeline/popularity/4740 now shows about 0.002% for December and January. No hits in "Adoption of the feature on top sites".

@RByers

In particular, now that I have a rough idea of magnitude and that this is indeed a candidate for potential removal, I can opt this counter into anonymized URL collection - with any luck we could find >99% of the usage is one or two sites suggesting removal is potentially easy.

Nice, what's the status of this?

zcorpan avatar Jan 16 '24 10:01 zcorpan

I just checked and there's not enough data from Chrome 121 beta to identify any popular origins hitting this regularly enough (need >50 users per day for our privacy policy). Chrome 121 hits stable in early Feb, so hopefully we'll get some hits around mid Feb.

RByers avatar Jan 16 '24 15:01 RByers

@RByers mid-Feb has passed. :) Is there enough data now for anonymized URL collection?

zcorpan avatar Feb 28 '24 10:02 zcorpan

Yep, thanks for the ping. We do indeed have good data now, and I've done an analysis and shared some examples in the chromium issue. The usage appears to have quite a long tail (top 10 origins account only for 2% of the usage we can measure), and there are lots of examples like blogs and forum posts which are unlikely to get fixed but where the user experience would be significantly diminished by breaking the embeds. As a result, I don't believe we can reasonably remove this behaviour from Chromium, sorry! I also don't see any evidence of downward trend in the few months of UseCounter data we now have. So I don't expect this to just resolve with time over the next few years.

So my recommendation is that we standardize this rewrite. Thoughts?

RByers avatar Mar 09 '24 19:03 RByers

Thanks!

I agree. @gregorypappas and I also looked at a few in the "top sites" of https://chromestatus.com/metrics/feature/timeline/popularity/4740 , it looks like there are various education and government sites affected also.

WebKit tried to remove rewriting in https://bugs.webkit.org/show_bug.cgi?id=224453 but brought it back in https://bugs.webkit.org/show_bug.cgi?id=237182 cc @achristensen07 @cdumez

Since Gecko and WebKit seem to be fine with only YouTube rewrite, not Vimeo or Dailymotion, that seems like the minimum to standardize to address the web compat problem. Is that a problem for Chromium even if you can point to the spec?

zcorpan avatar Mar 12 '24 10:03 zcorpan

Quick GitHub search to get a sense of how common vimeo/dailymotion embeds are compared to youtube.

  • https://github.com/search?q=youtube.com%2Fv%2F&type=code 82.7k
  • https://github.com/search?q=youtube-nocookie.com%2Fv%2F&type=code 1.8k
  • https://github.com/search?q=vimeo.com%2Fmoogaloop.swf&type=code 7.7k
  • https://github.com/search?q=dailymotion.com%2Fswf%2F&type=code 1.7k

zcorpan avatar Mar 12 '24 10:03 zcorpan

Oh nice stats. From my perspective it doesn't make the system really any more complex to have a list of rewrite rules rather than a single one, and there's clearly non-trivial usage of Vimeo too. But if there's no other implementation interested in adding the Vimeo or Dailymotion support, then yeah WHATWG has to just standardize the only thing with multiple-implementor interest.

Chromium will keep the others as non-standard extensions. I just want to go on record saying I don't think it's a great look for the HTML spec to special case a Google property alone, and certainly not something Google would endorse or encourage 😄 . There was a general web compat problem with embedded video when the web evolved to not depend on flash for video, and Chromium is happy to mitigate that compat problem for any video provider with non-trivial usage of their legacy flash embed.

RByers avatar Mar 14 '24 19:03 RByers

Well now you're being reasonable, @RByers! 😄

@gregorypappas thoughts about supporting vimeo and dailymotion rewrites? @achristensen07 @cdumez same for WebKit?

zcorpan avatar Mar 15 '24 08:03 zcorpan

thoughts about supporting vimeo and dailymotion

It wouldn't be hard, and doing so would mildly increase interoperability, so we might as well just do it. As noted in earlier comments, I still oppose this mechanism in principal, and would hate to see these hacks remain in implementations forever. So even when standardized, I think we should always be eyeing the removal of this stuff if usage decreases enough, even if it takes 20 years :)

gregorypappas avatar Mar 16 '24 20:03 gregorypappas

Does/will everyone at least log some kind of console error similar to synchronous XMLHttpRequest?

annevk avatar Mar 17 '24 15:03 annevk

Does/will everyone at least log some kind of console error similar to synchronous XMLHttpRequest?

Sorry I missed this. I'll do you one better, in Chromium I'll add a deprecation report (which has an associated console error) so that sites watching for usage of deprecated functionality in Chrome in the wild will get a hit on this. In practice we've found that this (and console errors) don't tend to actually drive a meaningful reduction in usage, but I still think it's the right thing to be doing. With luck there will be a point some years in the future where we can say the web doesn't need this anymore (we're just long past the point of hoping that'll come soon IMHO).

RByers avatar Sep 10 '24 15:09 RByers