html icon indicating copy to clipboard operation
html copied to clipboard

[View Transitions] Extend render-blocking to support Document

Open khushalsagar opened this issue 1 year ago • 83 comments

render-blocking allows the author to specify which resources are important for a good first paint. This allows the browser to defer a Document's first paint (new load only) up till a timeout.

It currently supports a limited set of resources that allow the blocking attribute. Two common use-cases which are missing are:

  • Images. Documents frequently have a hero image. While a network delay here could be too slow, fetching them from the disk cache could be worth the delay.
  • Document itself. Authors have no way of saying whether the "above the fold" content has been parsed and fetched.

It would be good to extend this concept to cover the above use-cases. Likely with a generic promise based API like:

document.renderBlockUntil(loadBlockingResources);

The downside of a promise based API is that the browser has no visibility into which resources are render-blocking, to prioritize fetching them. But it provides a lot of flexibility, and would work well if other APIs can already be used to prioritize network work.

This API would also be an important building block for View Transitions.

@noamr @xiaochengh @yoavweiss @nickcoury

khushalsagar avatar May 25 '23 13:05 khushalsagar

I was just prototyping a version of this yesterday with a pair of style tags:

<style>
.render-blocked {
  display: none;
}
</style>
<!-- Above the fold content -->
<style>
.render-blocked {
  display: block;
}

There are several uses I see for this:

  1. Avoid flashes of partially rendered content.
  2. Lower the difficulty of getting frame-perfect transitions especially with more complicated DOM manipulations and async code paths. (VT API can also fulfill this if appropriate)
  3. Possible performance gains from reducing layout thrashing.

JS API

I don't have a strong sense of an ergonomic JS API yet, but the promise one seems reasonable and would be flexible. The other method would be to have an explicit start() and stop() method, though the advantage of this is multiple pieces of code could independently render-block, and one wouldn't accidentally unblock the other.

A timeout could be a convenient parameter. It could be manually built into the promise API, though that's subject to a setTimeout not being blocked on the main thread.

On point 2 above, we frequently have cases where we remove old DOM, run some cleanup code, render new DOM, hydrate controllers that may further change DOM on instantiation, then finally have the final state ready. Given a lot of code is built to be async and non-blocking by releasing control back to the main thread, or uses requestAnimationFrame for certain use cases, we sometimes see 1 frame flashes of intermediate state before the final render. This is jarring to users and takes tedious coordination of client and framework code to prevent. This would bypass all of that and allow better structured code to focus on behaviors rather than rendering quirks.

One example we hit is fetching content from an async cache. Our transition service is separate from our fetching and rendering services. The transition code is responsible for cleaning up the previous content and removing old DOM, but the fetching + rendering holds the implementation details for getting and rendering new content. If the fetch service gets the content from a local async cache, the timing is such that we can show a blank page briefly before the renderer gets the content. We had to add complexity to coordinate this, rather than the transition service blocking rendering, removing old code, sending the request to fetch and render, then unblocking once the renderer is finished.

One consideration that comes to mind from some VT API use case is if animations or videos are allowed to continue to run during render blocking. In the previous example, the transition service might play a partial fade out animation on the old content to show the user the page is changing, not knowing if the fetch service will make a network request or hit local cache. Though I'm not sure if the complexity required for this is worth it.

HTML API

I can see the role of an HTML API, especially if this is used on an initial page render where JS will have less knowledge of the streaming page as it is parsed than the browser will. It might look something like this.

<meta rb-id="unblock" rb-timeout="500">
<!-- Above the fold content -->
<div id="unblock"></div>

I'm less certain this would be useful after the initial streaming load, as usually HTML inserted later on can be batched instead of streamed.

nickcoury avatar May 26 '23 03:05 nickcoury

It sounds like the use-case for a JS API is primarily for same-document navigations where removing the old DOM and populating the new DOM can't be an atomic operation. Does the HTML API suffice when loading a new Document? We're erring towards a declarative solution since it works better with browser optimizations like preload scanning.

One consideration that comes to mind from some VT API use case is if animations or videos are allowed to continue to run during render blocking.

This is fairly hard. Render blocking works by pausing frame production completely (as if the tab was backgrounded) so no animations or video playback can run. As an internal detail, we can keep accelerated animations running while you're mutating the DOM. But it becomes very difficult to reason about if the DOM mutation removes some element and its associated animation keeps running.

If you're ok with the page being static then you can kinda get what you want by hacking the VT API just for the render-blocking part. Something like:

document.documentElement.style.viewTransitionName = "none";
document.startViewTransition(asyncDOMUpdate);

We had to add complexity to coordinate this, rather than the transition service blocking rendering, removing old code, sending the request to fetch and render, then unblocking once the renderer is finished.

The transition service can use the code above to block rendering in the sequence you mentioned. This seems ok if the async part is hitting a local cache. If the fetch needs to hit the network, you probably still want to do that before suppressing rendering to keep the current page animating and interactive while waiting on the network.

khushalsagar avatar Jun 02 '23 21:06 khushalsagar

Yes, that's the primary use case for the JS API for us, and the VT API would likely work well enough as long as it's still performant on a window of a frame or two.

nickcoury avatar Jun 06 '23 02:06 nickcoury

As spec'd, startViewTransition will do the following:

  1. Render one frame and snapshot it. If no element has a view-transition-name, snapshotting has no cost. Chrome's current implementation might have a one frame delay here (we wait for this frame to progress until a point before realizing no element needs a snapshot) which we can optimize.
  2. Dispatch the callback passed to that function. Now its up to the author how long this takes.
  3. When the callback is done (or timeout happens), start rendering again.

There shouldn't be any performance issues even if the callback takes one or 2 frame's worth of latency. Let us know if you're not seeing that in practice.

khushalsagar avatar Jun 06 '23 20:06 khushalsagar

I wonder if for the view transitions case this can be a CSS property rather than an HTML property, e.g. putting view-transition-capture: stall or so on elements that are not yet "settled", giving a hint to the UA that the new state is not ready to be captured. This means though that we keep have to be calculate style even if we end up not performing layout/paint.

noamr avatar Jun 14 '23 06:06 noamr

An explainer clarifying the use-case and a concrete proposal is now available here.

khushalsagar avatar Aug 09 '23 21:08 khushalsagar

One question on the proposal is if requiring JavaScript to unblock rendering will be a performance bottleneck relative to a more declarative solution? My understanding is that the context switch to JS land can be expensive especially on initial page load when a lot of things are going on. Or is that just as performant as an HTML or CSS directive to unblock?

nickcoury avatar Aug 10 '23 04:08 nickcoury

For the partial rendering case, this proposal expects authors to:

  1. know the layout (true most of the time)
  2. know the "settled" DOM content (not true in many dynamic/interactive contexts)
  3. make a judgement call about a "worst-case scenario" - the most content that will generally be rendered above the fold, given the known layout and current device/software landscape, and common user preference settings. (this is really hard, see also, the frequency of above-the-fold <img loading=lazy>)

That's a lot to ask, and I worry that poor usage of this API (because of the difficulty and in many cases impossibility of using it well) will lead to worse experiences for users (very slow LCPs).

eeeps avatar Aug 10 '23 16:08 eeeps

One question on the proposal is if requiring JavaScript to unblock rendering will be a performance bottleneck relative to a more declarative solution? My understanding is that the context switch to JS land can be expensive especially on initial page load when a lot of things are going on. Or is that just as performant as an HTML or CSS directive to unblock?

I don't think the script switch should be a problem. If an author is explicitly unblocking rendering before full Document parsing, that's a strong hint that this is a good spot for the parser to yield for rendering. So a content switch away from the parser will likely be done anyway.

That said, a declarative solution like a new tag: <unblock render> is also doable. The script way is preferred just for consistency, since that's how authors are supposed to preemptively unblock rendering on other resources (script, stylesheets). @mfreed7 @chrishtr in case they have an opinion on whether a declarative tag is worth any perf implications.

That's a lot to ask, and I worry that poor usage of this API

The targeted use-case for partial rendering is View Transitions. The author knows exactly which DOM nodes need to animate (will have a view-transition-name) for each transition. So they can unblock rendering when the last required node for a transition has been parsed.

khushalsagar avatar Aug 10 '23 16:08 khushalsagar

For the partial rendering case, this proposal expects authors to:

  1. know the layout (true most of the time)

  2. know the "settled" DOM content (not true in many dynamic/interactive contexts)

  3. make a judgement call about a "worst-case scenario" - the most content that will generally be rendered above the fold, given the known layout and current device/software landscape, and common user preference settings. (this is really hard, see also, the frequency of above-the-fold <img loading=lazy>)

That's a lot to ask, and I worry that poor usage of this API (because of the difficulty and in many cases impossibility of using it well) will lead to worse experiences for users (very slow LCPs).

People can do this today eg by starting with body at display:none and changing to display:block once enough content had been parsed.

In addition, LCP is not the only relevant metric here - there is also CLS.

Blocking render until enough is parsed to avoid jumpiness is a legit trade off that developers should be able to make use of. Sometimes the way to avoid the footguns is with education rather than by limiting the choices.

noamr avatar Aug 10 '23 16:08 noamr

People can do this today eg by starting with body at display:none and changing to display:block once enough content had been parsed.

Adding some anecdotal data for this, Chrome has a feature called paint holding which will defer displaying a web page until it has FCP content. We've heard from authors using display: none or opacity: 0 on html/body element on top of that to effectively get the behaviour proposed here.

The author knows exactly which DOM nodes need to animate (will have a view-transition-name) for each transition.

Adding a concrete use-case from a real world site. Say you have a header or footer which is position: fixed so its guaranteed to be displayed in viewport irrespective of the device. And you want it to slide in from viewport edge (or stay at the same spot if it was in the old Document as well). The only way to get the transition right is for the browser to parse those elements before first render. Otherwise incorrect animations will be set up as described here.

khushalsagar avatar Aug 10 '23 17:08 khushalsagar

I agree that the potential for misuse shouldn't be a blocker on this proposal especially since it's opt-in to solve a specific problem. While web users are used to stream rendering, there's an argument that it's a large barrier to more modern and "native, app-like" experiences on the web because we don't have sufficient control of rendering when needed. The web community already recognizes the problematic nature of flashes of partial content (flashes of unstyled content, CLS, etc) and initial page loading need not be an exception for visually seamless transitions.

At the same time I also agree the magnitude of impact (multiple second delay for first paint) is also larger than average.

Are there safeguards we can put in place to help balance the concerns here?

  • Default timeout after which the page is shown, which can be overridden by the developer e.g. `
  • Console warning when render blocking is used and a problematic timeout is hit. May be covered by LCP already, but an explicit warning could be more explicit and impactful.

nickcoury avatar Aug 14 '23 17:08 nickcoury

Default timeout after which the page is shown

+1. That's already how the render-blocking concept is spec'd.

Console warning when render blocking is used

That sounds reasonable, we can emit a warning if the timeout is hit and which resources were timed out. @xiaochengh FYI.

khushalsagar avatar Aug 14 '23 18:08 khushalsagar

Default timeout after which the page is shown

+1. That's already how the render-blocking concept is spec'd.

Console warning when render blocking is used

That sounds reasonable, we can emit a warning if the timeout is hit and which resources were timed out. @xiaochengh FYI.

Side note - the console is already cluttered with such warnings, but we can find a different solution in devtools UI to help with this.

noamr avatar Aug 15 '23 05:08 noamr

Couple of Qs as I continue to wrap my head around this:

  1. What is Chrome's existing blocking=render timeout, and would this use the same timeout?
  2. If the timeout is reached, would it be better to do a possibly-janky View Transition, or (somehow?) cancel the Transition?

eeeps avatar Aug 15 '23 15:08 eeeps

Couple of Qs as I continue to wrap my head around this:

  1. Chrome's existing blocking=render timeout seems to be 30 seconds. Would this use the same timeout?

For regular (non-transition) blocking, I don't see why not.

  1. If the timeout is reached, would it be better to do a possibly-janky View Transition, or (somehow?) cancel the Transition?

I think this would require experimentation, and might not need to be the same across browsers. In the same-document case the transition times out a lot faster than 30s.

noamr avatar Aug 15 '23 15:08 noamr

@noamr Note that I think I got the test for the existing implementation wrong, a second (more accurate?) test indicated 500ms, which is... very different. Updated my comment accordingly.

eeeps avatar Aug 15 '23 15:08 eeeps

@noamr Note that I think I got the test for the existing implementation wrong, a second (more accurate?) test indicated 500ms, which is... very different. Updated my comment accordingly.

My answer is the same... A lot of this is heuristic and differs between browsers. This proposal (together with the existing render blocking styles & scripts) allows developers to set an intention that guides the UA to a different trade-off between first-paint time and layout stability.

noamr avatar Aug 15 '23 15:08 noamr

What is Chrome's existing blocking=render timeout, and would this use the same timeout?

@xiaochengh @mfreed7 on that. I'm not sure if there is any reason to use a different timeout for different types of resources. The spec leaves this up to the UA so we can if needed.

If the timeout is reached, would it be better to do a possibly-janky View Transition, or (somehow?) cancel the Transition?

FWIW, this came up on the TAG review as well. Here's the tracking issue: https://github.com/w3c/csswg-drafts/issues/9155. We'll likely add spec text to recommend that UAs should consider aborting the transition if the navigation has crossed some time threshold (up to the UA what exactly that is).

khushalsagar avatar Aug 15 '23 15:08 khushalsagar

What is Chrome's existing blocking=render timeout, and would this use the same timeout?

We don't have any explicit timeout currently. We just use the default network timeout -- when the resource loading fails, we unblock rendering.

But anyway, it's free to change.

xiaochengh avatar Aug 15 '23 17:08 xiaochengh

Another update to the explainer based on some offline feedback. Latest version is here.

The interesting open questions are:

  1. There's 2 syntax options now: Blocking attribute vs Set of Element IDs declared in the head.

    I'm more inclined towards the blocking attribute. Its much simpler to explain, implement and flexible to use. The only motivation to take the Set of Element IDs would be if it encourages developers to limit blocking to what's strictly needed. But I'm not convinced that it will and the feature will become unnecessary complex. It also seems more error prone to fallback to full blocking from minor bugs.

  2. Limiting document render-blocking to when there's a transition, described here. That'll be a good addition for optimal use of render-blocking even for other resource types (scripts and stylesheets). And if we want to be conservative, document render-blocking can start off by being limited to when there is a transition.

khushalsagar avatar Aug 28 '23 16:08 khushalsagar

  1. The only motivation to take the Set of Element IDs would be if it encourages developers to limit blocking to what's strictly needed.

To put this differently, blocking attribute will block parsing of the whole document by default, with non-obvious script required to make it only partially block. However, set of element IDs would make this choice obvious (wait for a set of elements, or explicitly * meaning everything) at the cost of being less elegant and more error prone to specify.

vmpstr avatar Aug 28 '23 17:08 vmpstr

I don't like adding an easy mechanism to disable incremental rendering. Incremental rendering is done because it is generally better for users.

Would it be possible for the browser to figure out how long to block painting and transitioning in the new page based on the view transition names in the old doc? Once the end tag of all view transition name elements have been parsed, it no longer blocks. (Other things might still block, and timeouts can still unblock.)

zcorpan avatar Aug 30 '23 19:08 zcorpan

Would it be possible for the browser to figure out how long to block painting and transitioning in the new page based on the view transition names in the old doc?

Unfortunately no. The state of view-transition-names on the old Doc does not provide any information about what the names will be on the new Document. See the types of transitions described here. A name that exists on the old page may have a counter part on the new page (an image which morphs from one spot to another) or not (a header/footer which is leaving the screen). Similarly, there may be names in the new Document which didn't exist on the old Document (a header/footer which is entering the screen).

Once the end tag of all view transition name elements have been parsed, it no longer blocks.

What you described is conceptually close to the Meta Tag with Element Ids option. This is where the author gives the browser an explicit list of elements to block in the head, or really any point while the browser allows adding render blocking elements. This list can be tied to VT, like a list of view-transition-names, see addendum to this option here.

I'm not opposed to this option but its a less elegant API, tougher to implement and I don't think that alleviates your concern. In either case, the developer will need to give us a list of elements to block on and will be able to (intentionally or not) block rendering until full parsing.

If the goal is to limit this blocking to only when there is a transition, then I err towards the option described here. It helps authors limit what's blocking to when there is a transition for other resource types too.

khushalsagar avatar Aug 30 '23 19:08 khushalsagar

Unfortunately no. The state of view-transition-names on the old Doc does not provide any information about what the names will be on the new Document. See the types of transitions described here. A name that exists on the old page may have a counter part on the new page (an image which morphs from one spot to another) or not (a header/footer which is leaving the screen). Similarly, there may be names in the new Document which didn't exist on the old Document (a header/footer which is entering the screen).

The old document names don't guarantee this, but wouldn't the view-transition-names of the new document give us that info? I'm assuming we have that by the time we would've initially unblock rendering (regardless of view transitions).

I'm sure I'm missing something..

yoavweiss avatar Aug 31 '23 06:08 yoavweiss

Consider this WPT where the author adds the following style:

#last {
   view-transition-name: foo;
}

If the browser yields after the first jankMany script block (after parsing line 26), it will do a render before the node with id "last" is parsed. When the style cascade is done, no node with id "last" will be found so we don't see "foo" in the list of view-transition-names on this Document on first render.

So how can the browser know which view-transition-names are supposed to be found in the new document without parsing until the end? Or without an API for the author to provide this list at a point when rendering is guaranteed to be blocked (which is until body is added to the DOM today).

khushalsagar avatar Aug 31 '23 14:08 khushalsagar

Let's say that as part of style calculation we detected view-transition-names and their selectors. Then we could have tested these selectors against the currently parsed DOM and see that there's still a pending view-transition-name before "last" is parsed. (It might be the case that it's not worth the complexity, but let's play this out)

Then we could have a developer directive that tells us the type for each view-transition: exit, entry or morph. For exit view-transitions, there's no need to wait for anything on the new page. For entry view-transitions on the new page as well as morphs, we need to wait for a node that matches their selectors (can there be more than one?).

The collected view-transition-name rules would then indicate that the relevant nodes were not yet parsed, and hence we need to keep on parsing before rendering. But once we found a match, we could initiate rendering for the new page.

While it's possible that the developer made a mistake and we'd hold parsing till the end of the document (or a reasonable timeout), that seems better than telling developers to block incremental parsing by default. (and jump through hoops to unblock it)

yoavweiss avatar Aug 31 '23 15:08 yoavweiss

Let's say that as part of style calculation we detected view-transition-names and their selectors. Then we could have tested these selectors against the currently parsed DOM and see that there's still a pending view-transition-name before "last" is parsed. (It might be the case that it's not worth the complexity, but let's play this out)

Then we could have a developer directive that tells us the type for each view-transition: exit, entry or morph.

For exit view-transitions, there's no need to wait for anything on the new page.

For entry view-transitions on the new page as well as morphs, we need to wait for a node that matches their selectors (can there be more than one?).

The collected view-transition-name rules would then indicate that the relevant nodes were not yet parsed, and hence we need to keep on parsing before rendering. But once we found a match, we could initiate rendering for the new page.

While it's possible that the developer made a mistake and we'd hold parsing till the end of the document (or a reasonable timeout), that seems better than telling developers to block incremental parsing by default. (and jump through hoops to unblock it)

To do all this we have to continuously run full style calculations for every task that changes the DOM before the first render. Not sure the end result would buy us something in terms of performance.

noamr avatar Aug 31 '23 15:08 noamr

To do all this we have to continuously run full style calculations for every task that changes the DOM before the first render. Not sure the end result would buy us something in terms of performance.

We could do that with full-style calcs. But I suspect we could also optimize a lot of that away, especially with simple selectors, which I suspect could be the majority of cases.

Like I said above, this might not be worthwhile the complexity. But I think it could be interesting to think through this.

yoavweiss avatar Aug 31 '23 16:08 yoavweiss

To do all this we have to continuously run full style calculations for every task that changes the DOM before the first render. Not sure the end result would buy us something in terms of performance.

We could do that with full-style calcs. But I suspect we could also optimize a lot of that away, especially with simple selectors, which I suspect could be the majority of cases.

Like I said above, this might not be worthwhile the complexity. But I think it could be interesting to think through this.

Note also that by default documents have the "root" name on the HTML element, which would create the same effect as document render blocking. There could be cases where you have an animation on the root but you don't need it to be fully parsed. This could be really confusing.

Perhaps a complex solution should come after we see that the simple one is insufficient in practice and which patterns emerge?

I also want to mention that the context of view transitions is a same-origin multiple-document app (MPA). For such scenarios developers can orchestrate their loading a lot better to prevent long loads between same-app pages. A thought - Perhaps instead of adding complexity we should limit this to same-origin navigations (or go with blocking=transition etc)

noamr avatar Aug 31 '23 16:08 noamr