engine [DisplayList] Pre-Compute saveLayer bounds in DLBuilder

Previously the DisplayListBuilder would only pass along bounds for a saveLayer when they were supplied by the caller that was building the DisplayList. This would require Impeller to use post-processing of the EntityPass lists to compute them on its own.

DisplayList can now compute those bounds as it builds the DisplayList to save dispatch clients from having to do so on their own. It will also provide an indicator in the case when the caller supplied bounds that ended up being too small to capture all of the content, causing clipping by the layer render target.

Feb 24 '24 01:02 flar

Some preliminary data from the DLBuilder benchmark shows a big impact when the bounds are not provided. Note that almost all operations are within 1-2% of the original speed, but the 4 "WithSaveLayer" benchmarks are much slower. Examining the code shows that the SaveLayer operations being tested there are not supplying bounds.

So, the minimal performance impact demonstrates that the bounds computation only comes into play for a SaveLayer without bounds, but it does have an impact when the bounds need to be provided. One complication is that while the DisplayListBuilder is already computing the bounds of the DL operations, it is doing so in the "output space" of the DisplayList. When computing the bounds for a given SaveLayer, though, it needs to compute them in the transform space of the SaveLayer which may not match the transform of the entire DL.

Performance impact of pre-computing SaveLayer bounds

Performance impact of pre-computing saveLayer bounds in DLBuilder

At some point since that solution was prototyped, it became clear that the DL needed to also vet the bounds supplied by the caller so that Impeller could trust them - which means the overhead is now for all SaveLayer calls, even if the caller supplied some bounds. Now the bounds delivered to a DlOpReceiver can be fully trusted to be reasonably tight and will cover the contents unless the "clipped" flag is set.

Here is an expanded benchmark run on the previous and latest version of the fix with a new benchmark added to measure the impact on each rendering op inside the SaveLayer (the GlobalSaveLayer variants of the benchmarks).

Performance impact of previous solution and final solution on pre-computing SaveLayer bounds

DisplayList impact of computing bounds for SaveLayer

The latest solution has slightly more overhead for each SaveLayer/Restore pair, but much less overhead on a per-rendering-op basis for the contents of those SL/Restore calls.

Feb 24 '24 01:02 flar

In looking through the Impeller code to see what happens when all SaveLayer calls now come through with a bounds, I've found the following 2 cases where it seems to matter:

https://github.com/flutter/engine/blob/632f9d742dd7bafb561100ce485a707566a6c3d4/impeller/aiks/paint_pass_delegate.cc#L75 The opacity peephole code looks like it won't even try to collapse passes if they have an active bounds limit.

https://github.com/flutter/engine/blob/632f9d742dd7bafb561100ce485a707566a6c3d4/impeller/entity/entity_pass.cc#L221 Entity passes will still go through their children looking for content bounds. They used to treat the saveLayer bounds limit as a clip, but did not trust it for purposes of GetSubpassCoverage() so the new behavior will just cause more work to be done here (an intersection operation).

Feb 28 '24 07:02 flar

I think we'd need to distinguish the case of explicit bounds being set vs the DL computing bounds. For 1. We can't apply the peephole if there is a user-specified bounds limit since I believe it can change the expected rendering. Same for 2, we need to know that the DL computing the bounds limit so we can trust it.

Feb 28 '24 16:02 jonahwilliams

I think we'd need to distinguish the case of explicit bounds being set vs the DL computing bounds.

The change does provide that information, but we'd need to record it somewhere.

For 1. We can't apply the peephole if there is a user-specified bounds limit since I believe it can change the expected rendering.

DL has similar logic in it when it dispatches to Skia. I think the intent is that saveLayer doesn't specifically clip and so eliminating it would lose some behavior. Although, saveLayer does implicitly clip as it generates the temporary render target for the layer. The layer could be replaced with a clip and the final outcome could be a net savings, but the behavior would be affected in very minor ways (mainly that if we are in a rotated coordinate system you can tell whether the layer was allocated to the rotated bounds or whether a clip was directly enforced).

DL could provide a couple of other hints as well:

Did the supplied bounds encompass the computed bounds? Unfortunately this would require the extra overhead of computing the bounds even in cases where the caller supplied them. But, if this condition is true, then the opacity optimization could be applied even though the caller supplied the layer bounds.
The DL already provides an opacity compatibility hint in the SaveLayerOptions, but I don't think it takes the bounds into account as mentioned above. Eventually we want to expand the role of that hint, though, so the need for Impeller to compute that condition will go away in time.
In either case, the suspicion of the bounds could be handled by a clip operation in the opacity peephole code.

Same for 2, we need to know that the DL computing the bounds limit so we can trust it.

Trust it for what? If it is too large then that is on the caller for having a bad estimate. If it is too small then Impeller clips to it anyway so the bounds accumulation was wasted. Do we have any evidence that a caller supplying overly large bounds for their saveLayer calls is a problem that requires Impeller to do its own backup computations to optimize?

Feb 28 '24 17:02 flar

This is on my plate, so far so good :)

I do wonder if we can/should make the save layer bounds size/promise required so that we could delete all of the bounds computation code in aiks, though that probably should be a part of a different patch.

Mar 09 '24 00:03 jonahwilliams

This is on my plate, so far so good :)

I do wonder if we can/should make the save layer bounds size/promise required so that we could delete all of the bounds computation code in aiks, though that probably should be a part of a different patch.

The only reason I left it in was that unit tests still call SaveLayer without "promised" bounds - and in fact there is a unit test that wants to make sure that the resulting entities are properly computed. I could get rid of the tests, but do we really want to put this burden on internal code that uses aiks?

Mar 09 '24 01:03 flar

I'm all for flipping precomputed bounds on for Flutter now, but I don't think we should get into the business of actively degrading the Aiks interface by removing functionality or tests yet. At least not until DisplayList is in a state where we can get AAOS switched over to it.

That being said, I suspect DL isn't that far off from this. Maybe we can discuss what would be required for this in detail sometime next week? Basically we'd just need to be able to build the DisplayList/builders with no Skia dependencies.

So for stuff like text we'd need a DlTextBlob with Impeller and Skia implementations like we have for DlImage. And for stuff like DlImage, we can remove the Impeller/Skia-specific vtable methods and rely on contextual downcasting to the Skia/Impeller-specific DlImage classes instead. Contextual downcasting of the generic HAL types is how we avoid needing to compile all of Impeller's graphics API backends against every Flutter platform target (which would be impossible, of course).

Mar 09 '24 05:03 bdero

We don't need to remove it now, but we do need to plan to remove it as long term having two quasi public APIs for Impeller isn't great. Lets discuss it during the weekly?

Mar 11 '24 17:03 jonahwilliams

Waiting for the pre-cursor PRs to land so I can rebase this on top of their changes.

Mar 11 '24 22:03 flar