[Impeller] absorb the last blur pass into an entity instead of baking it in a render pass
fixes https://github.com/flutter/flutter/issues/150720
Pre-launch Checklist
- [ ] I read the Contributor Guide and followed the process outlined there for submitting PRs.
- [ ] I read the Tree Hygiene wiki page, which explains my responsibilities.
- [ ] I read and followed the Flutter Style Guide and the C++, Objective-C, Java style guides.
- [ ] I listed at least one issue that this PR fixes in the description above.
- [ ] I added new tests to check the change I am making or feature I am adding, or the PR is test-exempt. See testing the engine for instructions on writing and running engine tests.
- [ ] I updated/added relevant documentation (doc comments with
///). - [ ] I signed the CLA.
- [ ] All existing and new tests are passing.
If you need help, consider asking for advice on the #hackers-new channel on Discord.
I have it mostly implemented with a bug in the blur step size I need to fix. I think the performance will be representative though, here's where we are (a 25% increase in fps):
In the current state a non-uniform blur in just the x direction looks shimmery in GaussianBlurAnimatedBackdrop. I checked against ToT and it's the case there too.
It looks like we need to scale the pixel size by 0.5 to get the correct pixel size when rendering inline. In the backdrop filter case that seems accessible in the effect_transform. I'm not seeing an easy way to derive that in the image filter case. We might have to pipe the dpi scalar to here or find a way to do the blur pass that is more amenable to the math we have available to us.
Is the intention of this to basically render the second directional blur pass directly into the parent layer RenderPass, thereby rendering it at the full size of the blur?
Idea being that the extra volume of fragment computations is hopefully cheaper than the cost of rendering to a smaller texture first?
Is the intention of this to basically render the second directional blur pass directly into the parent layer RenderPass, thereby rendering it at the full size of the blur?
Idea being that the extra volume of fragment computations is hopefully cheaper than the cost of rendering to a smaller texture first?
Nope, the idea is just to eliminate a render pass which carries a lot of overhead. This is registering about 20% faster in the backdrop filter blur benchmark.
Oh, is the render pass that's being eliminated not what the second blur kernel pass writes to?
Oh, is the render pass that's being eliminated not what the second blur kernel pass writes to?
The thing that the second blur pass was rendering to was typically the same thing the downsample pass wrote to, so it's not eliminated.
The following patch seems to render things correctish. Here is the render versus the golden before the change. I have no idea where this 2.0 factor is coming from though.
--- a/impeller/entity/contents/filters/gaussian_blur_filter_contents.cc
+++ b/impeller/entity/contents/filters/gaussian_blur_filter_contents.cc
@@ -751,11 +751,9 @@ std::optional<Entity> GaussianBlurFilterContents::RenderFilter(
pass2_out.value().GetRenderTargetTexture();
blur_output_entity.SetBlendMode(entity.GetBlendMode());
blur_output_entity.SetTransform(blur_output_entity_transform);
- constexpr Vector2 orthographic_size(2.0f, 2.0f);
- Vector2 inline_pixel_size = pass1_pixel_size / orthographic_size;
blur_output_entity.SetContents(Contents::MakeAnonymous(
/*render_proc=*/
- [sampler_desc, texture, inline_pixel_size, blur_info,
+ [sampler_desc, texture, pass1_pixel_size, blur_info,
downsample_pass_args](const ContentContext& renderer,
const Entity& entity, RenderPass& pass) {
Quad blur_uvs = {Point(0, 0), Point(1, 0), Point(0, 1), Point(1, 1)};
@@ -763,13 +761,16 @@ std::optional<Entity> GaussianBlurFilterContents::RenderFilter(
Point(0, 0), Point(texture->GetSize().width, 0),
Point(0, texture->GetSize().height),
Point(texture->GetSize().width, texture->GetSize().height)};
+ int32_t fudge_scalar = 2.0;
return Render1DBlur(
renderer, pass, blur_vertices, blur_uvs, sampler_desc, texture,
BlurParameters{
- .blur_uv_offset = Point(inline_pixel_size.x, 0.0),
- .blur_sigma = blur_info.scaled_sigma.x *
+ .blur_uv_offset =
+ Point(pass1_pixel_size.x / fudge_scalar, 0.0),
+ .blur_sigma = fudge_scalar * blur_info.scaled_sigma.x *
downsample_pass_args.effective_scalar.x,
.blur_radius =
+ fudge_scalar *
ScaleBlurRadius(blur_info.blur_radius.x,
downsample_pass_args.effective_scalar.x),
.step_size = 1,
I chatted with @bdero. The fudge factor is the downscale amount. I wasn't thinking. The inline blur pass will now have to be scaled to match the destination size we are rendering to. That makes me a bit more dubious that this will be beneficial except for the cases of small blur sigmas.
That fudge scalar isn't directly the downsample scalar. I tried that and it doesn't work. It makes sense we have to blur more when drawing directly in the parent pass since it will be at a higher resolution. I ran the golden tests and the 2.0 renders things correctly. Notice here how the blur radius is mostly the same (this is despite the downscale factor not being 1/2):
However doing the horizontal blur pass in the destination breaks the work we did to make rotated uniform blurs work:
Here are the measurements before and after with the 2.0 fudge factor that seems to render things correctly:
| before | after |
|---|---|
Given the fact that we have to scale the kernel and we'd only be able to use this for unrotated blurs, the window of opportunities where this could potentially would help are shrinking.