Environment Map Filtering GPU pipeline
Objective
This PR implements a robust GPU-based pipeline for dynamically generating environment maps in Bevy. It builds upon PR #19037, allowing these changes to be evaluated independently from the atmosphere implementation.
While existing offline tools can process environment maps, generate mip levels, and calculate specular lighting with importance sampling, they're limited to static file-based workflows. This PR introduces a real-time GPU pipeline that dynamically generates complete environment maps from a single cubemap texture on each frame.
Closes #9380
Solution
Implemented a Single Pass Downsampling (SPD) pipeline that processes textures without pre-existing mip levels or pre-filtered lighting data.
Single Pass downsampling (SPD) pipeline:
- takes the 512x512 Cubemap as an input, and creates 9 MIP levels.
- each level is progressively down-sampled.
- It is actually broken into two passes due to architectural limitations.
- largely based on Jasmine's code from github
Pre-filtering pipeline: composed of multiple Radiance Map (specular mips) generation passes, followed by the irradiance map pass (diffuse). The pre-filtering pipeline is largely based on these articles:
- https://placeholderart.wordpress.com/2015/07/28/implementation-notes-runtime-environment-map-filtering-for-image-based-lighting/
- https://bruop.github.io/ibl/
Interesting note: the fireflies are almost completely gone by using the forward tonemap and reverse tonemap trick, while setting the whitepoint to
1.0. If the white point is set to higher, the fireflies come back. This is only apparent for envrironment maps with "hot" spots (i.e. values close to the maximum).
Previous work: #9414
Testing
The reflection_probes.rs example has been enhanced with:
- A fourth display option (toggled via spacebar)
- Adjustable roughness for the center sphere (using Up/Down keys)
First test case and use-case, we load a KTX texture with the RGBA 16-bit floating point format. I obtained this texture from polyhaven.com and used the ./convert.sh goegap_road_2k.exr command to convert it to a cubemap.
I had to install 4 different command line tools to get this to work (not ideal): OpenEXR CLI, ImageMagick, OpenImageIO, and finally the KTX command line tool. And I lose half the precision in this process going from 32-bit input to 16-bit output image.
convert.sh
#!/bin/bash
# create cubemap from equirectangular
file_name=$1
# remove the extension
file_name=$(basename "$file_name" .exr)
exrenvmap -c -li -w 512 -m -z none "$file_name.exr" "cubemap_%.exr"
echo "Created cubemap from equirectangular"
# fix the exr files with imagemagick
files=cubemap_*.exr
for file in $files; do
magick "$file" "$file"
oiiotool "$file" --fixnan box3 -o "$file"
echo "Processed $file"
done
rm "$file_name.ktx2" > /dev/null 2>&1
ktx create --format R16G16B16A16_SFLOAT \
--assign-tf linear \
--cubemap \
--zstd 3 \
"cubemap_+X.exr" "cubemap_-X.exr" "cubemap_+Y.exr" "cubemap_-Y.exr" "cubemap_-Z.exr" "cubemap_+Z.exr" \
"$file_name.ktx2"
echo "Created $file_name.ktx2"
ktx info "$file_name.ktx2" | grep "vkFormat"
Showcase
User facing API:
commands.spawn((
LightProbe,
FilteredEnvironmentMapLight {
environment_map: world.load_asset("environment_maps/goegap_road_2k.ktx2"),
..default()
},
Transform::from_scale(Vec3::splat(2.0)),
));
Computed Environment Maps
To use fully dynamic environment maps, create a new placeholder image handle with Image::new_fill, extract it to the render world. Then dispatch a compute shader, bind the image as a 2d array storage texture. Anything can be rendered to the custom dynamic environment map.
This is already demonstrated in PR #19037 with the atmosphere.rs example.
We can extend this idea further and run the entire PBR pipeline from the perspective of the light probe, and it is possible to have some form of global illumination or baked lighting information this way, especially if we make use of irradiance volumes for the realtime aspect. This method could very well be extended to bake indirect lighting in the scene. #13840 should make this possible!
Notes for reviewers
This PR includes a 7.2 MB KTX file for testing, this can of course be removed if it adds too much weight in the repo. We could include this "magic script" I came up with somewhere in the codebase as well. Robswain@ was suggesting that we implement the cubemap generation in a compute pipeline as well, so we could use equirectangular env maps directly instead (future work).
Some thoughts on the user API:
- I think I'd like to stick to the old GenerateEnvironmentMapLight name rather than FilteredEnvironmentMapLight.
- Similarly, we should give users control over when to re-generate the envmaplight. Maybe a boolean that auto-resets on extract like we did for the TAA reset.
- Does it require using LightProbe? Can you use it for a global envmaplight attached to the camera?
Ran it through NSight.
-
0.17 * 6 = 1.02msof vkCmdCopyBufferToImages 😬 - 0.05ms vkCmdCopyImage that I believe is from this PR (no labels 😢)
- 0.03ms for SPD in two passes
- 0.33ms for radiance map generation. Has good occupancy to start, but drops off towards the end. The workgroups for the smaller mips, I guess? Bottlenecked by texture reads of course.
- 0.14ms for irradiance map generation. Very low occupancy the whole time. Very low active threads per warp. I'm not really sure why.
Generally those buffer to image copies are way too slow, need to figure out what those are and remove them. Radiance pass looks good, and irradiance pass isn't that slow but has very suspiciously low active threads per warp.
Based on the shader profiler, the reason irradiance map is performing poorly is the giant switch statement absolutely kills perf as every thread gets masked out. Should 100% index into an array for that one.
Oh, I see that you're also doing a separate dispatch per mip for the radiance pass. Does each level depend on the previous?
This fixes #19125
Ideally remove the old env map assets while on it to resolve https://github.com/bevyengine/bevy/issues/19693 :)
What's the difference between the radiance and irradiance maps here? I skimmed the linked blog posts and don't really see where irradiance comes in. Am sleepy though so that might be it.
Got most of the way through the code today and yesterday, will try to finish by tomorrow.
Your PR caused a change in the graphical output of an example or rendering test. This might be intentional, but it could also mean that something broke! You can review it at https://pixel-eagle.com/project/B04F67C0-C054-4A6F-92EC-F599FEC2FD1D?filter=PR-19076
If it's expected, please add the M-Deliberate-Rendering-Change label.
If this change seems unrelated to your PR, you can consider updating your PR to target the latest main branch, either by rebasing or merging main into it.
Your PR caused a change in the graphical output of an example or rendering test. This might be intentional, but it could also mean that something broke! You can review it at https://pixel-eagle.com/project/B04F67C0-C054-4A6F-92EC-F599FEC2FD1D?filter=PR-19076
If it's expected, please add the M-Deliberate-Rendering-Change label.
If this change seems unrelated to your PR, you can consider updating your PR to target the latest main branch, either by rebasing or merging main into it.
Your PR caused a change in the graphical output of an example or rendering test. This might be intentional, but it could also mean that something broke! You can review it at https://pixel-eagle.com/project/B04F67C0-C054-4A6F-92EC-F599FEC2FD1D?filter=PR-19076
If it's expected, please add the M-Deliberate-Rendering-Change label.
If this change seems unrelated to your PR, you can consider updating your PR to target the latest main branch, either by rebasing or merging main into it.
Notes, remaining feedback to address:
- [x] fix CI more than 1 check is failing
- [x] find out/gather feedback whether it's better to have 1 less ktx file and reuse the skybox as the environment map, include a code comment as explanation as before
- [x] support all powers of two input textures and dynamically choose the mip chain. mip 9 - 12 are currently commented out. in theory we could support 8k cubemaps, although not sure about practical use/performance :D
- [x] GeneratedEnvironmentMapLight shouldn't require using LightProbe. support use case for then the global envmaplight is attached to the camera
- [x] Elaborate on whitepoint calculation. Match the one used in TAA and make it user configurable
For the ktx asset, I think we should merge this PR without it. Once http asset source lands we'll be able to update the example to use the ktx downloaded from an external source.
Btw is there any limitations/requirements on input cubemap sizes beyond being a power of 2? I know SPD has a max texture size of like 4k/8k iirc.
Oh another thought, what size are the output cubemaps? We may want to make it configurable between 128/256/512.
I think this is ready for a final review, and I think we should move the remaining feedback to a follow up tracking issue (multiple even if needed), I think it is out of scope for this PR as it would add too much changes on top of the current mergeable, user-ready state. It is up for debate so please mark any of these as in-scope if important for landing the PR.
Follow up:
- we should give users control over when to re-generate the environment map light. Maybe a boolean that auto-resets on extract like we did for the TAA reset.
- look into using compute enabled read-only BC6H compressed texture format
- checkout support flag for
wgpunative tier 1 write-onlyrg11b10ufloatpacked texture format - separate and modularize the
generate.rsfile, along with considering a separate folder for these resources. - build a good user facing api for controlling downsampling resolution settings, changing output resolution, and changing the filtering passes sample counts
- conduct furnace test, look into fixing the energy preservation issue at the highest roughness (diffuse irradiance convolution), and compare against ground truth to bring down the error bars.
- discuss offline with Jasmine the usage of the STBN textures
- share the GGX VNDF functions between Solari and env map filtering
After a discussion this morning, we agreed to add the spatio temporal blue noise texture to Bevy PBR plugin. This is from Electronic Arts' fastnoise repo. https://github.com/electronicarts/fastnoise/
I combined the texture using the following command:
wget https://github.com/electronicarts/fastnoise/blob/main/noise.zip
unzip noise.zip && cd noise
ktx create \
--format R8G8B8_UNORM \
--assign-tf linear \
--layers 32 \
vector3/temporal/exp/vector3_uniform_box3x3_exp0101_product_*.png \
stbn.ktx2
I did a quick sanity check and decided that the best option going with box3x3. This is the texture most path tracers and ReSTIR implementations ship with today. even for the filtering, I read exactly one texel per pixel, so aliasing that the Gaussian blur fights is not an issue. Box3x3 gives the cleanest high-frequency spectrum and fastest convergence. Current and future use cases:
- realtime environment map filtering GGX
- solari path tracing and ReSTIR GGX
- raymarching algorithm jittering for volumetrics, atmosphere
Since this step is complete, no remaining actions needed for this PR. ready for a final review and merge.
CI is failing; please take a look!
Fixed the runtime issue on linux with the vulkan backend. I got a working version of web build locally, switching the bindings around a bit. Since the web support was not in the initial scope of the PR but now including it so the CI can pass on merge. I will create a separate branch and review the changes merging into this branch since everything here has already been reviewed and approved.
Got feedback from Jasmine that I should try to keep the original setup with the full bind group for the native build, and rely on feature flags - I will attempt to make these changes in the PR to this branch linked above without making a mess.
@alice-i-cecile this is ready for a merge again. The CI should pass for web and linux.
this PR broke WebGL2: https://github.com/bevyengine/bevy/issues/20276
this PR also added an unconditional 1.5MB embedded file to everyone enabling bevy_pbr.
I think it shouldn't have been merged without this being more discussed, and with a way to avoid it.
I'll revert it unless we have a credible plan to fix both issues (webgl2 and size)