Async pipeline compilation
Objective
- Pipeline compilation is slow and blocks the frame
- Closes https://github.com/bevyengine/bevy/issues/8224
Solution
- Compile pipelines in a Task on the AsyncComputeTaskPool
- This won't actually improve anything until wgpu 0.19, as wgpu currently has internal locks that will cause blocks anyways
Changelog
- Render/compute pipeline compilation is now done asynchronously over multiple frames when the multi-threaded feature is enabled and on non-wasm platforms
- Added
CachedPipelineState::Creating - Added
bevy_utils::futures::check_ready - Added
bevy_render/multi-threadedcargo feature
Migration Guide
- Match on the new
Creatingvariant for exhaustive matches ofCachedPipelineState
Shame we couldn't get this one out in time for the jam. Darn!
Shame we couldn't get this one out in time for the jam. Darn!
Wouldn't be possible, as we need wgpu 0.19 first for this to actually work.
I'll also probably have to go back and redo/change this once https://github.com/gfx-rs/wgpu/issues/3794 gets implemented, as this PR will only help on native (we don't have threads on WASM/WebGPU).
Sadly causing an error log message (but no crash) when the app first loads, as it renders nothing while waiting for pipelines to compile :grimacing:
We might need some logic to block that specific log message for the first few frames of the app, or something else hacky.
2023-12-01T19:50:50.742368Z ERROR present_frames: wgpu_core::present: No work has been submitted for this frame
I'd like to try and land this for 0.13. We can cut it if it's not ready, it's semi-low priority, but it's a nice improvement if we can land it.
The last thing remaining for this PR is fixing compiling without multi-threaded or on wasm.
Doesn't work for me on Linux+AMD 6800xt with radv drivers.
Does it just crash or do you mean that it still compiles synchronously?
Does it just crash or do you mean that it still compiles synchronously?
It seems to compile synchronously looking at tracy, and then stuff just doesn't render randomly.
The new fix works for me.
Seems to not work on Mac in deferred rendering example by my understanding as of latest commit https://github.com/bevyengine/bevy/pull/10812/commits/4aed836e344b68305da5995f383ccc530cc81851
AdapterInfo { name: "Apple M1 Pro", vendor: 0, device: 0, device_type: IntegratedGpu, driver: "", driver_info: "", backend: Metal }
SystemInfo { os: "MacOS 14.2.1 ", kernel: "23.2.0", cpu: "", core_count: "10", memory: "32.0 GiB" }
https://github.com/bevyengine/bevy/assets/2771466/72dda7db-5661-4688-b50e-a044d2bc11b9
Same example on main for comparison.
https://github.com/bevyengine/bevy/assets/2771466/22edfc20-fd85-46d6-8c7f-a6a7444394ea
For completeness, it seems metal async support is blocked on https://github.com/gfx-rs/wgpu/issues/3794. Identified by @Elabajaba
I removed macOS support for now. Once wgpu gets create_pipeline_async() we can revisit this for WebGPU/macOS support.
I did some profiling, and we do still stutter when we hit new pipelines, but it's significantly less bad (~50ms instead of ~200+ms).
The issue is when we first call process_pipeline with a Queued pipeline naga_oil has to do a bunch of stuff (in the ShaderCache::get calls) which can take ~50ms on my system.
I moved the expensive naga oil work into the task. There's still a chance of stutter if extract_shaders() is blocked waiting for a lock a task is holding, but it should be pretty rare hopefully.
Is there any way for an app to request that pipelines be synchronously built: i.e. to turn this behavior off? I worry that just having objects silently not appear for a few frames is not the behavior that every app wants.
I'd like to leave that to a followup. We can put it in the RenderPlugin settings, but it's a bit of a pain.
It'd also be nice if apps could preload pipelines, but that can be done as a follow-up, as I'm not sure what the API for that would look like.
https://github.com/bevyengine/bevy/issues/10871
Works for me locally. I'm content with the level of review and testing for this, and no crimes have been committed in the code base. Merging now: I'd much rather have to revert just before release than find out it's broken for users after launch.