bevy icon indicating copy to clipboard operation
bevy copied to clipboard

Async pipeline compilation

Open JMS55 opened this issue 2 years ago • 8 comments

Objective

  • Pipeline compilation is slow and blocks the frame
  • Closes https://github.com/bevyengine/bevy/issues/8224

Solution

  • Compile pipelines in a Task on the AsyncComputeTaskPool
  • This won't actually improve anything until wgpu 0.19, as wgpu currently has internal locks that will cause blocks anyways

Changelog

  • Render/compute pipeline compilation is now done asynchronously over multiple frames when the multi-threaded feature is enabled and on non-wasm platforms
  • Added CachedPipelineState::Creating
  • Added bevy_utils::futures::check_ready
  • Added bevy_render/multi-threaded cargo feature

Migration Guide

  • Match on the new Creating variant for exhaustive matches of CachedPipelineState

JMS55 avatar Nov 30 '23 04:11 JMS55

Shame we couldn't get this one out in time for the jam. Darn!

alice-i-cecile avatar Nov 30 '23 05:11 alice-i-cecile

Shame we couldn't get this one out in time for the jam. Darn!

Wouldn't be possible, as we need wgpu 0.19 first for this to actually work.

I'll also probably have to go back and redo/change this once https://github.com/gfx-rs/wgpu/issues/3794 gets implemented, as this PR will only help on native (we don't have threads on WASM/WebGPU).

JMS55 avatar Nov 30 '23 15:11 JMS55

Sadly causing an error log message (but no crash) when the app first loads, as it renders nothing while waiting for pipelines to compile :grimacing:

We might need some logic to block that specific log message for the first few frames of the app, or something else hacky.

2023-12-01T19:50:50.742368Z ERROR present_frames: wgpu_core::present: No work has been submitted for this frame

JMS55 avatar Dec 01 '23 19:12 JMS55

I'd like to try and land this for 0.13. We can cut it if it's not ready, it's semi-low priority, but it's a nice improvement if we can land it.

JMS55 avatar Jan 24 '24 18:01 JMS55

The last thing remaining for this PR is fixing compiling without multi-threaded or on wasm.

JMS55 avatar Jan 27 '24 03:01 JMS55

Doesn't work for me on Linux+AMD 6800xt with radv drivers.

Elabajaba avatar Jan 28 '24 21:01 Elabajaba

Does it just crash or do you mean that it still compiles synchronously?

IceSentry avatar Jan 28 '24 23:01 IceSentry

Does it just crash or do you mean that it still compiles synchronously?

It seems to compile synchronously looking at tracy, and then stuff just doesn't render randomly.

Elabajaba avatar Jan 29 '24 00:01 Elabajaba

The new fix works for me.

Elabajaba avatar Jan 29 '24 01:01 Elabajaba

Seems to not work on Mac in deferred rendering example by my understanding as of latest commit https://github.com/bevyengine/bevy/pull/10812/commits/4aed836e344b68305da5995f383ccc530cc81851

AdapterInfo { name: "Apple M1 Pro", vendor: 0, device: 0, device_type: IntegratedGpu, driver: "", driver_info: "", backend: Metal }
SystemInfo { os: "MacOS 14.2.1 ", kernel: "23.2.0", cpu: "", core_count: "10", memory: "32.0 GiB" }

https://github.com/bevyengine/bevy/assets/2771466/72dda7db-5661-4688-b50e-a044d2bc11b9

tbillington avatar Jan 29 '24 01:01 tbillington

Same example on main for comparison.

https://github.com/bevyengine/bevy/assets/2771466/22edfc20-fd85-46d6-8c7f-a6a7444394ea

tbillington avatar Jan 29 '24 01:01 tbillington

For completeness, it seems metal async support is blocked on https://github.com/gfx-rs/wgpu/issues/3794. Identified by @Elabajaba

tbillington avatar Jan 29 '24 02:01 tbillington

I removed macOS support for now. Once wgpu gets create_pipeline_async() we can revisit this for WebGPU/macOS support.

JMS55 avatar Jan 29 '24 04:01 JMS55

I did some profiling, and we do still stutter when we hit new pipelines, but it's significantly less bad (~50ms instead of ~200+ms).

The issue is when we first call process_pipeline with a Queued pipeline naga_oil has to do a bunch of stuff (in the ShaderCache::get calls) which can take ~50ms on my system.

Elabajaba avatar Jan 29 '24 08:01 Elabajaba

I moved the expensive naga oil work into the task. There's still a chance of stutter if extract_shaders() is blocked waiting for a lock a task is holding, but it should be pretty rare hopefully.

JMS55 avatar Feb 01 '24 03:02 JMS55

Is there any way for an app to request that pipelines be synchronously built: i.e. to turn this behavior off? I worry that just having objects silently not appear for a few frames is not the behavior that every app wants.

I'd like to leave that to a followup. We can put it in the RenderPlugin settings, but it's a bit of a pain.

It'd also be nice if apps could preload pipelines, but that can be done as a follow-up, as I'm not sure what the API for that would look like.

https://github.com/bevyengine/bevy/issues/10871

JMS55 avatar Feb 02 '24 18:02 JMS55

Works for me locally. I'm content with the level of review and testing for this, and no crimes have been committed in the code base. Merging now: I'd much rather have to revert just before release than find out it's broken for users after launch.

alice-i-cecile avatar Feb 05 '24 13:02 alice-i-cecile