Pipeline cache API and implementation for Vulkan
Connections
- Fixes #5293 (for Vulkan)
- Required for https://github.com/linebender/vello/issues/289
- See also https://github.com/linebender/vello/pull/459 which validated that this was required
Description
Adds a PipelineCache resource to wgpu's API, and wires it down to Vulkan vkCreatePipelineCache.
This cache is added as a parameter to ComputePipelineDescriptor and RenderPipelineDescriptor.
This parameter is used as expected on Vulkan, and ignored on DirectX/Metal/WebGPU/gles.
Testing Testing has been performed on a Pixel 6, which has validated that this works as expected between runs. This is in the branch in my fork of Vello. This does only test the cache for Vulkan pipelining.
Checklist
- [x] Run
cargo fmt. - [x] Run
cargo clippy. If applicable, add:- [x]
--target wasm32-unknown-unknown - [X]
--target wasm32-unknown-emscripten- gets the same errors onmain, so I assume it's fine
- [x]
- [X] Run
cargo xtask testto run tests - I get the same failures onmain, so I assume it's fine - [x] Add change to
CHANGELOG.md. See simple instructions inside file.
Status This PR is now fully ready for review
Future Work I am planning on implementing additional validation inside wgpu that the cache is suitable, as discussed in https://medium.com/@zeuxcg/creating-a-robust-pipeline-cache-with-vulkan-961d09416cda. That is mainly about storing the driver version and file size inline in the file. This should be done before this feature is released, but could be in a different pull request. Please advise if you have any preference here.
This could also be extended to different platforms/APIs - see e.g. https://github.com/gfx-rs/gfx/issues/2877 I am not intending to do this work myself
cc @expenses, as you had a previous investigation in https://github.com/gfx-rs/gfx/issues/3716
Thanks for the review @cwfitzgerald
In terms of validation, my intent was to follow up with that, because I wasn't sure I was doing the right thing. So I can definitely add that in here.
So I can definitely add that validation in this PR, since the rest of the shape is approximately right.
One thing I need guidance on is how to hash the data in wgpu_core. Are you aware of a preferred library for this? It feels like the sort of thing which is quite opinionated
I just want some clarification about what you mean by:
The user will pull the pipeline key from the adapter, hash it, and use it as a filename to store the data.
In my prototype integration, I wasn't hashing the filename. I find it unlikely that users would do so, either. Should we pre-hash the value from the cache key method?
It shouldn't do any additional validation
This is also slightly troubling, as different APIs have different validation needs. For example, Vulkan probably needs to validate the driver version is the same, whereas the concept of a driver version isn't exposed at the higher levels
For example, Vulkan probably needs to validate the driver version is the same, whereas the concept of a driver version isn't exposed at the higher levels
That's interesting... are there Vk drivers in the wild that accept incompatible pipeline caches from applications, or is this more preemptive (due to the quality issues with some drivers)?
According to the linked article, there have been cases in the past where a driver either:
- Doesn't validate the UUID which is meant to detect this at all
- Fails to update the UUID when it would be incompatible
This is also slightly troubling, as different APIs have different validation needs. For example, Vulkan probably needs to validate the driver version is the same, whereas the concept of a driver version isn't exposed at the higher levels
The idea is that the wgpu-hal decides on its pipeline key based on its validation needs, and wgpu-core is the one who does the comparison and rejection of the pipeline key. This way the driver version will be part of the wgpu-hal pipeline key.
In my prototype integration, I wasn't hashing the filename. I find it unlikely that users would do so, either. Should we pre-hash the value from the cache key method?
If we make cache key an opaque type we can put a into_bytes and hash method on it.
The key factor which needs to be accounted for is:
- If the driver updates, the old cache you still have is useless.
I'm planning on implementing that with effectively two keys, a "device" key (which identifies the device/backend combination) and a further validation key. The device key should related to the "file name" being used, whereas the validation key is just used to bail from using the cache. That is potentially what you are describing, but I think it's good to clarify that
This is now ready for follow-up review. I am going to now test this updated version in Vello, and also need some guidance on computing the hash (it currently uses a hardcoded value).
I'm unlikely to make further functional changes before another review
Having done some more thinking on this, I have decided to exclude the hashing of the cache data from this PR. The main reasons for this are:
- I believe it to be unlikely that filesystems would corrupt the data in a way which keeps the same length, but changes some of the content
- Any attempt at meddling in the file by other processes/application is already declared out of scope by our safety comment
- We do not already have a dependency which would compute a suitable hash. We could likely produce a hash designed for a hashmap, but these have significantly different requirements to our hashing need.
I also don't see the necessity in making the pipeline cache key an opaque set of bytes, rather than a readable string. These files should be in a location which would be rarely accessed by end-users. Additionally, if using these as a file name, an application would have to sanitise the bytes to be reasonable for use as a file name. I suspect at least some applications would not do so, and potentially lose some of portability which wgpu provides. But I also think it's plausible that I've missed some important reason for this to be an incorrect assessment.
Re-requesting review pending CI
Looks to be an issue with the ash update
I'll fix this up and get it in