PipelineSet
PipelineSet copied to clipboard
Figure out threshold for single-threaded PSO building
Right now I stupidly use a ppl parallel_for to build the PSOs in parallel. Maye their task scheduler does some fanciness to group work together once it realizes the work items are small, but it would be good to also give it a decent guess of how many PSOs it should build in each thread.
The timings likely change a lot depending on whether the shaders are currently cached or not. When they're not cached, it probably pays off quickly to multi-thread (~10-100ms compiles). This can be tested by turning off the shader cache in eg. the NVIDIA control panel.
Probably need to come up with a workload that has 100s or 1000s of PSOs to be able to do useful measurements.
Also, the case of shaders being cached is affected by #2, since maybe the API can explicitly cache using a temporary and automatically managed ID3D12PipelineLibrary.