rust-gpu
rust-gpu copied to clipboard
`rustc_codegen_spirv` taking a long time processing my (large) shader
This issue is from Discord conversation https://discordapp.com/channels/750717012564770887/750717499737243679/943879544006381638
Expected Behaviour
I want to compile my shader faster.
Example & Steps To Reproduce
- clone my repo https://github.com/hatoo/rene
cargo build --release -Z timings- See compile duration
I tested on af3cf2b74e8ef4f4bd98a3cd2cda64b9379d20d7.
Completed rene v0.1.0 build script (run) in 263.2s
In console in VSCode and rust-analyzer plugin. It takes more time.
Completed rene v0.1.0 build script (run) in 407.3s
System Info
- Rust: rustc 1.60.0-nightly (1bd4fdc94 2022-01-12)
- OS: Windows 11
- CPU: 3950X
- GPU: RTX2080ti
- SPIR-V: SPIRV-Tools v2021.4-dev v2021.3-86-g21e3f681
After I added codes to my shader, it takes very longer time to compile. In this commit, it takes about 1000s.
Completed rene v0.1.0 build script (run) in 1009.6s
And I added one commit to this in this branch. I've waited for few hours and the build didn't end.
I enabled -Zself-profile and build it and I found rustc is spending almost all time to link (link_block_ordering_pass_and_mem2reg).
Procedure
- Enable
-Zself-profileon spirv-builder. It can be done by like https://github.com/hatoo/rust-gpu/commit/e34bda56b12cc86533e92ae57a754f6ae896e0b1 - Change
rene'sspirv-builderdependency to use it. cargo install --git https://github.com/rust-lang/measureme crox flamegraph summarizecargo build --releaseonrenecrox .\rene_shader-xxxx.mm_profdataandchrome_profiler.jsonwill be produced.- Open Chrome Developper tool and click the Performance tab. And upload
chrome_profiler.jsonby clicking ⬆️ button.

I've been working on a mem2reg replacement recently, relying on SPIR-T qptrs (based on https://github.com/EmbarkStudios/spirt/pull/29 / https://github.com/EmbarkStudios/spirt/pull/41 - no PR yet for the mem2reg replacement though, but I really hope I can get it into Rust-GPU 0.9, behind the qptr opt-in).
$ rg mem2reg 2023-07-08-spirt-disagg-baseline
512:time: 0.011; rss: 187MB -> 187MB ( +0MB) link_block_ordering_pass_and_mem2reg-before-inlining
515:time: 126.704; rss: 189MB -> 158MB ( -31MB) link_block_ordering_pass_and_mem2reg-after-inlining
$ rg qptr::partition_and_propagate 2023-07-08-spirt-disagg-qptr-pnp
538:time: 0.570; rss: 220MB -> 221MB ( +1MB) qptr::partition_and_propagate
so that's a 222x speed-up (for the same version of rene-shader I used to demo the initial impact from SPIR-T, over a year ago)
(RUSTGPU_CODEGEN_ARGS="--spirt-passes=qptr --no-infer-storage-classes --no-legacy-mem2reg" is the opt-in, at least on my local branch, ~~I'm hoping we can at least simplify it to RUSTGPU_CODEGEN_ARGS=--qptr for 0.9~~)
EDIT: to be perfectly clear: those flags are not useful without the necessary combination of SPIR-T + Rust-GPU changes and I don't even have an up-to-date Rust-GPU branch anywhere comparable to that one.