rust-gpu
rust-gpu copied to clipboard
Optimize `spirv-builder` clean build wall time
We should look into improve our clean build times of when building spriv-builder
& rustc_codegen_spirv
as that is the main scenario in CI and also useful for users if our crates rebuild fast.
Right now it looks like we have a couple of quite unfortunate serial dependencies that significantly increases the wall time. Here is the current results with cargo build --release -p spirv-builder -Z timings
on my AMD 5950x (32 vCPU):
Potential optimizations
- [ ] Could
spirv-builder
be the one depending onspirv-tools
instead ofrustc_codegen_spirv
?- I.e. move the responsibility of running spirv-opt/spirv-val and final linking to the builder.
- Does that sound like it could be doable @eddyb ?
- [ ] Remove/replace
serde
, quite crazy it takes 20 seconds to compile- Think we only have two use cases, our
ModuleResult
output and then the internal decorations that is using more complex serde implementation (CustomDecoration
). - have no idea how
serde
+serde_json
is taking such a crazy long time to build here though, can't see any other crates enabling tons of types that serde derive macros are used on or similar -
ModuleResult
would be trivial to replace serde withnanoserde
, could work forCustomDecoration
also? - #860
- Think we only have two use cases, our
- [ ] Remove/replace
sanitize-filename
dependency that compiles and usesregex
crate for super trivial things - [ ] Optimize builds of
rspirv
itself somehow?- Unfortunate that it is so slow to build by itself but could be hard to fix and tons of types in it. No open issues on it in the
rspirv
repo as far as I could see
- Unfortunate that it is so slow to build by itself but could be hard to fix and tons of types in it. No open issues on it in the
- [x] ~Determine why
rustc_codegen_spirv
is not split in frontend/codegen sections in the profile~ report.- This would enable
spriv-builder
itself to start building earlier and before the codegen ofrustc_codegen_spirv
is done. - Update: This is not possible as it it needs to be built as a
dylib
- This would enable
If we do get rid of serde and manage to make sure rustc_codegen_spirv
doesn't have to wait for the spirv-tools-sys
build, we could get get a 15+ second wall time improvement here as rustc_codegen_spirv
could start as soon as frontend section of rspirv
has been built:
-
Did an experiment to switch to nanoserde
and it seems to work fine, there are some contraints and type/format output changes that one have to figure out (no PathBuf
support), and does get rid of all the long serde compilation, 20 sec -> 1.7 sec.
Wall time benefit was just ~1 second though, but would enable real gains if one also finds a way not have rustc_codegen_spirv
stall and wait for spirv-tools-sys
build.
Determine why rustc_codegen_spirv is not split in frontend/codegen sections in the profile report. This would enable spriv-builder itself to start building earlier and before the codegen of rustc_codegen_spirv is done.
Found the source of this, rustc_codegen_spirv
was set as a dylib
crate instead of standard lib
.
Fixing this, together with the above experimental nanoserde
switch does reduce build time from original 46.3s to 41.9s, one can see in the report that rustc_codegen_spirv
is now split in frontend and codegen sections which enables spirv-builder
to build earlier.
You could split rustc_codegen_spirv into an rlib containing the actual implementation and a dylib which only depends on the rlib. The rlib will participate in pipelined compilation and the dylib will wait for all dependencies to have fully compiled before doing the final link. It may also be possible to move just the code using spirv-tools to the dylib, allowing most of the code of cg_spirv to be compiled at the same time as spirv-tools-sys.
ah that is an interesting idea and approach, what do you think of the feasibility of that @eddyb ?
Ahh I always forget about how dylibs behave just like executables, and the current state of pipelining - it makes sense in the grand scheme of there being no good way (yet) to block rustc
invoking the linker, on the linker inputs actually being present, but it's basically a Cargo<->rustc
communicative deficiency AFAIK, not even something limited by platform tooling.
What @bjorn3 is suggesting is basically "emulate what should happen, with today's tools" and we should probably do that.
If @oisyn wants to tackle this, a first approximation could be as simple as:
- rename
crates/rustc_codegen_spirv
(e.g. add a suffix like-impl
), and the name in itsCargo.toml
, and take away anydylib
-specific settings - create a minimal
crates/rustc_codegen_spirv
-
Cargo.toml
pretty much identical to what's currently there today (modulo deps?) and with an extra dependency on the new (e.g.rustc_codegen_spirv-impl
) crate -
src/lib.rs
containing onlyextern crate rustc_codegen_spirv_impl;
- in theory that should be enough to allow any exported symbols from the-impl
crate to surface up to to the dylib and allow it to function
-
For @bjorn3's complete suggestion, I'm guessing this needs to go in the dylib: https://github.com/EmbarkStudios/rust-gpu/blob/105cbcc6184ac78fcb954a525895d49f39733159/crates/rustc_codegen_spirv/src/lib.rs#L515-L561
With that Box::new(SpirvCodegenBackend)
at the very end being changed to Box::new(SpirvCodegenBackend { spirv_tools: ... })
(i.e. rustc_codegen_spirv-impl
's definition of SpirvCodegenBackend
needs a new field that holds some kind of trait object, function pointers, etc. - some mechanism to dynamically get access to the functionality we need from spirv_tools
without it being statically accessed from rustc_codegen_spirv-impl
).