rust-gpu icon indicating copy to clipboard operation
rust-gpu copied to clipboard

Optimize `spirv-builder` clean build wall time

Open repi opened this issue 2 years ago • 5 comments

We should look into improve our clean build times of when building spriv-builder & rustc_codegen_spirv as that is the main scenario in CI and also useful for users if our crates rebuild fast.

Right now it looks like we have a couple of quite unfortunate serial dependencies that significantly increases the wall time. Here is the current results with cargo build --release -p spirv-builder -Z timings on my AMD 5950x (32 vCPU):

image

Potential optimizations

  • [ ] Could spirv-builder be the one depending on spirv-tools instead of rustc_codegen_spirv?
    • I.e. move the responsibility of running spirv-opt/spirv-val and final linking to the builder.
    • Does that sound like it could be doable @eddyb ?
  • [ ] Remove/replace serde, quite crazy it takes 20 seconds to compile
    • Think we only have two use cases, our ModuleResult output and then the internal decorations that is using more complex serde implementation (CustomDecoration).
    • have no idea how serde+serde_json is taking such a crazy long time to build here though, can't see any other crates enabling tons of types that serde derive macros are used on or similar
    • ModuleResult would be trivial to replace serde with nanoserde, could work for CustomDecoration also?
    • #860
  • [ ] Remove/replace sanitize-filename dependency that compiles and uses regex crate for super trivial things
  • [ ] Optimize builds of rspirv itself somehow?
    • Unfortunate that it is so slow to build by itself but could be hard to fix and tons of types in it. No open issues on it in the rspirv repo as far as I could see
  • [x] ~Determine why rustc_codegen_spirv is not split in frontend/codegen sections in the profile~ report.
    • This would enable spriv-builder itself to start building earlier and before the codegen of rustc_codegen_spirv is done.
    • Update: This is not possible as it it needs to be built as a dylib

If we do get rid of serde and manage to make sure rustc_codegen_spirv doesn't have to wait for the spirv-tools-sys build, we could get get a 15+ second wall time improvement here as rustc_codegen_spirv could start as soon as frontend section of rspirv has been built: - image

repi avatar Apr 10 '22 11:04 repi

Did an experiment to switch to nanoserde and it seems to work fine, there are some contraints and type/format output changes that one have to figure out (no PathBuf support), and does get rid of all the long serde compilation, 20 sec -> 1.7 sec.

Wall time benefit was just ~1 second though, but would enable real gains if one also finds a way not have rustc_codegen_spirv stall and wait for spirv-tools-sys build.

image

repi avatar Apr 10 '22 11:04 repi

Determine why rustc_codegen_spirv is not split in frontend/codegen sections in the profile report. This would enable spriv-builder itself to start building earlier and before the codegen of rustc_codegen_spirv is done.

Found the source of this, rustc_codegen_spirv was set as a dylib crate instead of standard lib.

Fixing this, together with the above experimental nanoserde switch does reduce build time from original 46.3s to 41.9s, one can see in the report that rustc_codegen_spirv is now split in frontend and codegen sections which enables spirv-builder to build earlier.

image

repi avatar Apr 10 '22 12:04 repi

You could split rustc_codegen_spirv into an rlib containing the actual implementation and a dylib which only depends on the rlib. The rlib will participate in pipelined compilation and the dylib will wait for all dependencies to have fully compiled before doing the final link. It may also be possible to move just the code using spirv-tools to the dylib, allowing most of the code of cg_spirv to be compiled at the same time as spirv-tools-sys.

bjorn3 avatar Aug 08 '22 14:08 bjorn3

ah that is an interesting idea and approach, what do you think of the feasibility of that @eddyb ?

repi avatar Aug 08 '22 20:08 repi

Ahh I always forget about how dylibs behave just like executables, and the current state of pipelining - it makes sense in the grand scheme of there being no good way (yet) to block rustc invoking the linker, on the linker inputs actually being present, but it's basically a Cargo<->rustc communicative deficiency AFAIK, not even something limited by platform tooling.

What @bjorn3 is suggesting is basically "emulate what should happen, with today's tools" and we should probably do that.

If @oisyn wants to tackle this, a first approximation could be as simple as:

  • rename crates/rustc_codegen_spirv (e.g. add a suffix like -impl), and the name in its Cargo.toml, and take away any dylib-specific settings
  • create a minimal crates/rustc_codegen_spirv
    • Cargo.toml pretty much identical to what's currently there today (modulo deps?) and with an extra dependency on the new (e.g. rustc_codegen_spirv-impl) crate
    • src/lib.rs containing only extern crate rustc_codegen_spirv_impl; - in theory that should be enough to allow any exported symbols from the -impl crate to surface up to to the dylib and allow it to function

For @bjorn3's complete suggestion, I'm guessing this needs to go in the dylib: https://github.com/EmbarkStudios/rust-gpu/blob/105cbcc6184ac78fcb954a525895d49f39733159/crates/rustc_codegen_spirv/src/lib.rs#L515-L561

With that Box::new(SpirvCodegenBackend) at the very end being changed to Box::new(SpirvCodegenBackend { spirv_tools: ... }) (i.e. rustc_codegen_spirv-impl's definition of SpirvCodegenBackend needs a new field that holds some kind of trait object, function pointers, etc. - some mechanism to dynamically get access to the functionality we need from spirv_tools without it being statically accessed from rustc_codegen_spirv-impl).

eddyb avatar Aug 08 '22 23:08 eddyb