async-graphql icon indicating copy to clipboard operation
async-graphql copied to clipboard

Compile time performance

Open LukeMauldin opened this issue 2 years ago β€’ 27 comments

Expected Behavior

Rust web application compile times are getting progressively worse with application using graphql. Current graphql schema is about 1700 lines with around 100 mutation methods and 35 query methods. Perform some compile time testing and each graphql query method added resulted in an increase of about 1% of overall compile time. For reference on MBP M1 Pro, application already takes around 55 seconds to compile the application binary that serves graphql via Axum. Also duplicated test results on dedicated GCP Debian 11 VM compute instance c2-standard-4 and on Linux it took around 100 seconds to build application.

Ran cargo bloat tool and 13 out of the 14 top methods are graphql controllers with type: async_graphql::resolver_utils::container::ContainerType>::resolve_field::{{closure}}

Performed other testing such as removing the ORM components of the application but the one consistent factor contributing to compile times appeared to be graphql. Have major concerns about continuing to grow application using async-graphql crate because over the next year the graphql schema size is expected to grow by three or four times.

Even rust-analyzer takes around 10-12 seconds on MBP M1 Pro to provide compile time syntax checking which makes development difficult.

Specifications

Rustc version: 1.69.0

async-graphql = { version = "5.0.6", features = [
    "uuid08",
    "chrono",
    "dataloader",
] }
async-graphql-axum = "5.0.6"
axum = "0.6.11"

LukeMauldin avatar May 09 '23 16:05 LukeMauldin

How can we understand which feature of async-graphql leads us to this result? How can we speed up? Are there any heavy macros? What features are you using?

Can I ask you to create a minimal repo to reproduce the problem?

It could be a good starting point from which start improving.

frederikhors avatar May 09 '23 16:05 frederikhors

It's really slow,Consider splitting your crate into multiple sub-crates.πŸ˜„

sunli829 avatar May 09 '23 17:05 sunli829

Provided example repo: https://github.com/LukeMauldin/rust-graphql-perf It is easy to try with different numbers of graphql methods by commenting lines in https://github.com/LukeMauldin/rust-graphql-perf/blob/main/src/controllers/mod.rs Performance testing numbers on MBP M1 Pro:

Initial working example (50dc90d): 1.6 seconds, 21 MB debug binary

Simple event instance with query and mutation (611a856): 2.1 seconds, 22MB debug binary

Event instances 0-4 (5b90800): 3.5 seconds, 25MB debug binary, 143 line schema

Event instances 0-9 (5b90800): 5.5 seconds, 29MB debug binary, 263 line schema

The conclusion that I am drawing from the above test is that each graphql event instance that is added requires about 0.5 seconds of compile time. Our internal application has over 100 methods with a schema file around 1700 lines and so compile times grow quickly.

Splitting into different crates is not a good option for a few reasons. First, that would require substantial application refactoring and make business logic less clear. Second, on OSX about 30% of the compile time is in the linker and on Linux about 50% of the compile time is in the linker. From my understanding of the Rust compile process, splitting into separate crates will not help the linker phase.

LukeMauldin avatar May 09 '23 20:05 LukeMauldin

I have the same issue, my DSL schema has grown to 6500 lines, I have 135 mutations and about 65 query. My application build time is 3 minutes in cold start and about 40 seconds in incremental mode, about the same time it takes rust-analyzer. The build time keeps growing as the schema increases.

Is there anything we can do to improve the situations without having to split up the modules? Since the types in the schema often intersect each other and just can't be separated. The compilation was done on a Ryzen 3900x.

negezor avatar May 15 '23 07:05 negezor

@negezor - In the example Git repo I linked above, I also have a branch trying a different macro strategy with async-graphql but the compile time improvement in my testing took it from 5.5s (main branch) to 5.2s (alternative/simpleobject) which is rather small. I also created a branch (alternative/juniper) that used the rustling juniper crate. Using the juniper crate, the baseline performance was 2.4s and after adding nine instances, the compile time increased to 3.9s. Looking at percentages - compile time increased by 62% when going from 1 instance to 9 instances for juniper crate. Contrast is compile time increased by 160% in same scenario for async-graphql crate. For reference, a comparable example in Golang - https://github.com/LukeMauldin/golang-graphql-perf - took less than 1 second to compile. You mentioned rust-analyzer slowness and in my assessment that is almost as important as overall compile time because the rust-analyzer slowness dramatically slows the code writing feedback cycle with intellisense and errors being much delayed.

LukeMauldin avatar May 15 '23 13:05 LukeMauldin

I agree 100% with you. Same situation here. And our code is increasing more. And even with different crates the cold start maybe after a clear cargo or a complete entire build is always a nightmare of time. We should find a way to address this. We must.

frederikhors avatar May 15 '23 15:05 frederikhors

I agree, there was some discussion about it in #783 but the 'solutions' are just work arounds. I do not think splitting in to separate crates should really be the solution here. I agree that rust-analyzer slowness is far more important than compile time. There's no other crate I've ever used that causes such significant slow down.

I'm happy to assist, although I'm not very familiar with profiling compile time performance – does anyone have any suggestions?

oeed avatar May 16 '23 06:05 oeed

If compilation performance is really important to you, maybe you can try dynamic schema, which does not use procedural macros and can get better compilation performance. πŸ™‚

sunli829 avatar May 25 '23 04:05 sunli829

I never used dynamic schema before, but what do you think about using it during development and switch to a NOT dynamic one during build? Is it feasible?

frederikhors avatar May 25 '23 10:05 frederikhors

I never used dynamic schema before, but what do you think about using it during development and switch to a NOT dynamic one during build? Is it feasible?

They are very different and difficult to switch.πŸ˜…

sunli829 avatar May 25 '23 13:05 sunli829

First, that would require substantial application refactoring and make business logic less clear. Second, on OSX about 30% of the compile time is in the linker and on Linux about 50% of the compile time is in the linker.

Looks like you are not using LLD linker, it's multithreaded unlike GNU ld and should solve this issue. There's also mold linker which is even faster.

Logarithmus avatar Jun 12 '23 12:06 Logarithmus

First, that would require substantial application refactoring and make business logic less clear. Second, on OSX about 30% of the compile time is in the linker and on Linux about 50% of the compile time is in the linker.

Looks like you are not using LLD linker, it's multithreaded unlike GNU ld and should solve this issue. There's also mold linker which is even faster.

Are you stating on Linux that I should update Rust to use a different linker to improve the performance? My primary development platform is OSX and the rust analyzer performance is very bad (15-20 seconds for code completion and error checking). What can be done to improve that?

LukeMauldin avatar Jun 12 '23 14:06 LukeMauldin

@LukeMauldin on GNU/Linux:

  1. Install lld via your package manager
  2. Write this into $CARGO_HOME/config.toml:
[build]
rustflags = [ "-C", "link-arg=-fuse-ld=lld", ]

or try mold instead of lld, it should be even faster but not much.

on MacOS the process should be the same

There are discussions going for years and years in https://github.com/rust-lang/rust about making lld the default linker, but they are very conservative and thus you have to enable lld manually.

Logarithmus avatar Jun 12 '23 14:06 Logarithmus

Unfortunately this did not work for me. I got the errors below. Note this is on OSX 13.4 on an MBP M1.

Install steps:

  1. brew install llvm
  2. updated ~/.cargo/config.toml with text: [build] rustflags = [ "-C", "link-arg=-fuse-ld=/opt/homebrew/opt/llvm/bin/ld.lld", ]

Errors: = note: ld.lld: error: unknown argument '-dynamic', did you mean '-Bdynamic' ld.lld: error: unknown argument '-arch' ld.lld: error: unknown argument '-platform_version' ld.lld: error: unknown argument '-syslibroot' ld.lld: error: unknown argument '-dead_strip' ld.lld: error: unable to find library -lto_library ld.lld: error: /Library/Developer/CommandLineTools/usr/lib/libLTO.dylib: unknown file type ld.lld: error: cannot open arm64: No such file or directory ld.lld: error: cannot open macos: No such file or directory ld.lld: error: cannot open 13.0.0: No such file or directory ld.lld: error: cannot open 13.3: No such file or directory ld.lld: error: cannot open /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk: Is a directory ld.lld: error: /var/folders/68/14vhhkf95679cyffpg8ztjfh0000gn/T/rustcA1Gpny/symbols.o: unknown file type ld.lld: error: /Users/lukemauldin/code/github.com/LukeMauldin/rust-graphql-perf/target/debug/build/serde_derive-2b7ba707f1727866/build_script_build-2b7ba707f1727866.build_script_build.9cb824ed-cgu.0.rcgu.o: unknown file type ld.lld: error: /Users/lukemauldin/code/github.com/LukeMauldin/rust-graphql-perf/target/debug/build/serde_derive-2b7ba707f1727866/build_script_build-2b7ba707f1727866.build_script_build.9cb824ed-cgu.1.rcgu.o: unknown file type ld.lld: error: /Users/lukemauldin/code/github.com/LukeMauldin/rust-graphql-perf/target/debug/build/serde_derive-2b7ba707f1727866/build_script_build-2b7ba707f1727866.build_script_build.9cb824ed-cgu.10.rcgu.o: unknown file type ld.lld: error: /Users/lukemauldin/code/github.com/LukeMauldin/rust-graphql-perf/target/debug/build/serde_derive-2b7ba707f1727866/build_script_build-2b7ba707f1727866.build_script_build.9cb824ed-cgu.11.rcgu.o: unknown file type ld.lld: error: /Users/lukemauldin/code/github.com/LukeMauldin/rust-graphql-perf/target/debug/build/serde_derive-2b7ba707f1727866/build_script_build-2b7ba707f1727866.build_script_build.9cb824ed-cgu.12.rcgu.o: unknown file type ld.lld: error: /Users/lukemauldin/code/github.com/LukeMauldin/rust-graphql-perf/target/debug/build/serde_derive-2b7ba707f1727866/build_script_build-2b7ba707f1727866.build_script_build.9cb824ed-cgu.13.rcgu.o: unknown file type ld.lld: error: /Users/lukemauldin/code/github.com/LukeMauldin/rust-graphql-perf/target/debug/build/serde_derive-2b7ba707f1727866/build_script_build-2b7ba707f1727866.build_script_build.9cb824ed-cgu.14.rcgu.o: unknown file type ld.lld: error: too many errors emitted, stopping now (use --error-limit=0 to see all errors) clang: error: linker command failed with exit code 1 (use -v to see invocation)

LukeMauldin avatar Jun 12 '23 14:06 LukeMauldin

@LukeMauldin -fuse-ld= option doesn't accept the path to linker binary, try this instead:

[build]
rustflags = [ "-C", "link-arg=--ld-path=/opt/homebrew/opt/llvm/bin/ld.lld" ]

Logarithmus avatar Jun 12 '23 14:06 Logarithmus

Same problem. Updated file: [build] rustflags = [ "-C", "link-arg=--ld-path=/opt/homebrew/opt/llvm/bin/ld.lld" ]

Errors: = note: ld.lld: error: unknown argument '-dynamic', did you mean '-Bdynamic' ld.lld: error: unknown argument '-arch' ld.lld: error: unknown argument '-platform_version' ld.lld: error: unknown argument '-syslibroot' ld.lld: error: unknown argument '-dead_strip' ld.lld: error: unable to find library -lto_library ld.lld: error: /Library/Developer/CommandLineTools/usr/lib/libLTO.dylib: unknown file type ld.lld: error: cannot open arm64: No such file or directory ld.lld: error: cannot open macos: No such file or directory ld.lld: error: cannot open 13.0.0: No such file or directory ld.lld: error: cannot open 13.3: No such file or directory ld.lld: error: cannot open /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk: Is a directory ld.lld: error: /var/folders/68/14vhhkf95679cyffpg8ztjfh0000gn/T/rustcYoMp2o/symbols.o: unknown file type ld.lld: error: /Users/lukemauldin/code/github.com/LukeMauldin/rust-graphql-perf/target/debug/build/libc-9b3dc823f7a9bab5/build_script_build-9b3dc823f7a9bab5.build_script_build.2152a42e-cgu.0.rcgu.o: unknown file type ld.lld: error: /Users/lukemauldin/code/github.com/LukeMauldin/rust-graphql-perf/target/debug/build/libc-9b3dc823f7a9bab5/build_script_build-9b3dc823f7a9bab5.build_script_build.2152a42e-cgu.1.rcgu.o: unknown file type ld.lld: error: /Users/lukemauldin/code/github.com/LukeMauldin/rust-graphql-perf/target/debug/build/libc-9b3dc823f7a9bab5/build_script_build-9b3dc823f7a9bab5.build_script_build.2152a42e-cgu.10.rcgu.o: unknown file type ld.lld: error: /Users/lukemauldin/code/github.com/LukeMauldin/rust-graphql-perf/target/debug/build/libc-9b3dc823f7a9bab5/build_script_build-9b3dc823f7a9bab5.build_script_build.2152a42e-cgu.11.rcgu.o: unknown file type ld.lld: error: /Users/lukemauldin/code/github.com/LukeMauldin/rust-graphql-perf/target/debug/build/libc-9b3dc823f7a9bab5/build_script_build-9b3dc823f7a9bab5.build_script_build.2152a42e-cgu.12.rcgu.o: unknown file type ld.lld: error: /Users/lukemauldin/code/github.com/LukeMauldin/rust-graphql-perf/target/debug/build/libc-9b3dc823f7a9bab5/build_script_build-9b3dc823f7a9bab5.build_script_build.2152a42e-cgu.13.rcgu.o: unknown file type ld.lld: error: /Users/lukemauldin/code/github.com/LukeMauldin/rust-graphql-perf/target/debug/build/libc-9b3dc823f7a9bab5/build_script_build-9b3dc823f7a9bab5.build_script_build.2152a42e-cgu.14.rcgu.o: unknown file type ld.lld: error: too many errors emitted, stopping now (use --error-limit=0 to see all errors) clang: error: linker command failed with exit code 1 (use -v to see invocation)

LukeMauldin avatar Jun 12 '23 14:06 LukeMauldin

@LukeMauldin try replacing ld.lld by ld64.lld https://lld.llvm.org/MachO/index.html#using-lld see more info here

Basically for GNU/Linux you need ld.lld and for macOS you need ld64.lld

Logarithmus avatar Jun 12 '23 15:06 Logarithmus

I updated OSX to use ld64.lld and this time there were no errors. Unfortunately it did not improve my compile times. The difference was 5% at most.

LukeMauldin avatar Jun 12 '23 15:06 LukeMauldin

I updated OSX to use ld64.lld and this time there were no errors. Unfortunately it did not improve my compile times. The difference was 5% at most.

Yeah. Linker is not the problem.

@LukeMauldin I can suggest you the amazing https://github.com/bjorn3/rustc_codegen_cranelift.

frederikhors avatar Jun 12 '23 16:06 frederikhors

Has anyone tested cranelift with async-graphql crate? Even if cranelift works, I have been doing Rust development for the past 5 years and I have never ran across a crate that needs an experimental development compiler to have decent compile times. What can be done to fix the crate?

LukeMauldin avatar Jun 12 '23 17:06 LukeMauldin

I'm using cranelift and it works amazingly! They soon will switch to stable rust.

frederikhors avatar Jun 12 '23 17:06 frederikhors

@LukeMauldin we are struggling with long compile times and graphQL as well. https://github.com/graphql-rust/juniper is slightly better but still slow... but give it a try

Logarithmus avatar Jun 12 '23 17:06 Logarithmus

I dug in a bit here and found some good data. Steps to gather this data:

  1. Checkout rust and build custom toolchain with debug = true set in config.toml.
  2. Checkout https://github.com/LukeMauldin/rust-graphql-perf
  3. export RUSTC_LOG=[typeck{key=rust-graphql-perf}]
  4. cargo +nightly check -p rust-graphql-perf

That spits out a lot of data. With some grepping and sorting I was able to get it down to the following output (filtered to only instance0 data):

β”œβ”€  1ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:359 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#18}::into_gql), type_dependent_defs: UnordMap { inner: {} }, field
β”œβ”€  1ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:376 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#19}::end_transition_mins), type_dependent_defs: UnordMap { inner: 
β”œβ”€  2ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:335 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#15}::title), type_dependent_defs: UnordMap { inner: {} }, field_in
β”œβ”€  2ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:339 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#16}::to_date), type_dependent_defs: UnordMap { inner: {} }, field_
β”œβ”€  3ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:281 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#8}::clone), type_dependent_defs: UnordMap { inner: {} }, field_ind
β”œβ”€  3ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:336 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#15}::description), type_dependent_defs: UnordMap { inner: {} }, fi
β”œβ”€  3ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:338 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#16}::from_date), type_dependent_defs: UnordMap { inner: {} }, fiel
β”œβ”€  3ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:341 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#16}::end_transition_mins), type_dependent_defs: UnordMap { inner: 
β”œβ”€  3ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:342 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#16}::guest_min_count), type_dependent_defs: UnordMap { inner: {} }
β”œβ”€  3ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:343 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#16}::guest_max_count), type_dependent_defs: UnordMap { inner: {} }
β”œβ”€  4ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:266 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#6}::type_name), type_dependent_defs: UnordMap { inner: {} }, field
β”œβ”€  4ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:304 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#12}::type_name), type_dependent_defs: UnordMap { inner: {} }, fiel
β”œβ”€  4ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:425 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#22}::type_name), type_dependent_defs: UnordMap { inner: {} }, fiel
β”œβ”€  7ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:325 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#13}::type_name), type_dependent_defs: UnordMap { inner: {} }, fiel
β”œβ”€  8ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:331 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#13}::as_raw_value), type_dependent_defs: UnordMap { inner: {} }, f
β”œβ”€ 11ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:21 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#0}::into_gql), type_dependent_defs: UnordMap { inner: {7: Ok((Assoc
β”œβ”€ 45ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:283 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#9}::create0), type_dependent_defs: UnordMap { inner: {34: Ok((Asso
β”œβ”€ 52ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:345 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#17}::clone), type_dependent_defs: UnordMap { inner: {} }, field_in
β”œβ”€ 72ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:245 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#3}::event_instances0), type_dependent_defs: UnordMap { inner: {40:
β”œβ”€ 74ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:428 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#22}::resolve), type_dependent_defs: UnordMap { inner: {8: Ok((Asso
β”œβ”€ 75ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:269 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#6}::resolve), type_dependent_defs: UnordMap { inner: {8: Ok((Assoc
β”œβ”€ 83ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:307 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#12}::resolve), type_dependent_defs: UnordMap { inner: {8: Ok((Asso
β”œβ”€ 88ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:370 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#19}::to_date), type_dependent_defs: UnordMap { inner: {64: Ok((Ass
β”œβ”€ 91ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:364 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#19}::description), type_dependent_defs: UnordMap { inner: {64: Ok(
β”œβ”€ 91ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:382 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#19}::guest_max_count), type_dependent_defs: UnordMap { inner: {64:
β”œβ”€ 92ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:373 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#19}::start_transition_mins), type_dependent_defs: UnordMap { inner
β”œβ”€ 93ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:367 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#19}::from_date), type_dependent_defs: UnordMap { inner: {64: Ok((A
β”œβ”€105ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:376 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#19}::end_transition_mins), type_dependent_defs: UnordMap { inner: 
β”œβ”€112ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:329 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#13}::to_value), type_dependent_defs: UnordMap { inner: {160: Ok((A
β”œβ”€166ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:417 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#21}::find_entity), type_dependent_defs: UnordMap { inner: {8: Ok((
β”œβ”€167ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:296 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#11}::find_entity), type_dependent_defs: UnordMap { inner: {8: Ok((
β”œβ”€168ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:258 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#5}::find_entity), type_dependent_defs: UnordMap { inner: {8: Ok((A
β”œβ”€237ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:267 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#6}::create_type_info), type_dependent_defs: UnordMap { inner: {61:
β”œβ”€293ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:305 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#12}::create_type_info), type_dependent_defs: UnordMap { inner: {61
β”œβ”€412ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:328 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#13}::parse), type_dependent_defs: UnordMap { inner: {128: Ok((Asso
β”œβ”€431ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:249 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#5}::resolve_field), type_dependent_defs: UnordMap { inner: {174: O
β”œβ”€453ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:287 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#11}::resolve_field), type_dependent_defs: UnordMap { inner: {67: O
β”œβ”€493ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:330 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#13}::federation_fields), type_dependent_defs: UnordMap { inner: {5
β”œβ”€556ms DEBUG rustc_hir_typeck return=TypeckResults { hir_owner: DefId(0:326 ~ rust_graphql_perf[8d40]::controllers::instance0::{impl#13}::create_type_info), type_dependent_defs: UnordMap { inner: {61

This data says that create_type_info is the most expensive thing to typecheck, federation_fields is the second most, etc.

I'll keep digging and see what exactly is so expensive about those, but getting concrete data is a promising start.

hamiltop avatar Nov 16 '23 07:11 hamiltop

It's really slow,Consider splitting your crate into multiple sub-crates.πŸ˜„

Hi, I'm also concerned about compile time performance especially when applications grow with time and therefore try to split my code up into sub-crates. Doing this I encountered one question: Below is a simplified version of the code where I try to split the User and the Tenant objects into separate crates. At the moment the tenant resolver in the User's impl block returns the Tenant struct, and users resolver of the Tenant returns the User struct. When splitting these two structs into separate crates this would result in a cyclic dependency between those crates which is not allowed/possible in Rust. Therefore my question is, if splitting this code into separate crates is even possible?

#[derive(gql::SimpleObject)]
#[graphql(complex)]
pub struct User {
    pub id: Uuid,
    pub name: String,
    pub tenant_id: Uuid,
}

#[gql::ComplexObject]
impl User {
    async fn tenant(&self, ctx: &gql::Context<'_>) -> Tenant {
        // get tenant by self.tenant_id
    }
}
#[derive(gql::SimpleObject)]
#[graphql(complex)]
pub struct Tenant {
    pub id: Uuid,
    pub name: String,
}

#[gql::ComplexObject]
impl Tenant {
    async fn users(&self, ctx: &gql::Context<'_>) -> Vec<User> {
        // get all users with tenant_id = self.id
    }
}

biwecka avatar Mar 02 '24 13:03 biwecka

I'm using cranelift and it works amazingly! They soon will switch to stable rust.

Indeed! The last couple of days I have been experimenting with various ways to improve the painfully slow 3m24s (or longer) re-compile times for release builds, and utilizing Cranelift as part of the equation has helped immensely.

Here's a table of recent compilation timings I recorded, with various configurations: (table from this page, with permalink here) 2024-05-06_10-14-06_node

Note: Rows with "SW1" in the right-most cell are from my slower Windows desktop; those with "SW2" are from my recently-purchased (and faster) Linux laptop.

Arrow descriptions:

  • The rows with red arrows are the "status quo" as of a week ago, on my desktop; release re-compiles within Docker (as used for new prod deploys) took ~3.5 minutes to complete [col7], when changing a logging line in app-server's main.rs [col1]. (yes, quite painful for any sort of "trial based" debugging...)
  • The rows with yellow arrows are timings from a couple days ago, as seen on my new (and faster) Linux laptop, using the non-VM docker-engine [col2] instead of Docker Desktop (this avoids the overhead of a full VM layer, and is part of why I installed Linux as my OS on the laptop). No switch from LLVM to Cranelift, or enabling of incremental compilation yet.
  • The rows with green arrows (for desktop) or blue arrows (for new laptop) are the latest timings, using a hybrid "LLVM + Cranelift" compilation scheme, where the dependencies are compiled with LLVM with max optimization level [col3] (and thereafter cached), but then my app's own code is compiled with Cranelift [col3], for faster iteration (got the idea from here). In the lower rows, I also enable incremental compilation [col1] for further gains.

While I'm very happy with the speed gains above, there are a few things to note:

  • Cranelift output binaries are apparently significantly slower than those optimized using LLVM. I have not yet benchmarked the runtime performance, because setting up an organic/representative performance test is difficult for the type of frontend interface my website uses. However, two anecdotes I've found so far can be seen here and here [fig24]. (neither are quite the same scenario, but suggest an acceptable performance impact given the compile-time improvements)
  • I seem to recall it being "not recommended" to have incremental compilation enabled for production deploys, presumably because sometimes the incremental compilation can "get messed up", and produce buggy outputs even if the source code itself is fine. I know this has happened for me at least once in the past, for debug builds. So, if this ends up happening frequently, it may force me to turn it off again. (even if so though, that would leave me with 33s builds on my laptop, which is drastically better than the 3m24s+ builds I had a couple weeks ago)
  • While the compile times themselves are much improved, there is other overhead like uploading the docker image to a container registry -- so further optimizations are still worth seeking in those other areas. (I tried enabling LTO to get smaller images, but hit compile/link errors)
  • Longer term, it could make sense to use Cranelift for manual compiles/deploys (eg. when in a development cycle where you're forced to try/test changes in production) -- but then once the development session is done, kick off some cloud build that re-compiles and re-deploys soon afterward with max optimizations. If someone knows of an easy (and cheap) way to do this, please let me know.

Anyway, happy with the compile-time improvements -- and am just hoping now that they'll be able to "stick". (ie. not have severe side-effects that force me to revert them all!)

Venryx avatar May 06 '24 17:05 Venryx