rust Trivial code edit makes build times 10 times slower

I'm unable to reproduce the problem with a MVP, so I'll try to be more descriptive I can, sorry for that.

Preamble: I have this crate that holds all common code between like 10 applications. Those applications are simple services that takes data from a queue, does something and put the result in another queue, except for one that also exposes a good number of REST APIs using warp. This last one is the biggest application, and has already given me some headaches, for example to be run in debug you must specify RUST_MIN_STACK env var, otherwise warp route types will cause a stack overflow.

The incident: The other day I made an innocent code edit to that shared crate, my CI started rebuilding every project and, after 1 hour, it notified me that the REST API application build went timeout.

To show you the innocent code edit, I need to intruduce a bit of boilerplate:

use tokio::sync::OnceCell;

pub struct Model {
    currency: Family,
    details: Option<Details>,//json field
    packageable: OnceCell<AnotherModel>,
}

pub struct Details {
    reverse: Option<bool>,
}

pub struct AnotherModel {}

impl AnotherModel {
    pub fn counter_currency(&self, _currency: Family) -> Result<Family, VoidError> {
        unreachable!()
    }
}

#[derive(Copy, Clone)]
pub enum Family {}

pub trait ConnectionTrait {}

pub struct VoidError {}

impl Model {
    pub fn reverse(&self) -> bool {
        self.details.as_ref().and_then(|d| d.reverse).unwrap_or_default()
    }
    
    pub fn currency(&self) -> Family {
        self.currency
    }

    /// kind-of parent model
    pub async fn packageable<C>(&self, _conn: &C) -> Result<&AnotherModel, VoidError>
    where
        C: ConnectionTrait + std::fmt::Debug,
    {
        self.packageable
            .get_or_try_init(|| async {
                unreachable!()//get by database
            })
            .await
    }
}

This was the working code:

impl Model {
    pub async fn source_currency<C>(&self, conn: &C) -> Result<Family, VoidError>
    where
        C: ConnectionTrait + std::fmt::Debug,
    {
        if self.reverse() {
            self.packageable(conn).await?.counter_currency(self.currency())
        } else {
            Ok(self.currency())
        }
    }

    pub async fn destination_currency<C>(&self, conn: &C) -> Result<Family, VoidError>
    where
        C: ConnectionTrait + std::fmt::Debug,
    {
        if self.reverse() {
            Ok(self.currency())
        } else {
            self.packageable(conn).await?.counter_currency(self.currency())
        }
    }

    pub async fn counter_currency<C>(&self, conn: &C) -> Result<Family, VoidError>
    where
        C: ConnectionTrait + std::fmt::Debug,
    {
        if self.reverse() {
            self.source_currency(conn).await
        } else {
            self.destination_currency(conn).await
        }
    }
}

The edit consists in this:

impl Model {
    pub async fn source_currency<C>(&self, conn: &C) -> Result<Family, VoidError>
    where
        C: ConnectionTrait + std::fmt::Debug,
    {
        if self.reverse() {
            self.counter_currency(conn).await
        } else {
            Ok(self.currency())
        }
    }

    pub async fn destination_currency<C>(&self, conn: &C) -> Result<Family, VoidError>
    where
        C: ConnectionTrait + std::fmt::Debug,
    {
        if self.reverse() {
            Ok(self.currency())
        } else {
            self.counter_currency(conn).await
        }
    }

    pub async fn counter_currency<C>(&self, conn: &C) -> Result<Family, VoidError>
    where
        C: ConnectionTrait + std::fmt::Debug,
    {
        self.packageable(conn).await?.counter_currency(self.currency())
    }
}

As you can see, it's nothing trascendental, there is no recursion, etc...

Benchmarks: My machine is slightly faster than the CI one. old code: With opt-level = 3 it takes 5 minutes With opt-level = 2 it takes 5 minutes With opt-level = 1 it takes 5 minutes With opt-level = 0 it takes 4 minutes new code: With opt-level = 3 it takes 45 minutes With opt-level = 2 it takes 6 minutes With opt-level = 1 it takes 5 minutes With opt-level = 0 it takes 3 minutes

Last part of new code build consists in a single rustc process taking 100% of a single CPU and no-so-much RAM. Launching with CARGO_LOG=debug or even CARGO_LOG=trace doesn't help a lot, last messages before a really long silence are like

[2022-10-14T13:15:13Z DEBUG cargo_util::paths] invocation time for ".../target/release/.fingerprint/project-b64972d2f705716e" is 1665753313.923454678s

or

[2022-10-14T13:15:14Z TRACE cargo::core::compiler] found directive from rustc: `.../target/release/deps/project-b64972d2f705716e.d`

Oct 17 '22 13:10 nappa85

Could you use perf top (if you're on linux) or something similar to find out where rustc is spending all of its time? Alternatively, building rustc yourself and then using RUSTC_LOG=debug or trace could help diagnose where it gets stuck.

Oct 17 '22 14:10 Noratrieb

Could you use perf top (if you're on linux) or something similar to find out where rustc is spending all of its time? Alternatively, building rustc yourself and then using RUSTC_LOG=debug or trace could help diagnose where it gets stuck.

Sorry, never used perf top before, if there is something specific I need to do, please tell me.

Simply lanching perf top while the project is building with new code, I've been able to see several changes in the top red part, but the names where more or less always those, sometimes combined between them:

  41,34%  libLLVM-14-rust-1.64.0-stable.so     [.] llvm::ScalarEvolution::isLoopBackedgeGuardedByCond
  40,30%  libLLVM-14-rust-1.64.0-stable.so     [.] llvm::ScalarEvolution::isBasicBlockEntryGuardedByCond

  64,14%  libLLVM-14-rust-1.64.0-stable.so       [.] llvm::GVNPass::processBlock

  78,03%  libLLVM-14-rust-1.64.0-stable.so       [.] llvm::GVNPass::performPRE

  68,14%  libLLVM-14-rust-1.64.0-stable.so  [.] llvm::MachineFunction::getOrCreateLandingPadInfo

  85,31%  libLLVM-14-rust-1.64.0-stable.so         [.] llvm::VirtRegAuxInfo::weightCalcHelper

  87,18%  libLLVM-14-rust-1.64.0-stable.so         [.] (anonymous namespace)::MachineBlockPlacement::buildChain

  86,90%  libLLVM-14-rust-1.64.0-stable.so         [.] (anonymous namespace)::MachineBlockPlacement::selectBestCandidateBlock

Oct 17 '22 15:10 nappa85

If this function is where it spent its time hanging around, then that sounds like an LLVM bug. @rustbot label +A-llvm

Oct 17 '22 15:10 Noratrieb

I think it will be hard to do anything about this without some way to reproduce. Presumably what is happening is that LLVM blows up your code into a huge function (possibly as a result of inlining) and then transforms struggle to deal with it. But it's hard to say any more than that.

Nov 01 '22 09:11 nikic

@rustbot label +E-needs-mcve

Nov 01 '22 13:11 Rageking8

I think it will be hard to do anything about this without some way to reproduce. Presumably what is happening is that LLVM blows up your code into a huge function (possibly as a result of inlining) and then transforms struggle to deal with it. But it's hard to say any more than that.

Yes, I think it's like that, because reintroducing the same code edits now, after other, unrelated, code edits, doesn't trigger the same behavior

Nov 01 '22 13:11 nappa85

I know this issue has been closed, but I wanted to memorialize that I have experienced this problem under Rust 1.67 (rustc 1.67.1 (d5a82bbd2 2023-02-07)). I thought the problem was fixed when I upgraded to Rust 1.70 (rustc 1.70.0 (90c541806 2023-05-31), the first couple of builds seemed faster, but after those builds, the problem returned.

Jul 01 '23 01:07 c-lewis

rust rust copied to clipboard

Trivial code edit makes build times 10 times slower

rust
rust copied to clipboard