Enzyme icon indicating copy to clipboard operation
Enzyme copied to clipboard

How to solve "This analysis pass was not registered prior to being queried" error when dlopens Enzyme on rust std::autodiff

Open sgasho opened this issue 1 month ago • 10 comments

related pr: https://github.com/rust-lang/rust/pull/149271#issuecomment-3575989722

When using Enzyme via dlopen in rustc(std::autodiff), we get this assertion failure:

Assertion failed: (AnalysisPasses.count(PassT::ID()) &&
                    "This analysis pass was not registered prior to being queried")

I found that this is becase DominatorTreeAnalysis and LoopAnalysis are not registered when TypeAnalyzer::TypeAnalyzer tries to access them.

I added a code to register those passes in PreProcessCache::PreProcessCache(enzyme/Enzyme/FunctionUtils.cpp) like below and the "This analysis pass was not registered prior to being queried" error disappeared

Image

However, the comment "Explicitly chose AA passes that are stateless and will not be invalidated" on the code indicates those registrations are not on a right place.

I'm looking for the best solution for this problem.

sgasho avatar Nov 30 '25 08:11 sgasho

weird, can you make a reproducer outside of rust (e.g. a simple c++ file that links against libEnzyme that reproduces the error).

I've never seen a similar issue from this before so getting a MWE would be useful to understand what's going wrong when rust tries linking

wsmoses avatar Dec 01 '25 05:12 wsmoses

@wsmoses cc: @ZuseZ4

Is this C++ code appropriate for reproducing the error?

LLVM: 21.1.5

#include "llvm/AsmParser/Parser.h"
#include "llvm/Passes/PassBuilder.h"
#include "llvm/Support/InitLLVM.h"

extern "C" void registerEnzymeAndPassPipeline(llvm::PassBuilder &PB, bool augment);

static auto IR = R"(
; ModuleID = 'm'
target triple = "arm64-apple-darwin"

declare double @__enzyme_autodiff(double (double)*, double)

define double @square(double %x) {
  %mul = fmul double %x, %x
  ret double %mul
}

define double @run(double %x) {
  %call = call double @__enzyme_autodiff(double (double)* @square, double %x)
  ret double %call
}
)";

int main(int argc, char **argv)
{
  llvm::InitLLVM X(argc, argv);
  llvm::LLVMContext Ctx;
  llvm::SMDiagnostic Err;
  auto Buffer = llvm::MemoryBuffer::getMemBuffer(IR, "m", false);
  auto M = llvm::parseAssembly(*Buffer, Err, Ctx);

  llvm::PassBuilder PB;
  llvm::LoopAnalysisManager LAM;
  llvm::FunctionAnalysisManager FAM;
  llvm::CGSCCAnalysisManager CGAM;
  llvm::ModuleAnalysisManager MAM;

  PB.registerModuleAnalyses(MAM);
  PB.registerCGSCCAnalyses(CGAM);
  PB.registerFunctionAnalyses(FAM);
  PB.registerLoopAnalyses(LAM);
  PB.crossRegisterProxies(LAM, FAM, CGAM, MAM);

  registerEnzymeAndPassPipeline(PB, /*augment*/ false);

  llvm::ModulePassManager MPM;
  if (auto ErrStr = PB.parsePassPipeline(MPM, "enzyme"))
  {
    llvm::errs() << "failed to parse Enzyme pipeline: " << toString(std::move(ErrStr)) << "\n";
    return 1;
  }

  MPM.run(*M, MAM);
  return 0;
}

I got "This analysis pass was not registered prior to being queried" error

/Users/suganoshota/Desktop/enzyme-playground % ./build.sh
/Users/suganoshota/Desktop/enzyme-playground % ./mwe     
Assertion failed: (AnalysisPasses.count(PassT::ID()) && "This analysis pass was not registered prior to being queried"), function getResult, file PassManager.h, line 414.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: ./mwe
1.      Running pass "EnzymeNewPM" on module "m"
 #0 0x000000010e219f0c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/llvm/lib/libLLVM.dylib+0x165f0c)
 #1 0x000000010e217ba0 llvm::sys::RunSignalHandlers() (/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/llvm/lib/libLLVM.dylib+0x163ba0)
 #2 0x000000010e21aa04 SignalHandler(int, __siginfo*, void*) (/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/llvm/lib/libLLVM.dylib+0x166a04)
 #3 0x00000001860b3744 (/usr/lib/system/libsystem_platform.dylib+0x1804e3744)
 #4 0x00000001860a9888 (/usr/lib/system/libsystem_pthread.dylib+0x1804d9888)
 #5 0x0000000185fae850 (/usr/lib/system/libsystem_c.dylib+0x1803de850)
 #6 0x0000000185fada84 (/usr/lib/system/libsystem_c.dylib+0x1803dda84)
 #7 0x00000001033d9734 TypeAnalyzer::TypeAnalyzer(FnTypeInfo const&, TypeAnalysis&, unsigned char) (/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/enzyme/lib/libEnzyme-21.dylib+0x465734)
 #8 0x00000001034127fc TypeAnalysis::analyzeFunction(FnTypeInfo const&) (/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/enzyme/lib/libEnzyme-21.dylib+0x49e7fc)
 #9 0x000000010327ba5c (anonymous namespace)::EnzymeBase::HandleAutoDiff(llvm::Instruction*, unsigned int, llvm::Value*, llvm::Type*, llvm::SmallVectorImpl<llvm::Value*>&, std::__1::map<int, llvm::Type*, std::__1::less<int>, std::__1::allocator<std::__1::pair<int const, llvm::Type*>>> const&, std::__1::vector<DIFFE_TYPE, std::__1::allocator<DIFFE_TYPE>> const&, llvm::Function*, DerivativeMode, (anonymous namespace)::EnzymeBase::Options&, bool, llvm::SmallVectorImpl<llvm::CallInst*>&) (/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/enzyme/lib/libEnzyme-21.dylib+0x307a5c)
#10 0x00000001032760a0 (anonymous namespace)::EnzymeBase::HandleAutoDiffArguments(llvm::CallInst*, DerivativeMode, bool, llvm::SmallVectorImpl<llvm::CallInst*>&) (/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/enzyme/lib/libEnzyme-21.dylib+0x3020a0)
#11 0x00000001032716d8 (anonymous namespace)::EnzymeBase::lowerEnzymeCalls(llvm::Function&, std::__1::set<llvm::Function*, std::__1::less<llvm::Function*>, std::__1::allocator<llvm::Function*>>&) (/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/enzyme/lib/libEnzyme-21.dylib+0x2fd6d8)
#12 0x000000010326c974 (anonymous namespace)::EnzymeBase::run(llvm::Module&) (/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/enzyme/lib/libEnzyme-21.dylib+0x2f8974)
#13 0x000000010328a6bc llvm::detail::PassModel<llvm::Module, EnzymeNewPM, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/enzyme/lib/libEnzyme-21.dylib+0x3166bc)
#14 0x000000010e3e327c llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/llvm/lib/libLLVM.dylib+0x32f27c)
#15 0x00000001021cd94c main (/Users/suganoshota/Desktop/enzyme-playground/mwe+0x10000194c)
#16 0x0000000185ce1d54
zsh: abort      ./mwe

FYI: build.sh

#!/usr/bin/env bash
set -euo pipefail

LLVM_CONFIG=${LLVM_CONFIG:-/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/llvm/bin/llvm-config}
ENZYME_LIBDIR=${ENZYME_LIBDIR:-/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/enzyme/lib}
CXX=${CXX:-/usr/bin/clang++}
ARCH_TARGET=${ARCH_TARGET:-arm64-apple-macos12}

${CXX} -target ${ARCH_TARGET} main.cpp \
  $(${LLVM_CONFIG} --cxxflags --ldflags --system-libs --libs core native ipo passes analysis irreader) \
  -L${ENZYME_LIBDIR} \
  -Wl,-rpath,$(${LLVM_CONFIG} --libdir) -Wl,-rpath,${ENZYME_LIBDIR} \
  -lEnzyme-21 -o mwe

sgasho avatar Dec 06 '25 11:12 sgasho

c.f. I mainly looked around LLVMRustOptimize(It's in compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp in the rust codebase) in my PR(feat: dlopen Enzyme rust-lang/rust#149271) and then extracted core logic into c++ reproducer

enzyme_fn = registerEnzymeAndPassPipeline and passes it to LLVMRustOptimize as EnzymePtr

https://github.com/sgasho/rust/blob/52f0d2ec4fb3db116aff7d35fb0ee61c70d03eda/compiler/rustc_codegen_llvm/src/back/write.rs#L729-L775

Inside LLVMRustOptimize. It executes PB.registerXXXXX, EnzymePtr(=registerEnzymeAndPassPipeline), then MPM.run at last

https://github.com/sgasho/rust/blob/52f0d2ec4fb3db116aff7d35fb0ee61c70d03eda/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp#L657-L661

https://github.com/sgasho/rust/blob/52f0d2ec4fb3db116aff7d35fb0ee61c70d03eda/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp#L900-L929

sgasho avatar Dec 06 '25 14:12 sgasho

That's great work, thank you for debugging this.

ping @wsmoses seems like he found the issue. Any thoughts?

The main thing I find confusing is that I'd expect a full O3 pipeline to be registered here, since we run that before enzyme, s.t. Enzyme can reuse analysis results. https://github.com/sgasho/rust/blob/a591113c0a2b7755514c47bde211fdb92d1d7002/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp#L910 However, this one still seems similar enough: https://github.com/rust-lang/rust/pull/149271/files#diff-eac3d5ef63f39b7914ae9015f31b8d87353ea8fbcd5b02a7335dd86c557e7625 Can you print the registered passes on Enzyme before & after calling registerEnzymeAndPassPipeline on your branch?

These are what I get on main when using RUSTFALGS=-Z=autodiff=Enable,PrintPasses:

cross-dso-cfi,openmp-opt,globaldce<vfe-linkage-unit-visibility>,inferattrs,function<eager-inv>(callsite-splitting),pgo-icall-prom,cgscc(function-attrs,argpromotion,function(sroa<modify-cfg>)),ipsccp,called-value-propagation,rpo-function-attrs,globalsplit,wholeprogramdevirt,coro-early,globalopt,function(mem2reg),constmerge,deadargelim,function<eager-inv>(instcombine<max-iterations=1;no-verify-fixpoint>,aggressive-instcombine),expand-variadics,cgscc(inline<only-mandatory>,inline),globalopt,openmp-opt,globaldce<vfe-linkage-unit-visibility>,cgscc(argpromotion,coro-split,coro-annotation-elide),function<eager-inv>(instcombine<max-iterations=1;no-verify-fixpoint>,constraint-elimination,jump-threading,sroa<modify-cfg>,tailcallelim),cgscc(function-attrs),require<globals-aa>,function(invalidate<aa>),cgscc(openmp-opt-cgscc),function<eager-inv>(loop-mssa(licm<allowspeculation>),gvn<>,memcpyopt,dse,move-auto-init,mldst-motion<no-split-footer-bb>,loop(indvars,loop-deletion,loop-unroll-full),loop-distribute,loop-vectorize<interleave-forced-only;vectorize-forced-only;>,infer-alignment,loop-unroll<O3>,transform-warning,sroa<preserve-cfg>,instcombine<max-iterations=1;no-verify-fixpoint>,simplifycfg<bonus-inst-threshold=1;forward-switch-cond;switch-range-to-icmp;switch-to-lookup;no-keep-loops;hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,sccp,instcombine<max-iterations=1;no-verify-fixpoint>,bdce,vector-combine,infer-alignment,instcombine<max-iterations=1;no-verify-fixpoint>,loop-mssa(licm<allowspeculation>),alignment-from-assumptions,jump-threading),lowertypetests,lowertypetests,function(loop-sink,div-rem-pairs,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;speculate-unpredictables>),elim-avail-extern,globaldce<vfe-linkage-unit-visibility>,rel-lookup-table-converter,cg-profile,coro-cleanup,function(annotation-remarks),EnzymeNewPM

ZuseZ4 avatar Dec 08 '25 12:12 ZuseZ4

Can you print the registered passes on Enzyme before & after calling registerEnzymeAndPassPipeline on your branch?

The outputs I got are not so similar to the above one, but anyway I'd like to share them.

Tell me If I'm doing wrong. I ran ./x test --stage 1 tests/codegen-llvm/autodiff/scalar.rs to print passes.

Before registerEnzymeAndPassPipeline(EnzymePtr)

annotation2metadata,forceattrs,inferattrs,coro-early,function<eager-inv>(ee-instrument<>,lower-expect,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;no-switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,sroa<modify-cfg>,early-cse<>,callsite-splitting),openmp-opt,ipsccp,called-value-propagation,globalopt,function<eager-inv>(mem2reg,instcombine<max-iterations=1;no-verify-fixpoint>,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>),always-inline,require<globals-aa>,function(invalidate<aa>),require<profile-summary>,cgscc(devirt<4>(inline,function-attrs<skip-non-recursive-function-attrs>,argpromotion,openmp-opt-cgscc,function<eager-inv;no-rerun>(sroa<modify-cfg>,early-cse<memssa>,speculative-execution<only-if-divergent-target>,jump-threading,correlated-propagation,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,instcombine<max-iterations=1;no-verify-fixpoint>,aggressive-instcombine,libcalls-shrinkwrap,tailcallelim,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,reassociate,constraint-elimination,loop-mssa(loop-instsimplify,loop-simplifycfg,licm<no-allowspeculation>,loop-rotate<header-duplication;prepare-for-lto>,licm<allowspeculation>,simple-loop-unswitch<nontrivial;trivial>),simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,instcombine<max-iterations=1;no-verify-fixpoint>,loop(loop-idiom,indvars,extra-simple-loop-unswitch-passes,loop-idiom-vectorize,loop-deletion,loop-unroll-full),sroa<modify-cfg>,vector-combine,mldst-motion<no-split-footer-bb>,gvn<>,sccp,bdce,instcombine<max-iterations=1;no-verify-fixpoint>,jump-threading,correlated-propagation,adce,memcpyopt,dse,move-auto-init,loop-mssa(licm<allowspeculation>),coro-elide,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,instcombine<max-iterations=1;no-verify-fixpoint>),function-attrs,function(require<should-not-run-function-passes>),coro-split,coro-annotation-elide)),deadargelim,coro-cleanup,globalopt,globaldce,rpo-function-attrs,recompute-globalsaa,function<eager-inv>(float2int,lower-constant-intrinsics,chr,loop(loop-rotate<header-duplication;prepare-for-lto>,loop-deletion),loop-distribute,inject-tli-mappings,loop-vectorize<interleave-forced-only;vectorize-forced-only;>,infer-alignment,loop-load-elim,instcombine<max-iterations=1;no-verify-fixpoint>,simplifycfg<bonus-inst-threshold=1;forward-switch-cond;switch-range-to-icmp;switch-to-lookup;no-keep-loops;hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,vector-combine,instcombine<max-iterations=1;no-verify-fixpoint>,loop-unroll<O3>,transform-warning,sroa<preserve-cfg>,infer-alignment,instcombine<max-iterations=1;no-verify-fixpoint>,loop-mssa(licm<allowspeculation>),alignment-from-assumptions,loop-sink,instsimplify,div-rem-pairs,tailcallelim,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;speculate-unpredictables>),globaldce,constmerge,function(annotation-remarks),canonicalize-aliases,name-anon-globals

After registerEnzymeAndPassPipeline

same as Before registerEnzymeAndPassPipeline

annotation2metadata,forceattrs,inferattrs,coro-early,function<eager-inv>(ee-instrument<>,lower-expect,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;no-switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,sroa<modify-cfg>,early-cse<>,callsite-splitting),openmp-opt,ipsccp,called-value-propagation,globalopt,function<eager-inv>(mem2reg,instcombine<max-iterations=1;no-verify-fixpoint>,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>),always-inline,require<globals-aa>,function(invalidate<aa>),require<profile-summary>,cgscc(devirt<4>(inline,function-attrs<skip-non-recursive-function-attrs>,argpromotion,openmp-opt-cgscc,function<eager-inv;no-rerun>(sroa<modify-cfg>,early-cse<memssa>,speculative-execution<only-if-divergent-target>,jump-threading,correlated-propagation,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,instcombine<max-iterations=1;no-verify-fixpoint>,aggressive-instcombine,libcalls-shrinkwrap,tailcallelim,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,reassociate,constraint-elimination,loop-mssa(loop-instsimplify,loop-simplifycfg,licm<no-allowspeculation>,loop-rotate<header-duplication;prepare-for-lto>,licm<allowspeculation>,simple-loop-unswitch<nontrivial;trivial>),simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,instcombine<max-iterations=1;no-verify-fixpoint>,loop(loop-idiom,indvars,extra-simple-loop-unswitch-passes,loop-idiom-vectorize,loop-deletion,loop-unroll-full),sroa<modify-cfg>,vector-combine,mldst-motion<no-split-footer-bb>,gvn<>,sccp,bdce,instcombine<max-iterations=1;no-verify-fixpoint>,jump-threading,correlated-propagation,adce,memcpyopt,dse,move-auto-init,loop-mssa(licm<allowspeculation>),coro-elide,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,instcombine<max-iterations=1;no-verify-fixpoint>),function-attrs,function(require<should-not-run-function-passes>),coro-split,coro-annotation-elide)),deadargelim,coro-cleanup,globalopt,globaldce,rpo-function-attrs,recompute-globalsaa,function<eager-inv>(float2int,lower-constant-intrinsics,chr,loop(loop-rotate<header-duplication;prepare-for-lto>,loop-deletion),loop-distribute,inject-tli-mappings,loop-vectorize<interleave-forced-only;vectorize-forced-only;>,infer-alignment,loop-load-elim,instcombine<max-iterations=1;no-verify-fixpoint>,simplifycfg<bonus-inst-threshold=1;forward-switch-cond;switch-range-to-icmp;switch-to-lookup;no-keep-loops;hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,vector-combine,instcombine<max-iterations=1;no-verify-fixpoint>,loop-unroll<O3>,transform-warning,sroa<preserve-cfg>,infer-alignment,instcombine<max-iterations=1;no-verify-fixpoint>,loop-mssa(licm<allowspeculation>),alignment-from-assumptions,loop-sink,instsimplify,div-rem-pairs,tailcallelim,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;speculate-unpredictables>),globaldce,constmerge,function(annotation-remarks),canonicalize-aliases,name-anon-globals

c.f. After PB.parsePassPipeline

Almost same as Before(After) registerEnzymeAndPassPipeline. EnzymeNewPM at the end

annotation2metadata,forceattrs,inferattrs,coro-early,function<eager-inv>(ee-instrument<>,lower-expect,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;no-switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,sroa<modify-cfg>,early-cse<>,callsite-splitting),openmp-opt,ipsccp,called-value-propagation,globalopt,function<eager-inv>(mem2reg,instcombine<max-iterations=1;no-verify-fixpoint>,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>),always-inline,require<globals-aa>,function(invalidate<aa>),require<profile-summary>,cgscc(devirt<4>(inline,function-attrs<skip-non-recursive-function-attrs>,argpromotion,openmp-opt-cgscc,function<eager-inv;no-rerun>(sroa<modify-cfg>,early-cse<memssa>,speculative-execution<only-if-divergent-target>,jump-threading,correlated-propagation,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,instcombine<max-iterations=1;no-verify-fixpoint>,aggressive-instcombine,libcalls-shrinkwrap,tailcallelim,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,reassociate,constraint-elimination,loop-mssa(loop-instsimplify,loop-simplifycfg,licm<no-allowspeculation>,loop-rotate<header-duplication;prepare-for-lto>,licm<allowspeculation>,simple-loop-unswitch<nontrivial;trivial>),simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,instcombine<max-iterations=1;no-verify-fixpoint>,loop(loop-idiom,indvars,extra-simple-loop-unswitch-passes,loop-idiom-vectorize,loop-deletion,loop-unroll-full),sroa<modify-cfg>,vector-combine,mldst-motion<no-split-footer-bb>,gvn<>,sccp,bdce,instcombine<max-iterations=1;no-verify-fixpoint>,jump-threading,correlated-propagation,adce,memcpyopt,dse,move-auto-init,loop-mssa(licm<allowspeculation>),coro-elide,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,instcombine<max-iterations=1;no-verify-fixpoint>),function-attrs,function(require<should-not-run-function-passes>),coro-split,coro-annotation-elide)),deadargelim,coro-cleanup,globalopt,globaldce,rpo-function-attrs,recompute-globalsaa,function<eager-inv>(float2int,lower-constant-intrinsics,chr,loop(loop-rotate<header-duplication;prepare-for-lto>,loop-deletion),loop-distribute,inject-tli-mappings,loop-vectorize<interleave-forced-only;vectorize-forced-only;>,infer-alignment,loop-load-elim,instcombine<max-iterations=1;no-verify-fixpoint>,simplifycfg<bonus-inst-threshold=1;forward-switch-cond;switch-range-to-icmp;switch-to-lookup;no-keep-loops;hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,vector-combine,instcombine<max-iterations=1;no-verify-fixpoint>,loop-unroll<O3>,transform-warning,sroa<preserve-cfg>,infer-alignment,instcombine<max-iterations=1;no-verify-fixpoint>,loop-mssa(licm<allowspeculation>),alignment-from-assumptions,loop-sink,instsimplify,div-rem-pairs,tailcallelim,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;speculate-unpredictables>),globaldce,constmerge,function(annotation-remarks),canonicalize-aliases,name-anon-globals,EnzymeNewPM

sgasho avatar Dec 10 '25 14:12 sgasho

Can you try compiling a standalone project, that you created with cargo new/init and a dummy autodiff call in main.rs? tests are always a little special in how they're getting compiled. In that case you should see multiple pipelines being printed, e.g. one for each dependency, one for the lib.rs (lib) build, one for the main.rs, etc. I can also see your compilation pipeline being used in my standalone usage, just EnzymeNewPM is then being appended to another pipeline.

ZuseZ4 avatar Dec 10 '25 14:12 ZuseZ4

Note: What I recognize so far.

  1. Tests pass when I register DominatorTreeAnalysys & LoopAnalysis to FAM at PreProcessCache::PreProcessCache
  2. DominatorTreeAnalysys & LoopAnalysis do exist at PassRegistry.def
  3. Error occurs at Enzyme/TypeAnalysis/TypeAnalysis.cpp TypeAnalyzer::TypeAnalyzer
    • trace: https://github.com/rust-lang/rust/pull/149271#issuecomment-3576359019
  4. TypeAnalyzer::TypeAnalyzer tries to getResult of TargetLibraryAnalysis, DominatorTreeAnalysis, PostDominatorTreeAnalysis, LoopAnalysis, ScalarEvolutionAnalysis
    • It means, we can get results of TargetLibraryAnalysis, PostDominatorTreeAnalysis, and ScalarEvolutionAnalysis
    • I have no idea what are the differences between these and (DominatorTreeAnalysis, LoopAnalysis)

sgasho avatar Dec 10 '25 14:12 sgasho

Thanks for the hints. I checked using the following steps. anything wrong with them?

  1. ./x build --stage 1 library/std
  2. cargo new at another location
  3. copied and pasted this to main.rs
    • https://rustc-dev-guide.rust-lang.org/autodiff/internals.html
    • modified the main function to be empty. I could not build with bar(....) inside main, does it take too long or some infinite loop inside build process?
  4. RUSTFLAGS="-Zautodiff=Enable,PrintPasses" cargo +stage1 build
  5. Got passes before/after registerEnzymeAndPassPipeline
    • cross-dso-cfi,wholeprogramdevirt,lowertypetests,lowertypetests,coro-cond(coro-early,cgscc(coro-split),coro-cleanup,globaldce),function(annotation-remarks)
    • ↑ seems short so I think I'm missing something

sgasho avatar Dec 10 '25 15:12 sgasho

Step 1+2 are fine, and 3 works with the main branch, so presumably it broke as part of the dlopen experiments. Or does it also fail for you if you use main?

For 4., can you try to build in release mode? Your pass list is indeed very short, but I get the same for dbg mode.

ZuseZ4 avatar Dec 10 '25 15:12 ZuseZ4

Ok, thanks! I'll check about step3 and 4 tomorrow

sgasho avatar Dec 10 '25 15:12 sgasho

I found some bad lock operations in rust implementation in my pr caused deadlock and triggered "This analysis pass was not registered prior to being queried" errorr.

I fixed it and the error no longer appeared.

I wonder why deadlock triggered this error, but anyway I think I can close this issue now

Discussion on Zulip: #t-compiler/help > libload / dlopen Enzyme/autodiff @ 💬

sgasho avatar Dec 13 '25 11:12 sgasho