cmdstan icon indicating copy to clipboard operation
cmdstan copied to clipboard

cmdstan stops working with Intel MKL

Open wds15 opened this issue 4 years ago • 11 comments

Summary:

The attached model used to work with previous (pre 2.28.0) MKL installations, but it fails to compile now with 2.28.0.

Description:

With the MKL being used the latest cmdstan 2.28.0 stops compiling with the model below.

make/local:

MKL_CBWR=AVX
CC = gcc
CXX = g++
C_MKL=-I ${MKLROOT}/include -march=ivybridge -DEIGEN_USE_MKL_ALL -Wno-enum-compare
LD_MKL=-L${MKLROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lm -ldl -Wl,-rpath=${MKLROOT}/lib/intel64,--no-as-needed
# which need to be passed to compiler/linker options
CXXFLAGS += $(C_MKL)
CFLAGS += $(C_MKL)
LDFLAGS += $(LD_MKL)
STAN_THREADS=true

blrm_exnex-combo3_data_R.txt

blrm-exnex-28_stan.txt

Error message during compilation (g++ 8.2.0):

stan/lib/stan_math/stan/math/prim/fun/lb_constrain.hpp:175:49:   required from ‘auto stan::math::lb_constrain(const std::vector<T1>&, const L&, stan::return_type_t<T1, T2>&) [with T = std::vector<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, std::allocator<Eigen::Matrix<stan::math::var_value<double>, -1, 1> > >; L = int; stan::require_not_std_vector_t<T_high>* <anonymous> = 0; stan::return_type_t<T1, T2> = stan::math::var_value<double>]’
stan/src/stan/io/deserializer.hpp:386:38:   required from ‘auto stan::io::deserializer<T>::read_constrain_lb(const LB&, LP&, Sizes ...) [with Ret = std::vector<std::vector<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, std::allocator<Eigen::Matrix<stan::math::var_value<double>, -1, 1> > >, std::allocator<std::vector<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, std::allocator<Eigen::Matrix<stan::math::var_value<double>, -1, 1> > > > >; bool Jacobian = false; LB = int; LP = stan::math::var_value<double>; Sizes = {int, int, int}; T = stan::math::var_value<double>]’
/home/weberse2/t/blrm-exnex-28.hpp:4732:24:   required from ‘stan::scalar_type_t<T2> blrm_exnex_28_model_namespace::blrm_exnex_28_model::log_prob_impl(VecR&, VecI&, std::ostream*) const [with bool propto__ = false; bool jacobian__ = false; VecR = Eigen::Matrix<stan::math::var_value<double>, -1, 1>; VecI = Eigen::Matrix<int, -1, 1>; stan::require_vector_like_t<VecR>* <anonymous> = 0; stan::require_vector_like_vt<std::is_integral, VecI>* <anonymous> = 0; stan::scalar_type_t<T2> = stan::math::var_value<double>; std::ostream = std::basic_ostream<char>]’
/home/weberse2/t/blrm-exnex-28.hpp:6142:77:   required from ‘T_ blrm_exnex_28_model_namespace::blrm_exnex_28_model::log_prob(Eigen::Matrix<T_job_param, -1, 1>&, std::ostream*) const [with bool propto__ = false; bool jacobian__ = false; T_ = stan::math::var_value<double>; std::ostream = std::basic_ostream<char>]’
stan/src/stan/model/model_base_crtp.hpp:96:77:   required from ‘stan::math::var stan::model::model_base_crtp<M>::log_prob(Eigen::Matrix<stan::math::var_value<double>, -1, 1>&, std::ostream*) const [with M = blrm_exnex_28_model_namespace::blrm_exnex_28_model; stan::math::var = stan::math::var_value<double>; std::ostream = std::basic_ostream<char>]’
stan/src/stan/model/model_base_crtp.hpp:93:20:   required from here
stan/lib/stan_math/lib/eigen_3.3.9/Eigen/src/Core/ArrayWrapper.h:80:48: error: passing ‘const NestedExpressionType’ {aka ‘const Eigen::CwiseUnaryView<Eigen::MatrixBase<Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >::val_Op, Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >’} as ‘this’ argument discards qualifiers [-fpermissive]
       return m_expression.coeffRef(rowId, colId);
                                                ^
In file included from stan/lib/stan_math/lib/eigen_3.3.9/Eigen/Core:439,
                 from stan/lib/stan_math/lib/eigen_3.3.9/Eigen/Dense:1,
                 from stan/lib/stan_math/stan/math/prim/fun/Eigen.hpp:22,
                 from stan/lib/stan_math/stan/math/rev.hpp:4,
                 from stan/lib/stan_math/stan/math.hpp:19,
                 from stan/src/stan/model/model_header.hpp:4,
                 from /home/weberse2/t/blrm-exnex-28.hpp:3:
stan/lib/stan_math/lib/eigen_3.3.9/Eigen/src/Core/DenseCoeffsBase.h:340:33: note:   in call to ‘Eigen::DenseCoeffsBase<Derived, 1>::Scalar& Eigen::DenseCoeffsBase<Derived, 1>::coeffRef(Eigen::Index, Eigen::Index) [with Derived = Eigen::CwiseUnaryView<Eigen::MatrixBase<Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >::val_Op, Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >; Eigen::DenseCoeffsBase<Derived, 1>::Scalar = double; Eigen::Index = long int]’
     EIGEN_STRONG_INLINE Scalar& coeffRef(Index row, Index col)
                                 ^~~~~~~~

This is with MKL version 2019.1.144

Reproducible Steps:

Current Output:

The current output. Knowing what is the current behavior is useful.

Expected Output:

It should just work as it did with previous versions. The last version we have tested this setup ok is cmdstan 2.21.

Additional Information:

NA

Current Version:

v2.28.0

wds15 avatar Oct 11 '21 11:10 wds15

Just to double check, it is working with 2.27.0? I am not sure from your post as you mention 2.21.

rok-cesnovar avatar Oct 11 '21 12:10 rok-cesnovar

No.. 2.27.0 also fails with the same error.

wds15 avatar Oct 11 '21 14:10 wds15

2.26.0 works OK.

wds15 avatar Oct 11 '21 14:10 wds15

@andrjohns maybe this easy for you: To me it looks as if Eigen integrates with the MKL using the plugin system. This is apparently interfering with the use of the plugin system as coded up for Stan-math. There are comments in stan/math/prim/fun/Eigen.hpp which talk about multiple plugins, but they don't really make sense to me. Maybe you can have a look or comment? Let me know if you need more info. As stated above: Things stopped working 2.27.0 while they were ok with 2.26.0...if that helps. Thanks!

Have a look at: https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/Assign_MKL.h

wds15 avatar Oct 12 '21 07:10 wds15

Sounds interesting, will take a look!

andrjohns avatar Oct 12 '21 10:10 andrjohns

Alright, I can replicate locally with the MKL when just running the unit tests for the MatrixBase plugins. Will sort out a fix asap

andrjohns avatar Oct 12 '21 10:10 andrjohns

@wds15 & @rok-cesnovar, it's an Eigen bug, not Stan. The macros that Eigen are using for generating the MKL code (in the header that Sebastion linked before) are only defined for const inputs on which a CWiseUnaryOp can be called.

So when calling them on a non-const input, it throws the error about discarding attributes. I'll open a bug with Eigen and see if we can patch locally in the interim

andrjohns avatar Oct 12 '21 11:10 andrjohns

Thank you very much for digging into this!

rok-cesnovar avatar Oct 12 '21 11:10 rok-cesnovar

I'd think, that it would be nice to have this in our 2.28.1 release as well... if this is easy for you to handle and you got the time to do it soon.

Thanks!

wds15 avatar Oct 12 '21 11:10 wds15

Just a heads up that this is going to be a larger refactor and I don't think should hold up 2.28

andrjohns avatar Oct 14 '21 05:10 andrjohns

Thanks for the heads up…take your time.

wds15 avatar Oct 14 '21 05:10 wds15