candle icon indicating copy to clipboard operation
candle copied to clipboard

Enabling different linking options for MKL downstream.

Open Narsil opened this issue 8 months ago • 0 comments

When using candle/mkl right now, it forces the use of static linking.

However, this is not optimal in some circumstances, for instance we need to hotpatch the library to enable faster runtime on non-Intel CPU: https://github.com/huggingface/text-embeddings-inference/blob/main/Dockerfile#L39-L40

In order for the thing to be practical it's easier to use dynamic linking and use clever ordering to force patch the library at runtime.

I made the change non breaking.

  • feature mkl becomes _mkl and doesn't include the actual intel-mkl-src/mkl-static-lp64-iomp feature, so it's agnostic the linking procedure.
  • Adding a new mkl feature which solely adds mkl-static-lp64-iomp to keep backward compatibility in examples and downstream crates
  • Propagate to candle-nn
  • Didn't propagate to candle-transformers (user can simply opt-out of mkl altogether since no code actually uses it)

So now for power-users than want to custom link mkl all they need to do is enable the _mkl feature, and manually enable the intel-mkl-src/mkl-dynamic-lp64-iomp in their own crate/binary.

I haven't modified the documentation since it's a power user feature anyway, but I can do it if deemed necessary.

Narsil avatar Mar 28 '25 11:03 Narsil