mistral.rs icon indicating copy to clipboard operation
mistral.rs copied to clipboard

Move supports_attn_softmax logic to build.rs

Open EricLBuehler opened this issue 1 year ago • 5 comments

@sgrebnov this is the solution I mentioned in #935. What do you think?

EricLBuehler avatar Jan 23 '25 00:01 EricLBuehler

Code Metrics Report
  ===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 C Header                2           35           28            0            7
 Dockerfile              1           41           22           10            9
 JSON                   12          105          104            0            1
 Python                 69         2926         2534           77          315
 Shell                   1           58           22           18           18
 Plain Text              3         3723            0         2413         1310
 TOML                   18          627          556            2           69
 YAML                    2           21           19            2            0
-------------------------------------------------------------------------------
 Jupyter Notebooks       4            0            0            0            0
 |- Markdown             2           77           32           31           14
 |- Python               2          205          178            1           26
 (Total)                            282          210           32           40
-------------------------------------------------------------------------------
 Markdown               46         3802            0         2891          911
 |- BASH                 6          103          100            0            3
 |- JSON                 1           12           12            0            0
 |- Python               7          121          109            0           12
 |- Rust                15          512          433            0           79
 |- TOML                 2           75           63            0           12
 (Total)                           4625          717         2891         1017
-------------------------------------------------------------------------------
 Rust                  309        99703        89365         1933         8405
 |- Markdown           149         1690           25         1540          125
 (Total)                         101393        89390         3473         8530
===============================================================================
 Total                 467       111041        92650         7346        11045
===============================================================================
  

github-actions[bot] avatar Jan 23 '25 00:01 github-actions[bot]

@EricLBuehler - this looks great! I'm just trying to learn whether supports_attn_softmax is required to compile vs in runtime. My previous impression was that it is required in runtime as I was able to build the app with metal acceleration enabled using GH macos runner that does not support acceleration and run successfully on another device with metal acceleration supported.

sgrebnov avatar Jan 23 '25 01:01 sgrebnov

Hi @sgrebnov!

I'm just trying to learn whether supports_attn_softmax is required to compile vs in runtime.

The reason why we need the check is that the attn_softmax implementation for Metal uses intrinsic types only present in Metal 3.1+. So really, it only depends on the installed Metal framework version, which is why it should be fine to check this at build time (unless the Metal framework version changes during runtime?).

I don't love how this method modifies files in the project - do you perhaps have a better way to pass the information to attention.rs?

EricLBuehler avatar Jan 23 '25 01:01 EricLBuehler

At least in our use case, the metal framework version changes at runtime. Why is this? We want to deliver compiled versions of our metal binary to arbitrary metal users. Therefore at compile time, the metal version is based on our CI computers. At runtime, it will be whichever version the user has locally.

Jeadie avatar Jan 23 '25 02:01 Jeadie

@EricLBuehler - I'm learning about bfloat and it seems the app will fail if compiled with supports_attn_softmax but run on device with metal version < 310. As described by @Jeadie the scenario we would like to add support for is the following: the binaries are compiled once as part of CI/CD and then redistributed and run on different devices (different hardware/software version include Metal framework version).

As bf16 in Metal supported on all macOS 14+ devices, can we rely on OS version (in runtime) to detect where supports_attn_softmax should be enabled? Similar to metal-rs logic https://github.com/gfx-rs/metal-rs/blob/master/src/device.rs#L437?

sgrebnov avatar Jan 27 '25 18:01 sgrebnov