Move supports_attn_softmax logic to build.rs
@sgrebnov this is the solution I mentioned in #935. What do you think?
Code Metrics Report
=============================================================================== Language Files Lines Code Comments Blanks =============================================================================== C Header 2 35 28 0 7 Dockerfile 1 41 22 10 9 JSON 12 105 104 0 1 Python 69 2926 2534 77 315 Shell 1 58 22 18 18 Plain Text 3 3723 0 2413 1310 TOML 18 627 556 2 69 YAML 2 21 19 2 0 ------------------------------------------------------------------------------- Jupyter Notebooks 4 0 0 0 0 |- Markdown 2 77 32 31 14 |- Python 2 205 178 1 26 (Total) 282 210 32 40 ------------------------------------------------------------------------------- Markdown 46 3802 0 2891 911 |- BASH 6 103 100 0 3 |- JSON 1 12 12 0 0 |- Python 7 121 109 0 12 |- Rust 15 512 433 0 79 |- TOML 2 75 63 0 12 (Total) 4625 717 2891 1017 ------------------------------------------------------------------------------- Rust 309 99703 89365 1933 8405 |- Markdown 149 1690 25 1540 125 (Total) 101393 89390 3473 8530 =============================================================================== Total 467 111041 92650 7346 11045 ===============================================================================
@EricLBuehler - this looks great! I'm just trying to learn whether supports_attn_softmax is required to compile vs in runtime. My previous impression was that it is required in runtime as I was able to build the app with metal acceleration enabled using GH macos runner that does not support acceleration and run successfully on another device with metal acceleration supported.
Hi @sgrebnov!
I'm just trying to learn whether supports_attn_softmax is required to compile vs in runtime.
The reason why we need the check is that the attn_softmax implementation for Metal uses intrinsic types only present in Metal 3.1+. So really, it only depends on the installed Metal framework version, which is why it should be fine to check this at build time (unless the Metal framework version changes during runtime?).
I don't love how this method modifies files in the project - do you perhaps have a better way to pass the information to attention.rs?
At least in our use case, the metal framework version changes at runtime. Why is this? We want to deliver compiled versions of our metal binary to arbitrary metal users. Therefore at compile time, the metal version is based on our CI computers. At runtime, it will be whichever version the user has locally.
@EricLBuehler - I'm learning about bfloat and it seems the app will fail if compiled with supports_attn_softmax but run on device with metal version < 310. As described by @Jeadie the scenario we would like to add support for is the following: the binaries are compiled once as part of CI/CD and then redistributed and run on different devices (different hardware/software version include Metal framework version).
As bf16 in Metal supported on all macOS 14+ devices, can we rely on OS version (in runtime) to detect where supports_attn_softmax should be enabled? Similar to metal-rs logic https://github.com/gfx-rs/metal-rs/blob/master/src/device.rs#L437?