Improvements for running on Windows with Snapdragon X
- [x] I have read the contributing guidelines
- Self-reported review complexity:
- [x ] Low
- [ ] Medium
- [ ] High
Improvements/Issues addressed:
- Add documentation on how to build for Windows, especially for ARM (with MSVC or clang)
- See - llama.cpp/docs/build.md
- Fix, that building with MSVC breaks for Snapdragn X / Windows because MSVC does not support any C in-line Assembly for ARM (_asm_ directive) - fixes #8446
- append "&& ! ((defined(_MSC_VER)) && ! defined(clang))" before each _asm_. Clang on Windows masquerades as MSCV, therefore the strange conditional - that it must not be a MSVC which is not clang.
ToDo in future fixes for Snapdragon X on Windows and not addressed by this PR:
-
Work on enabling GPU acceleration for SnapDragon X, e.g. via getting Vulkan to run on Snapdragon x - see #8455
-
Work on enabling NPU acceleration for SnapDragon X, e.g. via work on getting QNN to run, and also on Windows - see #6869 (currently only for Android)
P.S: This is my first PR ever, and with the help of chatGPT as tutor. Please have mercy on me, if I misunderstood things.
@AndreasKunar We just need to update the README using instructions from my original PR that enabled optimized builds for the X-Elite. https://github.com/ggerganov/llama.cpp/pull/7191
You can also find the overall procedure in the Github Actions Workflow Here is the summary:
- Install Visual Studio 2022 (Community or another edition). Make sure to install complete ARM64 support (libraries, tools, etc)
- Install Chocolatey (choco) package manager -- https://chocolatey.org/install
- Use
chocoto install Ninja, CMake, LLVM
From Power Shell:
choco install ninja
choco install cmake --installargs 'ADD_CMAKE_TO_PATH=System'
choco install llvm
Once everything is installed build with
cmake --preset arm64-windows-llvm-release
cmake --build build-arm64-windows-llvm-release
@AndreasKunar We just need to update the README using instructions from my original PR that enabled optimized builds for the X-Elite. #7191
You can also find the overall procedure in the Github Actions Workflow Here is the summary:
- Install Visual Studio 2022 (Community or another edition). Make sure to install complete ARM64 support (libraries, tools, etc)
- Install Chocolatey (choco) package manager -- https://chocolatey.org/install
- Use
chocoto install Ninja, CMake, LLVMFrom Power Shell: choco install ninja choco install cmake --installargs 'ADD_CMAKE_TO_PATH=System'
While I understand, some might use chocolately, winget is already shipping with Windows, and winget install cmake also does the job. I'd rather have both options available, than forcing chocoletely.
choco install llvm
I'd rather also have the option to directly install llvm via github, without needing chocolately
Once everything is installed build withcmake --preset arm64-windows-llvm-release cmake --build build-arm64-windows-llvm-release
I agree and should have included this! THANKS!
Good input, I will update the doc tomorrow (Jul 18) morning my time (its late today, I'm tired/prone to errors, and I'd rather start fresh).
I'd rather also have the option to directly install llvm via github, without needing chocolately
No objection to having options :) We should have a recommended way though, that is also used in the CI for consistency.
So far, in the CI I used choco https://github.com/ggerganov/llama.cpp/blob/master/.github/workflows/build.yml#L758
If winget is better option (ie included in windows by default, etc) I don't mind updating the CI to use winget. We need to fix the CI anyway for the MSVC builds (see my comment in that discussion).
I'd rather also have the option to directly install llvm via github, without needing chocolately
No objection to having options :) We should have a recommended way though, that is also used in the CI for consistency.
So far, in the CI I used choco https://github.com/ggerganov/llama.cpp/blob/master/.github/workflows/build.yml#L758
If winget is better option (ie included in windows by default, etc) I don't mind updating the CI to use winget. We need to fix the CI anyway for the MSVC builds (see my comment in that discussion).
@max-krasnyansky - I have tried to clarify/simplify the instructions. The current VS2022 automatically installs cmake and ninja, so no need for separate installs (I also tested this on my PC). Clang can be either installed via download or choco.
@max-krasnyansky apologies, I found an even easier installation method and changed the documentation again. Installing all via the VS2022 install (git, cmake, clang, MSVC and ninja) and documented it accordingly. I also suggest to disable openmp (-D GGML_OPENMP=OFF) with clang for arm64 and documented this.
Looks good to me.
Re: OpenMP. @fmz and I are working on the next round of the Threadpool updates that we submitted earlier. It had to be re-written a bit due to OpenMP getting merged and other changes. The new PR will enable Threadpool for Windows on ARM64 as well (using native Windows thread primitives). Also technically OpenMP works with LLVM/Clang, libomp.dll and headers are included in the LLVM distribution, but CMake fails to find it, it's a CMake issue. Anyway, we're going to introduce better threading via new PR soon :)
@ggerganov can you please point me to someone who can review+approve this? max-krasnyansky, the author of the initial Snapdragon X / Windows optimizations thinks its OK. Thanks in advance.
Do I understand correctly that:
- Building with
cmake --preset arm64-windows-llvm-releasethe new aarch64 assembly kernels will compile and work correctly?- Building with
cmake --preset arm64-windows-MSVCwill also succeed, but the aarch64 assembly kernels will be skipped by the new conditionals?
yes.
building with MSVC breaks for Snapdragn X / Windows because MSVC does not support any C in-line Assembly for ARM (asm directive)
I also need someone to confirm this statement and that there is nothing better that we can do than disabling this code for certain environments (pinging @Dibakar)
Did you try using
__asminstead of__asm__?https://learn.microsoft.com/en-us/cpp/assembler/inline/asm?view=msvc-170
In MSVC all asm only works with 32-bit x86, and neither with x64 nor with arm64. @max-krasnyansky verified this in his comments.
The conditional seems to be fine for now, I will confirm it shortly.
@ggerganov - sorry for my inexperience with pull requests, and for my asking because I don't want to do anything wrong. Is there something I still need to do for this, or is everything automatic, once it got approved? Should I close the request?
Question: Why is __clang__ being disabled too? Won't that render the Clang (regular clang, not clang-cl) builds also unable to take advantage of MATMUL?
Question: Why is
__clang__being disabled too? Won't that render the Clang (regular clang, not clang-cl) builds also unable to take advantage of MATMUL?
Its not disabled at all. But clang on Windoes says its both MSVC and clang. My conditional disables the asm only for "real" MSVC (MSVC and not clang)