llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Improvements for running on Windows with Snapdragon X

Open AndreasKunar opened this issue 1 year ago • 10 comments

Improvements/Issues addressed:

  1. Add documentation on how to build for Windows, especially for ARM (with MSVC or clang)
  • See - llama.cpp/docs/build.md
  1. Fix, that building with MSVC breaks for Snapdragn X / Windows because MSVC does not support any C in-line Assembly for ARM (_asm_ directive) - fixes #8446
  • append "&& ! ((defined(_MSC_VER)) && ! defined(clang))" before each _asm_. Clang on Windows masquerades as MSCV, therefore the strange conditional - that it must not be a MSVC which is not clang.

ToDo in future fixes for Snapdragon X on Windows and not addressed by this PR:

  • Work on enabling GPU acceleration for SnapDragon X, e.g. via getting Vulkan to run on Snapdragon x - see #8455

  • Work on enabling NPU acceleration for SnapDragon X, e.g. via work on getting QNN to run, and also on Windows - see #6869 (currently only for Android)

P.S: This is my first PR ever, and with the help of chatGPT as tutor. Please have mercy on me, if I misunderstood things.

AndreasKunar avatar Jul 17 '24 07:07 AndreasKunar

@AndreasKunar We just need to update the README using instructions from my original PR that enabled optimized builds for the X-Elite. https://github.com/ggerganov/llama.cpp/pull/7191

You can also find the overall procedure in the Github Actions Workflow Here is the summary:

  • Install Visual Studio 2022 (Community or another edition). Make sure to install complete ARM64 support (libraries, tools, etc)
  • Install Chocolatey (choco) package manager -- https://chocolatey.org/install
  • Use choco to install Ninja, CMake, LLVM
From Power Shell:  
        choco install ninja
        choco install cmake --installargs 'ADD_CMAKE_TO_PATH=System'
        choco install llvm

Once everything is installed build with

     cmake --preset arm64-windows-llvm-release
     cmake --build build-arm64-windows-llvm-release

max-krasnyansky avatar Jul 17 '24 18:07 max-krasnyansky

@AndreasKunar We just need to update the README using instructions from my original PR that enabled optimized builds for the X-Elite. #7191

You can also find the overall procedure in the Github Actions Workflow Here is the summary:

  • Install Visual Studio 2022 (Community or another edition). Make sure to install complete ARM64 support (libraries, tools, etc)
  • Install Chocolatey (choco) package manager -- https://chocolatey.org/install
  • Use choco to install Ninja, CMake, LLVM
From Power Shell:  
        choco install ninja
        choco install cmake --installargs 'ADD_CMAKE_TO_PATH=System'

While I understand, some might use chocolately, winget is already shipping with Windows, and winget install cmake also does the job. I'd rather have both options available, than forcing chocoletely.

    choco install llvm

I'd rather also have the option to directly install llvm via github, without needing chocolately


Once everything is installed build with

 cmake --preset arm64-windows-llvm-release
 cmake --build build-arm64-windows-llvm-release

I agree and should have included this! THANKS!

Good input, I will update the doc tomorrow (Jul 18) morning my time (its late today, I'm tired/prone to errors, and I'd rather start fresh).

AndreasKunar avatar Jul 17 '24 20:07 AndreasKunar

I'd rather also have the option to directly install llvm via github, without needing chocolately

No objection to having options :) We should have a recommended way though, that is also used in the CI for consistency.

So far, in the CI I used choco https://github.com/ggerganov/llama.cpp/blob/master/.github/workflows/build.yml#L758

If winget is better option (ie included in windows by default, etc) I don't mind updating the CI to use winget. We need to fix the CI anyway for the MSVC builds (see my comment in that discussion).

max-krasnyansky avatar Jul 17 '24 20:07 max-krasnyansky

I'd rather also have the option to directly install llvm via github, without needing chocolately

No objection to having options :) We should have a recommended way though, that is also used in the CI for consistency.

So far, in the CI I used choco https://github.com/ggerganov/llama.cpp/blob/master/.github/workflows/build.yml#L758

If winget is better option (ie included in windows by default, etc) I don't mind updating the CI to use winget. We need to fix the CI anyway for the MSVC builds (see my comment in that discussion).

@max-krasnyansky - I have tried to clarify/simplify the instructions. The current VS2022 automatically installs cmake and ninja, so no need for separate installs (I also tested this on my PC). Clang can be either installed via download or choco.

AndreasKunar avatar Jul 18 '24 12:07 AndreasKunar

@max-krasnyansky apologies, I found an even easier installation method and changed the documentation again. Installing all via the VS2022 install (git, cmake, clang, MSVC and ninja) and documented it accordingly. I also suggest to disable openmp (-D GGML_OPENMP=OFF) with clang for arm64 and documented this.

AndreasKunar avatar Jul 18 '24 16:07 AndreasKunar

Looks good to me.

Re: OpenMP. @fmz and I are working on the next round of the Threadpool updates that we submitted earlier. It had to be re-written a bit due to OpenMP getting merged and other changes. The new PR will enable Threadpool for Windows on ARM64 as well (using native Windows thread primitives). Also technically OpenMP works with LLVM/Clang, libomp.dll and headers are included in the LLVM distribution, but CMake fails to find it, it's a CMake issue. Anyway, we're going to introduce better threading via new PR soon :)

max-krasnyansky avatar Jul 19 '24 18:07 max-krasnyansky

@ggerganov can you please point me to someone who can review+approve this? max-krasnyansky, the author of the initial Snapdragon X / Windows optimizations thinks its OK. Thanks in advance.

AndreasKunar avatar Jul 23 '24 07:07 AndreasKunar

Do I understand correctly that:

  • Building with cmake --preset arm64-windows-llvm-release the new aarch64 assembly kernels will compile and work correctly?
  • Building with cmake --preset arm64-windows-MSVC will also succeed, but the aarch64 assembly kernels will be skipped by the new conditionals?

yes.

building with MSVC breaks for Snapdragn X / Windows because MSVC does not support any C in-line Assembly for ARM (asm directive)

I also need someone to confirm this statement and that there is nothing better that we can do than disabling this code for certain environments (pinging @Dibakar)

Did you try using __asm instead of __asm__?

https://learn.microsoft.com/en-us/cpp/assembler/inline/asm?view=msvc-170

In MSVC all asm only works with 32-bit x86, and neither with x64 nor with arm64. @max-krasnyansky verified this in his comments.

AndreasKunar avatar Jul 23 '24 08:07 AndreasKunar

The conditional seems to be fine for now, I will confirm it shortly.

Dibakar avatar Jul 24 '24 06:07 Dibakar

@ggerganov - sorry for my inexperience with pull requests, and for my asking because I don't want to do anything wrong. Is there something I still need to do for this, or is everything automatic, once it got approved? Should I close the request?

AndreasKunar avatar Jul 25 '24 12:07 AndreasKunar

Question: Why is __clang__ being disabled too? Won't that render the Clang (regular clang, not clang-cl) builds also unable to take advantage of MATMUL?

hmartinez82 avatar Aug 06 '24 11:08 hmartinez82

Question: Why is __clang__ being disabled too? Won't that render the Clang (regular clang, not clang-cl) builds also unable to take advantage of MATMUL?

Its not disabled at all. But clang on Windoes says its both MSVC and clang. My conditional disables the asm only for "real" MSVC (MSVC and not clang)

AndreasKunar avatar Aug 06 '24 11:08 AndreasKunar