Allow building on Windows using `clang-cl` toolchain
It's not possible to build gemma.cpp with the standard MSVC front-end as it doesn't support arrays more than 0x7ffffffff bytes (see Compiler Error C2148), however this isn't a problem with the optional Visual Studio Clang/LLVM-based frontend.
This can be specified using the -T flag when running CMake:
$ cmake -B build -T ClangCL
$ cmake --build build --config Release
With the included CMakePresets.json (second commit) this can be simplified to just:
$ cmake --preset windows
$ cmake --build --preset windows
Windows doesn't provide pread/pwrite so this must be emulated using the ReadFile/WriteFile Win32 APIs.
NOMINMAX is defined to prevent the min/max macros from windows.h from conflicting with expressions like std::min. Generally libraries should avoid including windows.h in their public headers or define WIN32_LEAN_AND_MEAN before including the windows.h header, but this unfortunately isn't always the case.
Tested with Visual Studio 2022 Build Tools:
$ .\build\Release\gemma.exe --tokenizer tokenizer.spm --compressed_weights 2b-it-sfp.sbs --model 2b-it
__ _ ___ _ __ ___ _ __ ___ __ _ ___ _ __ _ __
/ _` |/ _ \ '_ ` _ \| '_ ` _ \ / _` | / __| '_ \| '_ \
| (_| | __/ | | | | | | | | | | (_| || (__| |_) | |_) |
\__, |\___|_| |_| |_|_| |_| |_|\__,_(_)___| .__/| .__/
__/ | | | | |
|___/ |_| |_|
tokenizer : tokenizer.spm
compressed_weights : 2b-it-sfp.sbs
model : 2b-it
weights : [no path specified]
max_tokens : 3072
max_generated_tokens : 2048
*Usage*
Enter an instruction and press enter (%Q quits).
*Examples*
- Write an email to grandma thanking her for the cookies.
- What are some historical attractions to visit around Massachusetts?
- Compute the nth fibonacci number in javascript.
- Write a standup comedy bit about GPU programming.
> Write an email to grandma thanking her for the cookies.
[ Reading prompt ] ...................
Subject: Thank you for the delicious cookies!
Dear Grandma,
I just wanted to take a moment to express my sincere gratitude for the wonderful cookies you baked for me. They were absolutely delicious!
I especially loved the [mention a few of your favorite cookies]. They were so fresh and flavorful, and the perfect treat to brighten my day.
Your generosity and thoughtfulness mean the world to me. It's always so sweet to receive a homemade treat, and your cookies were a perfect reminder of all the love and care you put into them.
I can't wait to enjoy another batch soon! Please let me know when you're planning to bake some more.
Thank you again for everything, Grandma. You're the best!
Love,
[Your name]
Very nice, cool to see this building on Windows, thanks for sending the pull request! FYI we are currently working on automated Copybara sync so we can merge hopefully soon.
Very nice, cool to see this building on Windows, thanks for sending the pull request! FYI we are currently working on automated Copybara sync so we can merge hopefully soon.
No worries. I know the fun of wrangling google3. 😁
Rebased onto the dev branch and added an additional commit that adds a CMakePresets.json file which can help simplify multiple CMake configurations.
Can I request an exe download for those of us who are not familiar with building on Windows?
Thanks for rebasing to dev, that's great! We are now ready to (attempt to) merge. I've resolved conflicts in README.
@jan-wassenberg @austinvhuang It looks like this Copybara sync (https://github.com/google/gemma.cpp/commit/c03b5da542ef19f65a4147a52ccac7c89334e7f3) reverted all the changes from this PR. Yay automation.
ouch sorry about that, will take a closer look a bit later this evening.
ouch sorry about that, will take a closer look a bit later this evening.
No worries! ~~I'll reapply the commits on the current dev branch and put up a new PR.~~
@jan-wassenberg @austinvhuang It looks like this Copybara sync (c03b5da) reverted all the changes from this PR. Yay automation.
Sorry about this! We're still working out the kinks.
I've restoring the changes in https://github.com/google/gemma.cpp/commit/84444c93a44f484442fda2523dde7e77dbd3a53c.
Sorry about this! We're still working out the kinks.
I've restoring the changes in 84444c9.
Sounds great! Thank you.
Hi @GrahamboJangles,
Can I request an exe download for those of us who are not familiar with building on Windows?
PR #38 adds the Windows build to the GitHub Actions job. The gemma.exe binary can be found in the gemma-windows-latest-windows-Release build artefact.
Please be aware that Windows Defender currently detects this as malware—almost certainly a false positive (this is a good part of the reason for building it on GitHub's infrastructure and not just providing a binary I compiled myself). I'll see if there's anything I can do about this.
@dcoles - Thanks for the reply. I wasn't even aware GitHub Actions existed, thanks for that. It does look like it flags it as malware -- I actually can't even download it, usually, it lets me bypass. Wget will not download a valid file either.
I think you can submit false positive reports to Windows Defender, VirusTotal, etc. https://www.microsoft.com/en-us/wdsi/filesubmission/ https://docs.virustotal.com/docs/false-positive-contacts
But I'm not sure if this has to be done on a per-file basis or not.
I did manage to build via Ubuntu WSL, is there a way I can manually build for Windows?
Hi @GrahamboJangles,
I think you can submit false positive reports to Windows Defender, VirusTotal, etc. https://www.microsoft.com/en-us/wdsi/filesubmission/ https://docs.virustotal.com/docs/false-positive-contacts
I submitted the file to Microsoft, though haven't got a response back yet. Good news is that the latest build is no longer being detected as a threat.
If you're interested in building it yourself, you can find the instructions in the README on the devbranch. Hopefully no more complex than building on Ubuntu WSL. Let me know if you find any parts of the instructions unclear.
@dcoles
Even the latest build is flagged for me.
I'm trying to build it myself, but get the following error:
ryzen@GDESKTOP:~/gemma.cpp/build$ ls
7b-it-sfp.sbs CMakeCache.txt CMakeFiles CPackConfig.cmake CPackSourceConfig.cmake Makefile _deps archive.tar.gz bin cmake_install.cmake gemma lib tokenizer.spm
ryzen@GDESKTOP:~/gemma.cpp/build$ cmake --preset windows
CMake Error: Could not read presets from /home/ryzen/gemma.cpp/build: File not found
I'm trying to build it myself, but get the following error:
@GrahamboJangles You should run the CMake configure step (cmake --preset windows) from the top-level gemma.cpp directory, then run the build step (cmake --build --preset windows) in the same top-level directory. The build directory should also be empty and not have any data from previous build attempts.
Note: You can't build this in WSL. It should be run in a regular Windows cmd.exe or PowerShell prompt after installing the Visual Studio 2022 build dependencies (you may need to restart your PC once before continuing):
winget install --id Kitware.CMake
winget install --id Microsoft.VisualStudio.2022.BuildTools --force --override "--passive --wait --add Microsoft.VisualStudio.Workload.VCTools;installRecommended --add Microsoft.VisualStudio.Component.VC.Llvm.Clang --add Microsoft.VisualStudio.Component.VC.Llvm.ClangToolset"
cd gemma.cpp
cmake --preset windows
cmake --build --preset windows
The resulting binary should be build\Release\gemma.exe:
build\Release\gemma.exe --tokenizer .\tokenizer.spm --model 7b-it --compressed_weights .\7b-it-sfp.sbs
Dumb mechanical engineer here playing around with cool software stuff that's over my head, but I keep getting
fatal error : 'sched.h' file not found when I run cmake --build --preset windows. I don't see that file anywhere in the Github Repo.
Building Custom Rule C:/nemo_h/HW2/CMakeLists.txt
In file included from C:\nemo_h\HW2\run.cc:30:
C:\nemo_h\HW2/util/app.h(21,10): fatal error : 'sched.h' file not found [C:\nemo_h\HW2\build\gemma.vcxproj]
Hi @nfe213,
We're currently waging a bit of a war with Google's code sync tool Capybarra, which keeps undoing the changes from this PR. This is likely the reason that your build is currently failing.
The most recent working commit on the dev branch is 84444c93a44f484442fda2523dde7e77dbd3a53c. You can check that out by running:
git checkout dev
git reset --hard 84444c93a44f484442fda2523dde7e77dbd3a53c
Thanks, @dcoles! I'll check that out. Best of luck soldier.
Edit: Got it! Thank you so much!!