llama.cpp
llama.cpp copied to clipboard
Updating build instructions to include BLAS support
This pull request adds clear instructions about how to build llama.cpp on every platform with and without BLAS support.
Looks good, could we also include a brief notice explaining that BLAS is only used when the batch size is at least 32? Making clear that it only benefits prompt processing, but not generation.
For cuBLAS, it may also be useful to point that the CUDA toolkit can be obtained from https://developer.nvidia.com/cuda-downloads
Sure. By installing the CUDA toolkit can Windows users build without any problem? I don't have a Nvidia GPU so I can't test it.
They also need Visual Studio Community, but yes, they should be able to build with the same cmake command line.
Great! Just one more thing, I suppose it is worth mentioning that on macs, BLAS is already supported by default through the Accelerate framework.
Ok, I'll just add that info then.
BLAS is only used when the batch size is at least 32
I don't think it's an issue any more, the default is now 512, is it not?
The batch still has to be at least 32 tokens, it won't be used with smaller prompts.
BTW, @DaniAndTheWeb yesterday I gave some messy instructions on how to get OpenBLAS llama.cpp on Windows: https://github.com/ggerganov/llama.cpp/discussions/1153
@SlyEcho Producing the build instructions for Windows in a clear way seems to add quite a lot of space to the readme. If that is not a problem I can just add all the needed instructions.
@SlyEcho I'm finishing a revised version of the build instructions to include make on Windows. Is there any reason why you recommended the fortran version of w64devkit?
It's only when you want to build OpenBLAS yourself. Although, it may not be necessary. Why would you want to build it yourself? Then the library is smaller as it is optimized only for you machine.
So it should be OK to link the fortran version also for the normal build or it may be better to link the vanilla version for that?
It doesn't really hurt anything. The reason I recommend w64devkit, is that it is just download and extract and it's ready to use. No need to install anything, nothing to clean up, just delete when you don't need it.
I think that it may be quite clear like this.
Please fix the failing EditorConfig check, there are some lines with trailing whitespaces.