llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Updating build instructions to include BLAS support

Open daniandtheweb opened this issue 1 year ago • 15 comments

This pull request adds clear instructions about how to build llama.cpp on every platform with and without BLAS support.

daniandtheweb avatar Apr 25 '23 22:04 daniandtheweb

Looks good, could we also include a brief notice explaining that BLAS is only used when the batch size is at least 32? Making clear that it only benefits prompt processing, but not generation.

For cuBLAS, it may also be useful to point that the CUDA toolkit can be obtained from https://developer.nvidia.com/cuda-downloads

slaren avatar Apr 25 '23 22:04 slaren

Sure. By installing the CUDA toolkit can Windows users build without any problem? I don't have a Nvidia GPU so I can't test it.

daniandtheweb avatar Apr 25 '23 23:04 daniandtheweb

They also need Visual Studio Community, but yes, they should be able to build with the same cmake command line.

slaren avatar Apr 25 '23 23:04 slaren

Great! Just one more thing, I suppose it is worth mentioning that on macs, BLAS is already supported by default through the Accelerate framework.

slaren avatar Apr 25 '23 23:04 slaren

Ok, I'll just add that info then.

daniandtheweb avatar Apr 25 '23 23:04 daniandtheweb

BLAS is only used when the batch size is at least 32

I don't think it's an issue any more, the default is now 512, is it not?

SlyEcho avatar Apr 26 '23 09:04 SlyEcho

The batch still has to be at least 32 tokens, it won't be used with smaller prompts.

slaren avatar Apr 26 '23 09:04 slaren

BTW, @DaniAndTheWeb yesterday I gave some messy instructions on how to get OpenBLAS llama.cpp on Windows: https://github.com/ggerganov/llama.cpp/discussions/1153

SlyEcho avatar Apr 26 '23 09:04 SlyEcho

@SlyEcho Producing the build instructions for Windows in a clear way seems to add quite a lot of space to the readme. If that is not a problem I can just add all the needed instructions.

daniandtheweb avatar Apr 26 '23 12:04 daniandtheweb

@SlyEcho I'm finishing a revised version of the build instructions to include make on Windows. Is there any reason why you recommended the fortran version of w64devkit?

daniandtheweb avatar Apr 26 '23 13:04 daniandtheweb

It's only when you want to build OpenBLAS yourself. Although, it may not be necessary. Why would you want to build it yourself? Then the library is smaller as it is optimized only for you machine.

SlyEcho avatar Apr 26 '23 13:04 SlyEcho

So it should be OK to link the fortran version also for the normal build or it may be better to link the vanilla version for that?

daniandtheweb avatar Apr 26 '23 13:04 daniandtheweb

It doesn't really hurt anything. The reason I recommend w64devkit, is that it is just download and extract and it's ready to use. No need to install anything, nothing to clean up, just delete when you don't need it.

SlyEcho avatar Apr 26 '23 13:04 SlyEcho

I think that it may be quite clear like this.

daniandtheweb avatar Apr 26 '23 13:04 daniandtheweb

Please fix the failing EditorConfig check, there are some lines with trailing whitespaces.

slaren avatar Apr 26 '23 18:04 slaren