llama.cpp
llama.cpp copied to clipboard
Clearer windows instructions.. please?
Looking at the README i just see too many incomplete information and steps... ( the readme is assuming i know things i dont )
If anyone would be nice to ELI5 for me please... I've gotten up to installing visual studio and i cloned the repo in a folder... I got the 7B llama file "consolidated.00.pth" ... I have multiple versions of python so i dont know wich one to use..
Then.. the instructions are just head scratching to me.
ok i built the code with cmake and now in visual studio... i try to run it and it says access denied for ALL_BUILD
;_;
First you need to have all the additional files in LLaMa folder like tokenizer.model
and tokenizer_checklist.chk
and in that folder you shoud have 7B folder and inside of that checklist.chk
, consolidated.00.pth
, params.json
A- you don't need to compile the repo yourself anymore , download windows release.
B- Now you need to download the repo zip file in order to use python files!
extract it in your ssd ,then:
1- install python ( im using 3.6.10) and check add the system path during installation.
2- run python -m pip install torch numpy sentencepiece
and wait until it finishes.
3- go to your extracted folder (step B) , clear windows address bar and type cmd
then hit the enter to open up cmd in that directory.
4- type python convert-pth-to-ggml.py D:\LLama\7B\ 1
( note that you shoud change D:\llama... to your 7B location in your pc. but leave the "1
" intact.
5- after it is done you wil have a file named ggml-model-f16.bin
in your model\7B directory
6- now open your compiled release that you downloaded in the step A .
7- once again open that folder and type cmd
in the address bar and press enter
8- now type quantize D:\LLama\7B\ggml-model-f16.bin D:\LLama\7B\ggml-model-q4_0.bin 2
( again all the addresses should be change to your own data location!)
9- it is done , all you need to do is run cmd in compiled folder (A) then run your prompts , like:
main -m D:\LLaMA\7B\ggml-model-q4_0.bin -p "What is love?"
main: prompt: ' What is love?'
main: number of tokens in prompt = 5
1 -> ''
1724 -> ' What'
338 -> ' is'
5360 -> ' love'
29973 -> '?'
sampling parameters: temp = 0.800000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.100000
What is love? Is it just a feeling or is it something more than that? Is there more to it than meets the eye?
I think it's something more. It certainly can be felt, but I don't believe it's as simple as that. Love isn't simply feelings; it has to do with actions and reactions as well as emotions. Love is not a feeling of comfort or warmth like you might feel around your best friend or someone you care about.
Love is something more than that.
When I think of love, I can't help but picture flowers blooming in springtime.
10- if you want to use larger model like 13B,30B,65B ,the method is the same BUT you have to run quantize
command more , for example if step 4 gives you multiple files like ggml-model-f16.bin
,ggml-model-f16.bin.1
ggml-model-f16.bin.2
... ggml-model-f16.bin.7 ,
you should run quantize one by one for each of them like:
quantize D:\LLama\65B\ggml-model-f16.bin D:\LLama\65B\ggml-model-q4_0.bin 2
quantize D:\LLama\65B\ggml-model-f16.bin.1 D:\LLama\65B\ggml-model-q4_0.bin.1 2
quantize D:\LLama\65B\ggml-model-f16.bin.2 D:\LLama\65B\ggml-model-q4_0.bin.2 2
.
.
.
quantize D:\LLama\65B\ggml-model-f16.bin.7 D:\LLama\65B\ggml-model-q4_0.bin.7 2
i think there are other ways too but it is how i do it in windows! correct me if there are better ways!
Alternatively you can download ubuntu via wsl2 in windows and run it on linux .
Python versioning is painful in general, and especially so in Windows.
Exactly why I made this https://github.com/anzz1/python-windows/ Just extract that to C:\python and click install.reg and you'll have a working multi-version python installation with none of the crap that comes with it using the official installer. (If you choose to use this, remove all your existing Python installations first to start fresh)
The official installers can also completely mess everything up if you do things like installing Python 2 after installing Python 3, as some of the installer versions (iirc python2.x 32-bit) do not understand the concept of multiversioning and just overwrites the shit out of everything.
The easiest option though is to uninstall all other versions of python and just installing python 3. If you keep multiple versions in windows and especially when package managers like pip
, or worse, conda
come into play and virtual environments (sigh..............), I guarantee you will be in a world of pain.
Thank you..! 😄 Ill try those later when my head stops hurting...
why are instructions so cryptic in general...
Because we often spend a lot of time answering issues that don't use the issue template instead of spending time writing documentation.