alpaca-electron
alpaca-electron copied to clipboard
7B 13B 30B Comparisons
I am testing a few models on my machine, M2 Mac.
At first, I tried 13B, slightly slow, but it's not bad, 5-7 words per seconds. The answers are pretty good actually. It's not yet ChatGPT, as I could not get proper answer on Blender Python. But it's pretty good at general Q&A.
I thought 7GB would be faster, but somewhat the AI responses and answers are disappointing. I delete it right away.
30 GB... is a bit too slow for this machine.
I wonder how we can refine a model, make it run faster and more precise on topic?
"how we can refine a model" - i think this depends on the model that you're using - not a program. Can author refine all models? not sure.
"precise on topic" will be only if parameters like "temp" "top-p" will be added to control in this program (like in other UI text-generation tools). As example if you'll run your models through console services - you'll have ability to control base params like this.
Hmm... yes, I need to investigate the model more, but I am pretty happy with 13B. It seems pretty smart, just under ChatGPT.
However, this 7B model is definitely broken: https://huggingface.co/Pi3141/alpaca-lora-7B-ggml/blob/main/ggml-model-q4_1.bin https://huggingface.co/Pi3141/alpaca-lora-7B-ggml/resolve/main/ggml-model-q4_1.bin
Is there a recommendation on Llama or Alpaca model that's the most creative / better for coding?
In many Occassions, if we ask questions with the same beginning of sentences, it will repeat the answers, without even thinking. It's a bug.
Are there any compatible models other than the basic 7b/13b/30b from here?
https://huggingface.co/Pi3141
I've got 30b running on my 5900x/64GB RAM desktop and it's actually pretty useable - maybe 2-3 words per second. I wasn't sure how well that would work.
Curious if there is anything else I can try out. I don't know that much about LLMs but most don't seem to be in this ggml/.bin format. I searched GGML on HuggingFace but none of them (at least that I'm interested in) seem to work. Assume they're the "old format" the model loader references.
EDIT: Actually I've found one that works : https://huggingface.co/verymuchawful/Alpacino-13b-ggml