dalai
dalai copied to clipboard
llama does work while alpaca does not (bad magic error)
If i try to install alpaca instead of llama, I get a bad magic error when running it. While using llama.cpp from this repo https://github.com/ggerganov/llama.cpp/releases it will work by using the same alpaca models file with no issue. I'm trying to make llama.cpp work from the browser now
same for me
Apparently they now "Avoid unnecessary bit shuffling by packing the quants in a better way. Requires model re-quantization". So that's the problem.
How do we re-quantize it?
Beats me. I don't think it's really desirable to do that either because you would have to basically downgrade the encoding of every future version of llama or alpaca models to keep it working with the codebase. I switched to a different codebase of llama I found on github that had been kept up to date and got it to work.
On Tue, May 9, 2023 at 4:40 AM Stuart Alexander @.***> wrote:
How do we re-quantize it?
— Reply to this email directly, view it on GitHub https://github.com/cocktailpeanut/dalai/issues/434#issuecomment-1539677089, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFC2UWODMQKAGJODODAUO23XFIGKJANCNFSM6AAAAAAXX3DSYQ . You are receiving this because you commented.Message ID: @.***>
It seems like its going to be some effort for this api project and more to chase and maintain compatibility/support for this academic style code. They probably don't have portability and consistency in mind when doing research.
go to llama.ccp or koboldccp .... this project is to obsolete ...
bad magic i also encountered but i change the model into LLaMa so not problem i guess model problem ....
1200ms for 7b model token ? Wow ... I have 400ms token with 70b / 65b models ...but using llamacpp
For 7b models I have 14 ms /token