dalai llama does work while alpaca does not (bad magic error)

llama does work while alpaca does not (bad magic error)

Open suoko opened this issue 1 year ago • 8 comments

If i try to install alpaca instead of llama, I get a bad magic error when running it. While using llama.cpp from this repo https://github.com/ggerganov/llama.cpp/releases it will work by using the same alpaca models file with no issue. I'm trying to make llama.cpp work from the browser now

May 06 '23 04:05 suoko

same for me

May 07 '23 09:05 theproj3ct

Apparently they now "Avoid unnecessary bit shuffling by packing the quants in a better way. Requires model re-quantization". So that's the problem.

May 09 '23 04:05 zephyrprime

How do we re-quantize it?

May 09 '23 08:05 thestumonkey

Beats me. I don't think it's really desirable to do that either because you would have to basically downgrade the encoding of every future version of llama or alpaca models to keep it working with the codebase. I switched to a different codebase of llama I found on github that had been kept up to date and got it to work.

On Tue, May 9, 2023 at 4:40 AM Stuart Alexander @.***> wrote:

How do we re-quantize it?

— Reply to this email directly, view it on GitHub https://github.com/cocktailpeanut/dalai/issues/434#issuecomment-1539677089, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFC2UWODMQKAGJODODAUO23XFIGKJANCNFSM6AAAAAAXX3DSYQ . You are receiving this because you commented.Message ID: @.***>

May 09 '23 15:05 zephyrprime

It seems like its going to be some effort for this api project and more to chase and maintain compatibility/support for this academic style code. They probably don't have portability and consistency in mind when doing research.

May 13 '23 15:05 thorsteinssonh

go to llama.ccp or koboldccp .... this project is to obsolete ...

May 14 '23 17:05 mirek190

bad magic i also encountered but i change the model into LLaMa so not problem i guess model problem ....

Aug 10 '23 07:08 junxian428

1200ms for 7b model token ? Wow ... I have 400ms token with 70b / 65b models ...but using llamacpp

For 7b models I have 14 ms /token

Aug 10 '23 08:08 mirek190

dalai dalai copied to clipboard

llama does work while alpaca does not (bad magic error)

dalai
dalai copied to clipboard