dalai icon indicating copy to clipboard operation
dalai copied to clipboard

llama does work while alpaca does not (bad magic error)

Open suoko opened this issue 1 year ago • 8 comments

If i try to install alpaca instead of llama, I get a bad magic error when running it. While using llama.cpp from this repo https://github.com/ggerganov/llama.cpp/releases it will work by using the same alpaca models file with no issue. I'm trying to make llama.cpp work from the browser now

suoko avatar May 06 '23 04:05 suoko

same for me

theproj3ct avatar May 07 '23 09:05 theproj3ct

Apparently they now "Avoid unnecessary bit shuffling by packing the quants in a better way. Requires model re-quantization". So that's the problem.

zephyrprime avatar May 09 '23 04:05 zephyrprime

How do we re-quantize it?

thestumonkey avatar May 09 '23 08:05 thestumonkey

Beats me. I don't think it's really desirable to do that either because you would have to basically downgrade the encoding of every future version of llama or alpaca models to keep it working with the codebase. I switched to a different codebase of llama I found on github that had been kept up to date and got it to work.

On Tue, May 9, 2023 at 4:40 AM Stuart Alexander @.***> wrote:

How do we re-quantize it?

— Reply to this email directly, view it on GitHub https://github.com/cocktailpeanut/dalai/issues/434#issuecomment-1539677089, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFC2UWODMQKAGJODODAUO23XFIGKJANCNFSM6AAAAAAXX3DSYQ . You are receiving this because you commented.Message ID: @.***>

zephyrprime avatar May 09 '23 15:05 zephyrprime

It seems like its going to be some effort for this api project and more to chase and maintain compatibility/support for this academic style code. They probably don't have portability and consistency in mind when doing research.

thorsteinssonh avatar May 13 '23 15:05 thorsteinssonh

go to llama.ccp or koboldccp .... this project is to obsolete ...

mirek190 avatar May 14 '23 17:05 mirek190

bad magic i also encountered but i change the model into LLaMa so not problem i guess model problem .... image

junxian428 avatar Aug 10 '23 07:08 junxian428

1200ms for 7b model token ? Wow ... I have 400ms token with 70b / 65b models ...but using llamacpp

For 7b models I have 14 ms /token

mirek190 avatar Aug 10 '23 08:08 mirek190