alpaca-electron icon indicating copy to clipboard operation
alpaca-electron copied to clipboard

what ever I try no model loads

Open simteraplications opened this issue 1 year ago • 22 comments

I downloaded the models from the link provided on version1.05 release page. But what ever I try it always sais couldn't load model. I use the ggml-model-q4_0.bin or the ggml-model-q4_0.bin files but nothing loads. I tried windows and Mac. It doesn't give me a proper error message just sais couldn't load model.

simteraplications avatar Apr 22 '23 00:04 simteraplications

You need to download the q4_1 file, not q4_0.

ItsPi3141 avatar Apr 22 '23 03:04 ItsPi3141

I used the following link. https://huggingface.co/Pi3141/alpaca-7b-native-enhanced/blob/main/ggml-model-q4_1.bin it doesn't work, just sais can't load.

simteraplications avatar Apr 22 '23 14:04 simteraplications

i tried so many models and they either fail to load or they never write anything at all, i used kobold and the models work fine so i dunno what im doing wrong, i like this tool a lot but it never actually worked for me

Where exactly did you get the models from?

ItsPi3141 avatar Apr 24 '23 14:04 ItsPi3141

from the link on the releases page https://huggingface.co/Pi3141

And you're using q4_1, right?

ItsPi3141 avatar Apr 24 '23 18:04 ItsPi3141

i tried this one https://huggingface.co/Pi3141/gpt4-x-alpaca-native-13B-ggml/blob/main/ggml-model-q4_1.bin

Can you try Alpaca native enhanced? https://huggingface.co/Pi3141/alpaca-7b-native-enhanced

ItsPi3141 avatar Apr 24 '23 18:04 ItsPi3141

Maybe you can show the terminal log if you are using mac or linux, that will be more clear.

penghe2021 avatar Apr 24 '23 18:04 penghe2021

that one works, i guess its just really slow? also it doesnt seem to take into account other stuff that runs on my pc because, its running at 100% and, my music now has these little skips in the audio, and my pc is unstable

i dont remember the kobold ui being so extreme, i could multitask with other stuff, also kobold shows me the tokens being read in real time which was really good feedback that it was doing stuff, but with alpaca electron i cant tell if the window is stuck or if its actually doing stuff, i really wish there was some text down here that said "Processing Characters: 1 of 5000"

or something like that, it would improve the usability by 200%

this is just, kinda annoying to look at and, it doesnt tell me anything, it just made me assume it was frozen

I'll consider adding the character processed counter. Most of this stuff is to do with llama.cpp though. I have no control over the CPU usage. Im just making the frontend for it.

ItsPi3141 avatar Apr 24 '23 19:04 ItsPi3141

i think you should add it or you are going to get more people reporting the models as broken

Actually I can't. Llama.cpp doesn't show how many tokens of the prompt has been processed.

What I'll do to fix people reporting that the model is broken is that I will make it a rule that people cannot open an issue if they haven't waited at least 1 hour for a response from the model to make sure that it's not just their computer.

Because if a model can't be load, the app will notify you. It only freezes in rare edge cases.

ItsPi3141 avatar Apr 24 '23 20:04 ItsPi3141

I tried all these models and none of them works, everything just sais couldn't load model. How do I find the terminall logs? I am using the macOS arm64 build.

simteraplications avatar Apr 24 '23 20:04 simteraplications

bruh nobody is ever gonna wait one hour, they will just find another tool

Yeah good luck to them finding a different tool thats faster than llama.cpp. If it takes that long for llama.cpp to run for them, then their CPU spec is probably not good, thus it would also make sense that they wouldn't have a GPU or the GPU won't be powerful enough.

ItsPi3141 avatar Apr 24 '23 22:04 ItsPi3141

where can I find the terminal logs on Mac?

simteraplications avatar Apr 24 '23 22:04 simteraplications

where can I find the terminal logs on Mac?

Sorry, I didn't test it on Mac before, I just assume when we run the command on terminal, it will display some info like this


//> llama_model_load_internal: format     = ggjt v1 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 2048
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 3 (mostly Q4_1)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size =  59.11 KB
llama_model_load_internal: mem required  = 6612.57 MB (+ 1026.00 MB per state)

//> llama_init_from_file: kv self size  = 1024.00 MB

system_info: n_threads = 8 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | 
main: interactive mode on.
Reverse prompt: 'User:'
Reverse prompt: '### Instruction:

penghe2021 avatar Apr 24 '23 22:04 penghe2021

Sorry, I didn't test it on Mac before, I just assume when we run the command on terminal, it will display some info like this


//> llama_model_load_internal: format     = ggjt v1 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 2048
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 3 (mostly Q4_1)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size =  59.11 KB
llama_model_load_internal: mem required  = 6612.57 MB (+ 1026.00 MB per state)

//> llama_init_from_file: kv self size  = 1024.00 MB

system_info: n_threads = 8 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | 
main: interactive mode on.
Reverse prompt: 'User:'
Reverse prompt: '### Instruction:

That's normal, it's loading the model. Give it some time.

ItsPi3141 avatar Apr 24 '23 23:04 ItsPi3141

Hey I had the same problem on linux (fedora silverblue 38) and I tryed to compile it myself and then it worked! Im also guessing this is the same issues as: https://github.com/ItsPi3141/alpaca-electron/issues/24 https://github.com/ItsPi3141/alpaca-electron/issues/51

skidd-level-100 avatar Apr 29 '23 20:04 skidd-level-100

from the link on the releases page https://huggingface.co/Pi3141

And you're using q4_1, right?

What's the difference with q4_1.bin q4_2.bin q4_3.bin etc?

tinfoil-hat-net avatar May 14 '23 10:05 tinfoil-hat-net