LocalAI
LocalAI copied to clipboard
docker-compose method could not load rwkv model, in spite of proper folder structure
hello,
i tried the docker-compose method outlined in the README and here is the output:
$ docker-compose up -d --build /usr/lib/python3/dist-packages/paramiko/transport.py:236: CryptographyDeprecationWarning: Blowfish has been deprecated "class": algorithms.Blowfish, Building api [+] Building 1.6s (13/13) FINISHED => [internal] load .dockerignore 0.7s => => transferring context: 73B 0.6s => [internal] load build definition from Dockerfile.dev 0.7s => => transferring dockerfile: 352B 0.6s => [internal] load metadata for docker.io/library/debian:11 0.5s => [internal] load metadata for docker.io/library/golang:1.20 0.5s => [builder 1/5] FROM docker.io/library/golang:1.20@sha256:eaf12671a7ac51fd23786109c19bd0150c8f894e2672024faac3d14ed4 0.0s => [internal] load build context 0.4s => => transferring context: 5.91kB 0.3s => [stage-1 1/2] FROM docker.io/library/debian:11@sha256:63d62ae233b588d6b426b7b072d79d1306bfd02a72bff1fc045b8511cc89 0.0s => CACHED [builder 2/5] WORKDIR /build 0.0s => CACHED [builder 3/5] RUN apt-get update && apt-get install -y cmake 0.0s => CACHED [builder 4/5] COPY . . 0.0s => CACHED [builder 5/5] RUN make build 0.0s => CACHED [stage-1 2/2] COPY --from=builder /build/local-ai /usr/bin/local-ai 0.0s => exporting to image 0.0s => => exporting layers 0.0s => => writing image sha256:3906afc65f2d467953e4aab83a55aaf75498b858a0ebf299bc32f8eed16b4328 0.0s => => naming to quay.io/go-skynet/local-ai:latest 0.0s Starting localai_api_1 ... done
but then the rwkv models (multiple ones properly converted) do not load:
$ curl http://172.18.0.2:8080/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "rwkv.cpp-1.5b-11x.bin", "messages": [{"role": "user", "content": "Say this is a test!"}], "temperature": 0.7 }' {"error":{"code":500,"message":"could not load model - all backends returned error: 5 errors occurred:\n\t* failed loading model\n\t* failed loading model\n\t* failed loading model\n\t* failed loading model\n\t* could not load model\n\n","type":""}}
please like this issue if you have the same problem
Hi @bennmann,
Can you show the content of your models directory, and what the /models endpoint return?
Mine one loaded properly, but facing the following error I get from docker-compose logs:
api_1 | llama.cpp: loading model from /models/rwkv-1b5-v11
api_1 | error loading model: unknown (magic, version) combination: 67676d66, 00000064; is this really a GGML file?
api_1 | llama_init_from_file: failed to load model
api_1 | gptj_model_load: invalid model file '/models/rwkv-1b5-v11' (bad magic)
api_1 | gptj_bootstrap: failed to load model from '/models/rwkv-1b5-v11'
api_1 | gpt2_model_load: invalid model file '/models/rwkv-1b5-v11' (bad magic)
api_1 | gpt2_bootstrap: failed to load model from '/models/rwkv-1b5-v11'
api_1 | stablelm_model_load: invalid model file '/models/rwkv-1b5-v11' (bad magic)
api_1 | stablelm_bootstrap: failed to load model from '/models/rwkv-1b5-v11'
api_1 | SIGILL: illegal instruction
api_1 | PC=0x8ea6ed m=0 sigcode=2
api_1 | signal arrived during cgo execution
api_1 | instruction bytes: 0xc4 0xe3 0x7d 0x39 0x45 0xa8 0x1 0x48 0x8b 0x43 0xe8 0x48 0x8b 0xbd 0xd8 0x7c
The model above converted using the following command:
python rwkv/convert_pytorch_to_ggml.py ~/Downloads/RWKV-4-Raven-1B5-v11-Eng99%-Other1%-20230425-ctx4096.pth ~/Downloads/rwkv-1b5-v11 float16
The following is the output of gpt4all-chat's test_hw on my workstation:
gpt4all hardware test results:
AVX = 1
AVX2 = 0
FMA = 0
SSE3 = 1
your hardware supports the "bare_minimum" version of gpt4all.
Did RWKV required AVX2 to execute?
can you check latest master? I've also added an example: https://github.com/go-skynet/LocalAI/tree/master/examples/rwkv
did a git pull, then similar issues arise:
/LocalAI$ http://localhost:8080/v1/chat/chat/completions -H "Content-Type: application/json" -d '{
"model": "rwkv.cpp-1.5b-11x.bin",
"messages": [{"role": "user", "content": "Say this is a test!"}],
"temperature": 0.7
}'
{"error":{"code":500,"message":"could not load model - all backends returned error: 5 errors occurred:\n\t* failed loading model\n\t* failed loading model\n\t* failed loading model\n\t* failed loading model\n\t* could not load model\n\n","type":""}}
I do have a proper response for querying models:
{"object":"list","data":[{"id":"20B_tokenizer.json","object":"model"},{"id":"RWKV-14B-11x-Q5_1.bin","object":"model"},{"id":"RWKV-4-Raven-7B-v10-Eng99-20230418-ctx8192-cppfp16.bin","object":"model"},{"id":"rwkv.cpp-1.5b-11x.bin","object":"model"},{"id":"rwkv.tokenizer.json"," ```
the models DO begin to load into RAM (via free -h command i see in terminal Used space increases), but still error 500
thanks for any help on next steps.
Can you try the steps in https://github.com/go-skynet/LocalAI/tree/master/examples/rwkv ?
i ran out of disk space, and having a time getting my environment stable again.... i will report back in some time
ok, hopes this helps someone else too - containers kept stacking up in docker and ate my space away:
had to
~$ docker system prune -a -f
then i was able to remove the larger models and begin another test with 1.5b size model only:
docker-compose up -d --build
....
....
=> [builder 5/5] RUN make build 53.3s
=> [stage-1 2/2] COPY --from=builder /build/local-ai /usr/bin/local-ai 0.3s
=> exporting to image 0.2s
=> => exporting layers 0.1s
=> => writing image sha256:bc5368903e72d4a9766857d0418a3057f605d3579c597606fc984c2d28f60afd 0.0s
=> => naming to quay.io/go-skynet/local-ai:latest 0.0s
Creating rwkv_api_1 ... done
but alas i still get http 500 (models end point still works)
$ curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "rwkv.cpp-1.5b-11x.bin", "messages": [{"role": "user", "content": "Say this is a test!"}], "temperature": 0.7 }' {"error":{"code":500,"message":"could not load model - all backends returned error: 5 errors occurred:\n\t* failed loading model\n\t* failed loading model\n\t* failed loading model\n\t* failed loading model\n\t* could not load model\n\n","type":""}}
also i haven't found the most graceful way to get localai to stop, i just docker kill it to get ram back
$ docker kill rwkv_api_1
I had this problem when I first ran it and it turned out to be because it doesn't handle symlinks properly, once I switched from a symlink to just copying the file into the models directory it fixed it.
today i tested this way, and confirmed all the folders structures were chown'd to my user
i tried the local make build install and also get an RWKV HTTP 500
Has anyone gotten past this http 500 issue?
I have the same error, can't load the model, latest main version and everything setup correctly (BUILD_TYPE=generic). I tried to run rwkv-cpp on the very same model that cannot be loaded with LocalAI, and it was working well ... I've tried updating the commit version of go-rwkv-cppp in the Makefile but didn't work either.
did you put the token file next to the rwkv model? there is an rwkv example over here: https://github.com/go-skynet/LocalAI/tree/master/examples/rwkv
Yes to the token and same issue
Did you tried to run step-by-step the example? Can you report the full output logs?
To understand what's going on, I would need to know:
- Version of LocalAI you are using
- What is the content of your model folder, and if you had configured the model with a YAML file, please post it as well
- Full output logs of the API running with
--debugwith your steps
-
Version of LocalAI you are using latest:
850a690290ac32079efa1e5f779bdd082957d380(remove and reclone the project to reproduce the example step by step ) -
What is the content of your model folder, and if you had configured the model with a YAML file, please post it as well
debian@ai3:~/workspace/LocalAI/examples/rwkv$ ls -al models/
total 1205536
drwxr-xr-x 2 debian debian 4096 May 14 13:49 .
drwxr-xr-x 4 debian debian 4096 May 14 13:41 ..
-rw-r--r-- 1 debian debian 296 May 14 13:41 gpt-3.5-turbo.yaml
-rw-r--r-- 1 root root 1231971925 May 14 13:49 rwkv
-rw-r--r-- 1 debian debian 2467981 May 14 13:50 rwkv.tokenizer.json
-rw-r--r-- 1 debian debian 397 May 14 13:41 rwkv_chat.tmpl
-rw-r--r-- 1 debian debian 44 May 14 13:41 rwkv_completion.tmpl
md5sum of rwkv model : 71494609f13616d7fb8e9daa101cefd0 (used model in the example)
- Full output logs of the API running with --debug with your steps
Starting LocalAI using 4 threads, with models path: /models
┌───────────────────────────────────────────────────┐
│ Fiber v2.45.0 │
│ http://127.0.0.1:8080 │
│ (bound on host 0.0.0.0 and port 8080) │
│ │
│ Handlers ............ 17 Processes ........... 1 │
│ Prefork ....... Disabled PID ................. 1 │
└───────────────────────────────────────────────────┘
1:56PM DBG Model: gpt-3.5-turbo (config: {OpenAIRequest:{Model:rwkv File: ResponseFormat: Language: Prompt:<nil> Instruction: Input:<nil> Stop:<nil> Messages:[] Stream:false Echo:false TopP:0.8 TopK:80 Temperature:0.9 Maxtokens:100 N:0 Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 Seed:0} Name:gpt-3.5-turbo StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:1024 F16:false Threads:14 Debug:false Roles:map[assistant:Alice: system:Alice: user:Bob:] Embeddings:false Backend:rwkv TemplateConfig:{Completion:rwkv_completion Chat:rwkv_chat Edit:} MirostatETA:0 MirostatTAU:0 Mirostat:0 PromptStrings:[] InputStrings:[] InputToken:[]})
1:57PM DBG Request received: {"model":"gpt-3.5-turbo","file":"","response_format":"","language":"","prompt":null,"instruction":"","input":null,"stop":null,"messages":[{"role":"user","content":"How are you?"}],"stream":false,"echo":false,"top_p":0.8,"top_k":80,"temperature":0.9,"max_tokens":0,"n":0,"batch":0,"f16":false,"ignore_eos":false,"repeat_penalty":0,"n_keep":0,"mirostat_eta":0,"mirostat_tau":0,"mirostat":0,"seed":0}
1:57PM DBG Parameter Config: &{OpenAIRequest:{Model:rwkv File: ResponseFormat: Language: Prompt:<nil> Instruction: Input:<nil> Stop:<nil> Messages:[] Stream:false Echo:false TopP:0.8 TopK:80 Temperature:0.9 Maxtokens:100 N:0 Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 Seed:0} Name:gpt-3.5-turbo StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:1024 F16:false Threads:14 Debug:true Roles:map[assistant:Alice: system:Alice: user:Bob:] Embeddings:false Backend:rwkv TemplateConfig:{Completion:rwkv_completion Chat:rwkv_chat Edit:} MirostatETA:0 MirostatTAU:0 Mirostat:0 PromptStrings:[] InputStrings:[] InputToken:[]}
1:57PM DBG Template found, input modified to: The following is a verbose detailed conversation between Bob and a woman, Alice. Alice is intelligent, friendly and likeable. Alice is likely to agree with Bob.
Bob: Hello Alice, how are you doing?
Alice: Hi Bob! Thanks, I'm fine. What about you?
Bob: I am very good! It's nice to see you. Would you mind me chatting with you for a while?
Alice: Not at all! I'm listening.
Bob: How are you?
Alice:
1:57PM DBG Loading model in memory from file: /models/rwkv
[172.20.0.1]:49328 500 - POST /v1/chat/completions
PS: looking at ls -al of models I thought this could be a permission issue but after updating permissions I still have the exact same issue.
The only difference with the example is taht I used BUILD_TYPE=generic
I re-tried it locally and indeed there seems a regression introduced in https://github.com/go-skynet/LocalAI/pull/234. Fix is on its way in https://github.com/go-skynet/LocalAI/pull/255 and I'll tag a patch release afterwards, thanks for the detective work! This definitely needs more love in the CI to avoid regressions in the future. Please re-open or create other issues if you have still problems with rwkv.
looks like some filename variables may not be gracefully accounting for the example/rwkv folder structure now? this is closer though (using the docker-compose method)
$ docker-compose up -d --build
....
....
#0 6.402 CMake Error: The source "/build/go-bert/bert.cpp/CMakeLists.txt" does not match the source "/media/username/LocalAI/go-bert/bert.cpp/CMakeLists.txt" used to generate cache. Re-run cmake with a different source directory.
#0 6.403 make[1]: *** [Makefile:150: bert.o] Error 1
#0 6.403 make[1]: Leaving directory '/build/go-bert'
#0 6.403 make: *** [Makefile:94: go-bert/libgobert.a] Error 2
------
Dockerfile.dev:9
--------------------
7 | RUN apt-get update && apt-get install -y cmake
8 | COPY . .
9 | >>> RUN make build
10 |
11 | FROM debian:$DEBIAN_VERSION
--------------------
ERROR: failed to solve: process "/bin/sh -c make build" did not complete successfully: exit code: 2