serge
serge copied to clipboard
Missing --host parameter in deploy.sh for K8S type deployment using release docker image
Bug description
A missing parameter in deploy.sh can lead to failure to join the web service on Kubernetes. I have seen that adding the following to deploy.sh fix the issue:
cd api && uvicorn main:app --host 0.0.0.0 --port 9124 --root-path /api/ &
Here is the error log if needed.
INFO: main initializing models
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:9124 (Press CTRL+C to quit)
INFO: main models are ready
11:26:38 AM [vite] http proxy error at /chats:
Error: connect ECONNREFUSED ::1:9124
at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1532:16)
11:26:38 AM [vite] http proxy error at /chat/420689cd-99de-477e-8ea0-b0ec82f51830:
Error: connect ECONNREFUSED ::1:9124
at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1532:16)
SyntaxError: Unexpected end of JSON input
at JSON.parse (<anonymous>)
at Proxy.eval (/node_modules/@sveltejs/kit/src/runtime/server/page/load_data.js:286:19)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async load (+layout.ts:12:17)
at async Module.load_data (/node_modules/@sveltejs/kit/src/runtime/server/page/load_data.js:162:17)
at async eval (/node_modules/@sveltejs/kit/src/runtime/server/page/index.js:169:13)
SyntaxError: Unexpected token 'I', "Internal S"... is not valid JSON
at JSON.parse (<anonymous>)
at Proxy.eval (/node_modules/@sveltejs/kit/src/runtime/server/page/load_data.js:286:19)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async load (+layout.ts:12:17)
at async Module.load_data (/node_modules/@sveltejs/kit/src/runtime/server/page/load_data.js:162:17)
at async Module.respond_with_error (/node_modules/@sveltejs/kit/src/runtime/server/page/respond_with_error.js:52:17)
at async resolve (/node_modules/@sveltejs/kit/src/runtime/server/respond.js:388:12)
at async Module.respond (/node_modules/@sveltejs/kit/src/runtime/server/respond.js:240:20)
at async file:///usr/src/app/web/node_modules/@sveltejs/kit/src/exports/vite/dev/index.js:505:22
11:27:03 AM [vite] http proxy error at /chats:
Error: connect ECONNREFUSED ::1:9124
at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1532:16)
11:27:03 AM [vite] http proxy error at /models:
Error: connect ECONNREFUSED ::1:9124
at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1532:16)
SyntaxError: Unexpected end of JSON input
at JSON.parse (<anonymous>)
at Proxy.eval (/node_modules/@sveltejs/kit/src/runtime/server/page/load_data.js:286:19)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async load (+layout.ts:12:17)
at async Module.load_data (/node_modules/@sveltejs/kit/src/runtime/server/page/load_data.js:162:17)
at async eval (/node_modules/@sveltejs/kit/src/runtime/server/page/index.js:169:13)
This should not have any impact on the deployment on Docker.
Steps to reproduce
-
kubectl run serge-dev --image=ghcr.io/nsarrazin/serge:release --port='8008' --port='9124' --expose=true
service/serge-dev created pod/serge-dev created
-
Go to firefox and enter service IP exposing the webservice
-
Get error 500:
Environment Information
OS: Rocky Linux 8.7 Kubernetes Version: Kubernetes 1.25.6 Browser: Firefox 111.0.1
Screenshots
No response
Relevant log output
No response
Confirmations
- [X] I'm running the latest version of the main branch.
- [X] I checked existing issues to see if this has already been described.
I have also provided the fix here (look for https://github.com/nsarrazin/serge/pull/71). It shouldn't impact other deployments. If this issue gets fixed, a new docker image can be built and pushed to the repository and it'll be deployable on Docker and K8S
@FenarkSEC Can this be close now that #71 was merged?
I'm currently trying with the new image that has been published. I'll give you the answer in the following minutes.
Newly pushed image returns an error in the chat (see below), building it locally and using it fixes the original issue and works.
main: seed = 1679851058
llama_model_load: loading model from '/usr/src/app/weights/ggml-alpaca-7B-q4_0.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx = 512
llama_model_load: n_embd = 4096
llama_model_load: n_mult = 256
llama_model_load: n_head = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot = 128
llama_model_load: f16 = 2
llama_model_load: n_ff = 11008
llama_model_load: n_parts = 1
https://github.com/nsarrazin/serge/commit/09471f2346177de3b0cc46dcc9943647ab302931 have fixed the issue with the newly built image, issue can be closed.