BitNet icon indicating copy to clipboard operation
BitNet copied to clipboard

running in server mode not returning any response

Open anilkvp opened this issue 7 months ago • 1 comments

Bring up the llm in server mode with command

python run_inference_server.py -m <model> --host 0.0.0.0 --port 5000

When connect to the server using API endpoint

http://localhost:5000/completion

with payload

{"prompt": "<prompt>}

The server receive the request and start generating tokens but keep on generating, going in infinite loop, never return

anilkvp avatar May 20 '25 15:05 anilkvp

how about containing headers?

` headers = { "Content-Type": "application/json" }

    try:
        print(payload)
        response = requests.post(self.server_url, headers=headers, json=payload, timeout=20)`

this works for me.

g7199 avatar Jun 05 '25 06:06 g7199