BitNet
BitNet copied to clipboard
running in server mode not returning any response
Bring up the llm in server mode with command
python run_inference_server.py -m <model> --host 0.0.0.0 --port 5000
When connect to the server using API endpoint
http://localhost:5000/completion
with payload
{"prompt": "<prompt>}
The server receive the request and start generating tokens but keep on generating, going in infinite loop, never return
how about containing headers?
` headers = { "Content-Type": "application/json" }
try:
print(payload)
response = requests.post(self.server_url, headers=headers, json=payload, timeout=20)`
this works for me.