FastChat
FastChat copied to clipboard
Hardcoded host localhost and port 9090 for a rate monitor
trafficstars
In gradio_web_server.py, there is a hardcoded host and port for a supposed monitor daemon, but no daemon is around: https://github.com/lm-sys/FastChat/blob/a04072e35d0893e64169c8e3ea153312eb0fe9ac/fastchat/serve/gradio_web_server.py#L391
This, in turn, breaks the usage of openAI endpoints with the --register to a json file.
So, for example, this openai_compatible_server.json can't run:
{
"Llama 405": {
"model_name": "llama3.1:405b",
"api_type": "openai",
"api_base": "http://localhost:11434/v1",
"api_key": "",
"anony_only": false,
"recommended_config": {
"temperature": 0.7,
"top_p": 1.0
}
}
}
Because it will always fail with a CONNECTION REFUSED since we have no such monitor daemon:
2024-09-19 21:31:47 | INFO | gradio_web_server | bot_response. ip: 127.0.0.1
2024-09-19 21:31:47 | INFO | gradio_web_server | monitor error: HTTPConnectionPool(host='localhost', port=9090): Max retries exceeded with url: /is_limit_reached?model=Llama%20405%20on%20WestAI&user_id=127.0.0.1 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x119061610>: Failed to establish a new connection: [Errno 61] Connection refused'))