FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

Hardcoded host localhost and port 9090 for a rate monitor

Open surak opened this issue 1 year ago • 2 comments
trafficstars

In gradio_web_server.py, there is a hardcoded host and port for a supposed monitor daemon, but no daemon is around: https://github.com/lm-sys/FastChat/blob/a04072e35d0893e64169c8e3ea153312eb0fe9ac/fastchat/serve/gradio_web_server.py#L391

This, in turn, breaks the usage of openAI endpoints with the --register to a json file.

So, for example, this openai_compatible_server.json can't run:

{
  "Llama 405": {
    "model_name": "llama3.1:405b",
    "api_type": "openai",
    "api_base": "http://localhost:11434/v1",
    "api_key": "",
    "anony_only": false,
    "recommended_config": {
      "temperature": 0.7,
      "top_p": 1.0
    }
  }
}

Because it will always fail with a CONNECTION REFUSED since we have no such monitor daemon:

2024-09-19 21:31:47 | INFO | gradio_web_server | bot_response. ip: 127.0.0.1
2024-09-19 21:31:47 | INFO | gradio_web_server | monitor error: HTTPConnectionPool(host='localhost', port=9090): Max retries exceeded with url: /is_limit_reached?model=Llama%20405%20on%20WestAI&user_id=127.0.0.1 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x119061610>: Failed to establish a new connection: [Errno 61] Connection refused'))

surak avatar Sep 20 '24 08:09 surak