litellm [Bug]: ollama_chat with yi-coder and proxy results in error: LLM Provider NOT provided

What happened?

setup a new model like this:

  - model_name: 'yi-coder'
    litellm_params:
      model: 'ollama_chat/yi-coder:latest'
      provider: ollama_chat
      api_base: 'https://your-ollama-server-url'

Try to chat with this model via curl and got this error:

litellm.BadRequestError: LLM Provider NOT provided. Pass in the LLM provider you are trying to call. You passed model=yi-coder Pass model as E.g. For 'Huggingface' inference endpoints pass in `completion(model='huggingface/starcoder',..)`

Relevant log output

Are you a ML Ops Team?

No

What LiteLLM version are you on ?

v1.53.7

Twitter / LinkedIn details

No response

Dec 05 '24 11:12 bgeneto

provider: ollama_chat

there's no field called 'provider'

Dec 05 '24 22:12 krrishdholakia

Unable to repro. this works as expected. Please share the full config + request being made to proxy, for repro.

Dec 05 '24 22:12 krrishdholakia

Unable to repro. this works as expected. Please share the full config + request being made to proxy, for repro.

It follows a minimal config.yaml file:

general_settings:
  master_key: sk_12345678
  litellm_salt_key: sk_87654321
  database_url: 'postgresql://litellm:[email protected]:5432/litellm'

litellm_settings:
  json_logs: true
  set_verbose: false
  cache: false

model_list:
  - model_name: 'yi-coder'
    litellm_params:
      model: 'ollama_chat/yi-coder:latest'
      api_base: 'http://localhost:11434'

And a curl request:

curl --location 'http://127.0.0.1:4000/v1/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data-raw '{
    "stream": false,
    "model": "yi-coder",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful coding assistant"
        },
        {
            "role": "user",
            "content": "A hello world code in your preferred language... you choose!"
        }
    ],
    "num_retries": 0,
    "request_timeout": 3
}'

And finally the output/error:

{
    "error": {
        "message": "litellm.BadRequestError: LLM Provider NOT provided. Pass in the LLM provider you are trying to call. You passed model=yi-coder\n Pass model as E.g. For 'Huggingface' inference endpoints pass in `completion(model='huggingface/starcoder',..)` Learn more: https://docs.litellm.ai/docs/providersNo fallback model group found for original model_group=yi-coder.",
        "type": null,
        "param": null,
        "code": "400"
    }
}

NOTE: This error also happens with other ollama models like: "granite3-dense:8b", "granite-code:8b"

Dec 07 '24 02:12 bgeneto

please share your complete debug logs.

got by running - litellm --detailed_debug.

Dec 07 '24 03:12 krrishdholakia

please share your complete debug logs.

got by running - litellm --detailed_debug.

Hello. I found that disabling the "enable_pre_call_checks" option (setting it to false) makes the problem go away. Here are some relevant logs:

{"message": "retrieve cooldown models: []", "level": "DEBUG", "timestamp": "2024-12-08T19:48:46.262376"}
{"message": "async cooldown deployments: []", "level": "DEBUG", "timestamp": "2024-12-08T19:48:46.262387"}
{"message": "cooldown_deployments: []", "level": "DEBUG", "timestamp": "2024-12-08T19:48:46.262395"}
{"message": "cooldown deployments: []", "level": "DEBUG", "timestamp": "2024-12-08T19:48:46.262403"}
{"message": "Starting Pre-call checks for deployments in model=yi-coder", "level": "DEBUG", "timestamp": "2024-12-08T19:48:46.262412"}
{"message": "token_counter messages received: [{'role': 'user', 'content': 'Tell me a joke'}]", "level": "DEBUG", "timestamp": "2024-12-08T19:48:46.262435"}
{"message": "Token Counter - using generic token counter, for model=", "level": "DEBUG", "timestamp": "2024-12-08T19:48:46.271864"}
{"message": "LiteLLM: Utils - Counting tokens for OpenAI model=gpt-3.5-turbo", "level": "DEBUG", "timestamp": "2024-12-08T19:48:46.271893"}
{"message": "An error occurs - OllamaError: Error getting model info for yi-coder:latest. Set Ollama API Base via `OLLAMA_API_BASE` environment variable. Error: [Errno 111] Connection refused", "level": "ERROR", "timestamp": "2024-12-08T19:48:46.272255", "stacktrace": "Traceback (most recent call last):\n  File \"/usr/local/lib/python3.11/site-packages/httpx/_transports/default.py\", line 69, in map_httpcore_exceptions\n    yield\n  File \"/usr/local/lib/python3.11/site-packages/httpx/_transports/default.py\", line 233, in handle_request\n    resp = self._pool.handle_request(req)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py\", line 256, in handle_request\n    raise exc from None\n  File \"/usr/local/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py\", line 236, in handle_request\n    response = connection.handle_request(\n               ^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/httpcore/_sync/connection.py\", line 101, in handle_request\n    raise exc\n  File \"/usr/local/lib/python3.11/site-packages/httpcore/_sync/connection.py\", line 78, in handle_request\n    stream = self._connect(request)\n             ^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/httpcore/_sync/connection.py\", line 124, in _connect\n    stream = self._network_backend.connect_tcp(**kwargs)\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/httpcore/_backends/sync.py\", line 207, in connect_tcp\n    with map_exceptions(exc_map):\n  File \"/usr/local/lib/python3.11/contextlib.py\", line 158, in __exit__\n    self.gen.throw(typ, value, traceback)\n  File \"/usr/local/lib/python3.11/site-packages/httpcore/_exceptions.py\", line 14, in map_exceptions\n    raise to_exc(exc) from exc\nhttpcore.ConnectError: [Errno 111] Connection refused\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.11/site-packages/litellm/llms/ollama.py\", line 218, in get_model_info\n    response = litellm.module_level_client.post(\n               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/litellm/llms/custom_httpx/http_handler.py\", line 474, in post\n    raise e\n  File \"/usr/local/lib/python3.11/site-packages/litellm/llms/custom_httpx/http_handler.py\", line 451, in post\n    response = self.client.send(req, stream=stream)\n               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/httpx/_client.py\", line 914, in send\n    response = self._send_handling_auth(\n               ^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/httpx/_client.py\", line 942, in _send_handling_auth\n    response = self._send_handling_redirects(\n               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/httpx/_client.py\", line 979, in _send_handling_redirects\n    response = self._send_single_request(request)\n               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/httpx/_client.py\", line 1015, in _send_single_request\n    response = transport.handle_request(request)\n               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/httpx/_transports/default.py\", line 232, in handle_request\n    with map_httpcore_exceptions():\n  File \"/usr/local/lib/python3.11/contextlib.py\", line 158, in __exit__\n    self.gen.throw(typ, value, traceback)\n  File \"/usr/local/lib/python3.11/site-packages/httpx/_transports/default.py\", line 86, in map_httpcore_exceptions\n    raise mapped_exc(message) from exc\nhttpx.ConnectError: [Errno 111] Connection refused\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.11/site-packages/litellm/router.py\", line 5017, in _pre_call_checks\n    model_info = self.get_router_model_info(\n                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/litellm/router.py\", line 4277, in get_router_model_info\n    model_info = litellm.get_model_info(model=model_info_name)\n                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/litellm/utils.py\", line 4699, in get_model_info\n    raise e\n  File \"/usr/local/lib/python3.11/site-packages/litellm/utils.py\", line 4542, in get_model_info\n    return litellm.OllamaConfig().get_model_info(model)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/litellm/llms/ollama.py\", line 223, in get_model_info\n    raise Exception(\nException: OllamaError: Error getting model info for yi-coder:latest. Set Ollama API Base via `OLLAMA_API_BASE` environment variable. Error: [Errno 111] Connection refused"}
{"message": "Logging Details LiteLLM-Failure Call: [<bound method Router.deployment_callback_on_failure of <litellm.router.Router object at 0x7fecc4ba5e90>>, <litellm.router_strategy.lowest_tpm_rpm_v2.LowestTPMLoggingHandler_v2 object at 0x7fecc5c0add0>, <litellm.proxy.hooks.parallel_request_limiter._PROXY_MaxParallelRequestsHandler object at 0x7fecc79ba7d0>, <litellm.proxy.hooks.max_budget_limiter._PROXY_MaxBudgetLimiter object at 0x7fecc778f410>, <litellm.proxy.hooks.cache_control_check._PROXY_CacheControlCheck object at 0x7fecc6d00990>, <litellm._service_logger.ServiceLogging object at 0x7fecc83e4e10>, <litellm.integrations.langfuse.langfuse_prompt_management.LangfusePromptManagement object at 0x7fecc35aec90>]", "level": "DEBUG", "timestamp": "2024-12-08T19:48:46.275080"}
{"message": "litellm.acompletion(model=None)\u001b[31m Exception litellm.BadRequestError: LLM Provider NOT provided. Pass in the LLM provider you are trying to call. You passed model=yi-coder\n Pass model as E.g. For 'Huggingface' inference endpoints pass in `completion(model='huggingface/starcoder',..)` Learn more: https://docs.litellm.ai/docs/providers\u001b[0m", "level": "INFO", "timestamp": "2024-12-08T19:48:46.275138"}
{"message": "Model= is not mapped in model cost map. Defaulting to None model_cost_information for standard_logging_payload", "level": "DEBUG", "timestamp": "2024-12-08T19:48:46.277551"}
{"message": "initial list of deployments: [{'model_name': 'yi-coder', 'litellm_params': {'rpm': 60, 'api_key': 'redacted', 'api_base': 'https://redacted.app', 'input_cost_per_token': 0.0, 'output_cost_per_token': 0.0, 'model': 'ollama_chat/yi-coder:latest'}, 'model_info': {'id': 'e7e7d78adb12465b57051b6c9bfeadaf1bcd6c363ffb10fd819f689c8e5ccaf6', 'db_model': False, 'input_cost_per_token': 0.0, 'output_cost_per_token': 0.0}}, {'model_name': 'yi-coder', 'litellm_params': {'rpm': 45, 'api_key': 'redacted', 'api_base': 'https://redacted.app', 'input_cost_per_token': 0.0, 'output_cost_per_token': 0.0, 'model': 'ollama_chat/yi-coder:latest'}, 'model_info': {'id': '6df83268c1489f86c949cc1a654598d9bb339eb838b19cb917000896e96a77fe', 'db_model': False, 'input_cost_per_token': 0.0, 'output_cost_per_token': 0.0}}]", "level": "DEBUG", "timestamp": "2024-12-08T19:48:46.277698"}
{"message": "retrieve cooldown models: []", "level": "DEBUG", "timestamp": "2024-12-08T19:48:46.278024"}
{"message": "TracebackTraceback (most recent call last):\n  File \"/usr/local/lib/python3.11/site-packages/litellm/router.py\", line 2654, in async_function_with_fallbacks\n    response = await self.async_function_with_retries(*args, **kwargs)\n               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/litellm/router.py\", line 2902, in async_function_with_retries\n    response = await self.make_call(original_function, *args, **kwargs)\n               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/litellm/router.py\", line 3001, in make_call\n    response = await response\n               ^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/litellm/router.py\", line 968, in _acompletion\n    raise e\n  File \"/usr/local/lib/python3.11/site-packages/litellm/router.py\", line 856, in _acompletion\n    deployment = await self.async_get_available_deployment(\n                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/litellm/router.py\", line 5440, in async_get_available_deployment\n    raise e\n  File \"/usr/local/lib/python3.11/site-packages/litellm/router.py\", line 5313, in async_get_available_deployment\n    healthy_deployments = self._pre_call_checks(\n                          ^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/litellm/router.py\", line 5094, in _pre_call_checks\n    model, custom_llm_provider, _, _ = litellm.get_llm_provider(\n                                       ^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/litellm/litellm_core_utils/get_llm_provider_logic.py\", line 316, in get_llm_provider\n    raise e\n  File \"/usr/local/lib/python3.11/site-packages/litellm/litellm_core_utils/get_llm_provider_logic.py\", line 293, in get_llm_provider\n    raise litellm.exceptions.BadRequestError(  # type: ignore\nlitellm.exceptions.BadRequestError: litellm.BadRequestError: LLM Provider NOT provided. Pass in the LLM provider you are trying to call. You passed model=yi-coder\n Pass model as E.g. For 'Huggingface' inference endpoints pass in `completion(model='huggingface/starcoder',..)` Learn more: https://docs.litellm.ai/docs/providers\n", "level": "DEBUG", "timestamp": "2024-12-08T19:48:46.278542"}
{"message": "Trying to fallback b/w models", "level": "INFO", "timestamp": "2024-12-08T19:48:46.278567"}

Dec 08 '24 21:12 bgeneto

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

Mar 09 '25 00:03 github-actions[bot]