[Bug]: custom model timeout does not work
What happened?
We have these settings set:
litellm_settings:
callbacks: callback.handler
drop_params: true
request_timeout: 120
We are observing timeouts for one of our deployments, so I have added a timeout just for it:
- model_name: "o1-mini"
litellm_params:
model: "azure/o1-mini"
// ... Removed configs here...
timeout: 180
But we are still observing timeout at 120s and not 180s, which is unexpected and unwelcome, we are trying to avoid raising timeouts for all deployments.
Relevant log output
No response
Twitter / LinkedIn details
https://www.linkedin.com/in/jeromeroussin/
@Jerome Roussin unable to repro the issue - it looks like it's sending the correct timeout value for me
are you on the latest litellm version ? with this config - I could see 180 being set as the timeout for my request
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: azure/chatgpt-v-2
api_base: https://openai-gpt-4-test-v-1.openai.azure.com/
api_version: "2023-05-15"
api_key: os.environ/AZURE_API_KEY # The `os.environ/` prefix tells litellm to read this from the env. See https://docs.litellm.ai/docs/simple_proxy#load-api-keys-from-vault
timeout: 180
litellm_settings:
drop_params: true
request_timeout: 120
Addressed in a later bug report: https://github.com/BerriAI/litellm/issues/7001
The issue still exists in 1.57.3. The "timeout" value of our models is not being respected and we're seeing timeouts at the default 6000s value
A stacktrace of a 6000s timeout:
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/httpx/_transports/default.py", line 72, in map_httpcore_exceptions
yield
File "/usr/local/lib/python3.12/site-packages/httpx/_transports/default.py", line 377, in handle_async_request
resp = await self._pool.handle_async_request(req)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_async/connection_pool.py", line 256, in handle_async_request
raise exc from None
File "/usr/local/lib/python3.12/site-packages/httpcore/_async/connection_pool.py", line 236, in handle_async_request
response = await connection.handle_async_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_async/connection.py", line 103, in handle_async_request
return await self._connection.handle_async_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_async/http11.py", line 136, in handle_async_request
raise exc
File "/usr/local/lib/python3.12/site-packages/httpcore/_async/http11.py", line 106, in handle_async_request
) = await self._receive_response_headers(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_async/http11.py", line 177, in _receive_response_headers
event = await self._receive_event(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_async/http11.py", line 217, in _receive_event
data = await self._network_stream.read(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_backends/anyio.py", line 32, in read
with map_exceptions(exc_map):
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/contextlib.py", line 158, in __exit__
self.gen.throw(value)
File "/usr/local/lib/python3.12/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
raise to_exc(exc) from exc
httpcore.ReadTimeout
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 1576, in _request
response = await self._client.send(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 1674, in send
response = await self._send_handling_auth(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 1702, in _send_handling_auth
response = await self._send_handling_redirects(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 1739, in _send_handling_redirects
response = await self._send_single_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 1776, in _send_single_request
response = await transport.handle_async_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpx/_transports/default.py", line 376, in handle_async_request
with map_httpcore_exceptions():
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/contextlib.py", line 158, in __exit__
self.gen.throw(value)
File "/usr/local/lib/python3.12/site-packages/httpx/_transports/default.py", line 89, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.ReadTimeout
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/litellm/llms/azure/azure.py", line 589, in acompletion
headers, response = await self.make_azure_openai_chat_completion_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/litellm/llms/azure/azure.py", line 317, in make_azure_openai_chat_completion_request
raise e
File "/usr/local/lib/python3.12/site-packages/litellm/llms/azure/azure.py", line 309, in make_azure_openai_chat_completion_request
raw_response = await azure_client.chat.completions.with_raw_response.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/_legacy_response.py", line 373, in wrapped
return cast(LegacyAPIResponse[R], await func(*args, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/resources/chat/completions.py", line 1720, in create
return await self._post(
^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 1843, in post
return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 1537, in request
return await self._request(
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 1595, in _request
raise APITimeoutError(request=request) from err
openai.APITimeoutError: Request timed out.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/litellm/main.py", line 447, in acompletion
response = await init_response
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/litellm/llms/azure/azure.py", line 640, in acompletion
raise AzureOpenAIError(status_code=500, message=str(e))
litellm.llms.azure.common_utils.AzureOpenAIError: Request timed out.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/litellm/utils.py", line 1085, in wrapper_async
result = await original_function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/litellm/main.py", line 466, in acompletion
raise exception_type(
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2189, in exception_type
raise e
File "/usr/local/lib/python3.12/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 227, in exception_type
raise Timeout(
litellm.exceptions.Timeout: litellm.Timeout: APITimeoutError - Request timed out.
error_str: Request timed out.
can you share your updated config for repro? @jeromeroussin
@jeromeroussin if that's the stacktrace while calling o1, then that looks incorrect to me.
o1 calls should be routed inside azure/o1_handler.py - https://github.com/BerriAI/litellm/blob/8576ca8ccb12865eb52d1d6f4679085b7c10167c/litellm/llms/azure/chat/o1_handler.py#L4
https://github.com/BerriAI/litellm/blob/8576ca8ccb12865eb52d1d6f4679085b7c10167c/litellm/main.py#L1148
not azure.py. could this be an older version of litellm?
unable to repro this error. Here's the debug log showing the timeout being sent from Router (stores deployment information) to litellm translation layer.
Note: for me this is correctly displaying the 1.0 timeout.
17:17:11 - LiteLLM:DEBUG: utils.py:285 - Request to litellm:
17:17:11 - LiteLLM:DEBUG: utils.py:285 - litellm.acompletion(api_key='redacted', api_base='redacted', timeout=1.0,..)
with this config
model_list:
- model_name: "o1-mini"
litellm_params:
model: "azure/o1-mini"
api_key: os.environ/AZURE_API_KEY
api_base: os.environ/AZURE_API_BASE
timeout: 1
litellm_settings:
drop_params: true
request_timeout: 120
It is stacktrace for gpt-4o
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.