[Bug]: OpenAI o3 via openAI backend: Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead
Do you need to file an issue?
- [x] I have searched the existing issues and this bug is not already filed.
- [x] I believe this is a legitimate bug, not just a question or feature request.
Describe the bug
When using o3 on the openai backend, I try to upload a file of around 1 MB and get the following error in the console:
openai.BadRequestError: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
Given that for the thinking models on then openai backend it needs to use max_completion_tokens instead of max_tokens this error makes sense, however I do not see a way to configure max_completion_tokens in the .env?
Steps to reproduce
Upload any larger file and configure the openai backend to o3 or o4-mini
Expected Behavior
if using an OpenAI thinking model, namely o3 or o4-mini, the parameter max_completion_tokens instead of max_tokens should be sent to the backend.
LightRAG Config Used
Paste your config here
Logs and screenshots
ERROR: OpenAI API Call Failed,
Model: o3-2025-04-16,
Params: {'max_tokens': 1000, 'temperature': 1.0}, Got: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
ERROR: limit_async: Error in decorated function: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
ERROR: Merging stage failed in document doc-1254d453f577c1661c94d287e7c5b5cf: Traceback (most recent call last):
File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/lightrag.py", line 1033, in process_document
await merge_nodes_and_edges(
File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/operate.py", line 543, in merge_nodes_and_edges
entity_data = await _merge_nodes_then_upsert(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/operate.py", line 292, in _merge_nodes_then_upsert
description = await _handle_entity_relation_summary(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/operate.py", line 144, in _handle_entity_relation_summary
summary = await use_llm_func_with_cache(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/utils.py", line 1614, in use_llm_func_with_cache
res: str = await use_llm_func(input_text, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/utils.py", line 544, in wait_func
return await future
^^^^^^^^^^^^
File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/utils.py", line 328, in worker
result = await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/api/lightrag_server.py", line 223, in openai_alike_model_complete
return await openai_complete_if_cache(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/tenacity/asyncio/init.py", line 189, in async_wrapped
return await copy(fn, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/tenacity/asyncio/init.py", line 111, in call
do = await self.iter(retry_state=retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/tenacity/asyncio/init.py", line 153, in iter
result = await action(retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/tenacity/_utils.py", line 99, in inner
return call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/tenacity/init.py", line 400, in
Additional Information
- LightRAG Version: v1.3.6
- Operating System: macOS 15.4.1
- Python Version: 3.11.12
- Related Issues:
Thank you for reporting the issue. Do you have any recommendations for resolving it?
@danielaskdd Please also include this issue in relation to supporting 'reasoning models' like the o3. I'm using the lightrag-server setup:
File "C:\Apps\dVALi.venv\Lib\site-packages\openai_base_client.py", line 1562, in _request raise self._make_status_error_from_response(err.response) from None openai.BadRequestError: Error code: 400 - {'error': {'message': "Unsupported parameter: 'temperature' is not supported with this model.", 'type': 'invalid_request_error', 'param': 'temperature', 'code': 'unsupported_parameter'}}
@danielaskdd Please also include this issue in relation to supporting 'reasoning models' like the o3. I'm using the lightrag-server setup:
File "C:\Apps\dVALi.venv\Lib\site-packages\openai_base_client.py", line 1562, in _request raise self._make_status_error_from_response(err.response) from None openai.BadRequestError: Error code: 400 - {'error': {'message': "Unsupported parameter: 'temperature' is not supported with this model.", 'type': 'invalid_request_error', 'param': 'temperature', 'code': 'unsupported_parameter'}}
Unsupported parameter: 'temperature' , What's the model name?
I'm using Azure OpenAI o3-mini.
I was able to add support for this by modifying these files:
lightrag_server.py
async def azure_openai_model_complete(
prompt,
system_prompt=None,
history_messages=None,
keyword_extraction=False,
**kwargs,
) -> str:
keyword_extraction = kwargs.pop("keyword_extraction", None)
if keyword_extraction:
kwargs["response_format"] = GPTKeywordExtractionFormat
if history_messages is None:
history_messages = []
if args.reasoning_model:
kwargs["reasoning_effort"] = args.reasoning_effort
kwargs["max_completion_tokens"] = args.max_completion_tokens
else:
kwargs["temperature"] = args.temperature
return await azure_openai_complete_if_cache(
args.llm_model,
prompt,
system_prompt=system_prompt,
history_messages=history_messages,
base_url=args.llm_binding_host,
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
api_version=os.getenv("AZURE_OPENAI_API_VERSION", "2024-08-01-preview"),
**kwargs,
)
config.py
def parse_args() -> argparse.Namespace:
...
...
# Inject Reasoning model configuration - ADX
args.reasoning_model = get_env_value("IS_REASONING_MODEL", False, bool)
args.reasoning_effort = get_env_value("REASONING_EFFORT", "low")
args.max_completion_tokens = get_env_value("MAX_COMPLETION_TOKENS", 8192, int)
.env
REASONING_EFFORT=low
MAX_COMPLETION_TOKENS=8192
IS_REASONING_MODEL=True
max_token is no longer passed to the LLM.