LightRAG icon indicating copy to clipboard operation
LightRAG copied to clipboard

[Bug]: OpenAI o3 via openAI backend: Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead

Open ViolentVotan opened this issue 7 months ago • 4 comments

Do you need to file an issue?

  • [x] I have searched the existing issues and this bug is not already filed.
  • [x] I believe this is a legitimate bug, not just a question or feature request.

Describe the bug

When using o3 on the openai backend, I try to upload a file of around 1 MB and get the following error in the console:

openai.BadRequestError: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}

Given that for the thinking models on then openai backend it needs to use max_completion_tokens instead of max_tokens this error makes sense, however I do not see a way to configure max_completion_tokens in the .env?

Steps to reproduce

Upload any larger file and configure the openai backend to o3 or o4-mini

Expected Behavior

if using an OpenAI thinking model, namely o3 or o4-mini, the parameter max_completion_tokens instead of max_tokens should be sent to the backend.

LightRAG Config Used

Paste your config here

Logs and screenshots

ERROR: OpenAI API Call Failed, Model: o3-2025-04-16, Params: {'max_tokens': 1000, 'temperature': 1.0}, Got: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}} ERROR: limit_async: Error in decorated function: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}} ERROR: Merging stage failed in document doc-1254d453f577c1661c94d287e7c5b5cf: Traceback (most recent call last): File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/lightrag.py", line 1033, in process_document await merge_nodes_and_edges( File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/operate.py", line 543, in merge_nodes_and_edges entity_data = await _merge_nodes_then_upsert( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/operate.py", line 292, in _merge_nodes_then_upsert description = await _handle_entity_relation_summary( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/operate.py", line 144, in _handle_entity_relation_summary summary = await use_llm_func_with_cache( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/utils.py", line 1614, in use_llm_func_with_cache res: str = await use_llm_func(input_text, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/utils.py", line 544, in wait_func return await future ^^^^^^^^^^^^ File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/utils.py", line 328, in worker result = await func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/api/lightrag_server.py", line 223, in openai_alike_model_complete return await openai_complete_if_cache( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/tenacity/asyncio/init.py", line 189, in async_wrapped return await copy(fn, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/tenacity/asyncio/init.py", line 111, in call do = await self.iter(retry_state=retry_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/tenacity/asyncio/init.py", line 153, in iter result = await action(retry_state) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/tenacity/_utils.py", line 99, in inner return call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/tenacity/init.py", line 400, in self._add_action_func(lambda rs: rs.outcome.result()) ^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Cellar/[email protected]/3.11.12/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result return self.__get_result() ^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Cellar/[email protected]/3.11.12/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result raise self._exception File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/tenacity/asyncio/init.py", line 114, in call result = await fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/lightrag/llm/openai.py", line 185, in openai_complete_if_cache response = await openai_async_client.chat.completions.create( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 2028, in create return await self._post( ^^^^^^^^^^^^^^^^^ File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/openai/_base_client.py", line 1742, in post return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/votan/.local/pipx/venvs/lightrag-hku/lib/python3.11/site-packages/openai/_base_client.py", line 1549, in request raise self._make_status_error_from_response(err.response) from None openai.BadRequestError: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}

Additional Information

  • LightRAG Version: v1.3.6
  • Operating System: macOS 15.4.1
  • Python Version: 3.11.12
  • Related Issues:

ViolentVotan avatar May 06 '25 16:05 ViolentVotan

Thank you for reporting the issue. Do you have any recommendations for resolving it?

danielaskdd avatar May 09 '25 13:05 danielaskdd

@danielaskdd Please also include this issue in relation to supporting 'reasoning models' like the o3. I'm using the lightrag-server setup:

File "C:\Apps\dVALi.venv\Lib\site-packages\openai_base_client.py", line 1562, in _request raise self._make_status_error_from_response(err.response) from None openai.BadRequestError: Error code: 400 - {'error': {'message': "Unsupported parameter: 'temperature' is not supported with this model.", 'type': 'invalid_request_error', 'param': 'temperature', 'code': 'unsupported_parameter'}}

BireleyX avatar May 16 '25 07:05 BireleyX

@danielaskdd Please also include this issue in relation to supporting 'reasoning models' like the o3. I'm using the lightrag-server setup:

File "C:\Apps\dVALi.venv\Lib\site-packages\openai_base_client.py", line 1562, in _request raise self._make_status_error_from_response(err.response) from None openai.BadRequestError: Error code: 400 - {'error': {'message': "Unsupported parameter: 'temperature' is not supported with this model.", 'type': 'invalid_request_error', 'param': 'temperature', 'code': 'unsupported_parameter'}}

Unsupported parameter: 'temperature' , What's the model name?

danielaskdd avatar May 16 '25 08:05 danielaskdd

I'm using Azure OpenAI o3-mini.

I was able to add support for this by modifying these files:

lightrag_server.py

    async def azure_openai_model_complete(
        prompt,
        system_prompt=None,
        history_messages=None,
        keyword_extraction=False,
        **kwargs,
        ) -> str:
        keyword_extraction = kwargs.pop("keyword_extraction", None)
        if keyword_extraction:
            kwargs["response_format"] = GPTKeywordExtractionFormat
        if history_messages is None:
            history_messages = []
        if args.reasoning_model:
            kwargs["reasoning_effort"] = args.reasoning_effort
            kwargs["max_completion_tokens"] = args.max_completion_tokens
        else:  
            kwargs["temperature"] = args.temperature
        return await azure_openai_complete_if_cache(
            args.llm_model,
            prompt,
            system_prompt=system_prompt,
            history_messages=history_messages,
            base_url=args.llm_binding_host,
            api_key=os.getenv("AZURE_OPENAI_API_KEY"),
            api_version=os.getenv("AZURE_OPENAI_API_VERSION", "2024-08-01-preview"),
            **kwargs,
        )

config.py

def parse_args() -> argparse.Namespace:
...
...
    # Inject Reasoning model configuration - ADX
    args.reasoning_model = get_env_value("IS_REASONING_MODEL", False, bool)
    args.reasoning_effort = get_env_value("REASONING_EFFORT", "low")
    args.max_completion_tokens = get_env_value("MAX_COMPLETION_TOKENS", 8192, int)

.env

REASONING_EFFORT=low
MAX_COMPLETION_TOKENS=8192
IS_REASONING_MODEL=True

BireleyX avatar May 16 '25 08:05 BireleyX

max_token is no longer passed to the LLM.

danielaskdd avatar Jul 31 '25 00:07 danielaskdd