dev-gpt icon indicating copy to clipboard operation
dev-gpt copied to clipboard

bug: OpenAI API Call exceeds maximum token context length for Model

Open MR-Alex42 opened this issue 1 year ago • 1 comments

I'm running gptdeploy with model GPT 3.5, so my maximum token context length is limited to 4097 tokens. The API call currently does not consider model dependent token context length limits. I suggest to include error handling for this so gptdeploy continues to run. Maybe counting tokens (see https://platform.openai.com/tokenizer) in advance, splitting or shortening the prompt messeages can be workarounds for this issue.

Traceback (most recent call last): File "D:\Programme\gptdeploy\gptdeploy.py", line 5, in main() File "D:\Programme\Python310\lib\site-packages\click\core.py", line 1130, in call return self.main(*args, **kwargs) File "D:\Programme\Python310\lib\site-packages\click\core.py", line 1055, in main rv = self.invoke(ctx) File "D:\Programme\Python310\lib\site-packages\click\core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "D:\Programme\Python310\lib\site-packages\click\core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "D:\Programme\Python310\lib\site-packages\click\core.py", line 760, in invoke return __callback(*args, **kwargs) File "D:\Programme\gptdeploy\src\cli.py", line 38, in wrapper return func(*args, **kwargs) File "D:\Programme\gptdeploy\src\cli.py", line 74, in generate generator.generate(path) File "D:\Programme\gptdeploy\src\options\generate\generator.py", line 289, in generate self.generate_microservice(microservice_path, microservice_name, packages, num_approach) File "D:\Programme\gptdeploy\src\options\generate\generator.py", line 102, in generate_microservice test_microservice_content = self.generate_and_persist_file( File "D:\Programme\gptdeploy\src\options\generate\generator.py", line 71, in generate_and_persist_file content_raw = conversation.chat(f'You must add the content for {file_name}.') File "D:\Programme\gptdeploy\src\apis\gpt.py", line 121, in chat response = self._chat([self.system_message] + self.messages) File "D:\Programme\Python310\lib\site-packages\langchain\chat_models\base.py", line 128, in call return self.generate(messages, stop=stop).generations[0].message File "D:\Programme\Python310\lib\site-packages\langchain\chat_models\openai.py", line 252, in generate for stream_resp in self.completion_with_retry( File "D:\Programme\Python310\lib\site-packages\langchain\chat_models\openai.py", line 228, in completion_with_retry return completion_with_retry(**kwargs) File "D:\Programme\Python310\lib\site-packages\tenacity_init.py", line 289, in wrapped_f return self(f, *args, **kw) File "D:\Programme\Python310\lib\site-packages\tenacity_init.py", line 379, in call do = self.iter(retry_state=retry_state) File "D:\Programme\Python310\lib\site-packages\tenacity_init.py", line 314, in iter return fut.result() File "D:\Programme\Python310\lib\concurrent\futures_base.py", line 451, in result return self.__get_result() File "D:\Programme\Python310\lib\concurrent\futures_base.py", line 403, in __get_result raise self.exception File "D:\Programme\Python310\lib\site-packages\tenacity_init.py", line 382, in call result = fn(*args, **kwargs) File "D:\Programme\Python310\lib\site-packages\langchain\chat_models\openai.py", line 226, in _completion_with_retry return self.client.create(**kwargs) File "D:\Programme\Python310\lib\site-packages\openai\api_resources\chat_completion.py", line 25, in create return super().create(*args, **kwargs) File "D:\Programme\Python310\lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 153, in create response, _, api_key = requestor.request( File "D:\Programme\Python310\lib\site-packages\openai\api_requestor.py", line 226, in request resp, got_stream = self._interpret_response(result, stream) File "D:\Programme\Python310\lib\site-packages\openai\api_requestor.py", line 620, in _interpret_response self._interpret_response_line( File "D:\Programme\Python310\lib\site-packages\openai\api_requestor.py", line 683, in _interpret_response_line raise self.handle_error_response( openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4118 tokens. Please reduce the length of the messages.

MR-Alex42 avatar Apr 22 '23 15:04 MR-Alex42

Thanks for reporting. I was mostly using GPT-4. I wonder how we could solve this. Even if we determine the token length, what should it do if it is too much? I guess we need to make the prompts shorter in general. I will think about something.

florian-hoenicke avatar Apr 22 '23 16:04 florian-hoenicke

This should become less often an issue due to: https://github.com/jina-ai/gptdeploy/pull/70 Will close the ticket for now.

joschkabraun avatar May 02 '23 08:05 joschkabraun