I'm running gptdeploy with model GPT 3.5, so my maximum token context length is limited to 4097 tokens.
The API call currently does not consider model dependent token context length limits. I suggest to include error handling for this so gptdeploy continues to run. Maybe counting tokens (see https://platform.openai.com/tokenizer) in advance, splitting or shortening the prompt messeages can be workarounds for this issue.
Traceback (most recent call last):
File "D:\Programme\gptdeploy\gptdeploy.py", line 5, in
main()
File "D:\Programme\Python310\lib\site-packages\click\core.py", line 1130, in call
return self.main(*args, **kwargs)
File "D:\Programme\Python310\lib\site-packages\click\core.py", line 1055, in main
rv = self.invoke(ctx)
File "D:\Programme\Python310\lib\site-packages\click\core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "D:\Programme\Python310\lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "D:\Programme\Python310\lib\site-packages\click\core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "D:\Programme\gptdeploy\src\cli.py", line 38, in wrapper
return func(*args, **kwargs)
File "D:\Programme\gptdeploy\src\cli.py", line 74, in generate
generator.generate(path)
File "D:\Programme\gptdeploy\src\options\generate\generator.py", line 289, in generate
self.generate_microservice(microservice_path, microservice_name, packages, num_approach)
File "D:\Programme\gptdeploy\src\options\generate\generator.py", line 102, in generate_microservice
test_microservice_content = self.generate_and_persist_file(
File "D:\Programme\gptdeploy\src\options\generate\generator.py", line 71, in generate_and_persist_file
content_raw = conversation.chat(f'You must add the content for {file_name}.')
File "D:\Programme\gptdeploy\src\apis\gpt.py", line 121, in chat
response = self._chat([self.system_message] + self.messages)
File "D:\Programme\Python310\lib\site-packages\langchain\chat_models\base.py", line 128, in call
return self.generate(messages, stop=stop).generations[0].message
File "D:\Programme\Python310\lib\site-packages\langchain\chat_models\openai.py", line 252, in generate
for stream_resp in self.completion_with_retry(
File "D:\Programme\Python310\lib\site-packages\langchain\chat_models\openai.py", line 228, in completion_with_retry
return completion_with_retry(**kwargs)
File "D:\Programme\Python310\lib\site-packages\tenacity_init.py", line 289, in wrapped_f
return self(f, *args, **kw)
File "D:\Programme\Python310\lib\site-packages\tenacity_init.py", line 379, in call
do = self.iter(retry_state=retry_state)
File "D:\Programme\Python310\lib\site-packages\tenacity_init.py", line 314, in iter
return fut.result()
File "D:\Programme\Python310\lib\concurrent\futures_base.py", line 451, in result
return self.__get_result()
File "D:\Programme\Python310\lib\concurrent\futures_base.py", line 403, in __get_result
raise self.exception
File "D:\Programme\Python310\lib\site-packages\tenacity_init.py", line 382, in call
result = fn(*args, **kwargs)
File "D:\Programme\Python310\lib\site-packages\langchain\chat_models\openai.py", line 226, in _completion_with_retry
return self.client.create(**kwargs)
File "D:\Programme\Python310\lib\site-packages\openai\api_resources\chat_completion.py", line 25, in create
return super().create(*args, **kwargs)
File "D:\Programme\Python310\lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 153, in create
response, _, api_key = requestor.request(
File "D:\Programme\Python310\lib\site-packages\openai\api_requestor.py", line 226, in request
resp, got_stream = self._interpret_response(result, stream)
File "D:\Programme\Python310\lib\site-packages\openai\api_requestor.py", line 620, in _interpret_response
self._interpret_response_line(
File "D:\Programme\Python310\lib\site-packages\openai\api_requestor.py", line 683, in _interpret_response_line
raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4118 tokens. Please reduce the length of the messages.
Thanks for reporting. I was mostly using GPT-4.
I wonder how we could solve this.
Even if we determine the token length, what should it do if it is too much?
I guess we need to make the prompts shorter in general.
I will think about something.
This should become less often an issue due to: https://github.com/jina-ai/gptdeploy/pull/70
Will close the ticket for now.