azure-search-openai-demo
azure-search-openai-demo copied to clipboard
OpenAI Chat Responses Limited
We see cases where lengthy responses back from OpenAI cut off at a certain token limit. Our current workaround is to ask the LLM to "continue from where it left off", any plans/ways to address this and allow full response without this workaround?
We currently have a token limit of 1024:
response_token_limit = 1024
messages = build_messages(
model=self.chatgpt_model,
system_prompt=rendered_answer_prompt.system_content,
past_messages=rendered_answer_prompt.past_messages,
new_user_content=rendered_answer_prompt.new_user_content,
max_tokens=self.chatgpt_token_limit - response_token_limit,
fallback_to_default=self.ALLOW_NON_GPT_MODELS,
)
You can increase the token limit there to your desired maximum, if you need longer responses. It just means that the message history truncation logic may need to cut earlier messages in the conversation, to allow enough room for the response.