weaviate
weaviate copied to clipboard
Module behaviour on OpenAI request content is to large
Currently, the OpenAI module might return:
{
'error': [
{
'message': "failed with status: 400 error: This model's maximum context length is 2046 tokens, however, you requested 142471 tokens (142471 in your prompt; 0 for the completion). Please reduce your prompt; or completion length."
}
]
}
Can this be handled in the module? E.g., for this specific example that the 142471
token string is split into (142471/2046
) 70 snippets, and a centroid is calculated and stored?
Great find! Indeed the other modules are handling this as part of the inference, I'm surprised OpenAI does not handle this server-side. Do we know if the 2046 is a global limit that we can hard-code or does this depend on the model selected?
Asked it here
Thank you for your contribution to Weaviate. This issue has not received any activity in a while and has therefore been marked as stale. Stale issues will eventually be autoclosed. This does not mean that we are ruling out to work on this issue, but it most likely has not been prioritized high enough in the last months. If you believe that this issue should remain open, please leave a short reply. This lets us know that the issue is not abandoned and acts as a reminder for our team to consider prioritizing this again. Please also consider if you can make a contribution to help with the solution of this issue. If you are willing to contribute, but don't know where to start, please leave a quick message and we'll try to help you. Thank you, The Weaviate Team