semantic-kernel
semantic-kernel copied to clipboard
Reduce request in copilot chat sample app
Everytime I ask a question, it takes long to reply a message. What was happening behind a chat? Is there a way to reduce the request? For example, no memory mode.
Content: { "error": { "message": "Rate limit reached for default-gpt-3.5-turbo in organization org-xx1YQ6kT97CUfp444pfkgoM6 on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.", "type": "requests", "param": null, "code": null }
The error message itself is clear that "Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method."
There are a few requests that go on behind the scenes with each chat message that is sent, and as @lextm mentions you can add a CC to increase the rate limit.
Recycling my reply to one of your other posts re ways to reduce this: https://github.com/microsoft/semantic-kernel/issues/999#issuecomment-1550350983
I'll close this for now, feel free to reopen if there is more to discuss here.