langchaingo
langchaingo copied to clipboard
How do you rate limit calls to llms configured through langchaingo?
We're currently using langchaingo to connect to a gpt-4o model deployed on Azure. Is there any documentation or examples I can refer to on how to set token rate limits?
Thanks!
I had the same question but for Groq and couldn't find anything in the Docs.