graphrag
graphrag copied to clipboard
[Issue]: <sleeep> sleep_on_rate_limit_recommendation is not working for groq
Is there an existing issue for this?
- [ ] I have searched the existing issues
- [ ] I have checked #657 to validate if my issue is covered by community support
Describe the issue
i am using lamma 8b model for graphrag. the limit of llama3-8b-8192 of the groq is per minute 30k token, when the token is exceeded, but the code invoke the api, at that time code has to sleep for some time.
Steps to reproduce
No response
GraphRAG Config Used
encoding_model: cl100k_base
skip_workflows: []
llm:
api_key: ${GROQ_API_KEY} # groq api key
type: openai_chat # or azure_openai_chat
model: llama3-8b-8192
model_supports_json: true # recommended if this is available for your model.
max_tokens: 4000
# request_timeout: 180.0
api_base: https://api.groq.com/openai/v1
# api_version: 2024-02-15-preview
# organization: <organization_id>
# deployment_name: <azure_model_deployment_name>
tokens_per_minute: 2000 # set a leaky bucket throttle
requests_per_minute: 1 # set a leaky bucket throttle
max_retries: 3
max_retry_wait: 10000.0
sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
concurrent_requests: 1 # the number of parallel inflight requests that may be made / default is 25 / reduce if using the groq
Logs and screenshots
No response
Additional Information
- GraphRAG Version:
- Operating System:
- Python Version:
- Related Issues: