Docker MCP: OpenAI API - Too many requests
According to my log I'm getting hundrets of HTTP/1.1 429 Too Many Requests from the OpenAI API in a second. I hope I'm not getting banned, but using graphiti should be optimized. I tried to add a text of around 7k tokens to the graph.
graphiti-mcp-1 | 2025-05-20 13:24:14,694 - openai._base_client - INFO - Retrying request to /chat/completions in 1.512000 seconds
graphiti-mcp-1 | 2025-05-20 13:24:14,756 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:14,757 - openai._base_client - INFO - Retrying request to /chat/completions in 1.455000 seconds
graphiti-mcp-1 | 2025-05-20 13:24:15,648 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
graphiti-mcp-1 | 2025-05-20 13:24:16,525 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:16,527 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:16,528 - openai._base_client - INFO - Retrying request to /chat/completions in 2.061000 seconds
graphiti-mcp-1 | 2025-05-20 13:24:16,556 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:16,612 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:16,627 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:16,658 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:16,731 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:16,746 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:16,747 - openai._base_client - INFO - Retrying request to /chat/completions in 1.847000 seconds
graphiti-mcp-1 | 2025-05-20 13:24:16,747 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:16,748 - openai._base_client - INFO - Retrying request to /chat/completions in 1.843000 seconds
graphiti-mcp-1 | 2025-05-20 13:24:16,801 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:16,822 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:16,826 - openai._base_client - INFO - Retrying request to /chat/completions in 1.763000 seconds
graphiti-mcp-1 | 2025-05-20 13:24:16,873 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:16,880 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:16,881 - openai._base_client - INFO - Retrying request to /chat/completions in 1.701000 seconds
graphiti-mcp-1 | 2025-05-20 13:24:16,892 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:16,894 - openai._base_client - INFO - Retrying request to /chat/completions in 1.695000 seconds
graphiti-mcp-1 | 2025-05-20 13:24:16,895 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:16,896 - openai._base_client - INFO - Retrying request to /chat/completions in 1.681000 seconds
graphiti-mcp-1 | 2025-05-20 13:24:16,920 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:16,921 - openai._base_client - INFO - Retrying request to /chat/completions in 1.667000 seconds
graphiti-mcp-1 | 2025-05-20 13:24:17,203 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
graphiti-mcp-1 | 2025-05-20 13:24:18,887 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:18,909 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:18,952 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:18,968 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:18,974 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:19,049 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
graphiti-mcp-1 | 2025-05-20 13:24:19,074 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
Looks to be related to #290
Have you tried reducing SEMAPHORE_LIMIT in your environment? This defaults to 20.
Additionally, we've recently made improvements to Graphiti to reduce the number and size of of LLM calls made. We've also implemented the concept of a "small model", which is used as a classifier rather than a more expensive model. In Zep's implementation, this is gpt-4.1-nano.
Looks to be related to #290
I thought so as well but it's still occurring for me as well with latest release v0.12.0
The size of my text file is about 25 KB, which is not that big for a 12 page business document.
2025-06-13 19:17:57 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-06-13 19:17:57 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-06-13 19:17:57 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-06-13 19:17:57 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-06-13 19:17:57 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-06-13 19:17:57 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
2025-06-13 19:17:57 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
2025-06-13 19:17:57 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
2025-06-13 19:17:57 - openai._base_client - INFO - Retrying request to /chat/completions in 0.863000 seconds
2025-06-13 19:17:57 - openai._base_client - INFO - Retrying request to /chat/completions in 0.865000 seconds
2025-06-13 19:17:57 - openai._base_client - INFO - Retrying request to /chat/completions in 0.863000 seconds
2025-06-13 19:17:57 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
2025-06-13 19:17:57 - openai._base_client - INFO - Retrying request to /chat/completions in 0.863000 seconds
2025-06-13 19:17:57 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-06-13 19:17:57 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-06-13 19:17:58 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-06-13 19:17:58 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
BTW - I'm not using MCP. Just straight client call to add episodes.
I somewhat have a fix for this so using open ai Rate limits in headers while sending an api call in the response headers it gives the remaining number of
x-ratelimit-limit-requests": "10000", "x-ratelimit-limit-tokens": "200000", "x-ratelimit-remaining-requests": "9999", "x-ratelimit-remaining-tokens": "199992", "x-ratelimit-reset-requests": "8.64s", "x-ratelimit-reset-tokens": "2ms",
Using this I made a logic in the openai_client.py to check the response in the headers and according to the rate limit reset request time and ratelimit reset tokens wait that amount of time before making more calls to open ai to process episodes.
You can see in the screenshot when rate limit is hit it check the headers and then waits according to the reset limit. But as you can see due to so many api calls the reset limit becomes overly long (in hours) ands that not scalable. Also this depends on your open ai RPM and TPM for the model you're using unless those values are really high graphiti is not scalable for multiple users to add episodes in parallel.
But with this way im able to atleast when open ai rate limit is hit, wait for it to reset before making calls again and this way episodes are not left unprocessed when the rate limit is hit.
@TechupBusiness Is this still an issue? Please confirm within 14 days or this issue will be closed.
@TechupBusiness Is this still an issue? Please confirm within 14 days or this issue will be closed.
@TechupBusiness Is this still an issue? Please confirm within 14 days or this issue will be closed.