narrator icon indicating copy to clipboard operation
narrator copied to clipboard

Reduce Rate Requests - errors out with pro plan

Open bissella opened this issue 2 years ago • 5 comments

I have a paid account with OPEN AI but recieve this error message, suggesting I am exceeding my rate limit.

The limits for (gpt-4)[https://platform.openai.com/account/limits] are:

gpt-4 10,000 TPM 3 RPM200 RPD

My opportunity is:

  • I have the pro account and want to use the repo
  • I cannot use the repo right now.
, line 877, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

bissella avatar Nov 16 '23 20:11 bissella

+1 Getting the same error from OpenAI.

timlangedesign avatar Nov 18 '23 10:11 timlangedesign

Introduce a Retry Mechanism for Rate Limit Errors: When a rate limit error is encountered, the code should wait for a period and then retry the request.

Increase the Time Delay in the Main Loop: To reduce the frequency of requests, increase the delay in the main loop. This will help in staying within the rate limits.

Add Exception Handling: Implement broader exception handling to catch any errors that may occur during the execution.

import os
from openai import OpenAI, error as openai_error
import base64
import json
import time
import simpleaudio as sa
import errno
from elevenlabs import generate, play, set_api_key, voices

client = OpenAI()

set_api_key(os.environ.get("ELEVENLABS_API_KEY"))

def encode_image(image_path):
    while True:
        try:
            with open(image_path, "rb") as image_file:
                return base64.b64encode(image_file.read()).decode("utf-8")
        except IOError as e:
            if e.errno != errno.EACCES:
                # Not a "file in use" error, re-raise
                raise
            # File is being written to, wait a bit and retry
            time.sleep(0.1)

def play_audio(text):
    audio = generate(text, voice=os.environ.get("ELEVENLABS_VOICE_ID"))

    unique_id = base64.urlsafe_b64encode(os.urandom(30)).decode("utf-8").rstrip("=")
    dir_path = os.path.join("narration", unique_id)
    os.makedirs(dir_path, exist_ok=True)
    file_path = os.path.join(dir_path, "audio.wav")

    with open(file_path, "wb") as f:
        f.write(audio)

    play(audio)

def generate_new_line(base64_image):
    return [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this image"},
                {
                    "type": "image_url",
                    "image_url": f"data:image/jpeg;base64,{base64_image}",
                },
            ],
        },
    ]

def analyze_image(base64_image, script):
    try:
        response = client.chat.completions.create(
            model="gpt-4-vision-preview",
            messages=[
                {
                    "role": "system",
                    "content": """
                    You are Sir David Attenborough. Narrate the picture of the human as if it is a nature documentary.
                    Make it snarky and funny. Don't repeat yourself. Make it short. If I do anything remotely interesting, make a big deal about it!
                    """,
                },
            ]
            + script
            + generate_new_line(base64_image),
            max_tokens=500,
        )
        response_text = response.choices[0].message.content
        return response_text
    except openai_error.RateLimitError:
        print("Rate limit exceeded. Retrying in 60 seconds...")
        time.sleep(60)  # Wait for 60 seconds before retrying
        return analyze_image(base64_image, script)  # Recursive retry
    except Exception as e:
        print(f"Unexpected error occurred: {e}")
        return "Error in processing the image."

def main():
    script = []

    while True:
        try:
            image_path = os.path.join(os.getcwd(), "./frames/frame.jpg")
            base64_image = encode_image(image_path)

            print("👀 David is watching...")
            analysis = analyze_image(base64_image, script=script)

            print("🎙️ David says:")
            print(analysis)

            play_audio(analysis)

            script = script + [{"role": "assistant", "content": analysis}]

            # Increase delay to reduce frequency of requests
            time.sleep(30)  # Adjust the sleep time as needed

        except KeyboardInterrupt:
            print("Exiting...")
            break
        except Exception as e:
            print(f"An error occurred: {e}")

if __name__ == "__main__":
    main()

bubalina avatar Nov 20 '23 10:11 bubalina

@bubalina I tried your solution and the rate limit continued to time out. This seems to be a common problem with openAI api calls.

These articles describe the problem from OpenAI:

https://help.openai.com/en/articles/6891829-error-code-429-rate-limit-reached-for-requests

https://help.openai.com/en/articles/6891753-rate-limit-advice

https://platform.openai.com/docs/guides/rate-limits?context=tier-free

However I haven't dug deep into solutions.

I appreciate your help.

bissella avatar Nov 23 '23 20:11 bissella

Hey @bissella if you're able create multiple azure instances - you could use LiteLLM's openai-compatible server to loadbalance across from them:

Step 1: Put your instances in a config.yaml

model_list:
  - model_name: zephyr-beta
    litellm_params:
        model: huggingface/HuggingFaceH4/zephyr-7b-beta
        api_base: http://0.0.0.0:8001
  - model_name: zephyr-beta
    litellm_params:
        model: huggingface/HuggingFaceH4/zephyr-7b-beta
        api_base: http://0.0.0.0:8002
  - model_name: zephyr-beta
    litellm_params:
        model: huggingface/HuggingFaceH4/zephyr-7b-beta
        api_base: http://0.0.0.0:8003

Step 2: Install LiteLLM

$ pip install litellm

Step 3: Start litellm proxy w/ config.yaml

$ litellm --config /path/to/config.yaml

Docs: https://docs.litellm.ai/docs/simple_proxy

krrishdholakia avatar Nov 27 '23 17:11 krrishdholakia

@bubalina I tried your solution and the rate limit continued to time out. This seems to be a common problem with openAI api calls.

These articles describe the problem from OpenAI:

https://help.openai.com/en/articles/6891829-error-code-429-rate-limit-reached-for-requests

https://help.openai.com/en/articles/6891753-rate-limit-advice

https://platform.openai.com/docs/guides/rate-limits?context=tier-free

However I haven't dug deep into solutions.

I appreciate your help.

I would try generating a new API key

bubalina avatar Feb 13 '24 14:02 bubalina