Substack2Markdown icon indicating copy to clipboard operation
Substack2Markdown copied to clipboard

Fix Error Too Many Requests

Open alberba opened this issue 7 months ago • 0 comments

This commit introduces improvements to the get_url_soup function to handle rate-limiting errors (HTTP 429) more effectively. Below are the details of the changes:

Changes Made:

  1. Added max_attempts Parameter:

    • Introduced a new parameter max_attempts with a default value of 5, allowing multiple retry attempts when temporary errors occur.
  2. Handling "Too Many Requests" (HTTP 429):

    • Implemented an exponential backoff retry mechanism in case the page content indicates "too many requests."
    • Added random jitter to the delay between retries to reduce the likelihood of triggering server rate limits.
  3. Improved Error Messages:

    • Enhanced error messages to include the URL and number of failed attempts for better debugging.
  4. Additional Logic:

    • Checks if the page contains a <pre> element with the text "too many requests" and retries after a delay if detected.

alberba avatar May 11 '25 08:05 alberba