fantiadl
fantiadl copied to clipboard
Apply PR #134 to download functions
This PR with includes #134 by @1223334444abc
- Related to #78
Changes
- Apply PR #134 to download functions.
- Bug fix with #134
- Changs delay args to single float.
- Re-format some codes.
Black-formatter
--line-length 260
I would prefer if we pick the changes without changing formatting. We should get an issue created for formatting in a pre-commit hook.
In this PR, the handling of 404 errors has led to certain fanclubs without a cover image failing to download properly. Below are the differences between the main branch and this PR:
Downloading fanclub *****...
Collecting fanclub posts...
Collected 116 posts.
Downloading fanclub header...
Download URL returned 404. Skipping...
Downloading fanclub icon...
URL already downloaded. Skipping...
Downloading fanclub **********...
Collecting fanclub posts...
Collected 349 posts.
Downloading fanclub header...
Wait 1.25 s...
Encountered an error downloading fanclub. Skipping...
Traceback (most recent call last):
File "G:\fantia\fantiadl-format-429\fantiadl\models.py", line 314, in download_followed_fanclubs
self.download_fanclub(fanclub, limit)
File "G:\fantia\fantiadl-format-429\fantiadl\models.py", line 290, in download_fanclub
self.download_fanclub_metadata(fanclub)
File "G:\fantia\fantiadl-format-429\fantiadl\models.py", line 270, in download_fanclub_metadata
self.perform_download(header_url, header_filename, use_server_filename=self.use_server_filenames)
File "G:\fantia\fantiadl-format-429\fantiadl\models.py", line 441, in perform_download
request = self.safe_request("GET", url, stream=True)
File "G:\fantia\fantiadl-format-429\fantiadl\models.py", line 244, in safe_request
response.raise_for_status()
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://fantia.jp/images/fallback/fanclub/cover_image/_default3.png
Downloading fanclub **********...
Collecting fanclub posts...
Collected 203 posts.
Downloading fanclub header...
Wait 0.52 s...
Encountered an error downloading fanclub. Skipping...
Traceback (most recent call last):
File "G:\fantia\fantiadl-format-429\fantiadl\models.py", line 314, in download_followed_fanclubs
self.download_fanclub(fanclub, limit)
File "G:\fantia\fantiadl-format-429\fantiadl\models.py", line 290, in download_fanclub
self.download_fanclub_metadata(fanclub)
File "G:\fantia\fantiadl-format-429\fantiadl\models.py", line 270, in download_fanclub_metadata
self.perform_download(header_url, header_filename, use_server_filename=self.use_server_filenames)
File "G:\fantia\fantiadl-format-429\fantiadl\models.py", line 441, in perform_download
request = self.safe_request("GET", url, stream=True)
File "G:\fantia\fantiadl-format-429\fantiadl\models.py", line 244, in safe_request
response.raise_for_status()
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://fantia.jp/images/fallback/fanclub/cover_image/_default0.png
safe_request now only handles 429 errors internally; all other HTTP responses (including 404) are returned for external handling.
models.py
def safe_request(self, method, url, **kwargs):
# Download Delay
dl_delay = random.random() * self.delay
print("Wait {:.2f} s...".format(dl_delay))
time.sleep(dl_delay)
for attempt in range(3):
try:
response = self.session.request(method, url, **kwargs)
# Handle 429 errors
if response.status_code == 429:
self.consecutive_429 += 1
self.output(f"HTTP 429 Too Many Requests (attempt {self.consecutive_429}/3)\n")
if self.consecutive_429 >= 3:
self.output("Three consecutive 429 errors detected.\n")
choice = input("Continue waiting? (y/n): ").lower()
if choice != "y":
raise SystemExit("Aborted by user")
self.consecutive_429 = 0 # Reset counter
self.output("Wait {:.2f} s...\n".format(self.retry_wait))
time.sleep(self.retry_wait)
continue
self.consecutive_429 = 0 # Reset counter on success
return response
except requests.exceptions.RetryError as e:
if attempt == 2: # Final attempt
raise Exception(f"Request failed after 3 attempts: {str(e)}")
continue
raise Exception("Max retries exceeded")