RedditDownloader
RedditDownloader copied to clipboard
In praw_wrapper.py the function praw_apply_filter is not handling 404 exceptions
Describe the bug
All downloads stop on some 404's, the UI hangs, and the RMD application has to be restarted.
Environment Info
- Ubuntu 20.04 LTS
- RMD 3.1.5
Screenshots/Information
The following is the last data dumped:
Sep 22 12:44:35 vm-rmd RMD-ubuntu[85316]: HTTPSConnectionPool(host='assets', port=443): Max retries exceeded with url: /favicon-16x16.png (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f1eabaa3b10>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')) Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: Process RedditElementLoader: Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: Traceback (most recent call last): Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: File "multiprocessing/process.py", line 297, in _bootstrap Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: File "processing/redditloader.py", line 30, in run Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: File "processing/redditloader.py", line 51, in load Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: File "processing/redditloader.py", line 65, in _scan_sources Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: File "sources/subreddit_posts_source.py", line 16, in get_elements Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: File "static/praw_wrapper.py", line 131, in subreddit_posts Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: File "static/praw_wrapper.py", line 221, in _praw_apply_filter Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: File "praw/models/listing/generator.py", line 63, in next Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: File "praw/models/listing/generator.py", line 73, in _next_batch Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: File "praw/reddit.py", line 566, in get Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: File "praw/reddit.py", line 672, in _objectify_request Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: File "praw/reddit.py", line 855, in request Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: File "prawcore/sessions.py", line 331, in request Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: File "prawcore/sessions.py", line 260, in _request_with_retries Sep 22 12:44:42 vm-rmd RMD-ubuntu[85313]: prawcore.exceptions.NotFound: received 404 HTTP response
Additional context
It appears that in praw_appy_filter you are only handing:
except TypeError as e:
When prawcore/sessions.py gets a 404 the exception bubbles up the stack and stops RMD dead in its tracks.
Interesting. I wasn't aware it was possible to receive a 404 from Reddit mid-scan like that. Do you know what source is providing the problematic post?
I do not. I tried to figure it out, but the error logs were sparse about it, and the download queue was 1,000's of entries long.
It happens often, though. It may be related to using PIA VPN, but that is just wild conjecture.
Thanks for checking anyways. I'll try to hunt this down, but it is unlikely that I'll continue pushing out updates to the python build of RMD for much longer since the rewrite is nearing feature parity. It is prohibitively difficult to chase down all the edge cases the Python build seems to produce for some users.