bulk-downloader-for-reddit
bulk-downloader-for-reddit copied to clipboard
[BUG] DuplicateReplaceException: A duplicate comment has been detected.
- [x] I am reporting a bug.
- [x] I am running the latest version of BDfR
- [x] I have read the Opening an issue
Description
When using --all-comments
I get
praw.exceptions.DuplicateReplaceException: A duplicate comment has been detected. Are you attempting to call 'replace_more_comments' more than once?
after a few comments are downloaded.
Command
bdfr archive DIRECTORY --user me --submitted --all-comments --authenticate --file-scheme "{REDDITOR}_{POSTID}_{DATE}"
Environment (please complete the following information)
- OS: OSX 12.6.6
- Python version: Python 3.11.1
Logs
on console
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/devon/.local/pipx/venvs/bdfrx/lib/python3.11/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/Users/devon/.local/pipx/venvs/bdfrx/lib/python3.11/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/devon/.local/pipx/venvs/bdfrx/lib/python3.11/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/devon/.local/pipx/venvs/bdfrx/lib/python3.11/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/devon/.local/pipx/venvs/bdfrx/lib/python3.11/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/devon/.local/pipx/venvs/bdfrx/lib/python3.11/site-packages/bdfrx/__main__.py", line 117, in cli_download
reddit_downloader = RedditDownloader(config, [stream])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/devon/.local/pipx/venvs/bdfrx/lib/python3.11/site-packages/bdfrx/downloader.py", line 40, in __init__
super().__init__(args, logging_handlers)
File "/Users/devon/.local/pipx/venvs/bdfrx/lib/python3.11/site-packages/bdfrx/connector.py", line 63, in __init__
self._setup_internal_objects()
File "/Users/devon/.local/pipx/venvs/bdfrx/lib/python3.11/site-packages/bdfrx/connector.py", line 80, in _setup_internal_objects
self.create_reddit_instance()
File "/Users/devon/.local/pipx/venvs/bdfrx/lib/python3.11/site-packages/bdfrx/connector.py", line 156, in create_reddit_instance
token = oauth2_authenticator.retrieve_new_token()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/devon/.local/pipx/venvs/bdfrx/lib/python3.11/site-packages/bdfrx/oauth2.py", line 73, in retrieve_new_token
refresh_token = reddit.auth.authorize(params["code"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/devon/.local/pipx/venvs/bdfrx/lib/python3.11/site-packages/praw/models/auth.py", line 54, in authorize
authorizer.authorize(code)
File "/Users/devon/.local/pipx/venvs/bdfrx/lib/python3.11/site-packages/prawcore/auth.py", line 242, in authorize
self._request_token(
File "/Users/devon/.local/pipx/venvs/bdfrx/lib/python3.11/site-packages/prawcore/auth.py", line 155, in _request_token
response = self._authenticator._post(url, **data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/devon/.local/pipx/venvs/bdfrx/lib/python3.11/site-packages/prawcore/auth.py", line 38, in _post
raise ResponseException(response)
prawcore.exceptions.ResponseException: received 401 HTTP response
in log file:
[2023-06-22 16:58:43,763 - bdfr.connector - DEBUG] - Disabling the following modules:
[2023-06-22 16:58:43,763 - bdfr.connector - Level 9] - Created download filter
[2023-06-22 16:58:43,763 - bdfr.connector - Level 9] - Created time filter
[2023-06-22 16:58:43,763 - bdfr.connector - Level 9] - Created sort filter
[2023-06-22 16:58:43,768 - bdfr.connector - Level 9] - Create file name formatter
[2023-06-22 16:58:43,768 - bdfr.connector - DEBUG] - Using authenticated Reddit instance
[2023-06-22 16:58:43,963 - bdfr.oauth2 - Level 9] - Loaded OAuth2 token for authoriser
[2023-06-22 16:58:44,138 - bdfr.oauth2 - Level 9] - Written OAuth2 token from authoriser to /Users/devon/Library/Application Support/bdfr/default_config.cfg
[2023-06-22 16:58:44,354 - bdfr.connector - Level 9] - Resolved user to DevonAndChris
[2023-06-22 16:58:44,354 - bdfr.connector - Level 9] - Created site authenticator
[2023-06-22 16:58:44,354 - bdfr.connector - Level 9] - Retrieved subreddits
[2023-06-22 16:58:44,354 - bdfr.connector - Level 9] - Retrieved multireddits
[2023-06-22 16:58:44,354 - bdfr.archiver - DEBUG] - Retrieving comments of user DevonAndChris
[2023-06-22 16:58:44,355 - bdfr.connector - Level 9] - Retrieved user data
[2023-06-22 16:58:44,355 - bdfr.connector - Level 9] - Retrieved submissions for given links
[2023-06-22 16:58:46,095 - bdfr.archiver - DEBUG] - Attempting to archive submission jp1rmrr
[2023-06-22 16:58:52,610 - bdfr.archiver - DEBUG] - Writing entry jp1rmrr to file in JSON format at /Users/devon/Documents/archives/bdfr-auth9/BlockedAndReported/DevonAndChris_jp1rmrr_2023-06-21T23:25:13.json
[2023-06-22 16:58:52,610 - bdfr.archiver - INFO] - Record for entry item jp1rmrr written to disk
[2023-06-22 16:58:52,610 - bdfr.archiver - DEBUG] - Attempting to archive submission jp1t56r
[2023-06-22 16:58:58,440 - bdfr.archiver - DEBUG] - Writing entry jp1t56r to file in JSON format at /Users/devon/Documents/archives/bdfr-auth9/BlockedAndReported/DevonAndChris_jp1t56r_2023-06-21T23:39:07.json
[2023-06-22 16:58:58,440 - bdfr.archiver - INFO] - Record for entry item jp1t56r written to disk
[2023-06-22 16:58:58,440 - bdfr.archiver - DEBUG] - Attempting to archive submission jozxusg
[2023-06-22 16:59:01,372 - root - ERROR] - Archiver exited unexpectedly
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/bdfr/__main__.py", line 139, in cli_archive
reddit_archiver.download()
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/bdfr/archiver.py", line 49, in download
self.write_entry(submission)
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/bdfr/archiver.py", line 92, in write_entry
self._write_entry_json(archive_entry)
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/bdfr/archiver.py", line 103, in _write_entry_json
content = json.dumps(entry.compile())
^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/bdfr/archive_entry/comment_archive_entry.py", line 19, in compile
self.post_details = self._convert_comment_to_dict(self.source)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/bdfr/archive_entry/base_archive_entry.py", line 36, in _convert_comment_to_dict
in_comment.replies.replace_more(limit=None)
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/praw/util/deprecate_args.py", line 43, in wrapped
return func(**dict(zip(_old_args, args)), **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/praw/models/comment_forest.py", line 195, in replace_more
self._insert_comment(comment)
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/praw/models/comment_forest.py", line 80, in _insert_comment
raise DuplicateReplaceException
praw.exceptions.DuplicateReplaceException: A duplicate comment has been detected. Are you attempting to call 'replace_more_comments' more than once?
Same issue, though I guess with only 7 hours left, it's pointless to hope for a fix.
There will be fixes, the BDFR will be maintained going forwards. We're not stopping.
Do you have any other submission IDs for which this error occurs? The one in the logs does not exist.
I'm having the same issue, and I found another ID that causes the issue, but it's not a submission, its a comment.
The ID causing the issue for me is jns7s0a
, aka this comment: https://www.reddit.com/r/OutOfTheLoop/comments/146m5y0/whats_the_deal_with_so_many_people_mourning_the/jns7s0a/
and after trimming down the command i was using, i got to this, which should be usable to reproduce it:
bdfr clone --link jns7s0a [directory] --file-scheme '{POSTID}'