bulk-downloader-for-reddit
bulk-downloader-for-reddit copied to clipboard
Fix: Submissions with control character in title cannot be downloaded
bdfr archive --subreddit Unicode --sort new Z:/Reddit
fails on a windows system.
If a submission contains an invalid character in a field used for the filename the file cannot be saved and an exception occurs:
[2023-02-12 17:10:13,830 - root - ERROR] - Archiver exited unexpectedly
Traceback (most recent call last):
File "Z:\bulk-downloader-for-reddit\bdfr\__main__.py", line 143, in cli_archive
reddit_archiver.download()
File "Z:\bulk-downloader-for-reddit\bdfr\archiver.py", line 60, in download
self.write_entry(submission)
File "Z:\bulk-downloader-for-reddit\bdfr\archiver.py", line 120, in write_entry
self._write_entry_json(entry, content, hash)
File "Z:\bulk-downloader-for-reddit\bdfr\archiver.py", line 132, in _write_entry_json
self._write_content_to_disk(resource, content, hash)
File "Z:\bulk-downloader-for-reddit\bdfr\archiver.py", line 167, in _write_content_to_disk
with Path(file_path).open(mode="w", encoding="utf-8") as file:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\user\AppData\Local\Programs\Python\Python\Lib\pathlib.py", line 1044, in open
return io.open(self, mode, buffering, encoding, errors, newline)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: [Errno 22] Invalid argument: "Z:\\Reddit\\Unicode\\Expert-Fun-2444_Unicode that doesn't exist \x03_vm3p1h.json"
The submission's title contains a control character. It either needs to be replaced or according to the already existing style to be removed.
The provided fix does the latter.