reddit-html-archiver icon indicating copy to clipboard operation
reddit-html-archiver copied to clipboard

Issue getting write_html past generate_html

Open 5000thinmints opened this issue 4 years ago • 4 comments

Here's the error under win10 having just installed latest python and snudown, my data folder is about a gig and a half. I assumed min score/comments and deleted are all set to something by default.

E:\Myfolder\reddittohtml>write_html.py Traceback (most recent call last): File "E:\Myfolder\reddittohtml\write_html.py", line 774, in generate_html(args.min_score, args.min_comments, hide_deleted_comments) File "E:\Myfolder\reddittohtml\write_html.py", line 114, in generate_html raw_links = load_links(d, sub, True) File "E:\Myfolder\reddittohtml\write_html.py", line 625, in load_links comments_file_path = daily_path + '/' + link_row['id'] + '.csv' KeyError: 'id'

Here's when I do try it with min and max etc set;

E:\Myfolder\reddittohtml>write_html.py --min-score -4 --min-comments 2 --hide-deleted-comments Traceback (most recent call last): File "E:\Myfolder\reddittohtml\write_html.py", line 774, in generate_html(args.min_score, args.min_comments, hide_deleted_comments) File "E:\Myfolder\reddittohtml\write_html.py", line 114, in generate_html raw_links = load_links(d, sub, True) File "E:\Myfolder\reddittohtml\write_html.py", line 625, in load_links comments_file_path = daily_path + '/' + link_row['id'] + '.csv' KeyError: 'id'

5000thinmints avatar Jul 18 '20 15:07 5000thinmints

I pushed a fix, will you try again?

libertysoft3 avatar Jul 18 '20 19:07 libertysoft3

So it look longer to print out an error this time, if it helps some of the reddit subs I'm pulling from use non-english characters (ex hindi, korean) frequently

E:\Myfolder\reddittohtml>write_html.py Traceback (most recent call last): File "E:\Myfolder\reddittohtml\write_html.py", line 774, in generate_html(args.min_score, args.min_comments, hide_deleted_comments) File "E:\Myfolder\reddittohtml\write_html.py", line 119, in generate_html write_link_page(subs, l, sub, hide_deleted_comments) File "E:\Myfolder\reddittohtml\write_html.py", line 288, in write_link_page '###BODY###': snudown.markdown(c['body'].replace('>','>')), UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc2 in position 382: invalid continuation byte

5000thinmints avatar Jul 19 '20 01:07 5000thinmints

Cool, forward progress. I'm not too hip with Windows, but can you try running the 2 commands listed here under "Windows users may need to run"? https://github.com/libertysoft3/reddit-html-archiver/blob/master/README.md#install

libertysoft3 avatar Jul 19 '20 20:07 libertysoft3

Running the commands with cmd under admin still has the issue;

E:\Myfolder\reddittohtml>chcp 65001 Active code page: 65001

E:\Myfolder\reddittohtml>set PYTHONIOENCODING=utf-8

E:\Myfolder\reddittohtml>write_html.py Traceback (most recent call last): File "E:\Myfolder\reddittohtml\write_html.py", line 774, in generate_html(args.min_score, args.min_comments, hide_deleted_comments) File "E:\Myfolder\reddittohtml\write_html.py", line 119, in generate_html write_link_page(subs, l, sub, hide_deleted_comments) File "E:\Myfolder\reddittohtml\write_html.py", line 288, in write_link_page '###BODY###': snudown.markdown(c['body'].replace('>','>')), UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc2 in position 382: invalid continuation byte

5000thinmints avatar Jul 20 '20 14:07 5000thinmints