reddit-html-archiver icon indicating copy to clipboard operation
reddit-html-archiver copied to clipboard

Restricting write_html.py

Open ghost opened this issue 5 years ago • 3 comments

I think it would be nice to run something like this:

./fetch_links.py linux 2018-01-01 2018-12-31
./write_html.py linux 2018-01-01 2018-12-31

The first command already works fine, but the second one doesn't.

Unfortunately running ./write_html.py without arguments requires too much memory on my system and takes too long. After more than 30 minutes I had to manually stop it.

Being able to process posts from just one single subreddit in a limited time interval would probably help.

ghost avatar Oct 09 '19 18:10 ghost

So I see two issues here, the RAM issue and the issue of wanting to output a particular subreddit with write_html.py. Do you want to output multiple subs eventually? The issue is that write_html.py wants to know all of the subs that will be written, so it can create the index.html page as well as the 'subreddits' dropdown in the menu bar.

If you want to output multiple subs, then fixing the RAM issue would be the simplest and most straightforward way to move forward.

Maybe you can add some logging statements with memory used and figure out where it's going wrong? How big is all of your data in 'data'?

libertysoft3 avatar Oct 10 '19 05:10 libertysoft3

The best thing would be to fix the RAM issue, if possible, eventually keeping write_html.py usage as currently is. In my test write_html.py used all 8 gigabytes of RAM plus 5 gigabytes of swap before I stopped it. My data directory is currently 3.1 gigabytes. It will continue to expand in the future because I'm always adding new subreddits. Should I create another issue specifically for my RAM issue?

ghost avatar Oct 10 '19 10:10 ghost

Ugh, yeah something is definitely wrong there with that RAM usage. Yeah let's make a new RAM fail issue. I'll try to fix that RAM thing sometime. I'll change this one to an 'enhancement'.

libertysoft3 avatar Oct 11 '19 06:10 libertysoft3