reddit-html-archiver
reddit-html-archiver copied to clipboard
Thoughts
One, thanks for the app. I love it. I archive mostly text based subreddits and works great
Just having some thoughts on it and wondering if you have any plans for a few things.
- instead of csv shove into sqllite db?
- support markdown? ie makemd.py
I was thinking of maybe a quick flask frontend to the sqlitedb that way i would do some sorting and top , newest etc etc. like a little currated offline reddit
to add this a little. Possible for it to read the last archived dat of a subreddit and automatically archive from there? Im thinking like a daily cron job so it pulls all missing days from say today-1
Hello, glad you're digging this little tool.
instead of csv shove into sqllite db?
Not a bad idea, that'd be cleaner.
support markdown? ie makemd.py
It's using reddit's official markdown renderer already, https://github.com/chid/snudown Maybe you are finding some of the CSS styles of the markdown lacking.
Yea its a perfect little tool. does one thing very well. Appreciate the effort. Im archiving nosleep (and other story subs) Maybe its the original way they author added, but seeing a lot of 1-2 sentences then a paragraph
re sqlite, im not sure if it should replace the csv vs an option to do it.
Hope this isnt coming across as pest. Im just thinking about things while im using it.
- resolve links to internal in a lot of the story subs they link to the next iteration or previous ones. is there an easyish way to reformat those to internal links?
- localized media would be great if possible that may be a lot of work though
Maybe its the original way they author added, but seeing a lot of 1-2 sentences then a paragraph
You should be able to view the post on reddit and find the same comment and compare the layout. If you found anything wrong with how this archiver is doing things, I'd be into fixing/improving it.
im not sure if it should replace the csv vs an option to do it
Yeah I'd have to have a compelling reason to switch things around.
they link to the next iteration or previous ones
Oh like posts linking to each other? Yeah you might be able to pull that off pretty easily with a regex. I tried to keep the urls very similar on purpose. I guess you could do it in javascript so you dont have to touch the archived data.
localized media would be great if possible that may be a lot of work though
It would be doable, but after thumbnails and simple image posts it gets real involved real quick. And your archive size would explode. Pull requests accepted :)
im not very skilled with python at all, but getting more interested in it i may play around a bit with the url stuff especially after im done archiving my story subs