tumblr-crawler Some suggestions

Some suggestions

Open meokbodizi opened this issue 6 years ago • 3 comments

It needs to support basic form of content address like "https://tumblr.blahblah.com/blah" When approach to a certain tumblr blog with http protocol is blocked by ISP, try https:// or make it as basic form.
There should be a method to suppress repeating download when the download fails once. Save dummy file with the file name, for example.
When the address is form of "https://www.tumblr.com/dashboard/blog/blah", it skips downloads.

Jun 05 '18 23:06 meokbodizi

@meokbodizi Good suggestions. I don't have too much bandwidth to implement this, although this is not very hard. Would you like to help on this? Thanks.

Jun 06 '18 03:06 dixudx

Let me try to find the way. Thanks for the fast response.

Jun 06 '18 05:06 meokbodizi

I agree with suggestion 2 the most. Having the ability to use something like an .sqlite database to manage what content has already been downloaded and skip them while processing the tumblr lists would be fantastic! +1 vote.

Jul 17 '18 02:07 momoe

tumblr-crawler tumblr-crawler copied to clipboard

Some suggestions

tumblr-crawler
tumblr-crawler copied to clipboard