tumblr-crawler
tumblr-crawler copied to clipboard
Some suggestions
-
It needs to support basic form of content address like "https://tumblr.blahblah.com/blah" When approach to a certain tumblr blog with http protocol is blocked by ISP, try https:// or make it as basic form.
-
There should be a method to suppress repeating download when the download fails once. Save dummy file with the file name, for example.
-
When the address is form of "https://www.tumblr.com/dashboard/blog/blah", it skips downloads.
@meokbodizi Good suggestions. I don't have too much bandwidth to implement this, although this is not very hard. Would you like to help on this? Thanks.
Let me try to find the way. Thanks for the fast response.
I agree with suggestion 2 the most. Having the ability to use something like an .sqlite database to manage what content has already been downloaded and skip them while processing the tumblr lists would be fantastic! +1 vote.