InstagramCrawler reworking how total post count is pulled from page; fixes issue #26

reworking how total post count is pulled from page; fixes issue #26

Open wwwehr opened this issue 7 years ago • 3 comments

instagram has changed the format of the page; it no longer has a CSS_LOAD_MORE that I can detect so I removed that as well. I can now use this command successfully:

$ python instagramcrawler.py -q '#breakfast' -n 5
dir_prefix: ./data/, query: #breakfast, crawl_type: photos, number: 5, caption: False, authentication: None
posts: 70478766, number: 5
Scraping photo links...
Number of photo_links: 33
Saving...
Downloading 5 images to ata/breakfast.hashtag
Quitting driver...

using geckodriver 0.20.0 for osx:

https://github.com/mozilla/geckodriver/releases/download/v0.20.0/geckodriver-v0.20.0-macos.tar.gz

Mar 18 '18 18:03 wwwehr

for me intagramcrawler.py doesn´t work. Error in line 117 self.scroll_to_num_of_posts(number)

May 09 '18 01:05 JoaquinMontesinos

This works only the number of photos you want to crawl is less than the number of photos in the current page, it cannot automatically scroll down for us. In other words, this code can only crawl in one page only. Any idea?

Aug 30 '18 04:08 vuongducdai

I got the same problem here, Did you guys get any solutions?

Nov 14 '18 10:11 Cupido10

InstagramCrawler InstagramCrawler copied to clipboard

reworking how total post count is pulled from page; fixes issue #26

InstagramCrawler
InstagramCrawler copied to clipboard