InstagramCrawler icon indicating copy to clipboard operation
InstagramCrawler copied to clipboard

Error when scraping captions

Open joaanna opened this issue 8 years ago • 4 comments

Hey, so far I crawled followers smoothly, but I have 2 issues:

  1. I get this when I try to crawl the captions python instagramcrawler.py -d data -q 'viralnova365' -c -n 10 dir_prefix: data, query: viralnova365, crawl_type: photos, number: 10, caption: True posts: 1660, number: 10 Scraping photo links... Number of photo_links: 25 Scraping captions... Traceback (most recent call last): File "instagramcrawler.py", line 297, in main() File "instagramcrawler.py", line 293, in main caption=args.caption) File "instagramcrawler.py", line 85, in crawl self.click_and_scrape_captions(number) File "instagramcrawler.py", line 161, in click_and_scrape_captions FIREFOX_FIRST_POST_PATH).click() File "/InstagramCrawler/crawl/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 313, in find_element_by_xpath return self.find_element(by=By.XPATH, value=xpath) File "InstagramCrawler/crawl/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 791, in find_element 'value': value})['value'] File 'InstagramCrawler/crawl/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 256, in execute self.error_handler.check_response(response) File "InstagramCrawler/crawl/lib/python3.4/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: //a[contains(@class, '_8mlbc _vbtk2 _t5r8b')]
  2. also I would like to crawl all the images, but it never downloades the number specifed by -n, do you have any suggestions?

joaanna avatar Jul 03 '17 17:07 joaanna

Hi @joaanna , Thank you for telling me! I'll look into this when I have time...

tzuhsial avatar Jul 04 '17 03:07 tzuhsial

@joaanna I think I fixed the path to caption, that makes captions crawlable now. (Guess I'll have to do this everytime whenever Instagram updates)

And about the number issue, I am still looking for a robust way to detect if new posts are loaded. Any help is appreciated!

tzuhsial avatar Jul 07 '17 12:07 tzuhsial

Hi. I have the same problem. Error with values on label. FIREFOX_FIRST_POST_PATH Any suggestion please?

anfiallos avatar Aug 22 '17 23:08 anfiallos

hi, i got this problem too. selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: //div[contains(@class, '_8mlbc _vbtk2 _t5r8b')] image

anakmalank avatar Sep 29 '18 05:09 anakmalank