scrapy-selenium icon indicating copy to clipboard operation
scrapy-selenium copied to clipboard

KeyError: 'driver' or 'screenshot'

Open afperezp opened this issue 4 years ago • 9 comments

hey i just started to scrape with scrapy-selenium but Bildschirmfoto 2020-09-14 um 11 11 24 i am always getting this same problem. My mentor suggested adding Webdriver to the path, but the problem is not fixed, any suggestions?

afperezp avatar Sep 14 '20 09:09 afperezp

What's your code ?

tristanlatr avatar Sep 14 '20 13:09 tristanlatr

Same thing. Can't access response.meta['screenshot'] or 'driver' in my middleware

uselessvevo avatar Dec 13 '20 14:12 uselessvevo

I'm facing the same issue. As requested by @tristanlatr above, here's my code -

`import scrapy
 from scrapy_selenium import SeleniumRequest
 from selenium import webdriver
 from shutil import which
 import requests

 

  class LinkedinCrawlerSpider(scrapy.Spider):
          name = 'linkedin_crawler'
          allowed_domains = ['www.linkedin.com']


    def start_requests(self):
        yield SeleniumRequest(
            url = 'https://www.linkedin.com/sales/login',
            wait_time = 5,
            callback = self.login
        )
    def login(self, response):

        print(response.request.meta['driver'].title)`

Screenshot of the error -

Screenshot 2020-12-22 at 1 12 27 PM

raghavsehgal1 avatar Dec 22 '20 07:12 raghavsehgal1

@raghavsehgal1 Did you activate the downloader middleware in the settings.py ?

tristanlatr avatar Dec 22 '20 14:12 tristanlatr

@tristanlatr Yes, I did. scrapy-selenium related Code snippet from settings.py -

   DOWNLOADER_MIDDLEWARES = {
                  'scrapy_selenium.SeleniumMiddleware': 800
              }
  
  from shutil import which
  
  SELENIUM_DRIVER_NAME = 'chrome'
  SELENIUM_DRIVER_EXECUTABLE_PATH = which('chromedriver')
  SELENIUM_DRIVER_ARGUMENTS=['--headless']  # '--headless' if using chrome instead of firefox

I have been trying multiple approaches but I have been unable to figure out why the issue persists. I tried to print the meta object without any keys. This is the object that was printed -

{'download_timeout': 180.0, 'download_slot': 'golden.com', 'download_latency': 0.34168577194213867}

raghavsehgal1 avatar Dec 22 '20 16:12 raghavsehgal1

I had a similar issue and was resolved by making sure driver arguments is set to ['--headless'] i.e. two dashes for chrome

blackwhiteman avatar Dec 29 '20 00:12 blackwhiteman

I had the same issue, it turned out that I don't have the driver installed. (it's says in the log that the driver name and driver path is not set, so the selenium middleware is disable)

The issue solved after install the chromedriver or geckodriver.

Hope this help.

Lpaydat avatar Jan 27 '21 14:01 Lpaydat

I had the same issue. I get this error when debugging, but the code seems to be working fine.

tomato-ga avatar Mar 16 '22 07:03 tomato-ga

['--headless'] Hi @blackwhiteman It's still the same issue i changed as the following: SELENIUM_DRIVER_NAME = "chrome" SELENIUM_DRIVER_EXECUTABLE_PATH = '/drivers/chromedriver' SELENIUM_DRIVER_ARGUMENTS = ['--headless']

mahmoudodoo avatar Mar 22 '22 17:03 mahmoudodoo