scrapy-playwright inspect_response not working in spider with scrapy

inspect_response not working in spider with scrapy_playwright

Open eckseller opened this issue 1 year ago • 1 comments

Hi all,

I have a simple example below which should work but doesn't.

    class AwesomeSpider(scrapy.Spider):
    
      name = "test-playwright"
    
      def start_requests(self):
    
          yield scrapy.Request("https://quotes.toscrape.com/", meta={
              "playwright": True,
          })
    
      def parse(self, response):
    
          inspect_response(response, self)
    
          for quote in response.css('div.quote'):
              yield {
                  'text': quote.css('span.text::text').get(),
                  'author': quote.css('small.author::text').get(),
                  'tags': quote.css('div.tags a.tag::text').getall(),
              }

It gives the errors in the shell:

..........
[s] Useful shortcuts:
[s] shelp() Shell help (print this help)
[s] view(response) View response in a browser
**2022-08-15 11:17:16 [scrapy.core.scraper] ERROR: Spider error processing <GET [https://quotes.toscrape.com/>](https://quotes.toscrape.com/%3E) (referer: https://fonts.googleapis.com/)
Traceback (most recent call last):
File "/python/scrapy-projects/rightmove/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 1030, in adapt
extracted = result.result()

2022-08-15 11:17:16 [py.warnings] WARNING: /python/scrapy-projects/rightmove/venv/lib/python3.9/site-packages/IPython/core/displayhook.py:311: RuntimeWarning: coroutine 'Application.run_async' was never awaited
gc.collect()**

However, everything works fine if I run the scrapy shell initially directly from command line like so: scrapy shell 'https://quotes.toscrape.com'

Any ideas? I'm stumped, I think it's something to do with asyncio. Thanks,

(edited for syntax highlighting)

Aug 15 '22 09:08 eckseller

The traceback does not seem to be exactly the same, however the whole situation looks very similar to https://github.com/scrapy/scrapy/issues/5447. I see mentions of ipython in your post, I'd recommend trying with the regular interpreter by setting the SCRAPY_PYTHON_SHELL=python env variable.

Aug 15 '22 12:08 elacuesta

scrapy-playwright scrapy-playwright copied to clipboard

inspect_response not working in spider with scrapy_playwright

scrapy-playwright
scrapy-playwright copied to clipboard