scrapy-inline-requests icon indicating copy to clipboard operation
scrapy-inline-requests copied to clipboard

Yielding requests with callbacks

Open pawelmhm opened this issue 7 years ago • 0 comments

Since version 3.0 there are restrictions if Request from generator has callback/errback. Why is it like this? What is the reason for this change?

I have some spiders like this

# -*- coding: utf-8 -*-
import json
import scrapy
from inline_requests import inline_requests


class toscrapecssspider(scrapy.spider):
    name = "toscrape-css"
    start_urls = [
        'http://quotes.toscrape.com/',
    ]

    @inline_requests
    def parse(self, response):
        some_data = yield scrapy.request('http://httpbin.org/headers')
        print(json.loads(some_data.body))
        next_page_url = response.css("li.next > a::attr(href)").extract_first()
        if next_page_url is not none:
            yield scrapy.request(response.urljoin(next_page_url), callback=self.parse_page)

    def parse_page(self, response):
        print(response.url)
        print("hello")

This still works fine, but prints warnings

2017-12-28 12:33:04 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://httpbin.org/headers> (referer: http://quotes.toscrape.com/)
{u'headers': {u'Accept-Language': u'en', u'Accept-Encoding': u'gzip,deflate,br', u'Host': u'httpbin.org', u'Accept': u'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', u'User-Agent': u'Scrapy/1.4.0 (+http://scrapy.org)', u'Connection': u'close', u'Referer': u'http://quotes.toscrape.com/'}}
2017-12-28 12:33:05 [py.warnings] WARNING: /home/pawel/.virtualenvs/scrapy/local/lib/python2.7/site-packages/inline_requests/generator.py:59: UserWarning: Got a request with callback set, bypassing the generator wrapper. Generator may not be able to resume. <GET http://quotes.toscrape.com/page/2/>
  "be able to resume. %s" % ret)

2017-12-28 12:33:05 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://quotes.toscrape.com/page/2/> (referer: http://httpbin.org/headers)
http://quotes.toscrape.com/page/2/
hello

What can happen if generator may not be able to resume? Is there some way to preserve behavior from before 3.0 and skip warnings?

@rmax

pawelmhm avatar Dec 28 '17 11:12 pawelmhm