scrapy_doc_chs icon indicating copy to clipboard operation
scrapy_doc_chs copied to clipboard

在“Following links” 这一节有个小错误

Open pchjia opened this issue 9 years ago • 0 comments

文档的response.urljoin有两个参数, 但是第一个参数是Response类的引用, 不能在类外使用,查看文档后得出此处的正确写法为response.urljoin(href.extract())

以下是文章内容引用:

def parse(self, response):
for href in response.css("ul.directory.dir-col > li > a::attr('href')"): url = response.urljoin(response.url, href.extract()) yield scrapy.Request(url, callback=self.parse_dir_contents)

class Response(object_ref): def urljoin(self, url): """Join this Response's url with a possible relative url to form an absolute interpretation of the latter.""" return urljoin(self.url, url)

pchjia avatar Jan 13 '16 04:01 pchjia