scrapy-splash
scrapy-splash copied to clipboard
scrapy-splash doesn't render the page, but splash does
Scrapy-splash does not render the page fully, even though Splash alone does. I want to render contents of a table with id "grid". I can see that its rendered correctly in splash browser by http://localhost:8050/info?wait=0.5&images=1&expand=1&timeout=90.0&url=https%3A%2F%2Fwww.homeinspector.org%2FHomeInspectors%2FFind%2FResults%3FMetroAreaID%3D5%26NeighborhoodID%3D%26SearchType%3DMetroArea&lua_source=function+main%28splash%2C+args%29%0D%0A++assert%28splash%3Ago%28args.url%29%29%0D%0A++assert%28splash%3Await%280.5%29%29%0D%0A++return+splash%3Ahtml%28%29%0D%0Aend
However, its not rendered at all in my scrapy spider. The grid is empty.
I am talking about parse_find_page function
script="""
function main(splash, args)
splash.private_mode_enabled = false
splash.plugins_enabled = true
splash.indexeddb_enabled = true
splash.html5_media_enabled = true
assert(splash:go(splash.args.url)})
assert(splash:wait(7))
splash:set_viewport_full()
return splash:html()
end
"""
class Homeinspectors(CrawlSpider):
name = 'instructors'
def start_requests(self):
return [SplashRequest('https://www.homeinspector.org/HomeInspectors/Find', self.parse_find_page,
args={
'wait': 5,
'http_method': 'GET',
'timeout':30
},
)]
def parse_find_page (self, response):
values = response.xpath('//select[@id="ddlMetroArea"]/option/@value').extract()[1:]
url = "https://www.homeinspector.org/HomeInspectors/Find/Results?MetroAreaID={}NeighborhoodID=&SearchType=MetroArea"
for v in values:
yield SplashRequest(url.format(v), self.parse_search_results,
endpoint='execute',
args=
{'lua_source': script,
'wait': 5,
'http_method': 'GET'},
headers={'User-Agent': "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36"})
def parse_search_results( response):
# print response.body
instructors_urls = response.xpath('//*[@id="grid"]').extract()
print instructors_urls
Can you still reproduce this issue?
I'm having the same issue. what should I do?