Chained Requests
How Crawlee can be used when requests needs to be sent in sequence like in most ASP.Net applications. Scrapy handle these cases using inline requests without CALLBACK.
e.g
here couple of sequenced requests needs to be done to get the desired.
Hello, and thank you for your interest in Crawlee! I assume (correct me if I'm wrong) that you need to perform additional HTTP requests on each page you visit. Would the send_request helper work for you?
@crawler.router.default_handler
async def handler(context: BeautifulSoupCrawlingContext) -> None:
response = await context.send_request(url="/foo", method="post")
# parse JSON
response_json = json.loads(response.read())
# ...or HTML
response_soup = BeautifulSoup(response.read())
Thanks.
BasicCrawlingContext doesn't give access to response so the proposed solution will only work with extended class like BeautifulSoupCrawlingContext
New to crawlee, correct me if I'm wrong.
You can actually do this with BasicCrawlingContext as well, it also provides the send_request helper, and the response is returned from that. Do not confuse this with context.http_response which is the response for the original request URL, fetched before the request_handler was invoked.
@Ehsan-U is your question answered? If so, please close the issue :slightly_smiling_face: