scrape-it HTTP errors are not treated as errors

HTTP errors are not treated as errors

Open raysarebest opened this issue 7 years ago • 3 comments

If a URL gives an error response, such as a 404 or a 502, the Promise returned by the scrapeIt function does not reject, but instead resolves and calls its .then chain, and passes a basically-empty object as the data parameter. For example, this code prints "success", even though the URL 404s:

const scrapeIt = require("scrape-it");

scrapeIt("http://google.com/404.html", {}).then(({data, response}) => {
    console.log("success");
}).catch(() => {
    console.log("error");
});

When hitting the URL gives a HTTP status code of something not in the 200s, I feel it should automatically reject the promise so client catches will run, instead of the then chain.

Jul 16 '18 20:07 raysarebest

Yes, this approach has downsides—I remember I chose it for simplicity, I can see how it can break things. However, I guess people should be able to scrape error pages too (maybe they really want to do that).

We can add an option to use the behaviour you expect by default. 🚀 Contributions are welcome!

Jul 17 '18 06:07 IonicaBizau

Is anybody else working on this? I would like to try my hand at this issue

May 03 '19 16:05 cukejianya

@cukejianya Doesn't seem like anyone is so go for it!

Sep 06 '19 06:09 wayneconnolly

In 6.x.x HTTP errors will eventually throw, as long axios does that.

Mar 19 '23 14:03 IonicaBizau

scrape-it scrape-it copied to clipboard

HTTP errors are not treated as errors

scrape-it
scrape-it copied to clipboard