actor-google-search-scraper icon indicating copy to clipboard operation
actor-google-search-scraper copied to clipboard

Return an error if HTML parsing of results is broken

Open jancurn opened this issue 4 years ago • 3 comments

When Google changed the layout, we had an issue that we were returning invalid results to our users rather than failing (see https://github.com/apify/actor-google-search-scraper/issues/71). We should have some sanity checks in the code to ensure we're parsing the HTML correctly and then fail and return an error to the user, rather than returning an invalid result.

I've discussed this issue with potential customers and this was quite important for them.

jancurn avatar Jan 22 '21 14:01 jancurn

This is hard to do correctly. What means "broken"? If paid results are missing but organic are correct, is it broken? What about vice versa? Etc. Isn't it easier for the caller to do this rudimentary check on their side?

Just for context: Google usually changes only a few selectors so it breaks only parts of the parser.

metalwarrior665 avatar Jan 22 '21 20:01 metalwarrior665

What should happen is that we should have a test running often and a maintainer that fixes this within hours. (both is in progress)

But it would be good to hear it from the customers.

metalwarrior665 avatar Jan 22 '21 20:01 metalwarrior665

Well, the test could check that either organic results are present, or there is an info message that no results were found. That would make it fairly bulletproof, what do you think?

Of course, having regular tests would be best...

jancurn avatar Jan 25 '21 23:01 jancurn