scrapeulous icon indicating copy to clipboard operation
scrapeulous copied to clipboard

google_scraper times out if one search term returns no results

Open bokostmu opened this issue 3 years ago • 0 comments

Hi,

first thanks for the great work done here. This really makes scraping a lot simpler.

The issue i have is that if a search term returns no results, a timeout it is thrown, which 1. costs a lot of time and 2. burns a lot of credits in scrapeulous, especially when querying for multiple items. POC: Request: { "function": "https://raw.githubusercontent.com/NikolaiT/scrapeulous/master/google_scraper.js", "items": ["dasdagsgfdagdfghfsdfgdfaf"] }

Response: [ { "item": "dasdagsgfdagdfghfsdfgdfaf", "result": { "error_message": "TimeoutError: waiting for selector \"#center_col .g\" failed: timeout 30000ms exceeded", "error_trace": "TimeoutError: waiting for selector \"#center_col .g\" failed: timeout 30000ms exceeded\n at new WaitTask (/var/task/node_modules/puppeteer-core/lib/cjs/puppeteer/common/DOMWorld.js:394:34)\n at DOMWorld._waitForSelectorOrXPath (/var/task/node_modules/puppeteer-core/lib/cjs/puppeteer/common/DOMWorld.js:326:26)\n at DOMWorld.waitForSelector (/var/task/node_modules/puppeteer-core/lib/cjs/puppeteer/common/DOMWorld.js:309:21)\n at Frame.waitForSelector (/var/task/node_modules/puppeteer-core/lib/cjs/puppeteer/common/FrameManager.js:801:51)\n at Page.waitForSelector (/var/task/node_modules/puppeteer-core/lib/cjs/puppeteer/common/Page.js:1215:33)\n at GoogleScraper.wait_for_results (eval at run (/var/task/dist/crawler/src/handler.js:37:31), <anonymous>:338:21)\n at GoogleScraper.crawl (eval at run (/var/task/dist/crawler/src/handler.js:37:31), <anonymous>:43:18)\n at process._tickCallback (internal/process/next_tick.js:68:7)", "proxy": null } } ]

bokostmu avatar Aug 10 '20 09:08 bokostmu