headless-chrome-crawler icon indicating copy to clipboard operation
headless-chrome-crawler copied to clipboard

Distributed crawler powered by Headless Chrome

Results 33 headless-chrome-crawler issues
Sort by recently updated
recently updated
newest added

Bumped to the highest version that I was able to make it work without changing anything else

**What is the current behavior?** Not sure how to add in auth to a page via proxy. Normally in puppeteer I would do: `await page.authenticate({ username: user, password: pass, });`...

- Version: [email protected] - Platform / OS version: Windows 10 - Node.js version: v10.15.0 - Link: http://f4b1.com/ (Rem: same error for other sites) > f4b1.com > > Login - f4b1.com...

bug

**What is the current behavior?** The crawler is closed when the queue is empty **What is the expected behavior?** Is there a possibility to not to close crawler but to...

feature

i have a list that has 1.8k urls, but when ``` await crawler.queue(urls) ``` it seems stuck randomly without timeout? ```js const fs = require('fs') const _ = require('lodash') const...

bug

**What is the current behavior?** `(node:61647) UnhandledPromiseRejectionWarning: Error: Protocol error (Target.closeTarget): Target closed. at Promise (/Users/random/Documents/GitHub/test/node_modules/puppeteer/lib/Connection.js:86:56) at new Promise () at Connection.send (/Users/random/Documents/GitHub/test/node_modules/puppeteer/lib/Connection.js:85:12) at Page.close (/Users/random/Documents/GitHub/test/node_modules/puppeteer/lib/Page.js:888:38) at Crawler.close (/Users/random/Documents/GitHub/test/node_modules/headless-chrome-crawler/lib/crawler.js:80:22) at...

bug

Currently the Content-Security-Policy header is disabled by default when crawling. It's possible to re-enable it using the crawler configuration: `jQuery: false` Is this the intended way to re-enable the CSP...

feature

**What is the current behavior?** Getting unhandled exception: ``` { Error: Navigation Timeout Exceeded: 30000ms exceeded at Promise.then (/.../node_modules/puppeteer/lib/NavigatorWatcher.js:73:21) options: { maxDepth: 1, priority: 0, delay: 0, retryCount: 1, retryDelay:...

feature

**What is the current behavior?** If I specify a domain like eg. "http://www.domainname.com/" but the preferred domain settings on the server are without "www." then the crawling process stops. The...

bug