headless-chrome-crawler issues

1

Bumped to the highest version that I was able to make it work without changing anything else

Provide auth to page when using proxy

**What is the current behavior?** Not sure how to add in auth to a page via proxy. Normally in puppeteer I would do: `await page.authenticate({ username: user, password: pass, });`...

tsrdatatech

WebSocket is not open when 'maxrequestreached' event is reached

1

- Version: [email protected] - Platform / OS version: Windows 10 - Node.js version: v10.15.0 - Link: http://f4b1.com/ (Rem: same error for other sites) > f4b1.com > > Login - f4b1.com...

LeMoussel

bug

crawler listening new request url

1

**What is the current behavior?** The crawler is closed when the queue is empty **What is the expected behavior?** Is there a possibility to not to close crawler but to...

Minyar2004

feature

Stuck for no reason?

6

i have a list that has 1.8k urls, but when ``` await crawler.queue(urls) ``` it seems stuck randomly without timeout? ```js const fs = require('fs') const _ = require('lodash') const...

crapthings

bug

Error: Protocol error (Target.closeTarget): Target closed.

3

**What is the current behavior?** `(node:61647) UnhandledPromiseRejectionWarning: Error: Protocol error (Target.closeTarget): Target closed. at Promise (/Users/random/Documents/GitHub/test/node_modules/puppeteer/lib/Connection.js:86:56) at new Promise () at Connection.send (/Users/random/Documents/GitHub/test/node_modules/puppeteer/lib/Connection.js:85:12) at Page.close (/Users/random/Documents/GitHub/test/node_modules/puppeteer/lib/Page.js:888:38) at Crawler.close (/Users/random/Documents/GitHub/test/node_modules/headless-chrome-crawler/lib/crawler.js:80:22) at...

yeekit-jewel

bug

What is the intended method for re-enabling the Content-Security-Policy header?

1

Currently the Content-Security-Policy header is disabled by default when crawling. It's possible to re-enable it using the crawler configuration: `jQuery: false` Is this the intended way to re-enable the CSP...

jamieweb

feature

How to handle the timeout error?

2

**What is the current behavior?** Getting unhandled exception: ``` { Error: Navigation Timeout Exceeded: 30000ms exceeded at Promise.then (/.../node_modules/puppeteer/lib/NavigatorWatcher.js:73:21) options: { maxDepth: 1, priority: 0, delay: 0, retryCount: 1, retryDelay:...

gajus

feature

Crawl stops on non-www URLs

5

**What is the current behavior?** If I specify a domain like eg. "http://www.domainname.com/" but the preferred domain settings on the server are without "www." then the crawling process stops. The...

cosmiXs

bug

headless-chrome-crawler
headless-chrome-crawler copied to clipboard

Metadata

Update dependency version; Ignore pdf link to avoid crash

Bump Puppeteer version

Provide auth to page when using proxy

WebSocket is not open when 'maxrequestreached' event is reached

crawler listening new request url

Stuck for no reason?

Error: Protocol error (Target.closeTarget): Target closed.

What is the intended method for re-enabling the Content-Security-Policy header?

How to handle the timeout error?

Crawl stops on non-www URLs

← Metadata

Owner

Metadata

headless-chrome-crawler headless-chrome-crawler copied to clipboard

Metadata

← Metadata

Owner

Metadata

headless-chrome-crawler
headless-chrome-crawler copied to clipboard