browsertrix-crawler icon indicating copy to clipboard operation
browsertrix-crawler copied to clipboard

Stalled after pupeeter error

Open rgaudin opened this issue 3 years ago • 3 comments

Pretty sure this happened in the past: pupeeter raises an error and in this case the process just hangs forever. Not sure what the optimal behavior would be but at even just crashing/exiting would be better for my use case.

Page Load Failed: https://www.almaany.com/ar/dict/ar-ar/%D8%A3%D9%86%D8%B5%D8%A8%D8%A9/, Reason: Error: Timeout hit: 180000
Error: Protocol error (Runtime.callFunctionOn): Session closed. Most likely the page has been closed.
    at CDPSession.send (/app/node_modules/puppeteer-core/lib/cjs/puppeteer/common/Connection.js:285:35)
    at ExecutionContext._ExecutionContext_evaluate (/app/node_modules/puppeteer-core/lib/cjs/puppeteer/common/ExecutionContext.js:210:46)
    at ExecutionContext.evaluate (/app/node_modules/puppeteer-core/lib/cjs/puppeteer/common/ExecutionContext.js:106:113)
    at IsolatedWorld.evaluate (/app/node_modules/puppeteer-core/lib/cjs/puppeteer/common/IsolatedWorld.js:174:24)
Page Load Failed: https://www.almaany.com/ar/dict/ar-ar/%D8%AA%D9%8E%D9%86%D9%8E%D8%A7%D8%B5%D9%8F%D8%A8/, Reason: Error: Timeout hit: 180000
Error: Protocol error (Runtime.callFunctionOn): Session closed. Most likely the page has been closed.
    at CDPSession.send (/app/node_modules/puppeteer-core/lib/cjs/puppeteer/common/Connection.js:285:35)
    at ExecutionContext._ExecutionContext_evaluate (/app/node_modules/puppeteer-core/lib/cjs/puppeteer/common/ExecutionContext.js:210:46)
    at ExecutionContext.evaluate (/app/node_modules/puppeteer-core/lib/cjs/puppeteer/common/ExecutionContext.js:106:113)
    at IsolatedWorld.evaluate (/app/node_modules/puppeteer-core/lib/cjs/puppeteer/common/IsolatedWorld.js:174:24)

rgaudin avatar Sep 19 '22 08:09 rgaudin

@rgaudin do you have any repro steps? How long did it take for it to happen? Which site? I assume it would take a while, so probably hard to reproduce.

ikreymer avatar Oct 11 '22 23:10 ikreymer

Have been able to repro the 'Page Load Failed', but it does exit after that in my test.. hopefully will be able to find it eventually

ikreymer avatar Oct 12 '22 00:10 ikreymer

url=https://www.almaany.com/
include=https://www.almaany.com/ar/dict/ar-ar/.*
sizeLimit=4294967296
timeLimit=7200

Failing URL is https://www.almaany.com/ar/dict/ar-ar/أنصبة

I don't know how much time it took to get there

[8A[K
[K== Start:     2022-09-17 22:05:29.514
[K== Now:       2022-09-19 08:47:42.655 (running for 1.4 days)
[K== Progress:  1420 / 10137 (14.01%), errors: 670 (47.18%)
[K== Remaining: 8.9 days (@ 0.01 pages/second)
[K== Sys. load: 64.3% CPU / 22.7% memory
[K== Workers:   1
[K   #0 WORK https://www.almaany.com/ar/dict/ar-ar/%D8%AA%D9%8E%D9%86%D9%8E%D8%B

rgaudin avatar Oct 12 '22 08:10 rgaudin