browsertrix-crawler
browsertrix-crawler copied to clipboard
Crawl error: missing context with id
I've seen this a couple times witb v1.1.1 and thought it might be worth noting down. It doesn't seem to be easily reproducible unfortunately:
Error: INTERNAL ERROR: missing context with id = 14
at assert (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/util/assert.js:15:15)
at FrameManager.executionContextById (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/FrameManager.js:199:9)
at #onBindingCalled (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/Page.js:640:44)
at file:///app/node_modules/puppeteer-core/lib/esm/third_party/mitt/mitt.js:36:7
at Array.map (<anonymous>)
at Object.emit (file:///app/node_modules/puppeteer-core/lib/esm/third_party/mitt/mitt.js:35:20)
at CdpCDPSession.emit (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/common/EventEmitter.js:77:23)
at CdpCDPSession._onMessage (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/CDPSession.js:79:18)
at Connection.onMessage (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/Connection.js:138:25)
at Immediate.<anonymous> (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/node/NodeWebSocketTransport.js:37:36)
Node.js v20.11.1
In case it's useful here was my command, after removing the original site being crawled:
docker compose run --build -p 9037:9037 crawler crawl --url http://www.example.com --scopeType prefix --generateWACZ --screencastPort 9037 --collection buffon --scopeExcludeRx 'search=.=search=' --scopeExcludeRx '.*fig=.*fig=.*' --workers 4 --pageLoadTimeout 30 --text to-warc --screenshot view