serverless-chrome
serverless-chrome copied to clipboard
Example of how to use Puppeteer
try this one https://codeburst.io/a-guide-to-automating-scraping-the-web-with-javascript-chrome-puppeteer-node-js-b18efb9e9921
https://github.com/balmbees/suspicious-serverless/blob/master/src/services/content_dispatcher.ts
Here is our service code which uses serverless-chrome with puppeteer :)
Thank you very much @mooyoul, your code works great! I managed to copy it into my Puppeteer-powered project, and now I can deploy it on Lambda!
The only thing is that I have to remember everytime to set the env variable to avoid that Puppeteer downloads Chrome during the Serverless packaging process. I don't know why, but while "packing external modules", Puppeteer retries the download everytime, making the final package too big to be deployed on Lambda. Maybe it's because of serverless-webpack re-npm-installing everything? So, I'm using:
PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=1 serverless deploy
Do you have any insight about this, or fixed the issue differently?
@lorenzos
I'm not sure about your specific case but creating .npmrc file in project root directory with line
puppeteer_skip_chromium_download=1
works for me.
@jakub300 I tried both puppeteer_skip_chromium_download=1 and PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=1 in both .npmrc and .serverlessrc, but slss commands still trigger Puppeteer downloading Chrome :man_shrugging:
Hi @lorenzos, Glad to hear that my code was helpful :)
As you know, serverless-webpack plugin installs external dependency.
In my case, i specified @serverless-chrome/lambda as external module (and here), because @serverless-chrome/lambda package has additional chrome binary. so with external definition, built-in chrome binary will be packaged with bundled (from webpack build) script. (and other dependencies will be packaged into single bundle file also. without npm install attempts)
Anyway, i haven't experienced your issue (serverless-webpack tries to install puppeteer package during bundling) with my configuration.
Could you refer my setup and retry again?
@mooyoul My serverless.yml is identical, but my webpack.config.js is quite different: I have externals: [require("webpack-node-externals")()] instead, as shown here. I'm not practical of Webpack, so I'm not sure of what it does or if it's equivalent.
Anyway, during Serverless packaging, built-in Chrome is downloaded and bundled by serverless-chrome. The problem is that the same is done by Puppeteer too...
$ slss package ### More than 100MB, will fail to be uploaded
$ PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=1 slss package ### About ~40 MB, Puppeteer will work thanks to your code
So, I don't know where I should change my setup.
@lorenzos One more thing, for npm/yarn to load things from .npmrc you need to run command via them. Could you try to define script in package.json similar to example from repository?
https://github.com/adieuadieu/serverless-chrome/blob/daa22a3978cb2be79a73dd283d278ef56c3053c0/examples/serverless-framework/aws/package.json#L21
@jakub300 Uh, that's right. But now that I think about it, I can just set the env directly in that deploy script. Thank you!
In most cases, I highly recommend not using webpack-node-externals with webpack. By default, webpack-node-externals returns all of dependencies of your project (see description). so that's why everytime you are building your project via webpack, serverless-webpack tries to install entire dependencies.
@mooyoul Thank, Puppeteer doesn't download Chrome when packaging if, in Webpack config, I use only:
externals: [
'aws-sdk',
'es6-promise',
'@serverless-chrome/lambda',
'ws'
]
The reason for the additional two modules there is to avoid some warnings during building (i.e. when invoking local, packaging or deploying). The deployed function worked correctly anyway.
WARNING in ./node_modules/ws/lib/BufferUtil.js
Module not found: Error: Can't resolve 'bufferutil' in '/home/lorenzo/Dev/yamm-importer/node_modules/ws/lib'
@ ./node_modules/ws/lib/BufferUtil.js 35:21-42
@ ./node_modules/ws/lib/Receiver.js
@ ./node_modules/ws/index.js
@ ./node_modules/puppeteer/node6/Connection.js
@ ./node_modules/puppeteer/node6/Launcher.js
@ ./node_modules/puppeteer/node6/Puppeteer.js
@ ./lib/utils/puppeteer-launcher.js
@ ./lib/test.js
@ ./handler.js
WARNING in ./node_modules/ws/lib/Validation.js
Module not found: Error: Can't resolve 'utf-8-validate' in '/home/lorenzo/Dev/yamm-importer/node_modules/ws/lib'
@ ./node_modules/ws/lib/Validation.js 10:22-47
@ ./node_modules/ws/lib/Receiver.js
@ ./node_modules/ws/index.js
@ ./node_modules/puppeteer/node6/Connection.js
@ ./node_modules/puppeteer/node6/Launcher.js
@ ./node_modules/puppeteer/node6/Puppeteer.js
@ ./lib/utils/puppeteer-launcher.js
@ ./lib/test.js
@ ./handler.js
WARNING in ./node_modules/es6-promise/dist/es6-promise.js
Module not found: Error: Can't resolve 'vertx' in '/home/lorenzo/Dev/yamm-importer/node_modules/es6-promise/dist'
@ ./node_modules/es6-promise/dist/es6-promise.js 140:16-26
@ ./node_modules/es6-promisify/dist/promise.js
@ ./node_modules/es6-promisify/dist/promisify.js
@ ./node_modules/agent-base/index.js
@ ./node_modules/https-proxy-agent/index.js
@ ./node_modules/puppeteer/utils/ChromiumDownloader.js
@ ./node_modules/puppeteer/node6/Launcher.js
@ ./node_modules/puppeteer/node6/Puppeteer.js
@ ./lib/utils/puppeteer-launcher.js
@ ./lib/test.js
@ ./handler.js