chromeless
chromeless copied to clipboard
Use-Case: Serverless Chrome In Local Mode
Not sure if this is a case you guys have considered, but I don't see any reference to it in the open Issues or Readme. I work with a medium-sized (few million pages a day) crawler system that runs almost entirely in AWS Lambda. I would love to use Chromeless for its great APIs on top of Serverless Chrome, but I don't need all the fancy Websocket stuff of proxy mode since everything is already running inside a Lambda function. Creating a RPC from one Lambda to another seems overkill.
Is it possible to use Chromeless in "local" mode with serverless-chrome acting as the local Chrome? Are there downsides or limitations to this approach? Will this be supported moving forward?
Hi @neekolas yes of course! You're right that we haven't documented this very well, but Chromeless can be used within a Lambda function. The Chromeless Proxy uses the serverless-plugin-chrome package for Serverless. However, you can almost as easily go "vanilla" with the @serverless-chrome/lambda package.
For example (this is untested code I just cobbled together, but should convey the idea):
const launchChrome = require('@serverless-chrome/lambda')
const Chromeless = require('chromeless')
module.exports.handler = function handler (event, context, callback) {
launchChrome({
flags: ['--window-size=1280x1696', '--hide-scrollbars'],
})
.then((chrome) => {
// Chrome is now running on localhost:9222
const chromeless = new Chromeless({
launchChrome: false,
})
chromeless
.goto('https://www.google.com')
.type('chromeless', 'input[name="q"]')
.press(13)
.wait('#resultStats')
.evaluate(() => {
// this will be executed in headless chrome
const links = [].map.call(document.querySelectorAll('.g h3 a'), a => ({
title: a.innerText,
href: a.href,
}))
return JSON.stringify(links)
})
.then((urls) => {
chromeless
.close()
.then(chrome.kill) // https://github.com/adieuadieu/serverless-chrome/issues/41#issuecomment-317989508
.then(() => {
callback(null, urls)
})
})
.catch(callback)
})
.catch((error) => {
// Chrome didn't launch correctly
callback(error)
})
}
Just run into the same problem of running chrome headless locally. The solution by @adieuadieu works. Instead of using @serverless-chrome/lambda
, I used chromeLauncher
which is suggested by google chrome team. Here is my code sample:
const chromeLauncher = require('chrome-launcher');
const Chromeless = require('chromeless').Chromeless;
chromeLauncher.launch({
// port: 9222, // Uncomment to force a specific port of your choice.
chromeFlags: [
'--window-size=1200,800',
'--disable-gpu',
'--headless'
]
})
.then(function (chrome) {
console.log('Chrome debuggable on port: ' + chrome.port);
const chromeless = new Chromeless({
launchChrome: false
});
var url = '[SOME URL FOR TESTING]'
chromeless.goto(url)
.then(function () {
// Test runner script
})
Interesting...Sure beats the wrangling and shoehorning I had to do to get nightmare running smoothly in our Docker Cluster and CircleCI. Thanks @adieuadieu @ryancat!
I'll do a bit of poking around on https://github.com/adieuadieu/serverless-chrome/issues/41. I'm more familiar than I'd like to be with the innards of the Lambda execution environment.
The best workaround for adieuadieu/serverless-chrome#41 I've come up with so far is: https://github.com/neekolas/chromeless-testbed/pull/1. Still reliably fails on the 5th invocation, but it at least gives you 4 invocations before you have to recreate. Will keep digging.
Persisting a Chrome instance for more than 5 invokes is still giving me trouble...but I was able to get Chromeless working in Alpine Linux without any special docker run flags. Image weighs in at a totally reasonable 350mb uncompressed. https://github.com/neekolas/chromeless-testbed/blob/feature/docker/Dockerfile
I tried @adieuadieu snippet but when deploying using severless it tries to upload the service .zip that is 546Mb! which fails due the size restriction on the lambda. Any one has a tutorial or anything to overcome this issue???
Thanks!
@mexin Make sure you only .zip relevant dependencies. E.g. are you shipping a huge node_modules
folder? (with devDependencies?)
const chromeLauncher = require('chrome-launcher');
const Chromeless = require('chromeless').Chromeless;
chromeLauncher.launch({
// port: 9222, // Uncomment to force a specific port of your choice.
chromeFlags: [
'--window-size=1200,800',
'--disable-gpu',
'--headless'
]
})
.then(function (chrome) {
console.log('Chrome debuggable on port: ' + chrome.port);
var port = chrome.port;
console.log(port);
const chromeless = new Chromeless({
//cdp:{host: 'localhost', port: port, secure: false, closeTab: true},
launchChrome: false
});
var url = 'https://xyz.com';
chromeless.goto(url)
.then(function () {
// Test runner script
console.log("opened");
chromeless.end();
});
});
After running above code I'm facing below issue, actually chrome-launcher launched at port number 45417(some random port), and I've created chromeless object with option launchChrome:false, why I'm facing this issue could any one help me out. Thank you
(node:16697) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): Error: connect ECONNREFUSED 127.0.0.1:9222 (node:16697) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
@adieuadieu jumping in here because I didn't want to open a new issue (since it's not really an issue), and seems related (a "what next" kind of question).
I followed the #setup without a problem, but tbh, I'm lost as to what the next step is.
I've set up Lambda functions in the past whereby I route requests through API Gateway to them, but now that it's installed, how do I actually use the service? Any docs you can point me to?