puppeteer-cluster icon indicating copy to clipboard operation
puppeteer-cluster copied to clipboard

Support for Lighthouse

Open kushalhalder opened this issue 4 years ago • 6 comments

Essentially, lighthouse needs the port at which the chrome instance is running to run a job on the tab. Is that possible here?

kushalhalder avatar Nov 09 '20 17:11 kushalhalder

You can get the port by getting the browser from the page and then the port from the browser as demonstrated here:

flags.port = (new URL(page.browser().wsEndpoint())).port;
const result = await lighthouse(url, flags, config);

The problem I'm finding is that when Lighthouse takes over the Puppeteer instance it kills the puppeteer-cluster task before it has completed. This results in Lighthouse not being able to find the browser.

EDIT I found my issue was the concurrency model. CONCURRENCY_CONTEXT is tied to an incognito page which gets closed during the handoff to Lighthouse. The other models seem to work fine.

jasonreiche avatar Nov 24 '20 01:11 jasonreiche

Yeah, the integration looked quiet cumbersome. So I just created two files:

  1. chrome.js for launching chrome and getting the port
const puppeteer = require('puppeteer');
// const lighthouse = require('lighthouse');
const {URL} = require('url');

(async() => {
	const url = 'https://www.chromestatus.com/features';

	// Use Puppeteer to launch headful Chrome and don't use its default 800x600 viewport.
	const browser = await puppeteer.launch({
		headless: false,
		defaultViewport: null,
	});

	// // Wait for Lighthouse to open url, then inject our stylesheet.
	browser.on('targetchanged', async target => {
			const page = await target.page();
	//   if (page && page.url() === url) {
	//     await page.addStyleTag({content: '* {color: red}'});
	//   }
		console.log(page.url());
	});

	console.log((new URL(browser.wsEndpoint())).port)
})();
  1. lighthouse.js for connecting to the chrome
const lighthouse = require('lighthouse');

var myArgs = process.argv.slice(2);
console.log('myArgs: ', myArgs);
(async() => {

	port = myArgs[0]
	url = myArgs[1]
	const {lhr} = await lighthouse(url, {
		port: port,
		output: 'json',
		logLevel: 'info',
	});
})();

kushalhalder avatar Nov 24 '20 06:11 kushalhalder

Hi @jasonreiche & @kushalhalder

Could you please describe in more detail how were you able to access wsEndpoint simultaneously running Puppeteer Cluster (not a single instance)? I tried both of your examples but I without any success.

kacbrz avatar Dec 12 '22 00:12 kacbrz

@kacbrz, I was not able to get multiple instances of Lighthouse to run concurrently using puppeteer-cluster. Lighthouse is not designed to run concurrently as that would skew performance results. Lighthouse conflicts with itself if multiple instances are running in the same Node process. You could look at using lighthouse-batch-parallel. When I need to run Lighthouse, I just use puppeteer-cluster to manage the pool and run a concurrency of 1. In my case, I'm just looking for accessibility scores so we are switching over to axe-core moving forward.

jasonreiche avatar Dec 13 '22 14:12 jasonreiche

@jasonreiche Thank you for replying! Last couple of days I was thinking about the performance in the exact same context you've provided. Despite that I still have to build a cluster of Lighthouses and somehow handle the performance issue. My small project just evolved and got a little bit more difficult...

kacbrz avatar Dec 13 '22 23:12 kacbrz

Hi @kacbrz, my use case was different. I did not have a cluster, and I also had only one machine. What I wanted was that I launch a chrome instance, which exposes a port through which I can push lighthouse events which run in different tabs as separate sessions. The above snippets simulate this.

kushalhalder avatar Dec 14 '22 06:12 kushalhalder