chromium icon indicating copy to clipboard operation
chromium copied to clipboard

[REQUEST] ESM Support

Open giomarino opened this issue 2 years ago • 14 comments

ESM support is available? Project like SST requires ESM to use all framework feature (like resource binding)

giomarino avatar Jan 03 '23 11:01 giomarino

Works out of the box, at least as an AWS Lambda, with a layer.

import puppeteer from "puppeteer-core";
import chromium from "@sparticuz/chromium";

export async function shot (event){
	const browser = await puppeteer.launch({
		args: chromium.args,
		defaultViewport: chromium.defaultViewport,
		executablePath: await chromium.executablePath(),
		headless: chromium.headless,
		ignoreHTTPSErrors: true,
	});

	const page = await browser.newPage();
	await page.goto("https://example.com");
	const pageTitle = await page.title();
	await browser.close();

	console.log(pageTitle, "Example Domain");
};

tomchiverton avatar Feb 17 '23 22:02 tomchiverton

I don't have any problems either, but the package is only set up for cjs and not esm. I'll probably eventually set it to esm only. I've heard there are problems with dual packaging

Sparticuz avatar Feb 17 '23 23:02 Sparticuz

Yes I am also using it with imports. I have written an article and also made the code open source. Feel free to check it out. https://adapttive.com/blog/serverless-scraping#how-to-debug https://github.com/openwishlist/preview

milindsingh avatar Mar 19 '23 14:03 milindsingh

I've just hit difficulties using this with Astro on AWS Lambda (included as a standard dependency, rather than as a layer), as I kept getting ReferenceError: __dirname is not defined. This appears to be coming from chromium.executablePath, but I've been able to get around it by providing the path to the chromium binary:

const browser = await puppeteer.launch({
  ...
    executablePath: await chromium.executablePath(`${path.dirname(fileURLToPath(import.meta.url))}/node_modules/@sparticuz/chromium/bin`),
  ...
});

bnjns avatar Apr 06 '23 20:04 bnjns

That variable is dropped in more recent Node versions, nothing to do with the library

†øღ Sent from a super computer that fits in my pocket and is connected to the sum total of all human knowledge

On 6 April 2023 21:55:36 BST, Ben Jones @.***> wrote:

I've just hit difficulties using this with Astro on AWS Lambda (included as a standard dependency, rather than as a layer), as I kept getting ReferenceError: __dirname is not defined. This appears to be coming from chromium.executablePath, but I've been able to get around it by providing the path to the chromium binary:

const browser = await puppeteer.launch({
 ...
   executablePath: await ***@***.***/chromium/bin`),
 ...
});

-- Reply to this email directly or view it on GitHub: https://github.com/Sparticuz/chromium/issues/31#issuecomment-1499616742 You are receiving this because you commented.

Message ID: @.***>

tomchiverton avatar Apr 07 '23 09:04 tomchiverton

Would you have any thoughts on how to fix the issue with _dirname? My code is exactly like the one above. However, I am using esbuild, and for some bizarre reason, the output is only 1.2 MB, and I wonder if it is even bundling successfully. So any advice would be appreciated.

"puppeteer-core": "20.1.0" "@sparticuz/chromium": "113.0.1"

respectively using sparticuz/chromium as lambda layer.

import puppeteer from 'puppeteer-core'
import chromium from '@sparticuz/chromium'
import { fileURLToPath } from 'url';
import path from 'path';

export const handler = async () => {
    chromium.setGraphicsMode = false

    try {
        const browser = await puppeteer.launch({
            args: chromium.args,
            defaultViewport: chromium.defaultViewport,
            executablePath: await chromium.executablePath(`${path.dirname(fileURLToPath(import.meta.url))}\\chromium\\bin`),
            headless: chromium.headless,
            ignoreHTTPSErrors: true,
        });

        const page = await browser.newPage();
        await page.setContent('<p>Hello World</p>');
        await browser.close();
    } catch (err) {
        console.log('Some error happened: ', err);
    }
}

Regardless of what I do I am getting the following error

{
    "errorType": "ReferenceError",
    "errorMessage": "__dirname is not defined in ES module scope",
    "stack": [
        "ReferenceError: __dirname is not defined in ES module scope",
        "    at file:///var/task/index.mjs:104:5294",
        "    at file:///var/task/index.mjs:3:588",
        "    at file:///var/task/index.mjs:127:4524",
        "    at ModuleJob.run (node:internal/modules/esm/module_job:194:25)"
    ]
}

rdzidziguri avatar May 06 '23 22:05 rdzidziguri

Quick update, the moment I switched it to produce commonjs output, it worked without any issues, so the problem is related to the ESM bundle. And I do not have to specify any custom path, as this is done through the Lambda layer.

rdzidziguri avatar May 07 '23 10:05 rdzidziguri

I'm using this with typescript and esm output with a lambda layer without any issues. Generally if you are using some type of bundler is where I've seen issues.

Sparticuz avatar May 08 '23 13:05 Sparticuz

esbuild is AWS recommended one. CDK uses it as well, but the moment I switch the TS config compilerOptions module to CommonJS, it works without any issues, but the moment I set it to esnext, it fails with the error above.

rdzidziguri avatar May 08 '23 23:05 rdzidziguri

fwiw we are running into the same error. Thought we had it working at one point w/ ESM output, not sure if something changed or if it was actually broken the whole time.

dobrynin avatar May 22 '23 22:05 dobrynin

Okay I got it working again. Not sure what the issue was but I

  1. Nuked node_modules, dist directories, and cdk.out directoreis
  2. tweaked our package.json (puppeteer-core went from ^19.8.1 to ^19.8.0 and @sparticuz/chromium went from ^112.0.2)
    • note that @sparticuz/chromium is just a dev dependency, we use a lambda layer in prod.
  3. Updated the our chromium binary to v112.0.2

For some reason node_modules/@puppeteer/browsers/node_modules/yargs was removed from our package-lock.json. Our stack trace for the __dirname error listed yargs as the originating package for the error so it makes sense that the error is gone now, although I don't know why the package-lock.json changed.

dobrynin avatar May 22 '23 23:05 dobrynin

In case someone would like to use esbuild, the only way it worked for me in AWS was as follows.

import * as esbuild from "esbuild";

esbuild.build({
  entryPoints: ["out-rollup/index.js"],
  minify: true,
  bundle: true,
  platform: "node",
  format: "esm",
  target: "node18",
  external: [
    "puppeteer-core",
    "puppeteer-extra",
    "puppeteer-extra-plugin-stealth",
    "puppeteer-extra-plugin-adblocker",
    "@sparticuz/chromium",
  ],
  outdir: "dist",
  banner: {
    js: "import { createRequire } from 'module';import path from 'path';import { fileURLToPath } from 'url';const __filename = fileURLToPath(import.meta.url);const __dirname = path.dirname(__filename);const require = createRequire(import.meta.url);",
  },
  outExtension: {
    ".js": ".mjs",
  },
});

With the code in banner the error of __dirname, __filename and require in ESM are gone.

The downside I found is that by adding @sparticuz/chromium into external my package went from 16MB to 72MB being compressed, uncompressed was 75MB to 131MB.

dvalbuena1 avatar Jun 23 '23 18:06 dvalbuena1

That's great info. Thanks for posting! If you are on a platform that allows you to use layers, you can use the layer, then use @sparticuz/chromium-min. It's the same package, just without the binaries.

Sparticuz avatar Jun 23 '23 18:06 Sparticuz

@sparticuz/chromium-min

You are right, I hadn't realized that! Now the size remains at 20 MB while compressed. It is only necessary to specify the entire path using the layer.

chromium.executablePath("/opt/nodejs/node_modules/@sparticuz/chromium/bin")

dvalbuena1 avatar Jun 23 '23 19:06 dvalbuena1