crawlee icon indicating copy to clipboard operation
crawlee copied to clipboard

Expose playwright `BrowserContext` options

Open alexkreidler opened this issue 2 years ago • 1 comments

Which package is the feature request for? If unsure which one to select, leave blank

None

Feature

Expose the Playwright browser.newContext options to Crawlee users so they can use more advanced Playwright features.

Motivation

Playwright has many features such as recording and replaying network requests, using Chrome extensions, and emulating different devices that are only exposed through the Playwright browser.newContext options that create a BrowserContext class.

A classic Playwright library example is this:

import { chromium, devices } from 'playwright';

const browser = await chromium.launch(browserOpts);
const context = await browser.newContext(devices['iPhone 11']);
const page = await context.newPage();

Currently AFAICT Crawlee allows users to modify browser.launch options through the Crawlee PlaywrightLaunchContext API, but does not allow users to modify the Playwright BrowserContext.

Ideal solution or implementation, and any additional constraints

One implementation could be to extend the Crawlee LaunchContext options for Playwright with an additional browserContext field that has the Playwright browser.newContext options type. Then @crawlee/browser-pool would use that to launch the browser context.

It appears that the current code actually uses the same LaunchContext.launchOptions type for launching the browser context, while the type is for playwright's launch function which creates a browser. https://github.com/apify/crawlee/blob/2f9aa4e22017d08a396c1bca948b0c5c1e3ab84c/packages/browser-pool/src/playwright/playwright-plugin.ts#L109

This uses the launchPersistentContext which has similar (maybe identical) options to browser.newContext.

However I tried testing to see if these get passed through by adding these options to my PlaywrightCrawler:

const crawler = new PlaywrightCrawler({
  launchContext: {
    launcher: firefox,
    launchOptions: {
      recordHar: {
        path: DATA_DIR + "/data.har",
      }
    } as BrowserContextOptions
  },
})

but it didn't work.

So you could move these options to a new field, or update the types so it is a union of the browser and context launch options.

Thanks for this great library!

Alternative solutions or implementations

No response

Other context

No response

alexkreidler avatar Mar 27 '23 02:03 alexkreidler

Actually it did write a 240MB HAR file to data.har! I still think it would be best to separate the browserContext options from the general browser options, and alternatively make it a union type.

alexkreidler avatar Mar 27 '23 16:03 alexkreidler

I tried to introduce new option for this, but it's far away from trivial, so in the end, I surrendered and only improved the type of the launchOptions.

B4nan avatar Jun 11 '24 13:06 B4nan