web
web copied to clipboard
Concurrent tests don't call `requestAnimationFrame` consistently with Puppeteer
Took a while to debug this one, but I was writing some tests which I discovered to be flaky for uncelar reasons. After a lot of digging, I was able to confirm a minimal test case where the requestAnimationFrame callback just isn't invoked sometimes.
Reproduction
It fails at least 1/10 of the time with a timeout error. Patched requestAnimationFrame shows that one requestAnimationFrame callback is just never invoked for almost a second (longer timeouts do not help, the callback is simply never invoked):
$ npm test
dist/src/1.test.html:
🚧 Browser logs:
start @ 1703484191525
requestAnimationFrame - 1 @ 1703484191526
callback - 1 @ 1703484191529
requestAnimationFrame - 3 @ 1703484191529
callback - 3 @ 1703484191530
requestAnimationFrame - 5 @ 1703484191530
callback - 5 @ 1703484191546
requestAnimationFrame - 7 @ 1703484191546
cancelAnimationFrame - 7 @ 1703484191546
dist/src/2.test.html:
🚧 Browser logs:
start @ 1703484191535
requestAnimationFrame - 1 @ 1703484191535
callback - 1 @ 1703484191550
requestAnimationFrame - 3 @ 1703484191550
❌ Error: FAILED: Timeout after 1000 milliseconds @ 1703484192535!
at src/timeout.ts:15:12
at async timeout (src/timeout.ts:7:9)
at dist/async%20http:/localhost:8000/src/2.test.js:3:1
Chromium: |██████████████████████████████| 2/2 test files | 1 passed, 0 failed
I was only able to experience this issue in @web/test-runner-puppeteer. @web/test-runner-playwright does not appear to be affected. @web/test-runner-chrome may or may not be affected, I wasn't able to easily try it out as I'm working in a WSL environment which doesn't have a local install of Chrome.
Root Cause
It seems that by default, @web/test-runner-puppeteer will create a single browser context and use multiple tabs within it for concurrent tests. This unfortunately means that only one page/test is in the foreground at any given time and requestAnimationFrame in particular is highly dependent on page visibility. In my case this means that the backgrounded page just doesn't receive the RAF callback sometimes and its tests timeout. I believe this can also affect setTimeout and I did see some weird behavior there, but was unable to narrow down a minimal test case with it.
Workaround
From https://github.com/puppeteer/puppeteer/issues/2694, this behavior seems to be intended from Puppeteer, with the workaround being to create a new browser context. I was able to fix this via:
import { puppeteerLauncher } from '@web/test-runner-puppeteer';
export default {
browsers: [
puppeteerLauncher({
createPage: async ({ context, config }) => {
return (await context.browser().createIncognitoBrowserContext()).newPage();
},
}),
],
};
This creates a new browser context (basically a window?) for every page. This means all pages are always foregrounded and RAF is always invoked as expected.
Note that createBrowserContext: ({ browser, config }) => browser.createIncognitoBrowserContext() is not sufficient to fix the issue, as that context will be shared across all pages, so only one page remains foregrounded and RAF still gets dropped sometimes.
Solutions
I'm not 100% on the ideal solution here, but I spent a lot of time and energy debugging this issue so I'd like to spare the next developer from going through the same thing. I see a few potential solutions here:
1. Create a new context by default for each page.
- This avoids the problem by default.
- Potentially slower since each test gets its own context.
- Might make headfull testing more annoying since each concurrent test would get its own window.
- I'm not sure it's actually possible to "see" though as
--manualrequires the user to supply their own browser and--watch > Dopens on a single test, not multiple.
- I'm not sure it's actually possible to "see" though as
- Confusing that
@web/test-runner-puppeteeruses a non-Puppeteer default option forcreatePage. If users override this value, they would need to apply the same logic in their override or risk the same bug. However Puppeteer isn't really specific to testing, so I think it's fair that their defaults might be more generally focused on browser automation while WTR can have defaults focused on testing use cases, where test isolation of RAF is a bigger concern. - Likely still want to do 3. for users who override the
createPagehook.
2. Disable concurrency by default for Puppeteer.
The issue only occurs when multiple tests are being run at the same time, so disabling concurrency fixes the issue. This would negatively impact performance, definitely not an ideal solution. It doesn't necessarily need to disable concurrency by default for Playwright (or Chrome is that is indeed not affected), however it might make documentation of the default value a little more awkward. Users opting in to concurrency will still encounter the same problem. Likely still want to do 3. anyways.
3. Document the issue.
Specifically we could update https://modern-web.dev/docs/test-runner/browser-launchers/puppeteer/ and any other pages to document the issue and the workaround.
Anyone setting up Puppeteer is likely to just copy-paste the suggested configuration, so if that includes the createPage workaround, then projects are likely to include that configuration. Of course not everyone reads the docs, and some might ignore the workaround thinking it doesn't apply to them. Also weird caveats like this are exactly the kind of thing which would turn a user away from Puppeteer and/or Web Test Runner in the first place, especially if they don't understand the issue and how it may or may not affect them.
4. Do nothing.
We can accept this issue and https://github.com/puppeteer/puppeteer/issues/2694 as sufficient documentation in and of themselves. This is an admittedly rare issue, only occurring with requestAnimationFrame and concurrent tests (worth keeping in mind that concurrency is enabled by default).
I don't like this approach because this is a really difficult problem to debug. It requires a developer to go through the process:
- Discover a test is occasionally getting timeouts for no clear reason.
- Isolate the flakiness and take the effort to debug it.
- Discover that RAF isn't being called correctly.
- Identify that the issue only occurs for multiple pages with
concurrency > 1. - Narrow down the issue to Web Test Runner and/or Puppeteer.
- Realize background pages specifically don't invoke RAF.
- Discover the
createPageworkaround.
Those are some non-obvious cognitive leaps. They might always get lucky with a Google search which jumps them a few steps ahead, but I think most devs are likely to get to step 2 or 4 and just turn off the test or disable concurrency without understanding why. Given the amount of time I invested in debugging this, I probably should have too.
requestAnimationFrame also isn't a super obscure API. It's not terribly common either, but definitely something used frequently enough that users will encounter this problem.
5. Ask Puppeteer to change the default.
I considered opening an issue against Puppeteer but decided against it for now. I think there's definitely an argument to be made that Puppeteer should provide stronger isolation by default here, but I think their APIs are a more direct translation of the browser. The fact that multiple tabs in a window are coupled together via RAF timing is confusing and awkward, but also exactly what the user asked for when they did browser.newPage(). We could go that route if we don't think this is WTR's problem to solve, but I expect that to be a more difficult argument to make.
Ultimately Puppeteer is a browser automation tool, and that has more generic use cases than testing. Since WTR is focused on testing, I think it makes sense to have slightly more opinionated defaults here as make sense for testing use cases.
Proposal
If it's not obvious already, I think 1. is the best option. This issue is way too complicated and nuanced to expect users to fix themselves, no matter how effectively WTR can document the problem. I think we also should do 3. at least for the documentation which explicitly discusses overriding createPage, since that's where users will encounter this edge case.
Apologies for the very long issue, I wanted to make sure I captured all the relevant information here. Hopefully this is something we can address to avoid anyone else having this pain in the future.
Additional info: https://github.com/modernweb-dev/web/issues/2520
One other thought: I'm not sure if/how my proposed workaround differs from concurrentBrowsers. Maybe it would make more sense to deprecate / discourage concurrency for Puppeteer and suggest concurrentBrowsers instead? That might have the same net effect.