playwright
playwright copied to clipboard
feat: better support for visual regression testing
Playwright Test has a built-in toMatchSnapshot()
method to power Visual Regression Testing (VRT).
However, VRT is still challenging due to variances in the host environments. There's a bunch of measures we can do right away to drastically improve experience in @playwright/test
- [ ] support for
docker
test fixture to run browsers inside docker image. - [ ] support for
blur
in matching snapshot to counteract antialiasing - [x] better UI for reviewing snapshot diffs
Interesting context:
I think https://github.com/americanexpress/jest-image-snapshot provides a nice suite of options for various VRT scenarios. Test scenarios vary widely, depending on the context (testing components, whole pages, text-heavy or not, etc).
Besides bluring which helps a lot with antialiasing it would be nice if multiple image comparisons (e. g. SSM) would be possible. Alternative image comparison algorithms could be left to userland, if they can be plugged into toMatchSnapshot
via a common interface.
Besides bluring which helps a lot with antialiasing it would be nice if multiple image comparisons (e. g. SSM) would be possible.
@florianbepunkt What's SSM? Is it structural similarity measurement (SSIM)?
@aslushnikov Yes, typo.
Solid integration with Storybook would be beneficial for the work I do. Chromatic and Percy do this really well.
Also a UI for reviewing the diffs would be great.
Also a UI for reviewing the diffs would be great.
@kevinmpowell What's the one that you find most handy? Is it a "slider" diff like here:
I actually prefer the pixel highlighting (like Playwright already does), but organize all the failing tests in a UI so I can see what failed without having to poke around three different images.
Also being able to A/B toggle the baseline and the test image is nice in some cases.
Slider is rarely useful for me. An onion-skin (transparency overlay) would be more useful.
@aslushnikov Why toMatchSnapshot()
is not available in the documentation?
It can not be found in API list.
And the article that was in 1.13 https://playwright.dev/python/docs/1.13.0/test-snapshots
is not available for 1.14 anymore.
Thanks for thinking about Visual Regression testing. Thats important!
On a related note: It would be great if tests could be run cross-plattform. Currently the os platform name is baked into the snapshot filename, so our CI tests sometime fail due to name miss-match. https://github.com/microsoft/playwright/issues/7575
support for blur in matching snapshot to counteract antialiasing
It would be nice if we could choose whether we want to apply such image filters before the snapshot is being saved or only when doing the comparison. I would prefer the first option as it keeps the diff small when creating new snapshots even of such images that change randomly / are flaky.
Please allow an auto-generated filename when toMatchSnapshot has no name input, similar to how toMatchSnapshot works in Jest.
- [ ] Auto-gen filename when name not specified for toMatchSnapShot
- [ ] Set default toMatchSnapshot file extension in playwright.config.ts
E.g.
// foo.spec.ts toMatchSnapshot() => foo.spec.ts.snap (default extension customizable in playwright.config.ts)
When you have a lot of screenshot assertions in one file, we can avoid writing a lot of filename inputs:
Thanks for thinking on this, blur feature is something that will help us, we have something similar before with puppeter that help us to do comparisson in animated pages, in addition to that something that can be really useful is be able to ignore specific parts of the screen, specially in those parts where we have more dynamic data(videos/images)
Blur would help us greatly. Also, the slider view would be incredible as well.
We're also really interested in these improvements. We had to disable visual tests for now because they are randomly failing because a few pixels are off, even when increasing the threshold. Blur should help here hopefully.
I suggest solving biggest pain-point which is how to store this stuff in git repo so it doesn't blow up in size (to store only last snapshot). Git LFS kinda works but it's painful. Maybe something else would work better? For reference: https://github.com/americanexpress/jest-image-snapshot/issues/92
Would be great if these snapshot dirs were automatically marked in git to only store last revision.
We're using Git LFS, what's your issue with it? Once we had it set up for everyone (we're using Mac, Windows and Linux), it worked fine. We're storing all images in the repo using Git LFS (*.png
) so there's no work involved when adding snapshots to new tests either.
The only issue I have is comparing the image diffs in VS Code when committing new images as the old image is not shown in the diff view. The diff is working fine in the GitLab merge request view though so that's not a big issue.
Hi @aslushnikov! This was pushed to the next version a few times now, could you please add this to the roadmap (if there is one?) so we can have a rough estimate on when this is coming?
I need to implement some visual tests soon™️ and it would be great if I wouldn't need another tool for that. I need to know if there will be improvements to this in 2 months or 2 years though.
Hey @z0n, there's no roadmap. My guesstimate is that we'll have all the pieces together by summer 2022, the priority of VRT keeps raising.
We're using Git LFS, what's your issue with it?
It works for me but for example wanted to use it in one company that had poor infra and it didn't worked well with Jenskins for example, so I couldn't easily bypass it.
Also Git LFS worked weird with rebases and people had a lot of trouble with it when jumping between branches if I remember correctly.
It works but experience is suboptimal.
Hey folks! Here's an update on screenshots and blurring.
I see lots of you requested a "blur" option to pre-blur images before comparison. While I imagine it can help with certain issues, it's a very big hammer, so I wonder if we can do a more delicate job.
I'd appreciate if you could share screenshots (actual / expected) that fail for you with regular diff, but pass with preblur. This way we'll have some real-world data to play with!
Many folks mentioned that they want pre-blur to avoid snapshot failures due to a few pixel differences.
A new options has landed on tip-of-tree: pixelCount
and pixelRatio
. These a supposed to help in these cases. Please give them a try and let me know, if you still need preblur!
$ npm i @playwright/test@next
Thank you for improving visual regression features, @aslushnikov!
You may find the implementation experience of gemini-testing project useful. Some pointers:
- CIEDE2000 tolerance measure
- ignoring (font?) anti-aliasing, based on research paper
- ignoring text input caret
Several years ago we've great success using Gemini for visual regression testing. We used Gemini built-in web UI (either Gemini GUI or html-reporter - don't remember which) to choose changed images worth committing to Git. And during PR review we used built-in GitHub image diff. We had with very few false positives in image diffing. Unfortunately, false positives rate was not zero - mostly due to subtle browser timing/random fluctuations.
Gemini is deprecated now, replaced by Hermione, from the same authors. I haven't used it, but it seems to use the same approach for image diffing. The core is in looks-same
and gemini-core libraries.
Thanks @shamrin for the pointers! I'll read your links in more details later to get a better understanding, but so far we already do all of these:
- instead of using CIEDE2000,
pixelmatch
uses color difference in YIQ color space -
pixelmatch
uses the same algorithm based on the same whitepaper to ignore anti-aliasing - we hide text input caret on the browser level before making a screenshot
Hey! @aslushnikov I updated @florianbepunkt's original port of jest-image-snapshot to playwright test runner here: https://github.com/ayroblu/playwright-image-snapshot. Basically it looks VERY similar to playwright's existing golden.ts compare api and as you can see in matcher.ts.
The main benefit it is that it uses SSIM. I also updated how the diff is done so it's similar to pixelmatch's greyscale background which is super useful.

expect(await page.screenshot()).toMatchImageSnapshot(test.info(), [
name,
"1-initial-load.png",
]);
Would love to have this SSIM option ported to playwright test as TestInfo
is not exposed implicitly which makes the api usage a bit ugly. Made a PR #12258.
I'm also hoping not to need to supply a file name by default, seems unnecessary.
For the record: docker integration depends on global fixtures, so moving them forward.
Hi, @aslushnikov! Is it possible that in the next releases you will implement "slider" diff in the html report? There are cases where the slider is more convenient than the pixel highlighting method, especially when the length of the expected and actual screenshots differs.
It would be possible to implement one more tab in the report by analogy with Diff/Actual/Expected?

or you can display all 3 states on one tab in the report (as it looks in the attachments of this comment)
@bezyakina not sure for 1.21 (we're about to finalize this version), but still possible! It all depends on how much our users need it.
So could you please file this separately to our bug tracker as a feature request? The more likes / upvotes it will collect, the higher priority will be for us, and the faster we'll implement it!
@bezyakina not sure for 1.21 (we're about to finalize this version), but still possible! It all depends on how much our users need it.
So could you please file this separately to our bug tracker as a feature request? The more likes / upvotes it will collect, the higher priority will be for us, and the faster we'll implement it!
thanks for your reply, created a new feature request - https://github.com/microsoft/playwright/issues/13176
Hey there! Not sure if would be better to open another feature request, but https://github.com/jz-jess/RobotEyes has an interesting feature to ignore an array of UI elements in the image comparison, as these elements will be blurred, helping to achieve a higher percentage of fidelity (+95%) comparison. RobotEyes uses Imagemagick in the background which is a really powerful tool for image comparison. The idea is to ignore data elements from the screen before comparison is done. Taking that into account would require to set a different tolerance for each web page in the application, as each one can have different amount of UI elements with data. I've seen comments about blur, but it doesn't seem to be related to this... Thank you.
@AllanMedeiros you can use the mask
api to mask elements on the screenshot. This should help!