site-kit-wp icon indicating copy to clipboard operation
site-kit-wp copied to clipboard

Resolve VRT issues for local runs

Open aaemnnosttv opened this issue 3 years ago • 4 comments

Bug Description

We recently updated our VRT infrastructure to support running on ARM/M1 machines which are becoming more common rather quickly.

While the build runs on these machines where it didn't before, it seems to have introduced some degree of instability in local runs on Intel machines (and M1s?).

At current, I see ~10 scenario failures when running locally, most of which seem to be very small differences in font rendering although some appear to have timeouts as well, albeit inconsistently.

I do see some amount of failures consistently though. See https://github.com/google/site-kit-wp/issues/4619#issuecomment-1241116909

Steps to reproduce

  1. Run npm run test:visualtest locally
  2. See failures

Additional Context

  • OS: MacOS 12.5.1

Do not alter or remove anything below. The following sections will be managed by moderators only.

Acceptance criteria

  • Visual regression tests should consistently run successfully, both locally (on Intel + ARM) and on GitHub actions

Implementation Brief

Test Coverage

QA Brief

Changelog entry

aaemnnosttv avatar Sep 12 '22 20:09 aaemnnosttv

Update after spending some time on the ticket:

  • Have yet to determine the cause of the issue
  • Tried loading fonts within Storybook differently, using webfontloader and trigger backstop when the fonts are loaded via the custom wf-active class name.
  • Might not be related to whether fonts are already loaded.
  • Taking a lot of time to run the tests to check.

asvinb avatar Sep 21 '22 08:09 asvinb

@FlicHollis I think we can remove this from Sprint 85 as it looks like it needs more work on the IB :-)

eclarke1 avatar Sep 26 '22 08:09 eclarke1

This has been removed, thanks @eclarke1

FlicHollis avatar Sep 27 '22 09:09 FlicHollis

Latest updates:

  • Fonts are not rendering even though technically they are loaded. Reference: image

Test: image

  • No improvements noted when increasing the delay in tests/backstop/config.js
  • The same tests consistently fail (for e.g Global admin bar) which means that potentially we are missing something in those tests.

asvinb avatar Sep 28 '22 12:09 asvinb

I am also seeing failures on my Linux workstation(Ubuntu 22.04, AMD64), similar to the description but more, ~30 failures here.

techanvil avatar Oct 26 '22 09:10 techanvil

@aaemnnosttv Not sure if am missing something here and how the GH actions are all passing, but if you look at one of the reference image on develop and you compare it with what we currently have in Storybook for that image, they are different, with the GA4 CTA present in Storybook and not in the reference image. Screenshots in case things change:

Storybook: image

BackstopJS Reference: image

Weird how it VRT fails locally on develop whereas in GH actions, everything works fine 🤔 Do you have any idea @aaemnnosttv @tofumatt @eugene-manuilov ?

asvinb avatar Nov 03 '22 16:11 asvinb

Wow, that is indeed strange. Perhaps @techanvil or @nfmohit could investigate this further?

aaemnnosttv avatar Nov 09 '22 20:11 aaemnnosttv

Having discovered that the VRT suite is not actually running as expected in CI, I am parking this issue until that has been resolved: https://github.com/google/site-kit-wp/issues/6151

techanvil avatar Nov 15 '22 17:11 techanvil

Based on the work on #6151, it looks like this issue will still be relevant, with a reduction in the number of test failures but still a couple of cases that need fixing and also a more general issue of stability with unpredictable timeouts occurring locally.

techanvil avatar Nov 21 '22 16:11 techanvil

@asvinb @aaemnnosttv We fixed the VRT font loading issue with #6324 So I think we can close this issue.

derweili avatar Jan 16 '23 14:01 derweili

I think it could be worth keeping to see about fixing the timeouts that occur when running locally...

techanvil avatar Jan 16 '23 16:01 techanvil

I tried running on develop today and most of them failed for me with what seemed to be impercievable font-related differences.

       report | Test completed...
       report | 16 Passed
       report | 152 Failed

E.g.

image

When scrubbing the diff, I can't really notice a difference 🤔

aaemnnosttv avatar Jan 16 '23 18:01 aaemnnosttv

@aaemnnosttv what machine do you use? For me on mac M1 the VRT tests do not fail.

derweili avatar Jan 16 '23 19:01 derweili

@aaemnnosttv @derweili I just checked locally (Mac with Intel chip) and there are no failures: image

I see images have been updated last night, can you try again @aaemnnosttv ?

asvinb avatar Jan 17 '23 08:01 asvinb

@asvinb the updates were only about removing unused reference images. This should not change anything for this issue.

derweili avatar Jan 17 '23 13:01 derweili

That's odd @asvinb – I'm also on a Intel based Mac. I ran it again on develop just now and got the same results 😞

aaemnnosttv avatar Jan 18 '23 22:01 aaemnnosttv

@aaemnnosttv is your VRT docker image up to date? Maybe you could try rebuilding it.

derweili avatar Jan 19 '23 08:01 derweili

@derweili I deleted the images I had and ran it again and everything's passing now 🤷‍♂️ 🎉

I'll close this out then but do you know what changed in the image? It's built based on the specific version of backstop but something must have changed 🤔

aaemnnosttv avatar Jan 23 '23 22:01 aaemnnosttv

@aaemnnosttv actually I don't really know. With #6324 only images were updated that had visual changes, mostly due to font rendering errors.

The VRT image from your example was updated 1 month ago but also without visual changes. https://github.com/google/site-kit-wp/pull/6303/files

However, I experienced the error you had (errors without visual change), when I ran backstopjs locally without docker. I did this to debug font loading. So backstop used a local instance of chromium. When running the VRT tests this way, almost all the tests failed even though there were no visual changes. So I thought, this could be related to a different font rendering/antialising setting or so.

So I though it might be related to an outdated VRT docker image with different settings.

But if deleting the test images helped for you, should be delete the test images before the test? Why do we need to keep the test images for all test runs?

derweili avatar Jan 24 '23 08:01 derweili

But if deleting the test images helped for you, should be delete the test images before the test?

I didn't delete the test images, I deleted the local Docker image! When running again, it rebuilt the image which seemed to do the trick. It must have been something in the alpine:latest base image that changed 🤔

Why do we need to keep the test images for all test runs?

I think this is just how Backstop works, but I could be wrong. It references the generated test files in the report it creates but I'm really not sure why it accumulates them the way it does.

aaemnnosttv avatar Jan 28 '23 02:01 aaemnnosttv