shortest Add caching

Issue #124

This PR introduces basic caching mechanism to reduce costs and increase effectiveness of running test suites.

Performance Boost: Achieves an average speedup of 400%-600%, automations like "Find Lionel Messi Wikipedia page" now completing about 5x faster

Flow diagram

Note: This PR is WIP and still on early stage. It's there early to gather feedback. Future tests, refinements and enhancements will be made to make it reliable and even more performant

Dec 26 '24 23:12 nazargladish

@gladyshcodes is attempting to deploy a commit to the Antiwork Team on Vercel.

A member of the Team first needs to authorize it.

Dec 26 '24 23:12 vercel[bot]

All committers have signed the CLA.

Dec 26 '24 23:12 CLAassistant

One challenge is:

async getComponentStringByCoords(x: number, y: number) {
    return await this.getPage().evaluate(
      ([x, y]) => {
        const elem = document.elementFromPoint(x, y);
        return elem?.outerHTML.trim().replace(/\s+/g, ' ');
      },
      [x, y]
    );
  }

The function above retrieves normalized component string given X and Y coordinates. The reason it's implemented this way is that currently, Playwright does not support selecting DOM elements using coordinates (see this closed issue). The only way I have found was to use native document func. If you have any other ideas in mind, please suggest 👍

Dec 27 '24 00:12 nazargladish

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
shortest	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Dec 30, 2024 7:50pm

Dec 27 '24 07:12 vercel[bot]

Would be nice to resolve the conflicts and then I can test your branch locally.

Dec 27 '24 07:12 m2rads

Would be nice to resolve the conflicts and then I can test your branch locally.

@m2rads I have resolved conflicts now. You can check it out.

Dec 27 '24 16:12 nazargladish

Refinements / Improvements:

Write new text execution to cache file only after it has been successfully completed. That way we can be sure cached tests are only successful ones
Delete cache entry if it fails to run (e.g UI change) and rewrite it

Dec 27 '24 17:12 nazargladish

@gladyshcodes sorry we made some more changes. Please resolve the conflicts again and I will test soon. Thank you :)

Dec 27 '24 22:12 m2rads

@gladyshcodes sorry we made some more changes. Please resolve the conflicts again and I will test soon. Thank you :)

Hi there @m2rads. Those conflicts actually do not interfere with functionality added. I resolved them 👍 .

Dec 28 '24 19:12 nazargladish

Seems like Vercel deployment is failing. I am also getting some build errors locally.

Steps to reproduce this error:

Remove these files: pnpm-lock.json node_modules packages/shortest/node_modules packages/shortest/dist

Then run pnpm install

Let me know if you need Vercel logs for troubleshooting.

Dec 29 '24 11:12 m2rads

@m2rads Build issue resolved, but Vercel pipeline does not re-run. I guess your approval is needed

Dec 29 '24 15:12 nazargladish

Thanks the build issue is resolved now but I think there might be another issue. When running a test, it caches the steps successfully, but on a consecutive run, it deletes the cache and does the same process. It doesn't seem like the consecutive test is being executed from the cache as it takes the same amount of time to run.

https://github.com/user-attachments/assets/ba574583-5194-4827-8bb6-8550e636d3d6

Dec 29 '24 22:12 m2rads

@m2rads Hmm that's weird. I have just tried running several tests and it worked fine for me. I didn't try it with new dashboard test though. I will see now and let you know

shortest("Open Google"); runs:

Dec 29 '24 22:12 nazargladish

I see. The new test needs setup with Mailosaur. Maybe try a more complicated test that has multiple steps and see if this issue happens.

Dec 29 '24 22:12 m2rads

@m2rads I have tried with this test, it works:

shortest("Open first article on Hackernews");

Dec 29 '24 22:12 nazargladish

@m2rads I have tried with this test, it works:

shortest("Open first article on Hackernews");

Hhhm seems like this happens when you have multiple test cases. Can you try with more than one test case?

Dec 30 '24 09:12 m2rads

@m2rads Thank you for caching the bug! I've realized the problem was caused by several factors:

Missing sleep time after each browserTool.execute. Similar to how it's done in runner, we need to address this. However, I believe we should eliminate such code altogether since it affects the performance of both cached and uncached tests. Sometimes, waiting less than a second is sufficient for the next action, while other times, potentially, a longer wait might be needed
For dashboard.test.ts:

I managed to get it running by referencing to the source code of #183, as the relevant documentation appears to be missing. It would be helpful under the services configuration section section. Also, Clerk configuration needs updating: you’ll need to disable the “Require the same device and browser” checkmark in the Clerk dashboard for the tests with mailosaur to work (at least in my case)
Issue itself was the difference in components from email letter as their href attrs differ every time. We now recursively clean up href attributes from the tags. The idea that if the URL changes and we need to click the link, the next step will fail

I ran several tests and they pass now:

Perf boost: 5.6x

@slavingia If you have time, take a look as well, maybe any suggestions?

Dec 30 '24 16:12 nazargladish

Merged in auto fix which led to some new failures

Dec 30 '24 17:12 slavingia

Bug report:

however I've removed the button I was testing and it still pass

maybe I did something wrong, my test is just this: shortest("Open the home page and signin via email").expect("A message to check your email");

what I did now is that I removed the "Signing via Email" link and it still pass

it seems the second time I run the tests they are run against the screenshots but don't hit the server

You can see the video here: https://x.com/madarco/status/1874141497404363128

I guess there needs to be some logic to destroy the cache, so that the spec starts to fail. We'll need to think more on the best way to do that dynamically; we can start by making it very easy (one command) to nuke the cache?

Dec 31 '24 18:12 slavingia

shortest shortest copied to clipboard

Add caching

shortest
shortest copied to clipboard