ungoogled-chromium Integrating automated testing into our development workflow

Integrating automated testing into our development workflow

Open Eloston opened this issue 4 years ago • 10 comments

Motivation

We have a number of issues like #931, #843, #834, #835, #568, #395, and more closed issues that could be caught with automated tests (e.g. unit-tests) included with the browser code.

Furthermore, the size and scale of the patches in ungoogled-chromium continue to grow. Notably, Safe Browsing continues to grow and integrate itself deeper into the browser with every new version; This is resulting in enormous patches that are increasingly difficult to reason through.

If we want ungoogled-chromium to grow while remaining stable and reliable, we need to grow our development practices. I believe the most impactul action to take towards this goal is to integrate automated testing.

Plan

Since we have never considered automated testing before, there will be quite a bit of work involved. This will take time, so we will want to break this up into multiple stages. Here's a rough plan:

Learn about Google's testing workflow. Brainstorm ways of modifying/integrating their workflow given our development patterns and resources.
Determine the scope of changes in ungoogled-chromium to accomplish this. Here are some thoughts that come to mind:
- There will be a lot of problems with domain substitution and binary pruning that will need to be sorted through. For example, quite a few files that are pruned are for automated tests.
- We will also need to see what supporting tooling/scripts we will need to write to make the automated tests easy to run for anyone.
- Our practices for writing patches will need to be updated, so that the corresponding test code is included with the actual changes.
- I'm wondering if there are tests that are too resource-intensive for any consumer machine to run?
Implement the changes in this repo (ungoogled-chromium). We will probably do this in conjunction with at least one other platform repo, such as Portable Linux.
Inspired by https://github.com/Eloston/ungoogled-chromium/pull/907#issuecomment-575833587 we need a way to verify changes to this repo across all platforms. Otherwise we may forget to consider some platform-specific behavior. For example:
1. Have this repo's CI checks trigger builds on all platforms
2. For each platform, they will try to refresh the patches. If any patches are refreshed (regardless if they can be automatically refreshed or not), a PR will be submitted to that platform repo with the changes.
3. Also for each platform, CI will try to build and run the unit-tests. The results of the unit-test will be reported as a CI status on this repo's PR.
4. For major Chromium revisions, we should probably skip the CI check if the patches fail to refresh properly (but we can still submit a PR of the patches that did refresh).
5. Since developers on ungoogled-chromium update the platform repo patches and this repo's patches all at once, we could either close the PR automatically or not create one to begin with.
Implement the changes in platform repos.

I will update this plan as we continue to learn more.

Impact

We will solve a subset of logical errors in our patches that cause instability/malfunctions.
Given that ungoogled-chromium is mainly about stubbing/removing code, we should not need to write a lot of new code. However, there will be increased burden on developers to update testing code with any regular code changes. This opens up a class of tedious and potentially painful bugs regarding failing tests, but I believe it is a worthwhile trade-off for increased stability and reliability.

Nov 01 '19 00:11 Eloston

disregard my previous comment, thanks

Nov 01 '19 22:11 jstkdng

I am wondering how it will be possible to write tests that works without actually build the whole thing? Because from my experience a lot of these problems pop up very late during building, even in the last linking step. Then again if the test needs to run the building process then it will be too resource intensive (CircleCI free plan has max 2 shared CPU+4GB which will make building Linux version take ~10 hours I believe), so it seems a very difficult task.

Jan 19 '20 04:01 wchen342

@wchen342

I am wondering how it will be possible to write tests that works without actually build the whole thing? Because from my experience a lot of these problems pop up very late during building, even in the last linking step.

It might actually require more compilation than a regular build. My understanding is that there are two kinds of tests: browser tests, and unit tests. Running the browser tests requires building most of the browser and the tests themselves, so it may require more time than a regular build.

Also, a large project like Chromium has a lot of tests. If the time to run tests is a problem, we can consider optimizing the runtime by excluding tests that are not affected by our changes.

Then again if the test needs to run the building process then it will be too resource intensive (CircleCI free plan has max 2 shared CPU+4GB which will make building Linux version take ~10 hours I believe), so it seems a very difficult task.

I believe this is currently possible with GitHub Actions; the macOS repo still builds on GitHub Actions (although the builds are failing due to a macOS version incompatibility). I think some other services like OBS can handle this workload too.

In the worst case scenario, we would not have any CI and run the tests on our own machines. I believe this is still better than our informal testing.

Jan 19 '20 20:01 Eloston

@Eloston

So I tried to use Github Actions to run the whole building process on Arch repo: see this. In summary it kind of works, but it almost pushed the 6-hours maximum running time for Github Actions. For some reason Linux build takes much longer than the MacOS version. Looks like my Android version will certainly exceed the maximum runtime.

I also took a look at the documents here, here and here. It seems building the test targets separately without building the whole browser is partially possible (as least for blink). According to this, running all the tests will take way too long, so it will be good news if each of these tests can be built and run separately (Github Actions can run 20 jobs per repo parallelly). However, since current patches tend to eliminate codes related to tests, and pruning/domain substitution also changed some of those files, probably it will take a lot of changes to successfully build and run the tests.

Jan 23 '20 06:01 wchen342

@wchen342, the macOS VMs have more resources (4 CPUs, 12 GB RAM) than the Linux (and Windows?) VMs – you could try running via Docker on macOS, I think it used to be installed in the beta, however it's currently not listed. Maybe you can install it via brew or brew cask on a macOS VM

Hardware:

    Hardware Overview:

      Model Name: Apple device
      Model Identifier: VMware7,1
      Processor Speed: 3.33 GHz
      Number of Processors: 4
      Total Number of Cores: 4
      L2 Cache (per Processor): 256 KB
      L3 Cache (per Processor): 12 MB
      Memory: 12 GB

Jan 23 '20 11:01 kramred

@kramred Interesting, because I looked at Github help and it says Each virtual machine has the same hardware resources. Maybe they changed this recently?

Jan 23 '20 12:01 wchen342

@wchen342 · They have stated that since the beta, but from the first time I checked during the beta till recently (I think still now) the macOS VMs have had higher specs. I think they ran the macOS VMs on non-Apple hardware during the beta and then switched to MacStadium but mostly kept the specs (except for the OS version and some of the installed software); I have not checked the actual specs for the Windows VMs but the Linux ones are as stated in the support document. As this is owned/run by Microsoft now I think they might actually be using Windows as host system and VMware for virtualisation (not that it matters much).

Jan 23 '20 13:01 kramred

@kramred I just tried and you are right about the hardware. However as far as I tried I cannot get Docker to run properly on macOS because Docker needs Virtualbox as its driver but macOS blocks installation of Virtualbox by default, and either GUI or Recovery mode is needed to unblock that. The legacy xhyve kind of works but it is deprecated and unstable so I think that's probably not a good solution.

Edit: It seems xhyve works most of the time, but occasionally it will time out for unknown reason. Guess it's still better than nothing. Still need to test the overhead of running under a hypervisor though.

Jan 24 '20 03:01 wchen342

@Eloston

Since I have successfully tested auto-build for archlinux and android, I think next step is try to put together a bigger system across the repos. There are two problems needed to solve from here, qouting the original post:

Learn about Google's testing workflow. Brainstorm ways of modifying/integrating their workflow given our development patterns and resources.

So it turns out chromium comes with a bunch of unit tests with it, and they can be built with ninja just like the usual binary builds. I haven't tested this yet, since I imagine it will require a lot of tweaks through the source codes because of the patches/domain substitution.

Inspired by #907 (comment) we need a way to verify changes to this repo across all platforms. Otherwise we may forget to consider some platform-specific behavior.

Since I have played with Actions for a while now, I have a vague idea on how this should work:

First of all, we need a bot account which has write access to both the base repo (ungoogled-chromium) and platform-specific repos.
When a PR is created in main repo, it shall trigger a workflow, sending out a repository_dispatch to all platform repos (that has a workflow configured, of course).
Upon recieving the event, workflows in the platform repos will start running builds with codes pulling from the PR branch.
Upon complete, the sub workflows will each send a repository_dispatch back to the main repo with build status, triggering the bot to post comments containing build status under the PR.

I think I will start testing the second part first since it's easier and we already have auto build for archlinux up and run. If this sounds like a reasonable plan then I will go ahead and start working on it.

Feb 20 '20 22:02 wchen342

If this sounds like a reasonable plan then I will go ahead and start working on it.

Sounds good to me. Feel free to proceed.

So it turns out chromium comes with a bunch of unit tests with it, and they can be built with ninja just like the usual binary builds. I haven't tested this yet, since I imagine it will require a lot of tweaks through the source codes because of the patches/domain substitution.

I see. I am somewhat interested in this aspect from a technical standpoint, so I'll try to look into this at some point and scope out the work required. However if you do get the chance to investigate unit-tests, don't let me block your progress.

Feb 27 '20 09:02 Eloston

ungoogled-chromium ungoogled-chromium copied to clipboard

Integrating automated testing into our development workflow

Motivation

Plan

Impact

ungoogled-chromium
ungoogled-chromium copied to clipboard