Implement integration & e2e tests for Argo Rollouts
Execute integrations and E2E tests against stage and production canaries.
┆Issue is synchronized with this Jira Task
➤ Katrina Anderson commented:
Rachael Crook I added some acceptance criteria. Creating GCP buckets is part of Phase 3 of the Ecosystem Test Metric Project ( https://docs.google.com/document/d/12ALt57EV1IPmurMXlE8H5xQle-gfTsagpn1NHiu2R4k/edit?tab=t.0#heading=h.zcehyebhjk69 ) as is determining the file name format we will need. Please follow-up with us for specifics when the time comes.
➤ Katrina Anderson commented:
The GCP bucket for test results has been created. JUnit XML reports should be uploaded here → https://console.cloud.google.com/storage/browser/ecosystem-test-eng-metrics/autopush-rs/junit ( https://console.cloud.google.com/storage/browser/ecosystem-test-eng-metrics/autopush-rs/junit )
The report names must follow a strict naming convention: {job_number}{utc_epoch_datetime}{workflow}__{test_suite}__results{-index}.xml
-
The test suite in this case is ‘e2e'
-
The index is optional in the case of parallel test execution, which i don’t think will be the case here Examples:
-
15592__1724283071__autopush-rs__stage-deployment__e2e__results.xml
-
15592__1724283071__autopush-rs__production-deployment__e2e__results.xml
➤ Rachael Crook commented:
Working on deploying a basic configuration of Argo Rollouts to prod first. Which is being done today and next week. When we go to implement tests, we’ll need to determine how we can trigger the test and report on it. Argo Rollouts has built in integration with Prometheus and Grafana so we can have it check to make sure there are no alerts before deploying a canary for example.
➤ Katrina Anderson commented:
A while back, the team mentioned that both the integration and e2e tests can be run against stage and production environments as part of the deployment. Adding that detail.
➤ Rachael Crook commented:
This will be done as a Argo Analysis job template.
➤ Philip Jenvey commented:
https://mozilla-hub.atlassian.net/browse/PUSH-326 ( https://mozilla-hub.atlassian.net/browse/PUSH-326|smart-link ) containerized the integration test suite.
So the Argo Analysis job ( https://argo-rollouts.readthedocs.io/en/stable/analysis/job/ ) would invoke the autopush-integration-test Docker. We haven’t tested this against a live environment before but the suite is made to run against an existing environment by specifying the following env vars:
docker image: us-docker.pkg.dev - autopush-integration-tests
- AUTOPUSH_CN_SERVER=
- AUTOPUSH_EP_SERVER=
- AUTOPUSH_MP_SERVER=
# we can probably stub this out if and also set SKIP_SENTRY=true - DB_DSN=None
➤ Rachael Crook commented:
Philip Jenvey Are there any commands that need to be run when the container starts to kick it off?
➤ Philip Jenvey commented:
Rachael Crook Let’s override the Docker CMD to:
"sh", "-c", "poetry run pytest tests/integration/test_integration_all_rust.py --junit-xml=integration__results.xml -v"
➤ Rachael Crook commented:
Thanks Philip Jenvey for adding the command to run. I’ll try to get this setup this week
➤ Rachael Crook commented:
Philip Jenvey Sorry for the delay, got busy with the merino gcpv2 migration last week. I will try for this week.
➤ Rachael Crook commented:
Merged https://github.com/mozilla/webservices-infra/pull/6870 ( https://github.com/mozilla/webservices-infra/pull/6870|smart-link ) to setup the argo analysis template job. Currently verifying if it is working.
➤ Rachael Crook commented:
Another PR to fix a naming mismatch https://github.com/mozilla/webservices-infra/pull/6930 ( https://github.com/mozilla/webservices-infra/pull/6930|smart-link )
➤ Rachael Crook commented:
The AnalysisRun shows successful. Philip Jenvey is there a good way to verify this other than the status of successful?
https://webservices.argocd.global.mozgcp.net/applications/autopush-stage-us-west1-autoconnectrs?resource=&node=argoproj.io%2FAnalysisRun%2Fautopush-stage%2Fautoconnectrs-fb65b7b44-21-2%2F0&tab=summary ( https://webservices.argocd.global.mozgcp.net/applications/autopush-stage-us-west1-autoconnectrs?resource=&node=argoproj.io%2FAnalysisRun%2Fautopush-stage%2Fautoconnectrs-fb65b7b44-21-2%2F0&tab=summary )
I don't think so? Essentially, this test replicates the CI integration tests. If there's a problem those should fail and produce errors before getting to this point. CI presents the outcome of those tests.