autopush-rs Implement integration & e2e tests for Argo Rollouts

Execute integrations and E2E tests against stage and production canaries.

┆Issue is synchronized with this Jira Task

May 21 '25 18:05 data-sync-user

➤ Katrina Anderson commented:

Rachael Crook I added some acceptance criteria. Creating GCP buckets is part of Phase 3 of the Ecosystem Test Metric Project ( https://docs.google.com/document/d/12ALt57EV1IPmurMXlE8H5xQle-gfTsagpn1NHiu2R4k/edit?tab=t.0#heading=h.zcehyebhjk69 ) as is determining the file name format we will need. Please follow-up with us for specifics when the time comes.

May 21 '25 18:05 data-sync-user

➤ Katrina Anderson commented:

The GCP bucket for test results has been created. JUnit XML reports should be uploaded here → https://console.cloud.google.com/storage/browser/ecosystem-test-eng-metrics/autopush-rs/junit ( https://console.cloud.google.com/storage/browser/ecosystem-test-eng-metrics/autopush-rs/junit )

The report names must follow a strict naming convention: {job_number}{utc_epoch_datetime}{workflow}__{test_suite}__results{-index}.xml

The test suite in this case is ‘e2e'
The index is optional in the case of parallel test execution, which i don’t think will be the case here Examples:
15592__1724283071__autopush-rs__stage-deployment__e2e__results.xml
15592__1724283071__autopush-rs__production-deployment__e2e__results.xml

May 21 '25 18:05 data-sync-user

➤ Rachael Crook commented:

Working on deploying a basic configuration of Argo Rollouts to prod first. Which is being done today and next week. When we go to implement tests, we’ll need to determine how we can trigger the test and report on it. Argo Rollouts has built in integration with Prometheus and Grafana so we can have it check to make sure there are no alerts before deploying a canary for example.

May 21 '25 18:05 data-sync-user

➤ Katrina Anderson commented:

A while back, the team mentioned that both the integration and e2e tests can be run against stage and production environments as part of the deployment. Adding that detail.

May 21 '25 18:05 data-sync-user

➤ Rachael Crook commented:

This will be done as a Argo Analysis job template.

May 21 '25 18:05 data-sync-user

➤ Philip Jenvey commented:

https://mozilla-hub.atlassian.net/browse/PUSH-326 ( https://mozilla-hub.atlassian.net/browse/PUSH-326|smart-link ) containerized the integration test suite.

So the Argo Analysis job ( https://argo-rollouts.readthedocs.io/en/stable/analysis/job/ ) would invoke the autopush-integration-test Docker. We haven’t tested this against a live environment before but the suite is made to run against an existing environment by specifying the following env vars:

docker image: us-docker.pkg.dev - autopush-integration-tests

AUTOPUSH_CN_SERVER=
AUTOPUSH_EP_SERVER=
AUTOPUSH_MP_SERVER= # we can probably stub this out if and also set SKIP_SENTRY=true
DB_DSN=None

Jul 02 '25 18:07 data-sync-user

➤ Rachael Crook commented:

Philip Jenvey Are there any commands that need to be run when the container starts to kick it off?

Jul 16 '25 17:07 data-sync-user

➤ Philip Jenvey commented:

Rachael Crook Let’s override the Docker CMD to:

"sh", "-c", "poetry run pytest tests/integration/test_integration_all_rust.py --junit-xml=integration__results.xml -v"

Jul 28 '25 20:07 data-sync-user

➤ Rachael Crook commented:

Thanks Philip Jenvey for adding the command to run. I’ll try to get this setup this week

Jul 28 '25 20:07 data-sync-user

➤ Rachael Crook commented:

Philip Jenvey Sorry for the delay, got busy with the merino gcpv2 migration last week. I will try for this week.

Aug 04 '25 20:08 data-sync-user

➤ Rachael Crook commented:

Merged https://github.com/mozilla/webservices-infra/pull/6870 ( https://github.com/mozilla/webservices-infra/pull/6870|smart-link ) to setup the argo analysis template job. Currently verifying if it is working.

Aug 08 '25 15:08 data-sync-user

➤ Rachael Crook commented:

Another PR to fix a naming mismatch https://github.com/mozilla/webservices-infra/pull/6930 ( https://github.com/mozilla/webservices-infra/pull/6930|smart-link )

Aug 11 '25 17:08 data-sync-user

➤ Rachael Crook commented:

The AnalysisRun shows successful. Philip Jenvey is there a good way to verify this other than the status of successful?

https://webservices.argocd.global.mozgcp.net/applications/autopush-stage-us-west1-autoconnectrs?resource=&node=argoproj.io%2FAnalysisRun%2Fautopush-stage%2Fautoconnectrs-fb65b7b44-21-2%2F0&tab=summary ( https://webservices.argocd.global.mozgcp.net/applications/autopush-stage-us-west1-autoconnectrs?resource=&node=argoproj.io%2FAnalysisRun%2Fautopush-stage%2Fautoconnectrs-fb65b7b44-21-2%2F0&tab=summary )

Aug 27 '25 20:08 data-sync-user

I don't think so? Essentially, this test replicates the CI integration tests. If there's a problem those should fail and produce errors before getting to this point. CI presents the outcome of those tests.

Sep 02 '25 15:09 jrconlin