capsule icon indicating copy to clipboard operation
capsule copied to clipboard

bug(e2e): Tenant Replications

Open oliverbaehler opened this issue 1 year ago • 5 comments
trafficstars

Bug description

The e2e tests for the tenant replications often fail:

Summarizing 1 Failure:
  [FAIL] Creating a GlobalTenantResource object [It] should replicate resources to all Tenants
  /home/runner/work/capsule/capsule/e2e/globaltenantresource_test.go:276

They should be fixed, so we can rely on the status of our pipeline

oliverbaehler avatar Aug 20 '24 08:08 oliverbaehler

I witnessed this failure several times, despite the feature working as expected.

This could be a nice issue for newcomers, happy to share more information on how to debug the failure and provide a fix, definitely something's race-ish.

prometherion avatar Aug 24 '24 10:08 prometherion

Hi @prometherion ,

I'd like to work on this issue. Could you please provide more details on how to reproduce the failure and any tips for debugging?

Looking forward to contributing!

dev-saw99 avatar Sep 07 '24 16:09 dev-saw99

Sorry for this late answer, @dev-saw99!

Unfortunately, I've no idea why this is happening 😢

What I could suggest is trying back and for the incriminated test: after setting up the local environment, you can easily do this with the following command: ./bin/ginkgo run -v --tags=e2e --focus 'should replicate resources to all Tenants' ./e2e

Upon a failure, Capsule logs and current state must be investigated since I guess something raceish is happening 🤔

prometherion avatar Sep 16 '24 07:09 prometherion

@prometherion Hey, I tried but I was not able to reproduce this issue.

I will spend more time over this weekend and update here.

Meanwhile, if you have any suggestions please let me know.

dev-saw99 avatar Sep 18 '24 21:09 dev-saw99

Unfortunately, it seems it could be related to the whole test-suite.

The single suggestion I could have here is to run it entirely, and wait for failures, and check Capsule logs.

prometherion avatar Sep 20 '24 09:09 prometherion

@oliverbaehler let's see the progress with upcoming e2e runs, but I think #1264 will solve this issue, and it sounds potentially correct: a race condition, happening from time to time, with a timeout meeting the required condition.

If you agree, we could close this and open it back if we face these kinds of failure in the future.

prometherion avatar Dec 04 '24 12:12 prometherion

This pull request has been automatically closed because it has been inactive for more than 60 days. Please reopen if you still intend to submit this pull request.

github-actions[bot] avatar May 21 '25 00:05 github-actions[bot]