kepler icon indicating copy to clipboard operation
kepler copied to clipboard

5s is not enough for e2e test on Unit test Mac

Open sunya-ch opened this issue 2 years ago • 5 comments

Describe the bug Regarding this PR https://github.com/sustainable-computing-io/kepler/pull/225, the Unit test Mac keeps failing on e2e test (while the other unit tests were fine) when fixing first waiting time to 5s. https://github.com/sustainable-computing-io/kepler/actions/runs/3262694289/jobs/5360198723

To Reproduce

Not trivial but it happens only on github actions when I add some initial logics of power model before starting the Kepler exporter server.

Expected behavior

The test should wait until the server is ready or the program is failed and terminated.

Desktop (please complete the following information):

  • OS: Linux (but test pass in local)
  • Browser: Safari

Additional context I added kFlag as a keyword for Eventually function to periodically (every 5s for 1 minute) check the status instead of fixing waiting time and now, it can pass the test at 25s. Note that, in the other e2e unit test (except Mac), the kepler is ready right after the first check (at 5s).

Eventually(keplerSession.Out, 30, 5).Should(gbytes.Say(kFlag))

kFlag commit: https://github.com/sustainable-computing-io/kepler/pull/225/commits/15b693b5a246955b6103522a35dcecb5f3de0e12 pass test result: https://github.com/sustainable-computing-io/kepler/actions/runs/3266241292/jobs/5369687391

p.s. I can separate kFlag commit for another PR if needed.

sunya-ch avatar Oct 17 '22 15:10 sunya-ch

As we discussed before, we don't need the string to say kepler failed, but to say it has started... And then the test will try the health check to see if the kepler is still alive... So, what we want then is a log that prints after the server is started

We can fix that in another PR....

marceloamaral avatar Oct 18 '22 02:10 marceloamaral

If we use eventually without reporting when the kepler failed and exits, it can wait forever (or until timeout without necessity).

sunya-ch avatar Oct 18 '22 02:10 sunya-ch

This should be fine, 30s is a very short timeout interval Making the logic more transparent is better for this case

marceloamaral avatar Oct 18 '22 02:10 marceloamaral

But we can check both, if it failed or succeeded

marceloamaral avatar Oct 18 '22 02:10 marceloamaral

I created the PR #307 to fix this

marceloamaral avatar Oct 18 '22 04:10 marceloamaral

@marceloamaral, @sunya-ch

I want to make it clear for case below: for kepler, we have a health check request and a general service request. I am agree that it will take some time for kepler ready to provide service. but during the data preparation, should the health check response or not?

SamYuan1990 avatar Oct 24 '22 12:10 SamYuan1990

In fact, PR #307 fixed this...

The health check should only start responding when kepler is fully initialized

marceloamaral avatar Oct 25 '22 02:10 marceloamaral