coreos-assembler icon indicating copy to clipboard operation
coreos-assembler copied to clipboard

`kola testiso` tests should check for badness in console/journal output

Open jlebon opened this issue 1 year ago • 5 comments

Based on the report in https://github.com/coreos/fedora-coreos-tracker/issues/1722, this is not currently happening, but should.

(This is another thing we should get for free if we fold kola testiso tests into kola run.)

jlebon avatar May 01 '24 17:05 jlebon

If it's not pressing, I would love to look into this when I'm done with the issues I am currently working on.

c4rt0 avatar May 07 '24 10:05 c4rt0

@c4rt0 folding testiso into regular tests is on my radar let's collab on this if you want !

jbtrystram avatar May 15 '24 07:05 jbtrystram

@c4rt0 folding testiso into regular tests is on my radar let's collab on this if you want !

Great idea, I'd appreciate that.

c4rt0 avatar May 15 '24 10:05 c4rt0

I have a question. In kola/testiso.go we have the result
which simply checks if there was an error while writing test results to output. Going further, the reporter sends FAIL if atLeastOneFailed. 
When comparing this to kola.go I see that this functionality is much more expanded. Once tests fail we implement runRerun, with even more checks within that function. At first I started implementing something like cmdRerun (https://github.com/coreos/coreos-assembler/blob/main/mantle/cmd/kola/kola.go#L157) but that is obviously outside of spec of this particular issue. My silly question is: what exactly do you mean by checking badness in console/journal output @jlebon? Or maybe I should attempt implementing something like the mentioned above cmdRerun, simillar to the one in kola.go?

c4rt0 avatar Aug 06 '24 16:08 c4rt0

Specifically what we want is two bits from kola:

  1. calling CheckConsole() on the console output.
  2. calling CheckMachine() on the system. That one is trickier because it takes a Machine, which we don't have here. That's one thing that'd get transparently fixed once we make the testiso tests regular tests. Kinda yucky, but one thing we could do for now is to have CheckMachine() take a closure instead for running ssh commands to the node.

jlebon avatar Aug 07 '24 21:08 jlebon

Specifically what we want is two bits from kola:

1. calling [`CheckConsole()`](https://github.com/coreos/coreos-assembler/blob/7ebc623d824bb5543754c25017dac6b40d5175af/mantle/kola/harness.go#L1920-L1953) on the console output.

I feel like the above is now covered with this PR.

2. calling [`CheckMachine()`](https://github.com/coreos/coreos-assembler/blob/7ebc623d824bb5543754c25017dac6b40d5175af/mantle/platform/platform.go#L449) on the system. That one is trickier because it takes a `Machine`, which we don't have here.

Just to make sure I'm understanding this correctly @jlebon: Passing m platform.Machine into awaitCompletion will make me change all of the function parameters which call it. Sorry for asking, but is that what you mean (I prefer to ask first rather than wasting time while working on something else than you meant), or should I rather think of something else?

Kinda yucky, but one thing we could do for now is to have CheckMachine() take a closure instead for running ssh commands to the node.

I am a little lost here. With the mentioned above PR, the CheckConsole is called in the awaitCompletion - which is executed while tests are complete; It returns time and error. What Im confused about is the part with ssh commands to the node. Do you mean that I should move partof the awaitCompletion into the CheckMachine now, and then make that return the same time and error which awaitCompletion does? This somehow does not make sense to me, hence asking 😕

c4rt0 avatar Nov 11 '24 15:11 c4rt0

I'd say let's not worry about this for now and we'll get the checkMachine bits when (if) we fold testiso into the regular kola tests.

We can open a new issue for checkMachine to track that, but the console checks are done with https://github.com/coreos/coreos-assembler/pull/3895

dustymabe avatar Nov 12 '24 14:11 dustymabe

Awesome, thank you.

c4rt0 avatar Nov 12 '24 14:11 c4rt0

I'd say let's not worry about this for now and we'll get the checkMachine bits when (if) we fold testiso into the regular kola tests.

We can open a new issue for checkMachine to track that, but the console checks are done with #3895

Since the badness is now checked with CheckConsole on both console/journal output I am closing this one as complete. As per comment above, we will open a new issue for the checkMachine when working on folding testiso tests into regular kola tests.

c4rt0 avatar Nov 12 '24 14:11 c4rt0