tmt icon indicating copy to clipboard operation
tmt copied to clipboard

Feature: dmesg-like test check plugin using the journal

Open dennisbrendel opened this issue 1 year ago • 1 comments

There are tests out there that as part of their logic clear the kernel message ring buffer. This could render the dmesg check plugin useless, because it only checks the delta between test start and test end. In addition it's not always desired that the ring buffer is cleared when using the dmesg check plugin.

A good way to not be intrusive and get everything that was logged during test execution is getting the information from the journal, if the system has journald running.

An alternative to that is to watch and parse /dev/kmsg, but that means re-implementing partially what dmesg or journald do, so I think it's not the desired way.

dennisbrendel avatar Oct 08 '24 10:10 dennisbrendel

There are tests out there that as part of their logic clear the kernel message ring buffer. This could render the dmesg check plugin useless, because it only checks the delta between test start and test end.

I'm afraid cases like this will be hard to amend in general, as long as a test messes with what a check observes, there will always be space for misunderstanding and unexpected outcomes. Reading your comments below, there seems to be a way how to avoid such a problem in this case, but "you're on your own" might be valid answer.

In addition it's not always desired that the ring buffer is cleared when using the dmesg check plugin.

We can add a key to control this behavior. It is there to limit the scope of saved data to the duration of a single test.

A good way to not be intrusive and get everything that was logged during test execution is getting the information from the journal, if the system has journald running.

It could be the first option, but without (running) journald, dmesg check may very well fall back to the current implementation.

An alternative to that is to watch and parse /dev/kmsg, but that means re-implementing partially what dmesg or journald do, so I think it's not the desired way.

+1, definitely not what I would like to work on.

happz avatar Oct 08 '24 10:10 happz

This would be a problem if we fail the test based on what journalctl provides. some tests create some of the bad kernel conditions we filter for and I believe is the reason they clear dmesg after the test completes so that beaker does not flag the test as a fail.

However we still need this information and so maybe not make it a part of check but something like check where it is run on each test. Maybe an info module gets created. We can then run journalctl with time constraints for each test.

So at the end run with these parameters:

  -S --since=DATE            Show entries not older than the specified date
  -U --until=DATE            Show entries not newer than the specified date

We would also need to make sure we do not run this with -k since some tests reboot and we would only capture information from the final boot.

sbertramrh avatar Oct 24 '24 14:10 sbertramrh

This is an essential feature escalated by RHIVOS, proposing with the must priority for the next release.

psss avatar Jul 16 '25 12:07 psss