tmt icon indicating copy to clipboard operation
tmt copied to clipboard

Kernel aborts / panics detection

Open juk opened this issue 3 years ago • 5 comments

TMT should be able to detect kernel aborts / panics and report it as test result. This could be done via analyzing console logs.

There should also be a way to execute tests ignoring kernel panics and not reporting a panic/fail because of it. This functionality is necessary for kernel tests that trigger panics intentionally.

juk avatar Jun 07 '22 12:06 juk

Thanks for filing this. Link to downstream issue for Testing Farm: https://issues.redhat.com/browse/TFT-891

thrix avatar Jun 07 '22 12:06 thrix

Kernel can be configured in a way that panic is followed by a reboot. This essentially means that once TMT is able to gracefully handle unexpected/hard reboot it will be able to handle kernel panics too. I would also find this useful for FIPS testing we are doing.

The-Mule avatar Sep 13 '23 09:09 The-Mule

I notice tmt-1.34 can restart the test after I trigger a kernel panic. I can confirm coiby/kdump-tests: Kdump tmt tests has passed for 10 consecutive times,

# tmt run tests discover provision -h virtual -c system prepare execute report finish 
/var/tmp/tmt/run-017

/plans/kdump
    discover
        how: fmf
        name: client-setup
        directory: /root/kdump-tests
        tests: /setup/kdump
        how: fmf
        name: server-setup
...
        execute task #4: server-test on server
        how: tmt

    
        summary: 4 tests executed
    report
        how: display
        summary: 4 tests passed
    finish

coiby avatar Jul 19 '24 06:07 coiby

@coiby that sounds nice, thanks for testing it!

@The-Mule @juk would you mind spending some time on checking this feature, whether it's good enough for your use cases too?

happz avatar Jul 20 '24 13:07 happz

@happz You are welcome! Btw, I forgot to mention I use tmt-reboot -c "echo c > /proc/sysrq-trigger" (instead of rlRun "echo c > /proc/sysrq-trigger") to trigger a kernel panic. If I use rlRun "echo c > /proc/sysrq-trigger", somehow tmt will fail due to error from beakerlib,

# tmt run tests discover provision -h virtual -c system prepare execute report finish 
                        ...
                        kdump: Starting kdump: [OK]
                        :: [ 10:08:20 ] :: [   PASS   ] :: Command 'kdumpctl restart' (Expected 0, got 0)
                        :: [ 10:08:20 ] :: [  BEGIN   ] :: Running 'echo 1 > /proc/sys/kernel/sysrq'
                        :: [ 10:08:20 ] :: [   PASS   ] :: Command 'echo 1 > /proc/sys/kernel/sysrq' (Expected 0, got 0)
                        client_loop: send disconnect: Broken pipe
                    # the errr could also be 00:00:28 errr /client-test/tests/client (on client) (beakerlib: State 'imcomplete') [1/1]
                    00:00:28 errr /client-test/tests/client (on client) (beakerlib: State 'started') [1/1]
                    journal.txt: /var/tmp/tmt/run-035/plans/kdump/execute/data/guest/client/client-test/tests/client-4/journal.txt
            summary: 4 tests passed and 1 error
# echo $?
2

coiby avatar Jul 20 '24 23:07 coiby

#3284 has been created to track the issue that tmt will fail unless the kernel panic is triggered by tmt-reboot.

coiby avatar Oct 14 '24 10:10 coiby

This should be covered by the restart key:

  • https://tmt.readthedocs.io/en/stable/spec/tests.html#restart

Closing. Feel free to reopen if there are still any gaps.

psss avatar Oct 03 '25 06:10 psss