op-test icon indicating copy to clipboard operation
op-test copied to clipboard

Add support for monitors

Open sathnaga opened this issue 5 years ago • 11 comments

Let's add support for monitors to framework, this would help us capture the snapshot of system details through any user provided commands running in a predefined frequent intervals and store in a file in test-reports which then can be used for further processing later.

Usage:

monitors file in the basepath documents how user can create one monitor instance and running the test with --enable-monitors will allow the framework to enable the monitor threads run in parallel to the test and collect the output and monitor threads gets stopped at the end of tests and additional regular expression will help to extract useful information in the final output file.

Signed-off-by: Satheesh Rajendran [email protected]

sathnaga avatar Sep 19 '19 07:09 sathnaga

Test Config:

#cat ci1.conf
[op-test]
bmc_type=FSP
bmc_ip=X.X.X.X
bmc_username=abc
bmc_password=xyz
bmc_passwordipmi=pass
host_ip=X.X.X.X
host_user=root
host_password=pass
host_cmd_timeout=60
host_cmd=sleep 20
machine_state=OS

Test: ./op-test -c ci1.conf --run testcases.RunHostTest.RunHostTest --enable-profiler

Test Output: profiler_test_result.txt

#cat profilers |grep -v ^#
date,2,sut,date1
vmstat 1,0,sut,test1
vmstat,3,sut
date,5,sut

test-reports

$ ls -lrt
total 92
drwxrwxr-x. 2 satheesh satheesh  4096 Sep 19 12:54 host-results
-rw-rw-r--. 1 satheesh satheesh     0 Sep 19 12:54 console.log
-rw-rw-r--. 1 satheesh satheesh  1805 Sep 19 12:55 20190919125348.log
-rw-rw-r--. 1 satheesh satheesh  6653 Sep 19 12:55 20190919125407-test1.log
-rw-rw-r--. 1 satheesh satheesh 17874 Sep 19 12:57 20190919125407-vmstat.log
-rw-rw-r--. 1 satheesh satheesh  7860 Sep 19 12:57 20190919125407-date1.log
-rw-rw-r--. 1 satheesh satheesh   125 Sep 19 12:57 op-test-esel.Thu_Sep_19_12:55:02_2019
-rw-rw-r--. 1 satheesh satheesh  3038 Sep 19 12:57 20190919072348957229.main.log
-rw-rw-r--. 1 satheesh satheesh  4345 Sep 19 12:57 20190919125407-date.log
-rw-rw-r--. 1 satheesh satheesh 24744 Sep 19 12:57 20190919072348957769.debug.log
[satheesh@sathnaga86 latest]$ 

sathnaga avatar Sep 19 '19 07:09 sathnaga

I like the idea, but can you rename this to something like "monitoring jobs" or similar? It's not really clear what "profiler" is supposed to mean in the context of op-test.

oohal avatar Nov 28 '19 04:11 oohal

I like the idea, but can you rename this to something like "monitoring jobs" or similar? It's not really clear what "profiler" is supposed to mean in the context of op-test.

how about system_monitors ?

sathnaga avatar Nov 28 '19 06:11 sathnaga

I like the idea, but can you rename this to something like "monitoring jobs" or similar? It's not really clear what "profiler" is supposed to mean in the context of op-test.

how about system_monitors ?

That works I guess. It's a little redundant since there's not much else we'd be monitoring, but I'm not that fussed.

oohal avatar Nov 28 '19 06:11 oohal

I like the idea, but can you rename this to something like "monitoring jobs" or similar? It's not really clear what "profiler" is supposed to mean in the context of op-test.

how about system_monitors ?

That works I guess. It's a little redundant since there's not much else we'd be monitoring, but I'm not that fussed.

ya, right, monitors alone would suffice, have few more improvements aswell, will change them aswell and send it along..

sathnaga avatar Nov 28 '19 09:11 sathnaga

test pending...

sathnaga avatar Dec 09 '19 08:12 sathnaga

result: $ ./op-test -c ltcalpine-lp8.conf --run testcases.RunHostTest.RunHostTest --enable-monitors

...
...
[console-expect]#which whoami && whoami
/usr/bin/whoami
root
[console-expect]#echo $?
echo $?
0
[console-expect]#stress-ng --cpu 80 --timeout 20s
stress-ng --cpu 80 --timeout 20s
stress-ng: info:  [3159] dispatching hogs: 80 cpu
stress-ng: info:  [3159] cache allocate: using defaults, can't determine cache details from sysfs
stress-ng: info:  [3159] successful run completed in 21.40s
[console-expect]#echo $?
echo $?
0
ok

----------------------------------------------------------------------
Ran 1 test in 46.821s

OK
2019-12-09 13:42:36,160:op-test:<module>:INFO:Exit with Result errors="0" and failures="0"
2019-12-09 13:42:36,161:op-test.common.OpTestUtil:cleanup:INFO:OpTestSystem Starting to Gather ESEL's
2019-12-09 13:42:36,162:op-test.common.OpTestUtil:dump_versions:INFO:Log Location: /home/satheesh/data/gits/github/op-test-framework/test-reports/test-run-20191209134012/*debug*
2019-12-09 13:42:36,162:op-test.common.OpTestUtil:dump_versions:INFO:
----------------------------------------------------------
OpTestSystem Firmware Versions Tested
(if flashed things like skiboot.lid, may not be accurate)
----------------------------------------------------------
Firmware Versions Unavailable
----------------------------------------------------------
----------------------------------------------------------

2019-12-09 13:42:36,163:op-test.common.OpTestMonitor:stop:INFO:Stopping monitor lparstat_1

$ cat monitors |grep -v ^# lparstat 1,2,sut,,.*---\n([(\d+.\d+)\s+]+),

$ ls -lrt test-reports/latest/

total 48
drwxrwxr-x. 2 satheesh satheesh  4096 Dec  9 13:41 host-results
-rw-rw-r--. 1 satheesh satheesh     0 Dec  9 13:41 console.log
-rw-rw-r--. 1 satheesh satheesh  1118 Dec  9 13:42 20191209134012.log
-rw-rw-r--. 1 satheesh satheesh  4853 Dec  9 13:42 20191209134147-lparstat_1.log
-rw-rw-r--. 1 satheesh satheesh  3523 Dec  9 13:42 20191209081012516995.main.log
-rw-rw-r--. 1 satheesh satheesh   761 Dec  9 13:42 lparstat_1
-rw-rw-r--. 1 satheesh satheesh 23647 Dec  9 13:43 20191209081012517512.debug.log

$ tail -10 test-reports/latest/20191209134147-lparstat_1.log

System Configuration
type=Shared mode=Uncapped smt=8 lcpu=2 mem=51301376 kB cpus=47 ent=2.00 

%user  %sys %wait    %idle    physc %entc lbusy  vcsw phint
----- ----- -----    -----    ----- ----- ----- ----- -----
 0.12  0.00  0.00    99.88 0.024454 1.222700  0.12 1079279773     0
[console-expect]#echo $?
echo $?
0

$ tail -10 test-reports/latest/lparstat_1

 0.00  0.00  0.00   100.00 0.016303 0.815150  0.00 1079273755     0
 0.00  0.19  0.00    99.81 0.032600 1.630000  0.19 1079274873     0
 0.06  0.19  0.00    99.75 0.065200 3.260000  0.25 1079277417     0
 0.06  0.00  0.00    99.94 0.024453 1.222650  0.06 1079278867     0
99.94  0.06  0.00     0.00 1.993052 99.652600 100.00 1079279218     0
99.94  0.06  0.00     0.00 1.999028 99.951400 100.00 1079279218     0
99.89  0.11  0.00     0.00 2.004106 100.205300 100.00 1079279218     0
99.94  0.06  0.00     0.00 2.002501 100.125050 100.00 1079279218     0
99.95  0.05  0.00     0.00 2.004444 100.222200 100.00 1079279218     0
 0.12  0.00  0.00    99.88 0.024454 1.222700  0.12 1079279773     0

sathnaga avatar Dec 09 '19 08:12 sathnaga

@oohal Have addressed all comments, pls merge if no further comments, Tnx!

sathnaga avatar Dec 11 '19 11:12 sathnaga

rebased...

sathnaga avatar Dec 20 '19 04:12 sathnaga

@oohal pls help merge if no further comments, tnx!

sathnaga avatar May 04 '20 12:05 sathnaga