scaphandre icon indicating copy to clipboard operation
scaphandre copied to clipboard

Qemu exporter always outputs 0W

Open gpgn opened this issue 3 years ago • 3 comments

Bug description

Following https://hubblo-org.github.io/scaphandre-documentation/references/exporter-qemu.html, I first try stdout exporter on the host which seems to successfully retrieve rapl metrics:

sudo RUST_BACKTRACE=full ./target/debug/scaphandre stdout
Scaphandre stdout exporter
Sending ⚡ metrics
Measurement step is: 2s
Host:   0 W
        package         core            dram            uncore
Top 5 consumers:
Power           PID     Exe
No processes found yet or filter returns no value.
------------------------------------------------------------

Host:   6.279447 W
        package         core            dram            uncore
Socket0 6.279447 W |    5.607462 W      1.159405 W      0 W

Top 5 consumers:
Power           PID     Exe
0.080505 W      91      "ksmd"
0.080505 W      1288    "NetworkManager"
0.080505 W      1294    "iio-sensor-prox"
0.080505 W      1329    "thermald"
0 W     1       "systemd"
------------------------------------------------------------

However, when running qemu exporter, it does seem to run w/o issues, but 0W is written.

On the host:

sudo RUST_BACKTRACE=full ./target/debug/scaphandre qemu  
Scaphandre qemu exporter
Sending ⚡ metrics

Inside the VM (it does pick up on the processes running inside the VM):

vagrant@cluster-0-master-0:~/git/scaphandre$ sudo RUST_BACKTRACE=1 ./target/debug/scaphandre --vm stdout -t 15
scaphandre::sensors::powercap_rapl: Powercap_rapl path is: /var/scaphandre
Scaphandre stdout exporter
Sending ⚡ metrics
Measurement step is: 2s
Host:   0 W
Top 5 consumers:
Power           PID     Exe
No processes found yet or filter returns no value.
------------------------------------------------------------

Host:   0 W
Socket0 ---- W |

Top 5 consumers:
Power           PID     Exe
0 W     2065    "scaphandre"
0 W     1502    "containerd"
0 W     1       "systemd"
0 W     2       "kthreadd"
0 W     3       "rcu_gp"

Mount is there, but energy_uj always 0:

vagrant@cluster-0-master-0:~/git/scaphandre$ cat /var/scaphandre/intel-rapl\:0/energy_uj 
0

To Reproduce

I can recreate a simple Vagrantfile if it is helpful, but Im hoping I just missed a step somewhere.

Expected behavior

Power output showing values >0W.

Environment

On the host:

Ubuntu 20.04.4 LTS
5.15.0-41-generic
Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz

Using:

Vagrant 2.2.19
generic/ubuntu2004 (libvirt, 4.1.0)
(kvm) QEMU emulator version 4.2.1 (Debian 1:4.2-3ubuntu6.23)

In the VM:

Ubuntu 20.04.4 LTS
5.4.0-122-generic

gpgn avatar Jul 25 '22 14:07 gpgn

qemu exporter output, I guess it does retrieve rapl correctly (going by the topo uj diff), but ratio is 0 (when evaluating 7/404? given Time_pdiff=7 time_tdiff=404):

path: /var/lib/libvirt/scaphandre
Got topo uj diff: Record { timestamp: 1658761502.873728237s, value: "23439820", unit: MicroJoule }
Got 465 processes to filter.
Found a process with /usr/bin/qemu-system-x86_64
Number of filtered qemu processes: 1
Working on [ProcessRecord { process: Process { pid: 15167, stat: Stat { _private: (), pid: 15167, comm: "qemu-system-x86", state: 'S', ppid: 1, pgrp: 15166, session: 15166, tty_nr: 0, tpgid: -1, flags: 138412416, minflt: 35210, cminflt: 0, majflt: 1, cmajflt: 0, utime: 13453, stime: 2050, cutime: 0, cstime: 0, priority: 20, nice: 0, num_threads: 7, itrealvalue: 0, starttime: 161143, vsize: 3777171456, rss: 332498, rsslim: 18446744073709551615, startcode: 94247876923392, endcode: 94247882338253, startstack: 140726210869440, kstkesp: 0, kstkeip: 0, signal: 0, blocked: 268444224, sigignore: 4096, sigcatch: 16963, wchan: 0, nswap: 0, cnswap: 0, exit_signal: Some(17), processor: Some(11), rt_priority: Some(0), policy: Some(0), delayacct_blkio_ticks: Some(0), guest_time: Some(9940), cguest_time: Some(0), start_data: Some(94247886257888), end_data: Some(94247889329520), start_brk: Some(94247912988672), arg_start: Some(140726210876573), arg_end: Some(140726210879061), env_start: Some(140726210879061), env_end: Some(140726210879452), exit_code: Some(0) }, owner: 64055, root: "/proc/15167" }, timestamp: 1658761502.926784707s }, ProcessRecord { process: Process { pid: 15167, stat: Stat { _private: (), pid: 15167, comm: "qemu-system-x86", state: 'S', ppid: 1, pgrp: 15166, session: 15166, tty_nr: 0, tpgid: -1, flags: 138412416, minflt: 35120, cminflt: 0, majflt: 1, cmajflt: 0, utime: 13447, stime: 2049, cutime: 0, cstime: 0, priority: 20, nice: 0, num_threads: 6, itrealvalue: 0, starttime: 161143, vsize: 3777171456, rss: 332498, rsslim: 18446744073709551615, startcode: 94247876923392, endcode: 94247882338253, startstack: 140726210869440, kstkesp: 0, kstkeip: 0, signal: 0, blocked: 268444224, sigignore: 4096, sigcatch: 16963, wchan: 0, nswap: 0, cnswap: 0, exit_signal: Some(17), processor: Some(11), rt_priority: Some(0), policy: Some(0), delayacct_blkio_ticks: Some(0), guest_time: Some(9935), cguest_time: Some(0), start_data: Some(94247886257888), end_data: Some(94247889329520), start_brk: Some(94247912988672), arg_start: Some(140726210876573), arg_end: Some(140726210879061), env_start: Some(140726210879061), env_end: Some(140726210879452), exit_code: Some(0) }, owner: 64055, root: "/proc/15167" }, timestamp: 1658761497.871368763s }, ProcessRecord { process: Process { pid: 15167, stat: Stat { _private: (), pid: 15167, comm: "qemu-system-x86", state: 'S', ppid: 1, pgrp: 15166, session: 15166, tty_nr: 0, tpgid: -1, flags: 138412416, minflt: 35120, cminflt: 0, majflt: 1, cmajflt: 0, utime: 13442, stime: 2048, cutime: 0, cstime: 0, priority: 20, nice: 0, num_threads: 6, itrealvalue: 0, starttime: 161143, vsize: 3777171456, rss: 332498, rsslim: 18446744073709551615, startcode: 94247876923392, endcode: 94247882338253, startstack: 140726210869440, kstkesp: 0, kstkeip: 0, signal: 0, blocked: 268444224, sigignore: 4096, sigcatch: 16963, wchan: 0, nswap: 0, cnswap: 0, exit_signal: Some(17), processor: Some(11), rt_priority: Some(0), policy: Some(0), delayacct_blkio_ticks: Some(0), guest_time: Some(9931), cguest_time: Some(0), start_data: Some(94247886257888), end_data: Some(94247889329520), start_brk: Some(94247912988672), arg_start: Some(140726210876573), arg_end: Some(140726210879061), env_start: Some(140726210879061), env_end: Some(140726210879452), exit_code: Some(0) }, owner: 64055, root: "/proc/15167" }, timestamp: 1658761492.820526859s }]
Time_pdiff=7 time_tdiff=404
Ratio is 0
Adding 0 uJ
()
Updated /var/lib/libvirt/scaphandre/scaphandre-test_cluster-0-master-0/intel-rapl:0
...

gpgn avatar Jul 25 '22 15:07 gpgn

Converting the division here to floats worked, but Im unsure if that is following correct power calculation logic (or Rust practices as I don't know the language). https://github.com/hubblo-org/scaphandre/blob/main/src/exporters/qemu.rs#L84

let ratio = (time_pdiff as f64) / (tdiff as f64);

Now I see in the VM:

Host:   1.861659 W
Socket0 1.861659 W |

Top 5 consumers:
Power           PID     Exe
1.861659 W      2102    "scaphandre"
0.169241 W      1502    "containerd"
0 W     1       "systemd"
0 W     2       "kthreadd"
0 W     3       "rcu_gp"

gpgn avatar Jul 25 '22 15:07 gpgn

Had the same issue and thought about the same fix in Line 84.

However, I don't thinkt that the results displays the consumption in Watts but instead returns the consumed power in uJ between two records, though I'm also not quite sure about it

philippHecht98 avatar Nov 03 '22 11:11 philippHecht98