meta-balena
meta-balena copied to clipboard
Investigate cpu performance
It would be nice to investigate cpu/io/mem/network performance and then compare it to standard distributions on devices. There might be overheads we have that we can reduce.
pi0w, no app container, 2.29.2
even after 5 minutes, the supervisor and balena daemon seem to be doing something
Mem: 254004K used, 237460K free, 3864K shrd, 19484K buff, 100412K cached
CPU: 87% usr 12% sys 0% nic 0% idle 0% io 0% irq 0% sirq
Load average: 2.07 2.24 1.62 2/143 1510
PID PPID USER STAT VSZ %VSZ %CPU COMMAND
1081 1069 root R 154m 32% 62% node /usr/src/app/dist/app.js
782 1 root S 873m 182% 25% /usr/bin/balenad --experimental --log-driver=journald -s aufs -H fd:// -H unix:///var/run/balena.sock -H unix:///var/run/balena-engine.sock -H
647 1 root S 9868 2% 5% @sbin/plymouthd --tty=tty1 --mode=boot --pid-file=/run/plymouth/pid --attach-to-session --kernel-command-line=plymouth.ignore-serial-consoles s
1487 666 root R 2992 1% 4% top
780 1 root S 857m 178% 1% /usr/bin/balenad --delta-data-root=/mnt/sysroot/active/balena --delta-storage-driver=aufs --log-driver=journald -s aufs --data-root=/mnt/sysroo
848 782 root S 856m 178% 1% balena-engine-containerd --config /var/run/balena-engine/containerd/containerd.toml
After pushing a simple raspbian container that sleeps, cpu usage has settled down. plymouthd seems to be eating some unnecessary cpu cycles.
Mem: 400416K used, 91048K free, 3960K shrd, 30104K buff, 221028K cached
CPU: 4% usr 4% sys 0% nic 90% idle 0% io 0% irq 0% sirq
Load average: 2.55 3.25 2.58 1/154 1831
PID PPID USER STAT VSZ %VSZ %CPU COMMAND
647 1 root S 9868 2% 4% @sbin/plymouthd --tty=tty1 --mode=boot --pid-file=/run/plymouth/pid --attach-to-session --kernel-command-line=plymouth.ignore-serial-consoles s
1831 666 root R 2992 1% 2% top
782 1 root S 887m 185% 1% /usr/bin/balenad --experimental --log-driver=journald -s aufs -H fd:// -H unix:///var/run/balena.sock -H unix:///var/run/balena-engine.sock -H
848 782 root S 865m 180% 1% balena-engine-containerd --config /var/run/balena-engine/containerd/containerd.toml
842 780 root S 865m 180% 1% balena-engine-containerd --config /var/run/balena-host/containerd/containerd.toml
780 1 root S 857m 178% 1% /usr/bin/balenad --delta-data-root=/mnt/sysroot/active/balena --delta-storage-driver=aufs --log-driver=journald -s aufs --data-root=/mnt/sysroo
1081 1069 root S 140m 29% 1% node /usr/src/app/dist/app.js
1486 2 root IW 0 0% 1% [kworker/0:0]
1376 2 root IW 0 0% 1% [kworker/0:3]
Inside a rpi-raspbian
container, after a apt-get update && apt-get install sysbench
root@62d0b9a:/usr/src/app# sysbench --test=cpu run
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 1
Doing CPU performance benchmark
Threads started!
Done.
Maximum prime number checked in CPU test: 10000
Test execution summary:
total time: 264.7861s
total number of events: 10000
total time taken by event execution: 264.7184
per-request statistics:
min: 22.91ms
avg: 26.47ms
max: 121.75ms
approx. 95 percentile: 38.18ms
Threads fairness:
events (avg/stddev): 10000.0000/0.00
execution time (avg/stddev): 264.7184/0.00
root@62d0b9a:/usr/src/app#
On raspbian lite,
pi@raspberrypi:~$ uname -a
Linux raspberrypi 4.14.79+ #1159 Sun Nov 4 17:28:08 GMT 2018 armv6l GNU/Linux
pi@raspberrypi:~$ cat /etc/rpi-issue
Raspberry Pi reference 2018-11-13
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 7e0c786c641ba15990b5662f092c106beed40c9f, stage2
pi@raspberrypi:~$ sysbench --test=cpu run
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 1
Doing CPU performance benchmark
Threads started!
Done.
Maximum prime number checked in CPU test: 10000
Test execution summary:
total time: 228.4990s
total number of events: 10000
total time taken by event execution: 228.4667
per-request statistics:
min: 22.76ms
avg: 22.85ms
max: 33.24ms
approx. 95 percentile: 22.96ms
Threads fairness:
events (avg/stddev): 10000.0000/0.00
execution time (avg/stddev): 228.4667/0.00
So cpu load is unusually high on my pi0 on balenaOS
15:09:46 up 0:15, 1 user, load average: 0.78, 0.84, 0.66
Its sensible on pi0 on raspbian lite.
15:16:54 up 21:29, 2 users, load average: 0.07, 0.19, 0.16
I have a docker container running a sleep on both.
I'm going to try and bring down various services in the os to see what is causing it.
note to self: check out health monitoring using netdata.
[jakogut] This issue has attached support thread https://jel.ly.fish/d8c3be4e-68a1-4943-9a8a-e27f9e5bd26d
I investigated a little bit and the high initial cpu usage (at least on my system - RPI3) originates from rngd.
Further investigation yielded that for some reason the jitter source was used, even when the hardware randomness source hwrng is available. To solve this issue one could modify the command to run rngd from /usr/sbin/rngd -f -r /dev/hwrng
from /usr/sbin/rngd -f -r /dev/hwrng -x jitter
on raspberry pi systems.
Sadly I know exactly nothing about yocto, so I cannot implement this myself
hi @Tom-Julux what I think happens is that all entropy sources need to be initialized before the best performing (hardware) engine is chosen. So the jitter source even if it won't be used also has to be initialized (unless you pass the -x jitter
) option, even if it won't be used as there is a hardware entropy source available. I will look into adding the -x entropy
to device types that have a hardware entropy source to increase boot time.
[cywang117] This has attached https://jel.ly.fish/a13bcce9-7878-465c-940f-cd12b1d27a37
It might be worth pulling a new version of rngd, too - my pi zeros still have 6.15 despite being on the latest production BalenaOS, 6.16 changelog includes "Fix jitterentropy long timeout failures on low power hardware,"
hey @srd424 the version of rngd
comes from Poky Kirkstone - we will update as part of updating to the new Scarthgap LTS release which is scheduled to release in April.