htop icon indicating copy to clipboard operation
htop copied to clipboard

Maybe warn about performance issues with metrics read from smaps/smaps_rollup (PSS/SWAP/SWAPPSS)?

Open thejh opened this issue 6 months ago • 1 comments

Every time /proc/{pid}/smaps_rollup or /proc/{pid}/smaps is read, the kernel has to traverse all the page tables of the target process and look at all the struct page instances that the entries in these page tables point to. That is not great for two reasons:

  • Most importantly, this takes more time the more memory the process is using - disabling the smaps-based columns in htop reduces htop's CPU usage a lot on a machine that is using lots of memory.
  • This causes lock contention with memory management operations in the process whose maps are being read; so running htop could induce latency spikes in other processes on the system.

It might be a good idea to put these metrics behind a warning somehow - obviously some users will want to know these numbers even if that makes htop several times more CPU-intensive, but there are probably also people like me who just thought "oh I have more space on my screen, might as well add those numbers".

thejh avatar May 31 '25 13:05 thejh

I was trying to track down serious responsiveness issues and measured that this indeed adds a good 30% of processing overhead on my system. However, even with the default configuration, the first refresh (about 1240 processes running, 48GB RAM / 32GB ZRAM SWAP / 27GB SWAP on SSD and a 6.12.30-amd64 debian kernel) takes more than 5 seconds (instead of >7 with SWAP column), meaning (with the default delay of 1.5s seemingly being relative to wall clock and input buffering) any interaction is excruuciatingly sluggish 🫠

Here's measurement of first refresh (which it turns out is 2-3x longer than subsequent refreshes, but don't know how to measure these) done with { sleep 2; tmux send-key q; } & HTOPRC= perf stat htop -d 100 and strace -c in analog way:

Image
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 54.00    1.126597           5    207656           read
 26.29    0.548528           4    114811     26549 openat
 10.61    0.221298           2     88314           close
  3.69    0.077055           6     11081        42 newfstatat
  1.82    0.037906           5      7419           getdents64
  1.20    0.024954           2      9498           capget
  1.09    0.022680           2      8754           fstat
  0.92    0.019098           2      9242           fcntl
  0.19    0.003933           4       942        50 readlinkat
  0.04    0.000898          10        85           mmap
…
------ ----------- ----------- --------- --------- ----------------
100.00    2.086173           4    458694     26693 total

Guess a separate UI thread and a self-monitoring data collection loop would be cool, some day™ .. When I have some more time I'll try to generate a 🔥flamegraph as explained in #467 ..

eMPee584 avatar Jul 05 '25 11:07 eMPee584