htop icon indicating copy to clipboard operation
htop copied to clipboard

Dynamic scaling & Graph meter coloring (new)

Open Explorer09 opened this issue 2 years ago • 94 comments

This pull request implements dynamic scaling support and a "stacked" color for Graph meters. In addition, the Bar meter drawing routine is slightly improved for dynamically expanding the "total" value when needed.

This code would replace the current, one-color graph drawing algorithm so that all Graph meters are drawn with colors.

The data structures and algorithms for the color graphs and dynamically scaling are brand new, totally reworked from htop-dev#129 and the colors are intended to represent the meter data as close as possible. It still uses braille dots to display the graph, but they now have finer details (I think).

Screenshot


Caveats

  • The graph now shows only half the number of records after this PR. The old code used to show two records per terminal column (utilizing the two columns of a braille character), but due to coloring in terminal display, I have to change it to one record per terminal column, both in ASCII and Unicode terminal display.
  • The new code no longer keeps the graph "values" in the C double (floating point) data type. It now stores the precomputed colors and braille dots instead, and the data are specific to a graph height setting.
  • The "GRAPH_1" and "GRAPH_2" colors defined in CRT.h would become unused.

Technical details

  • The main method of determining the colors of all cells (terminal characters) is the largest remainder method. It ensures the color cells chosen would be representative to the original values of a meter. (Just like how it is used in electing representatives in an electoral system)
  • When dynamic scaling and color graph are combined, the graph color computing routine would perform multiple times to render the colors in multiple scales. The "color cells in multiple scales" are then stored in an array with an indexing method mimicking the "in-order traversal" of a complete binary tree. This allows at most 2 × "graph height" cells to be used for storing colors.

The following are potential features that can be implemented after this pull request, but I doubt if they are good ideas or if I have the ability to implement them:

  • A new color for the scale unit text (%, 32K, 16M, etc.), separate from the caption color. I was considering making the text blue to separate it from the cyan text of the caption.
  • Graph height adjustment at runtime. At least allow the htoprc file to specify the graph height. The settings UI would be the next step to implement.
  • A "colorblind" display of the graph, for the "monochrome" color scheme. The bar mode characters (|#*@$%&) can be reused for this.

Explorer09 avatar Jul 29 '21 09:07 Explorer09

Having the caption above the unit makes more sense IMHO.

BenBE avatar Jul 29 '21 09:07 BenBE

  1. I've done moving the caption to above the unit. It's an easy change.
  2. The code for checking __builtin_clz in configure is also done.
AC_MSG_CHECKING(for __builtin_clz)
AC_COMPILE_IFELSE([
   AC_LANG_PROGRAM([], [__builtin_clz(1); /* Requires gcc 3.4 or later */])],
   [AC_DEFINE([HAVE_BUILTIN_CLZ], 1, [Define to 1 if the compiler supports `__builtin_clz' function.])
   AC_MSG_RESULT(yes)],
   AC_MSG_RESULT(no))
  1. I think I can move the prevPowerOf2 function to XUtils.c, but I don't think it's good idea to make it static inline as the code looks non-trivial.
  2. I am reserved on using builtin_popcount. The reason is, for x86 family, POPCNT requires SSE 4.2 (instruction set) or -msse4.2 compiler flag to be efficient. If POPCNT is not available, gcc emits a libgcc call. My PopCount8 implementation is optimized (for 8-bit input specifically) and is smaller than libgcc's popcount. Do you suggest me to check for processor instruction instead of just check the compiler's popcount builtin?

Explorer09 avatar Jul 29 '21 11:07 Explorer09

  1. I've done moving the caption to above the unit. It's an easy change.

k. :)

  1. I think I can move the prevPowerOf2 function to XUtils.c, but I don't think it's good idea to make it static inline as the code looks non-trivial.

As inline typically just a hint to the compiler it can still decide to ignore it. Also most recent optimizers do code folding anyway, thus would remove the parts imported from different code units upon linking. Also there's some more magic going on with linkers nowadays, thus I doubt this is a real issue with recent compilers. Compilers can still opt to ignore the inline anytime and will happily do if the code becomes too complex or too wasteful when inlining (unless forced to).

  1. I am reserved on using builtin_popcount. The reason is, for x86 family, POPCNT requires SSE 4.2 (instruction set) or -msse4.2 compiler flag to be efficient. If POPCNT is not available, gcc emits a libgcc call. My PopCount8 implementation is optimized (for 8-bit input specifically) and is smaller than libgcc's popcount. Do you suggest me to check for processor instruction instead of just check the compiler's popcount builtin?

Similar argument with popcount: Link to godbolt

For any reasonable recent compiler the libgcc function will likely be optimized enough (and likely even use SSE2 when the system has it available), thus only the first call to such a function will receive a slowdown. Also most architectures by now have a POPCNT instruction available, which likely results in that instruction used on corresponding optimization levels. It may even be, that on higher optimization levels the compiler will replace your implementation by that instruction anyway. Better to convey the intention to the compiler than have it figure things out via pattern matching IMHO.

NB: Modern compilers are usually better in optimizing your code, than you are. Better optimize the algorithm by replacing it with a better one than trying to be smart …

BenBE avatar Jul 29 '21 12:07 BenBE

For any reasonable recent compiler the libgcc function will likely be optimized enough (and likely even use SSE2 when the system has it available), thus only the first call to such a function will receive a slowdown. Also most architectures by now have a POPCNT instruction available, which likely results in that instruction used on corresponding optimization levels. It may even be, that on higher optimization levels the compiler will replace your implementation by that instruction anyway. Better to convey the intention to the compiler than have it figure things out via pattern matching IMHO.

NB: Modern compilers are usually better in optimizing your code, than you are. Better optimize the algorithm by replacing it with a better one than trying to be smart …

If I have not read the disassembly of __popcountdi2 I would not have made the argument in the first place. The problem is gcc doesn't provide the "popcount 8 bit" for me, so I implement one by hand. (The usual "popcount" algorithm works great for 32, 64 and 128 bits but not optimal for 8 bits.) And I really don't like the call to the sub-optimal __popcountdi2 anyway.

Explorer09 avatar Jul 29 '21 14:07 Explorer09

Update: I figured out what to do with the popCount8 problem. Commit amended and rebased. See XUtils.h in commit b160428323c9f649fbaa7d32289b7ef9236d7673 to see what I mean. (Update (v3): 28b0a60521093f889882329081ce683eb36e9629)

Explorer09 avatar Jul 29 '21 15:07 Explorer09

Did some tests with the prevPowerOfTwo function on godbolt: Depending on your compiler and architecture and settings any of the three variants at times produces the shortest assembly with my tweaked variant being faster on x86, but slower on ARM, except when CLZ is actually used on ARM, where it depends heavily on the compiler version used.

Similar effects are likely visible with the popcount implementation too.

BenBE avatar Jul 29 '21 16:07 BenBE

Did some tests with the prevPowerOfTwo function on godbolt: Depending on your compiler and architecture and settings any of the three variants at times produces the shortest assembly with my tweaked variant being faster on x86, but slower on ARM, except when CLZ is actually used on ARM, where it depends heavily on the compiler version used.

prevPowerOf2(0) should (in theory) output 0, not 1. I've experimented with what you did before, but your code is not equivalent to mine, and you didn't assert(x > 0).

Explorer09 avatar Jul 29 '21 16:07 Explorer09

@Explorer09 Why write everything onto screen directly instead of using RichString as buffers? Advantage would be that you could abuse double-buffering there to shift the graph by two units (data points), thus with two sets of RichStrings you could just shift everything by one character and just draw the last cell …

BenBE avatar Jul 30 '21 10:07 BenBE

@BenBE

Why write everything onto screen directly instead of using RichString as buffers? Advantage would be that you could abuse double-buffering there to shift the graph by two units (data points), thus with two sets of RichStrings you could just shift everything by one character and just draw the last cell

It would break dynamic scaling. Look at Load average meter for example, the whole graph needs to be redrawn when the scale (or unit) changes.

Explorer09 avatar Jul 30 '21 11:07 Explorer09

Sure, but those situations can be tracked/detected …

BenBE avatar Jul 30 '21 12:07 BenBE

@BenBE The second reason for not using a RichString buffer is the use may change color scheme of htop at runtime, and the RichString buffer is not designed to reflect color scheme changes. Detecting when to clean the buffers like the cases above is too much work, and so I consider it doesn't worth the trouble.

Explorer09 avatar Jul 30 '21 12:07 Explorer09

On the other hand, re-using existing code makes maintaining the code easier once it got merged. And in the current state I'm not too convinced that these 650 LOC of new code are in a state that makes maintaining them easy enough to justify the complexity in this code. Just skimming over I saw several routines, that are justifiably quite long. And I also understand you are hesitant to refactor major parts of that code after that much effort that went in.

I had some PRs of mine that needed refactoring and bug fixing for about half a year until I could finally merge them. And even then there were small issues that needed to be addressed. So given the amount of code in this PR it should be one priority to ease maintaining the code and another one to have it working. Even if this sound's like I'm a killjoy here both are requirements to proceed with merging (this) new code.

BenBE avatar Jul 30 '21 13:07 BenBE

First of all: Thanks a lot for picking up on this again!

The new code works and it gives even more precision. On the other hand it looks strange in some situations... Not sure I like it that way. After all the meters are just to give a good overview, the latest bit of precision is not needed IMHO.

Either way I am happy if the makes its way into master.

eworm-de avatar Aug 02 '21 13:08 eworm-de

This pull request introduces 2 alerts when merging 24bd56e30dbc34ea43d0a248ac660498c25cbbb4 into ed82ce6456f0f904a0ab2b346b85d7cf46df109c - view on LGTM.com

new alerts:

  • 1 for Multiplication result converted to larger type
  • 1 for Comparison result is always the same

lgtm-com[bot] avatar Aug 03 '21 15:08 lgtm-com[bot]

There is one visual problem that I wish to solve after the next rebase. In the screenshot I tried with the test data of 4 items, the value of each being 25% (exact). When built with GRAPH_HEIGHT = 4 and 8, the graph results are perfect. The result when built with GRAPH_HEIGHT = 3 is acceptable, but the GRAPH_HEIGHT = 5 looks weird visually. When it comes to ties, the current algorithm prefers giving extra cells to the lower items. I'm considering changing the rule in tiebreak 1 to prevent graph distortions like in GRAPH_HEIGHT = 5. htop-graph-height-test

Explorer09 avatar Aug 09 '21 03:08 Explorer09

There is one visual problem that I wish to solve after the next rebase.

Take your time, this PR isn't targeted for 3.1.0 anyway. Depending on progress 3.1.1 may be possible. ;-)

Also would be nice if you could resolve those two code complexity issues I noted. Rough rule of thumb: 5 levels of indentation is usually too much. As are 3 pages of non-trivial loopy code. ;-)

BenBE avatar Aug 09 '21 06:08 BenBE

One thing that strikes me as a little 'clunky' in the user interface here is the way the shortened title is duplicated in the (very common I expect) case where we're displaying a bar/text/led meter followed by the equivalent graph meter. In the initial screenshot in this PR for example, we're repeating "CPU", "Tas", "Mem", and "Loa" unnecessarily such that its just becomes UI noise, really.

Since a graph is not very useful without some kind of legend (which we're implementing as a separate meter), maybe we should mandate that a graph must always have an associated bar/text/led meter (either above or below)? Alternatively, perhaps we need a generic way to optionally switch off those titles?

natoscott avatar Aug 09 '21 22:08 natoscott

Best example this isn't the case is with the CPU meters. I rarely ever see people use them with the respective text meters next to them …

BenBE avatar Aug 10 '21 06:08 BenBE

Best example this isn't the case is with the CPU meters. I rarely ever see people use them with the respective text meters next to them …

Hmm - are you asserting that graphs without legends are generally useful? I disagree. :) It's not by accident that the initial screenshot has the text meter above the same graph meter in every single case. I think its quite a neat hack to solve the lack-of-a-legend problem.

CPU meters are a bit of a special case because a/ its a utilization so the 0-100% bounds are fixed and well known, and b/ its such a common meter that people quickly learn the color mappings. However in most other cases, the need for a legend of some kind is definitely going to be a common scenario I think.

Anyway, that repeating text remains an eyesore that IMO is really not in keeping with the general aesthetics and minimalism strived for in htop. We should do something about that if we can.

natoscott avatar Aug 10 '21 06:08 natoscott

Though you shouldn't mandate the accompanying text meter directly above/below either, because e.g. I usually group the bar/graph stuff in one column and use another column for all the text stuff.

So why not skip the label requirement completely and make this fully up to the user; just show it by default after a meter is added?

BenBE avatar Aug 10 '21 06:08 BenBE

Though you shouldn't mandate the accompanying text meter directly above/below either, because e.g. I usually group the bar/graph stuff in one column and use another column for all the text stuff.

+1

So why not skip the label requirement completely and make this fully up to the user; just show it by default after a meter is added?

"up to the user" sounds good but I'm not following how we'd accomplish that via the current Setup screen - what would the UI steps be such that I could create a graph meter below its text meter without the repeating "CPU" (or "Tas" or "Mem" or "Loa") text?

natoscott avatar Aug 10 '21 07:08 natoscott

So why not skip the label requirement completely and make this fully up to the user; just show it by default after a meter is added?

"up to the user" sounds good but I'm not following how we'd accomplish that via the current Setup screen - what would the UI steps be such that I could create a graph meter below its text meter without the repeating "CPU" (or "Tas" or "Mem" or "Loa") text?

Have a dedicated function key for this? Because IIRC S+Enter, A+Enter and C+Enter are likely to be in use by the system or other programs that manage your input focus … And we've plenty of shortcuts available in that management screen. Or use S+F5 to add graph+label?

Nonetheles we need one key mapping to turn on the label for a given selected meter too.

BenBE avatar Aug 10 '21 07:08 BenBE

Regarding the graph label issue, my suggestion would be to keep things simple. Rather than introducing a hotkey, how about making a runtime option to toggle the label first? The option could be named "graph label display style", with choices like "aside" (the current, default style), "hidden" and "overlay". Of course the option would apply to all graph meters.

Explorer09 avatar Aug 11 '21 06:08 Explorer09

I think switching the label display on/off for each meter individually would be best … Just put one key for this in the function bar to turn the label on/off per meter.

BenBE avatar Aug 11 '21 11:08 BenBE

IMHO the label discussion can/should be handled in a separate PR …

BenBE avatar Aug 11 '21 11:08 BenBE

Running this on my workstation now. Looks like it is suffering reduced precision: A braille character always has the same height on left and right size. Both side used to represent different values before.

eworm-de avatar Aug 18 '21 08:08 eworm-de

@eworm-de

Looks like it is suffering reduced precision: A braille character always has the same height on left and right size. Both side used to represent different values before.

It's intentional. The new code now uses one braille character for one record. This reduces complexity for computing character colors as we have one record to concern about rather than two.

Explorer09 avatar Aug 18 '21 09:08 Explorer09

Hey @Explorer09, I've built, compiled and ran this (#714) version of htop, but I don't seem to find the graphs being colored. I've stress tested yet the graphs are still single colored. I've attached a screenshot for your reference, kindly let me know how I can figureout this issue.

Thanks in advance !

The command I used to stress test : stress -c 2 -i 2 -m 3 -t 20s image

AdityaSriram09 avatar Apr 04 '22 06:04 AdityaSriram09

@AdityaSriram09

Did you check out the right code? The screenshot you attached looks no different from the main branch. Remember you can see the Files changed page of this pull request to verify that you have checked out the right files.

Explorer09 avatar Apr 06 '22 04:04 Explorer09

Any chance for a rebase on main branch?

eworm-de avatar Apr 06 '22 07:04 eworm-de