htop icon indicating copy to clipboard operation
htop copied to clipboard

Add support for displaying threads under OpenBSD

Open multiplexd opened this issue 7 years ago • 7 comments

This patch adds support for displaying individual threads in a process on OpenBSD. The following points should be noted:

  • A call to ProcessList_buildTree() has been changed so that the process's showChildren flag always controls whether child processes are shown in tree view. This is necessary because on OpenBSD process ID 1 (init(8)) is a child process of the kernel thread with PID 0 (unlike other systems, where PID 0 is not visible from userspace and the first kernel thread has PID 2). By default, if a process is marked hidden then all of it's children will be hidden, which causes no processes to be displayed when hiding kernel threads is enabled. I do not know if this change will cause regressions elsewhere.

  • The flags passed to kvm_getprocs(3) have been changed to also return all threads visible from userspace, which includes kernel threads and threads within processes (similar to the current Linux code). The semantics of some of the fields in the Process object are Linux specific -- pid (process ID) and tgid (thread group ID) -- and so they are used slightly differently in this patch. Note the following:

    1. On OpenBSD, processes are composed of multiple threads. Each process shares the same PID, but different threads have different thread ID's.
    2. Both indiviudal threads and complete processes are returned by kvm_getprocs(3). The thread ID of the struct kinfo_proc returned for the overall process has a thread ID of -1. All other thread ID's are randomly assigned, as with PID's.
    3. As all processes have at least one thread, this means that information for every single-threaded process is returned twice: once for the process and once for the singleton thread.
    4. The value of a PID is currently capped at 99999 (see #818), however thread ID's are always greater than this value, but capped at 0x7ffff (524297 decimal).

    The patch therefore assigns the process ID and thread group ID as follows: if the struct returned represents the overall process (thread ID is set to -1), then the htop thread group ID is set to the process's kernel PID, and the htop PID and htop parent PID are set as normal. If the thread ID is not -1, then the htop thread group ID is set to the kernel PID, the htop PID is set to the kernel thread ID and the parent PID is set to the kernel PID. (This emulates the behaviour of examining threads under Linux in htop). Note that this means that with userland threads turned on every process in the process tree will show at least one thread, unlike Linux.

  • The maximum PID on OpenBSD is changed again in this patch (duplicating #818) as PID's and thread ID's share the same numeric interval, but occupy different ranges.

multiplexd avatar Jul 17 '18 21:07 multiplexd

Thanks for the PR! It's great to see contributions made to the OS ports! :)

Note that this means that with userland threads turned on every process in the process tree will show at least one thread, unlike Linux.

Do you think this could be confusing to users? A few questions to try to clarify this issue:

  • How does top behave on this regard on OpenBSD?
  • In a single-threaded process, are the stats (CPU%, memory, etc) shown by the "top level process" and the thread the same?
  • In a multi-threaded process, what is the relationship between the "top level process" entry and the threads? (are the "top level" process stats identical to the main thread's, or are they the aggregate of all threads?)

Thank you once again for your patches!

hishamhm avatar Jul 28 '18 02:07 hishamhm

I've examined OpenBSD top(1)'s behaviour, and I found the following:

  • OpenBSD's top(1) shows either processes or threads -- it does not show both at the same time. The list of threads therefore excludes all "top-level" processes and vice versa, so top(1) never has to work out a relationship between the two.

  • For single-threaded processes, the CPU and memory usage numbers are the same in both process and thread view.

  • For multi-threaded processes, the memory usage numbers are the same across all threads (as threads share address space), however the CPU usage numbers differ, as each thread is scheduled independently by the kernel.

I think the behaviour this patch implements might violate the "Principle of Least Astonishment" for those who have used htop on Linux (and don't have a working knowledge of the scheduling and threading model in the OpenBSD kernel). However, I think the best way to test this is to get other people to use the patched version and get their feedback.

multiplexd avatar Jul 28 '18 10:07 multiplexd

For multi-threaded processes, the memory usage numbers are the same across all threads (as threads share address space), the CPU usage numbers differ, as each thread is scheduled independently by the kernel.

What CPU usage values does your patch show for multi-threaded processes? More like this:

CPU Command
100 my_program (sum of all 4 threads)
 25 +-- my_program (main)
 25 +-- my_program (subthread 1)
 25 +-- my_program (subthread 2)
 25 +-- my_program (subthread 3)

or like this?

CPU Command
 25 my_program (same as main)
 25 +-- my_program (main)
 25 +-- my_program (subthread 1)
 25 +-- my_program (subthread 2)
 25 +-- my_program (subthread 3)

hishamhm avatar Jul 30 '18 12:07 hishamhm

The CPU values shown are similar to your first example, i.e. the CPU usage for the overall process is the sum total of the CPU usage of each individual thread.

multiplexd avatar Aug 01 '18 17:08 multiplexd

The CPU values shown are similar to your first example, i.e. the CPU usage for the overall process is the sum total of the CPU usage of each individual thread.

In that case, in the case of single-threaded processes, how about just not adding the main thread as a child of the process? In other words, if it has only 1 thread, just show the main process; if it has 2 or more, add all threads. I assume that most processes of the process list will be single-threaded, and the multi-threaded ones have all their threads properly display (including the main one with its individual CPU stats). It's not the exact same behavior as Linux but I think the output would look sensible. What do you think?

hishamhm avatar Aug 25 '18 13:08 hishamhm

In that case, in the case of single-threaded processes, how about just not adding the main thread as a child of the process? In other words, if it has only 1 thread, just show the main process; if it has 2 or more, add all threads.

That's a little trickier to implement properly, though it's not impossible. The process information returned by the kvm(3) interface on OpenBSD does not provide any indication whether a given process is single-threaded or whether a given thread is the only thread within a process. Therefore, it would be necessary to sort and collate all processes and threads, and then handle single and multi-threaded processes separately. It would complicate the implementation of the OpenBSD process list handling code a little, which might be a maintainability concern.

Personally, I'm not too bothered by single-threaded processes having a single child thread each, but that isn't representative of most OpenBSD users.

multiplexd avatar Aug 25 '18 15:08 multiplexd

Okay, I've modified the process listing code to first enumerate all processes on the system, and then find all the threads for each process. This means that single-threaded processes will now only display as a single process with no child threads, while multi-threaded processes will display as a process with two or more.

I've also encountered a problem where the process list hash table starts returning bogus pointers on x86_64 when the load factor is high, which I have opened a separate issue for (see #893).

multiplexd avatar Feb 21 '19 20:02 multiplexd