htop icon indicating copy to clipboard operation
htop copied to clipboard

Add Backtrace Screen

Open MrCirdo opened this issue 1 year ago • 29 comments

Hello everyone!

This is my first big pull request 😃

The goal of this PR is to add a backtrace screen for process or thread. Here's what it looks like for a thread : image

And for a process: image

Behind, I use the tool called eu-stack provided by elf-utils. The standard output is parsed and printed to the screen. Currently, I have implemented only the Refresh button. And my world is inspired by TraceScreen and OpenFilesScreen.

I still have some work to do before my work is ready (Formatting, bug fixes, ...). Currently, this is more of a demonstration than a cool feature 😄 .

What do you think about my work? Is it a feature that can be added?

MrCirdo avatar Jul 19 '23 16:07 MrCirdo

Thank you for your reactivity and your review.

I agree with you when you say eu-stack is not a good idea. I only used it to make a quick demo of my idea. I'm also unsure if eu-stack is supported on the BSD platforms.

Moreover, I think also the library unwind is more appropriate. I just have to deal with DWARF information. I will probably close all suggestions regarding the execution/parsing of eu-stack.

I saw very quickly the work of @natoscott and I will wait until his work is finished.

MrCirdo avatar Jul 20 '23 16:07 MrCirdo

Thank you for your reactivity and your review.

You're welcome.

I agree with you when you say eu-stack is not a good idea. I only used it to make a quick demo of my idea. I'm also unsure if eu-stack is supported on the BSD platforms.

There seems to be some patches re FreeBSD, but I'd not actually call this support … ;-)

That's with Darwin aside entirely … ;-)

Moreover, I think also the library unwind is more appropriate. I just have to deal with DWARF information. I will probably close all suggestions regarding the execution/parsing of eu-stack.

Sure, go ahead. Maybe check for similar places elsewhere in case something similar was missed in other places of this PR.

I saw very quickly the work of @natoscott and I will wait until his work is finished.

If you prepare your PR to anticipate that change, e.g. by preparing the data structures in a way that makes this move easy, you could even prepare things now. Given the remaining aspects in that other PR will take some time to sort out, don't feel obliged to wait for those changes to land. If the structures introduced in this PR are clean enough there's not too much work later to move things over to this new interface. Most of the work actually has already been done.

BenBE avatar Jul 20 '23 21:07 BenBE

Please don't merge the htop-dev:main branch. Rebase it instead. You know what git rebase is, don't you?

Explorer09 avatar Sep 05 '23 16:09 Explorer09

The flake commit is just for me. I will remove it at the end. I don't start my work. However, I keep updating my branch. So it's not ready to review.

MrCirdo avatar Sep 05 '23 20:09 MrCirdo

Hey,

I didn't have a lot of time recently for this PR. But I have some questions.

If you prepare your PR to anticipate that change, e.g. by preparing the data structures in a way that makes this move easy, you could even prepare things now. Given the remaining aspects in that other PR will take some time to sort out, don't feel obliged to wait for those changes to land. If the structures introduced in this PR are clean enough there's not too much work later to move things over to this new interface. Most of the work actually has already been done.

I'm looking for Row workflow. I don't know if it's a good idea to use it. Row_display uses the object Settings to get fields. Does it mean I have to make my own Settings object? With all process settings (like highlightThreads)? And do I have to make my ScreenSettings? Is it okay?

MrCirdo avatar Nov 16 '23 18:11 MrCirdo

I didn't have a lot of time recently for this PR. But I have some questions.

If you prepare your PR to anticipate that change, e.g. by preparing the data structures in a way that makes this move easy, you could even prepare things now. Given the remaining aspects in that other PR will take some time to sort out, don't feel obliged to wait for those changes to land. If the structures introduced in this PR are clean enough there's not too much work later to move things over to this new interface. Most of the work actually has already been done.

I'm looking for Row workflow. I don't know if it's a good idea to use it. Row_display uses the object Settings to get fields. Does it mean I have to make my own Settings object? With all process settings (like highlightThreads)? And do I have to make my ScreenSettings? Is it okay?

The Settings and ScreenSettings objects are shared between all tabs. The Settings provides with the global settings, applicable to all screens, while ScreenSettings is applicable to only to the current tab only. As you are basically inheriting the settings from the tab you are opening the backtrace details on, you can just copy the pointer to those settings. These are held in the Screen mostly for convenience and to reduce dependency scope. I.e. if you don't need any of the values from the settings (and none of the functions you're relying on) you could even skip these objects entirely.

BenBE avatar Nov 16 '23 18:11 BenBE

I didn't have a lot of time recently for this PR. But I have some questions.

If you prepare your PR to anticipate that change, e.g. by preparing the data structures in a way that makes this move easy, you could even prepare things now. Given the remaining aspects in that other PR will take some time to sort out, don't feel obliged to wait for those changes to land. If the structures introduced in this PR are clean enough there's not too much work later to move things over to this new interface. Most of the work actually has already been done.

I'm looking for Row workflow. I don't know if it's a good idea to use it. Row_display uses the object Settings to get fields. Does it mean I have to make my own Settings object? With all process settings (like highlightThreads)? And do I have to make my ScreenSettings? Is it okay?

The Settings and ScreenSettings objects are shared between all tabs. The Settings provides with the global settings, applicable to all screens, while ScreenSettings is applicable to only to the current tab only. As you are basically inheriting the settings from the tab you are opening the backtrace details on, you can just copy the pointer to those settings. These are held in the Screen mostly for convenience and to reduce dependency scope. I.e. if you don't need any of the values from the settings (and none of the functions you're relying on) you could even skip these objects entirely.

Thank you very much for the clarification. I was not sure I would be able to modify it (I mean If my changes would be accepted).

MrCirdo avatar Nov 17 '23 15:11 MrCirdo

Hi @BenBE , I completely forgot to say to not review my code. I just updated my branch. I deeply apologize.

MrCirdo avatar Jan 09 '24 09:01 MrCirdo

Hi @BenBE , I completely forgot to say to not review my code. I just updated my branch. I deeply apologize.

Don't worry. No need to apologize. The PR is marked as draft anyway, thus the only things I remarked upon was what I noticed immediately. I also ticked off some of the previous comments that no longer apply to the current state of this PR. Basically some maintenance.

NB: This PR would be scheduled for 3.4.x earliest anyway given that 3.3.0 is about to ship anytime soon.

BenBE avatar Jan 09 '24 12:01 BenBE

Hi,

This PR is ready for review. I closed all previous conversations.

Currently, I add only the support of Linux.

MrCirdo avatar Mar 16 '24 17:03 MrCirdo

Given that this feature will be available on multiple platforms and requires changes in different places I'd suggest to have a top-level feature flag for backtrace support and additional flags for the various libraries (iberty, eu-stack, …) The actual implementation for the stack traces should go in the platform code for each system, leaving the actual screen implementation mostly platform independent.

BenBE avatar Mar 17 '24 11:03 BenBE

Given that this feature will be available on multiple platforms and requires changes in different places I'd suggest to have a top-level feature flag for backtrace support and additional flags for the various libraries (iberty, eu-stack, …) The actual implementation for the stack traces should go in the platform code for each system, leaving the actual screen implementation mostly platform independent.

Ok I see what you mean, is it a good idea to be inspired by for example Machine_new?

MrCirdo avatar Mar 19 '24 21:03 MrCirdo

Given that this feature will be available on multiple platforms and requires changes in different places I'd suggest to have a top-level feature flag for backtrace support and additional flags for the various libraries (iberty, eu-stack, …) The actual implementation for the stack traces should go in the platform code for each system, leaving the actual screen implementation mostly platform independent.

Ok I see what you mean, is it a good idea to be inspired by for example Machine_new?

Maybe a better candidate may be the way in which Platform_getNetworkIO and Platform_getProcessLocks are implemented in <platform>/*Platform.[ch]

BenBE avatar Mar 19 '24 21:03 BenBE

Given that this feature will be available on multiple platforms and requires changes in different places I'd suggest to have a top-level feature flag for backtrace support and additional flags for the various libraries (iberty, eu-stack, …) The actual implementation for the stack traces should go in the platform code for each system, leaving the actual screen implementation mostly platform independent.

Ok I see what you mean, is it a good idea to be inspired by for example Machine_new?

Maybe a better candidate may be the way in which Platform_getNetworkIO and Platform_getProcessLocks are implemented in <platform>/*Platform.[ch]

Okay thank you, I'll take a look

MrCirdo avatar Mar 20 '24 20:03 MrCirdo

This is what it looks like now : image Do you have any recommendation?

MrCirdo avatar Jul 09 '24 19:07 MrCirdo

As you already have the various fields separated it would be nice to show this kinda tabulated. Also having addr module file line function+offs is a better order for the information (with unavailable items left blank). Maybe have a look at the Open Files list view for how a header could be handled.

Also with long filenames (especially on Nix) it would be nice to have this vie care about the "hide program path" setting and the "highlight basename"/"mark outdated" settings.

BenBE avatar Jul 09 '24 22:07 BenBE

Also with long filenames (especially on Nix) it would be nice to have this vie care about the "hide program path" setting and the "highlight basename"/"mark outdated" settings.

That's a good idea!

As you already have the various fields separated it would be nice to show this kinda tabulated. Also having addr module file line function+offs is a better order for the information (with unavailable items left blank). Maybe have a look at the Open Files list view for how a header could be handled.

In my original draft, it was tabulated. But I see it's a good idea. Currently, the lib unwind cannot get the line and in general programs are compiled without debug symbols. So I'm afraid that the line column will be blank

MrCirdo avatar Jul 10 '24 19:07 MrCirdo

Currently, the lib unwind cannot get the line and in general programs are compiled without debug symbols. So I'm afraid that the line column will be blank

Depending on how much dynamic you want to implement, you could hide the file+line columns if none of the rows includes information to display for those. Sometimes you have debug information available or even a service that can pull such information on-the-fly from separate debug info caches (although such long-lasting tasks should be on-demand). Thinking of fetching debug symbols like GDB does (based on buildinfo data embedded in the binaries). But that's something for a later improvement.

BenBE avatar Jul 11 '24 20:07 BenBE

Thinking of fetching debug symbols like GDB does (based on buildinfo data embedded in the binaries). But that's something for a later improvement.

That's a good idea too. I note that.

Also having addr module file line function+offs is a better order for the information (with unavailable items left blank).

I'm inspired by the output of GDB. I prefer this layout.

So now, it's look like : image

I have small improvements to do and I'll push the code. Is it better now?

MrCirdo avatar Jul 12 '24 22:07 MrCirdo

I'm inspired by the output of GDB. I prefer this layout.

So now, it's look like : image

I have small improvements to do and I'll push the code. Is it better now?

  1. There are unnecessary zero digits in the ADDRESS column. Perhaps it's good to give the column a dynamic width.
  2. What does the "+number" after the symbol name represents? Is it in lines or in bytes? Is it base 10 or base 16?

Explorer09 avatar Jul 13 '24 04:07 Explorer09

I'm inspired by the output of GDB. I prefer this layout. So now, it's look like : image I have small improvements to do and I'll push the code. Is it better now?

#NB should probably be Frame if you want to stick with the GDB terminology.

  1. There are unnecessary zero digits in the ADDRESS column. Perhaps it's good to give the column a dynamic width.

I think if we trim leading zeros for ADDRESS it's best to right-align that column. Furthermore I second that suggestion with the dynamic width.

  1. What does the "+number" after the symbol name represents? Is it in lines or in bytes? Is it base 10 or base 16?

The +number is the offset (in bytes) from the start of the function. Should be marked as hex by explicit 0x prefix IMO even though offsets are usually presented as hex when referring to memory addresses.

Okay, now the questions from my side:

  1. Do the colors have a special meaning? Like does the color of addresses encode memory protection stuff/mapping status?
  2. Given the function name can be quite large, wouldn't it be better to have it shown after the PATH column?
  3. Did you include the coloring for modules based on its deleted/obsolete status?
  4. The items in the menu bar should be much shorter. Demangle/Raw for the first and Full path/Basename fully suffice.
  5. Note that the shortcut for abbreviating the path names is p everywhere else.

BenBE avatar Jul 13 '24 16:07 BenBE

There are unnecessary zero digits in the ADDRESS column. Perhaps it's good to give the column a dynamic width.

I think if we trim leading zeros for ADDRESS it's best to right-align that column. Furthermore I second that suggestion with the dynamic width.

Yes there are a lot of zeros because my computer is 64-bit, so the address has 16 digits. However, you point out something interesting. What if htop is running on a 16-bit processor? And it's a good idea to have a dynamic column to handle this situation. Now I have multiple options to make this column dynamic. There is the right alignment (as you said @BenBE) but there is also zero-padded dynamic alignment. Let me try both.

What does the "+number" after the symbol name represents? Is it in lines or in bytes? Is it base 10 or base 16?

The +number is the offset (in bytes) from the start of the function. Should be marked as hex by explicit 0x prefix IMO even though offsets are usually presented as hex when referring to memory addresses.

Yes it's the offset in the functions. It's a good idea to show it with a hex representation. Fun fact GDB doesn't show it 😄

#NB should probably be Frame if you want to stick with the GDB terminology.

I'm concerned about the size of the name of this column. Because generally, there are less than 100 frames. And I can put FRAME or NB FRAME however, I'm afraid there will be a waste of space.

  1. Do the colors have a special meaning? Like does the color of addresses encode memory protection stuff/mapping status?

Euh no, I like colors so I tried to show the color that renders more beautiful and more visible.

  1. Given the function name can be quite large, wouldn't it be better to have it shown after the PATH column?

Yes, I can try to see what it would look like. In the GDB printing, it's at the end. This is why 😄

  1. Did you include the coloring for modules based on its deleted/obsolete status?

Euh no, but I'm not sure what do you want to say?

  1. The items in the menu bar should be much shorter. Demangle/Raw for the first and Full path/Basename fully suffice.

Yes It's definitely better.

MrCirdo avatar Jul 16 '24 12:07 MrCirdo

What if htop is running on a 16-bit processor?

Nope. htop doesn't support any 16-bit operating system at the moment, and it's unlikely to ever happen. You might be referring to 32-bit instead.

Euh no, I like colors so I tried to show the color that renders more beautiful and more visible.

The principle is, if there is any special information conveyed through color alone, it should be documented, and provide an alternative representation for monochrome terminals (in other words, think of colorblind accessibility).

Explorer09 avatar Jul 16 '24 12:07 Explorer09

#NB should probably be Frame if you want to stick with the GDB terminology.

I'm concerned about the size of the name of this column. Because generally, there are less than 100 frames. And I can put FRAME or NB FRAME however, I'm afraid there will be a waste of space.

We always can just call that column # or Frm, with # being the preferred one.

  1. Do the colors have a special meaning? Like does the color of addresses encode memory protection stuff/mapping status?

Euh no, I like colors so I tried to show the color that renders more beautiful and more visible.

If you do not have special meaning for a color you shouldn't be using one unnecessarily. It's not about being as colorful as possible, but to convey information in a concise and structured way.

  1. Did you include the coloring for modules based on its deleted/obsolete status?

Euh no, but I'm not sure what do you want to say?

Have a look at the options Highlight program basename and Highlight out-dated/removed programs / libraries in the settings dialog. Effect can be seen on the main process list in the COMMAND column.

BenBE avatar Jul 16 '24 13:07 BenBE