PhotonLibOS icon indicating copy to clipboard operation
PhotonLibOS copied to clipboard

The flame graph is displayed abnormally

Open ktxdgtx opened this issue 5 months ago • 15 comments

We introduced Photon and intended to capture flame graphs using perf to analyze some performance issues. However, the flame graph stacks captured look quite strange. There are many "unknown" stacks, and even when the stacks are visible, the call relationships seem odd.

unknown stack:

Image

stack looks so odd

Image

ktxdgtx avatar Aug 04 '25 10:08 ktxdgtx

For a photon thread created by photon::thread_create/photon::thread_create11, the top-most function in its call stack will have the function name "unknown".

Likewise, for the std::thread, the top function is usually clone

beef9999 avatar Aug 04 '25 12:08 beef9999

As to the high proportion of [unknown], can you first try perf top -p <pid> with your program, and make sure you have build photon with debug-info(cmake)

beef9999 avatar Aug 04 '25 12:08 beef9999

As to the high proportion of [unknown], can you first try perf top -p <pid> with your program, and make sure you have build photon with debug-info(cmake)

Thanks, we compile photon with the compilation parameter -g -ggdb3 -fno-omit-frame-pointer.

Using perf top, I can see a more normal performance dashboard. However, is there a way to generate a relatively normal flame graph? This is because the more commonly method we use to analyze performance issues is perf record + FlameGraph.

ktxdgtx avatar Aug 04 '25 14:08 ktxdgtx

The problem is caused by Thread11 using bottom of stack storing function and parameters, make it looks not like a formally thread stack.

It might have some solution (like add some separate spaces between paras and stack frame). We will try to fix that.

Coldwings avatar Aug 05 '25 02:08 Coldwings

The problem is caused by Thread11 using bottom of stack storing function and parameters, make it looks not like a formally thread stack.

It might have some solution (like add some separate spaces between paras and stack frame). We will try to fix that.

Does ‘Thread11’ here refer to the thread created via thread_create11 ?

dusteye avatar Aug 05 '25 03:08 dusteye

Does ‘Thread11’ here refer to the thread created via thread_create11 ?

Yes, but I don't think that's the reason. We are still investigating the problem.

lihuiba avatar Aug 05 '25 14:08 lihuiba

Does ‘Thread11’ here refer to the thread created via thread_create11 ?

Yes, but I don't think that's the reason. We are still investigating the problem.

Agree, we do not use thread_create11 and produce the same problem, and boost fiber works just fine.

dusteye avatar Aug 06 '25 10:08 dusteye

Boost.Context also has a 0x0000000000000000 in ?? () frame in gdb, which I believe is the [unkonwn] in perf.

Breakpoint 1, 0x000000000040bcd0 in jump_fcontext ()
(gdb) bt
#0  0x000000000040bcd0 in jump_fcontext ()
#1  0x000000000040318f in void boost::context::detail::fiber_entry<boost::context::detail::fiber_record<boost::context::fiber, boost::context::basic_fixedsize_stack<boost::context::stack_traits>, test_move()::{lambda(boost::context::fiber&&)#1}> >(boost::context::detail::transfer_t) ()
#2  0x000000000040bcbf in make_fcontext ()
#3  0x0000000000000000 in ?? ()
(gdb)

What is the flame graph like?

lihuiba avatar Aug 09 '25 11:08 lihuiba

Does ‘Thread11’ here refer to the thread created via thread_create11 ?

Yes, but I don't think that's the reason. We are still investigating the problem.

Agree, we do not use thread_create11 and produce the same problem, and boost fiber works just fine.

Sorry for the misleading. I have checked out implementation and no problem building the stack bottom like you said.

The problem is not about the existence of [unknown] trace, but all-stacked unknown frames like a huge stack. boost.fibers also shows [unknown] frame, but only one frame in the flame plot.

Coldwings avatar Aug 11 '25 02:08 Coldwings

Turns out that when recording by perf, use

perf record --call-graph dwarf ....

works well and able to record correct calling stack. the perf tools use -g resolving call-graph using fp(frame pointer), but leads to weird result.

Once use dwarf mode to resolve call-graph, it will get such flame graph which is correct.

Image

It is a temporary resolve. We will search for better solution works well when using -g or --call-graph fp

Coldwings avatar Aug 11 '25 03:08 Coldwings

In PR #937 , we have tried to make stack bottom with well set frame pointer. Now with -fno-omit-frame-pointer, perf tool with -g able to generate correct flame graph without [unknown] frames or stacked [unknown] parts. Just like shown below

Image

(The [unknown] parts in this graph are symbols from libgtest)

It should worked in all versions from 0.6, still, we do not want to making critical changes for stable versions.

@ktxdgtx @dusteye You can also try this patch on your working branch.

Coldwings avatar Aug 11 '25 09:08 Coldwings

In PR #937 , we have tried to make stack bottom with well set frame pointer. Now with -fno-omit-frame-pointer, perf tool with -g able to generate correct flame graph without [unknown] frames or stacked [unknown] parts. Just like shown below

Image

(The [unknown] parts in this graph are symbols from libgtest)

It should worked in all versions from 0.6, still, we do not want to making critical changes for stable versions.

@ktxdgtx @dusteye You can also try this patch on your working branch.

Great! I'll try it.

dusteye avatar Aug 11 '25 13:08 dusteye

In PR #937 , we have tried to make stack bottom with well set frame pointer. Now with -fno-omit-frame-pointer, perf tool with -g able to generate correct flame graph without [unknown] frames or stacked [unknown] parts. Just like shown below

Image

(The [unknown] parts in this graph are symbols from libgtest)

It should worked in all versions from 0.6, still, we do not want to making critical changes for stable versions.

@ktxdgtx @dusteye You can also try this patch on your working branch.

Well done! It seems the display is normal now.

ktxdgtx avatar Aug 11 '25 14:08 ktxdgtx

In PR #937 , we have tried to make stack bottom with well set frame pointer. Now with -fno-omit-frame-pointer, perf tool with -g able to generate correct flame graph without [unknown] frames or stacked [unknown] parts. Just like shown below

Image

(The [unknown] parts in this graph are symbols from libgtest)

It should worked in all versions from 0.6, still, we do not want to making critical changes for stable versions.

@ktxdgtx @dusteye You can also try this patch on your working branch.

It works fine on x86_64 now.

dusteye avatar Aug 12 '25 02:08 dusteye

@dusteye @ktxdgtx are you working on a same project now?

lihuiba avatar Aug 12 '25 05:08 lihuiba