profiler icon indicating copy to clipboard operation
profiler copied to clipboard

Inline code view proposal

Open gregtatum opened this issue 7 years ago • 3 comments

perf.html right now is pretty good at providing performance analysis in relation to how long each function took. However, once a slow function is found, it is important to be able to dive in right away and figure out what is going on. Currently, we have no way to do that in the tool. The only solution is to be able to navigate to different tools and perform additional analysis. These tools include source code on SearchFox, searching through a codebase for known function names, finding a way to view assembly code, or even worse on third party JavaScript, viewing a bundle and manually navigating to the slow code. perf.html should make this next step of analysis trivially easy.

C++/Rust inline code view

We have access to the build system and symbols. Profiler samples are collected as raw memory addresses. It is possible to turn these back into source locations and raw assembly information. For a given function call, we can collect per-line time spent on the symbolicated C++/Rust code. In a function this will reveal exactly where time is spent. In addition, we can collect the specific time spent on assembly instructions that were generated. Assembly can be viewed directly through the C++ source code via a drop down panel. (See the Prior Art below) This is a hugely powerful tool that can reveal performance issues that would be very opaque and difficult to extract using custom measurements and third party tools. If the infrastructure is put in place, this analysis would be a click away.

JavaScript sources

JavaScript sources could have a similar per-line view to show the cost of function internals. The profiler will need to be augmented with additional line and column information (some work is currently in progress). With this information, source fetching, and sourcemaps we should be able to exactly correlate timing information with individual instructions. This would probaly be more accurate than per-line as with the column information and some JS parsing we can tell exactly where time is spent inside the actual symbols of the code. In addition JIT invalidations could be displayed as long as the per-line information is included.

Implementation details

C++/Rust:

  • The symbol servers will need to emit all the correct line/column information for mapping to original sources.
  • We will need servers that can serve the assembly for official builds:
    • Use TaskCluster as our server, assuming we can get CORS headers
    • There would be a job on TaskCluster that creates assembly text files for every build of Firefox. The resulting text file(s) would be stored in the TaskCluster DB.
    • Surface a path in the TaskCluster index that would make them easy to find.
    • Allow perf.html to fetch those assembly sources as needed.

JavaScript

  • All JS frames (interpreter, baseline, ion) should include column and line information.
  • We will need to gather sources and sourcemaps. There are a few different approaches here.
    • Collect network markers that include sources (using the HAR format). There are some privacy implications here, and we would need to find a way to get source maps as well.
    • Bring sources over from the DevTools workflow.
    • Fetch sources and sourcemaps using the Gecko profiler addon. This is probably the most straightforward approach, although potentially you can fetch different sources than what was actually in the profiler, and then the results would be lying to you.
  • Ensure that JIT invalidations include per-line information.

Front-end work:

A React component will need to be made that can view sources. This should be pretty straightforward to set up with wrapping CodeMirror, as it has APIs to display sources nicely, and a column interface that we could hook information into.

There should be no reason to attempt overloading existing DevTools like the debugger with this information, and lower the complexity of the implementation. This will also allow for quick context switches within the tools. Any function location or stack should be able to link directly to the source from that context. Implementing the assembly view inside of the C++ view could be a bit trickier with working around CodeMirror's internals, but it's a solvable problem.

Also we have plenty of screen real-estate, so it would be nice to include as many columns of information as necessary, including % of overall time, % of function time.

Prior Art

Chrome's profiler

Click and go to the debugger with a new column of information:

image

Assembly view

This article has an excellent write-up

Notice the column information: image

Assembly mixed in with C++ code: image

OS X Shark profiler

https://ccrma.stanford.edu/~chanson9/tutorials/shark/shark.html

image

┆Issue is synchronized with this Jira Task

gregtatum avatar Feb 28 '18 19:02 gregtatum

We will need servers that can serve the assembly for official builds

Wondering if we could have that locally through the addon.

(no other comments, all this sounds quite good actually)

julienw avatar Mar 09 '18 13:03 julienw

We will need servers that can serve the assembly for official builds

Wondering if we could have that locally through the addon.

We could, but it would only work if you're the one who captured the profile, not if you're viewing somebody else's shared profile.

mstange avatar Mar 09 '18 16:03 mstange

We will need servers that can serve the assembly for official builds

The binaries for Windows are already served (e.g. at https://symbols.mozilla.org/xul.dll/6132B96B70FD000/xul.dl_ , with 6132B96B70FD000 being the code ID, which is written out in xul.pdb's .sym file), and Ted has filed bug 1429871 about serving Mac + Linux binaries as well. We could have perf.html download the entire binary and extract the assembly from it, for example using capstone.js.

Edit: Having the front-end download the entire file doesn't seem very realistic - it would be slow to download, slow to unpack, and annoying to manage caching and expiration. But we could add functionality to the symbolication API to do the downloading and caching.

mstange avatar May 11 '18 02:05 mstange