snabb icon indicating copy to clipboard operation
snabb copied to clipboard

Console-mode profiling in the RaptorJIT era

Open lukego opened this issue 7 years ago • 0 comments

Snabb v2019.01 will be the first release based on RaptorJIT. The good news is that this supports interactive graphical tooling with Studio. The bad news is that this does not support the old LuaJIT -jdump.

For the sake of the many lovers of plain text in the world I have extended Studio to support command-line decoding of Snabb shm directories into plain text dumps similar to -jdump. I hope that this will serve as a major upgrade to the old -jdump and make it much easier to profile and diagnose application performance.

Usage and profiling workflow example

Run Snabb and decode its shm folder into plain text files:

$ sudo SNABB_SHM_ROOT=~/shm SNABB_SHM_KEEP=1 ./snabb snabbmark basic1 1000e6
...

$ studio decode ~/shm/25003 ./dump
...
Finished.

$ ls dump
trace  vmprofile

$ ls dump/vmprofile
apps.basic.basic_apps.txt  engine.txt  program.txt

Inspect a profiler dataset:

$ cat dump/vmprofile/apps.basic.basic_apps.txt
Samples  | MCode    | VM       | GC       | Trace
   16860 | 100.0%   |   0.0%   |   0.0%   | trace 37 loop at (basic_apps)Tee:push:5
   14411 | 100.0%   |   0.0%   |   0.0%   | trace 39 loop at (basic_apps)Sink:push:3
   10667 | 100.0%   |   0.0%   |   0.0%   | trace 28 loop at (basic_apps)Source:pull:3
     574 | 100.0%   |   0.0%   |   0.0%   | trace 43 patch from (app)setvmprofile:2 to (basic_apps)Sink:push:4
     525 | 100.0%   |   0.0%   |   0.0%   | trace 23 patch from (app)setvmprofile:1 to (shm)resolve:2
     353 | 100.0%   |   0.0%   |   0.0%   | trace 46 patch from (app)with_restart:4 to (basic_apps)Tee:push:6
     347 | 100.0%   |   0.0%   |   0.0%   | trace 44 patch from (basic_apps)Sink:push:2 to (app)setvmprofile:2
     326 | 100.0%   |   0.0%   |   0.0%   | trace 52 patch from (app)with_restart:4 to (basic_apps)Source:pull:4
     296 | 100.0%   |   0.0%   |   0.0%   | trace 47 patch from (basic_apps)Source:pull:2 to (app)setvmprofile:2
     294 | 100.0%   |   0.0%   |   0.0%   | trace 48 patch from (basic_apps)Tee:push:4 to (app)setvmprofile:2
     153 | 100.0%   |   0.0%   |   0.0%   | trace 51 patch from (basic_apps)Sink:push:2 to (basic_apps)Sink:push:4
      37 | 100.0%   |   0.0%   |   0.0%   | trace 54 patch from (link)nreadable:3 to (basic_apps)Sink:push:4
      36 | 100.0%   |   0.0%   |   0.0%   | trace 53 patch from (link)nreadable:3 to (basic_apps)Tee:push:6
       5 |   0.0%   | 100.0%   |   0.0%   | trace 29 patch from (counter)add:1 to return
       2 | 100.0%   |   0.0%   |   0.0%   | trace 55 patch from (link)nreadable:3 to (basic_apps)Sink:push:4
       1 |   0.0%   | 100.0%   |   0.0%   | trace 27 loop at (histogram)clear:3

Inspect the "function call contour" of the hottest trace to understand which code paths it executes:

$ ls dump/trace/37
contour.txt  ir.txt  mcode.txt  name.txt

$ cat dump/trace/37/contour.txt 
trace 37 loop at (basic_apps)Tee:push:5
     (basic_apps)Tee:push:6 (apps/basic/basic_apps.lua:100)
         (link)receive:1 (core/link.lua:48)
             (counter)add:1 (core/counter.lua:91)
         (link)receive:7 (core/link.lua:54)
             (counter)add:1 (core/counter.lua:91)
         (link)receive:8 (core/link.lua:55)
     (basic_apps)Tee:push:7 (apps/basic/basic_apps.lua:101)
         (packet)clone:1 (core/packet.lua:158)
             (packet)allocate:1 (core/packet.lua:131)
                 (packet)freelist_nfree:1 (core/packet.lua:83)
             (packet)allocate:2 (core/packet.lua:132)
         (packet)from_pointer:2 (core/packet.lua:223)
             (main)<anonymous>:1 (core/main.lua:24)
                 (main)<anonymous>:2 (core/main.lua:25)
         (packet)append:3 (core/packet.lua:165)
     (basic_apps)Tee:push:9 (apps/basic/basic_apps.lua:103)
         (link)transmit:1 (core/link.lua:62)
             (link)full:1 (core/link.lua:81)
         (link)transmit:3 (core/link.lua:64)
             (counter)add:1 (core/counter.lua:91)
         (link)transmit:10 (core/link.lua:71)
             (counter)add:1 (core/counter.lua:91)
         (link)transmit:12 (core/link.lua:73)
     (basic_apps)Tee:push:8 (apps/basic/basic_apps.lua:102)
         (link)transmit:1 (core/link.lua:62)
             (link)full:1 (core/link.lua:81)
         (link)transmit:3 (core/link.lua:64)
             (counter)add:1 (core/counter.lua:91)
         (link)transmit:10 (core/link.lua:71)
             (counter)add:1 (core/counter.lua:91)
         (link)transmit:12 (core/link.lua:73)
     (basic_apps)Tee:push:8 (apps/basic/basic_apps.lua:102)

Limitations

Currently missing from the text representation:

  • Bytecode log for trace.
  • List of trace aborts.
  • ...?

Yell if it hurts and I'll try to improve this enough to fight off homesickness for -jdump.

lukego avatar Jan 16 '19 11:01 lukego