Console-mode profiling in the RaptorJIT era
Snabb v2019.01 will be the first release based on RaptorJIT. The good news is that this supports interactive graphical tooling with Studio. The bad news is that this does not support the old LuaJIT -jdump.
For the sake of the many lovers of plain text in the world I have extended Studio to support command-line decoding of Snabb shm directories into plain text dumps similar to -jdump. I hope that this will serve as a major upgrade to the old -jdump and make it much easier to profile and diagnose application performance.
Usage and profiling workflow example
Run Snabb and decode its shm folder into plain text files:
$ sudo SNABB_SHM_ROOT=~/shm SNABB_SHM_KEEP=1 ./snabb snabbmark basic1 1000e6
...
$ studio decode ~/shm/25003 ./dump
...
Finished.
$ ls dump
trace vmprofile
$ ls dump/vmprofile
apps.basic.basic_apps.txt engine.txt program.txt
Inspect a profiler dataset:
$ cat dump/vmprofile/apps.basic.basic_apps.txt
Samples | MCode | VM | GC | Trace
16860 | 100.0% | 0.0% | 0.0% | trace 37 loop at (basic_apps)Tee:push:5
14411 | 100.0% | 0.0% | 0.0% | trace 39 loop at (basic_apps)Sink:push:3
10667 | 100.0% | 0.0% | 0.0% | trace 28 loop at (basic_apps)Source:pull:3
574 | 100.0% | 0.0% | 0.0% | trace 43 patch from (app)setvmprofile:2 to (basic_apps)Sink:push:4
525 | 100.0% | 0.0% | 0.0% | trace 23 patch from (app)setvmprofile:1 to (shm)resolve:2
353 | 100.0% | 0.0% | 0.0% | trace 46 patch from (app)with_restart:4 to (basic_apps)Tee:push:6
347 | 100.0% | 0.0% | 0.0% | trace 44 patch from (basic_apps)Sink:push:2 to (app)setvmprofile:2
326 | 100.0% | 0.0% | 0.0% | trace 52 patch from (app)with_restart:4 to (basic_apps)Source:pull:4
296 | 100.0% | 0.0% | 0.0% | trace 47 patch from (basic_apps)Source:pull:2 to (app)setvmprofile:2
294 | 100.0% | 0.0% | 0.0% | trace 48 patch from (basic_apps)Tee:push:4 to (app)setvmprofile:2
153 | 100.0% | 0.0% | 0.0% | trace 51 patch from (basic_apps)Sink:push:2 to (basic_apps)Sink:push:4
37 | 100.0% | 0.0% | 0.0% | trace 54 patch from (link)nreadable:3 to (basic_apps)Sink:push:4
36 | 100.0% | 0.0% | 0.0% | trace 53 patch from (link)nreadable:3 to (basic_apps)Tee:push:6
5 | 0.0% | 100.0% | 0.0% | trace 29 patch from (counter)add:1 to return
2 | 100.0% | 0.0% | 0.0% | trace 55 patch from (link)nreadable:3 to (basic_apps)Sink:push:4
1 | 0.0% | 100.0% | 0.0% | trace 27 loop at (histogram)clear:3
Inspect the "function call contour" of the hottest trace to understand which code paths it executes:
$ ls dump/trace/37
contour.txt ir.txt mcode.txt name.txt
$ cat dump/trace/37/contour.txt
trace 37 loop at (basic_apps)Tee:push:5
(basic_apps)Tee:push:6 (apps/basic/basic_apps.lua:100)
(link)receive:1 (core/link.lua:48)
(counter)add:1 (core/counter.lua:91)
(link)receive:7 (core/link.lua:54)
(counter)add:1 (core/counter.lua:91)
(link)receive:8 (core/link.lua:55)
(basic_apps)Tee:push:7 (apps/basic/basic_apps.lua:101)
(packet)clone:1 (core/packet.lua:158)
(packet)allocate:1 (core/packet.lua:131)
(packet)freelist_nfree:1 (core/packet.lua:83)
(packet)allocate:2 (core/packet.lua:132)
(packet)from_pointer:2 (core/packet.lua:223)
(main)<anonymous>:1 (core/main.lua:24)
(main)<anonymous>:2 (core/main.lua:25)
(packet)append:3 (core/packet.lua:165)
(basic_apps)Tee:push:9 (apps/basic/basic_apps.lua:103)
(link)transmit:1 (core/link.lua:62)
(link)full:1 (core/link.lua:81)
(link)transmit:3 (core/link.lua:64)
(counter)add:1 (core/counter.lua:91)
(link)transmit:10 (core/link.lua:71)
(counter)add:1 (core/counter.lua:91)
(link)transmit:12 (core/link.lua:73)
(basic_apps)Tee:push:8 (apps/basic/basic_apps.lua:102)
(link)transmit:1 (core/link.lua:62)
(link)full:1 (core/link.lua:81)
(link)transmit:3 (core/link.lua:64)
(counter)add:1 (core/counter.lua:91)
(link)transmit:10 (core/link.lua:71)
(counter)add:1 (core/counter.lua:91)
(link)transmit:12 (core/link.lua:73)
(basic_apps)Tee:push:8 (apps/basic/basic_apps.lua:102)
Limitations
Currently missing from the text representation:
- Bytecode log for trace.
- List of trace aborts.
- ...?
Yell if it hurts and I'll try to improve this enough to fight off homesickness for -jdump.