tracer icon indicating copy to clipboard operation
tracer copied to clipboard

Tracing profiler for Common Lisp

#+title: Tracer

[[file:screenshot.png]]

  • Introduction

A tracing profiler for SBCL, with output suitable for display in Chrome's/Chromium's Tracing Viewer (chrome://tracing).

NOTE: in its current form, it's a pile of hacks altering the behavior of SBCL's implementation of Common Lisp's =TRACE= facility. Therefore it works only on SBCL - and I only tested it on SBCL 1.5.0 (AMD64, Linux). Help wanted in making it as portable as possible.

UPDATE: There's also an [[http://jacek.zlydach.pl/blog/2020-02-04-introducing-tracer-a-tracing-profiler-for-common-lisp.html][associated blog post]] with some thoughts about tracing profilers in general, and this one in particular.

=Tracer= is designed to be used in "drop-in" mode, i.e. no code in the profiled project needs to be modified to achieve its primary functionality. In the future, extended functionality may require adding some constructs to the profiled code.

=Tracer= is a tracing profiler - i.e. it collects information not just what was called and how often, but records actual call sequences, optionally with input and output arguments.

  • Use example The interface is currently highly unstable and subject to change at a whim. Future versions will not be backwards compatible.

    The main entry point is the macro =with-tracing=, which takes two arguments - a list of things to trace, and the body of code to profile. The list of things to trace follows the same convention as [[http://www.sbcl.org/manual/#Function-Tracing][SBCL's implementation]] of the built-in [[http://clhs.lisp.se/Body/m_tracec.htm][TRACE]] macro: each element "may be a symbol, denoting an individual function, or a string, denoting all functions fbound to symbols whose home package is the package with the given name".

    To collect traces of a REPL-bound calculation: #+BEGIN_SRC lisp (tracer:with-tracing ("MY-PROJECT" other-package/some-function) (my-project:some-long-operation)) #+END_SRC

    To collect traces from all running threads, e.g. to trace how a web application responds to HTTP requests: #+BEGIN_SRC lisp (tracer:with-tracing ("MY-PROJECT" "MY-WEBSERVER") (sleep 20)) #+END_SRC

    In complex projects, it may be useful to define a helper function:

    #+BEGIN_SRC lisp (defun interesting-traceables () `("MY-PROJECT" "MY-PROJECT/CORE" ,@(tracer:package-symbols-except "MY-PROJECT/OBJECTS" "SOME-ACCESSOR" "OTHER-ACCESSOR") jsown:to-json jsown:to-json* "SQLITE" "UIOP/RUN-PROGRAM" "CLACK.HANDLER")) #+END_SRC

    (=PACKAGE-SYMBOLS-EXCEPT= is provided by =Tracer= to make it easier to trace everything in a package except a specified blacklist of functions.)

    And use it like this (note the use of =#.= for read-time evaluation): #+BEGIN_SRC lisp (tracer:with-tracing #.(interesting-traceables) (my-project:some-long-operation)) #+END_SRC

    After that's done, in order to save the report to a file invoke: #+BEGIN_SRC lisp (tracer:save-report "report.json") #+END_SRC

    You can now view the results in Google Chrome or Chromium, by visiting [[chrome://tracing]] and loading the saved report with "Load" button.

    Sadly, there isn't much good documentation on how to use the tracing viewer UI. Best I can recommend is [[http://www.chromium.org/developers/how-tos/trace-event-profiling-tool/trace-event-reading#TOC-Navigating-the-Tracing-View][this]]. Fortunately, traces generated by =Tracer= use only the most basic features, so things should be self-evident.

    (Also note that Chrome's trace viewer is not without its own issues; in particular, you may need to load the report multiple times before it gets processed and displayed.)

** On storing input arguments and return values of function calls In order to minimize the size of the resulting trace (as well as maximize timing accuracy), by default saving arguments and return values of traced functions is disabled. The API for selecting what values to store and when is in flux. Currently, you can set a global default by installing appropriate function as the value of =DEFAULT-ARG-CONVERTER=, or specify a per-symbol policy using the =TRACING-ARG-CONVERTERS= hash table. See [[file:src/trace-event.lisp][trace-event.lisp]] for details. As an example, you can do:

#+BEGIN_SRC lisp (setf tracer::default-arg-converter tracer::+arg-converter-store-only-simple-objects-and-strings+) (tracer:with-tracing #.(interesting-traceables) (my-project:some-long-operation)) (tracer:save-report "report.json" :skip-args nil) #+END_SRC

in order to remember numbers, characters, booleans, symbols and strings. The fastest and most dangerous of current arg converters is =+ARG-CONVERTER-PASSTHROUGH+=, which records /all/ inputs and return values "by reference", without re-encoding them. The problem is, if any of such recorded values gets changed later, the trace will show these changes instead of the values at the time of the call. In extreme cases - such as use of destructive operations - this may damage or invalidate the recorded object, causing a condition to be signalled when attempting to print it.

The current interface is pretty crude and user-unfriendly, but it will be improved in the future.

  • Known issues =SAVE-REPORT= used with =:SKIP-ARGS NIL= may cause an error if =DEFAULT-ARG-CONVERTER= is left at its default value of =+ARG-CONVERTER-IGNORE-ALL+=. That's because the API for deciding what parameters to collect is in flux, and =:SKIP-ARGS= is a part of the old API.

  • Implementation notes =Tracer= works by abusing =TRACE= feature to time and record calls instead of printing them to =TRACE-OUTPUT=. To do so, it dynamically replaces a bunch of SBCL's internals implementing the =TRACE= functionality, and restores original implementations when done recording. Recording is protected by =UNWIND-PROTECT=, but if anything goes wrong, you can call =TRACER:STOP-TRACING= yourself just to be sure.

    =Tracer= also currently binds directly to SBCL's interface to [[http://man7.org/linux/man-pages/man2/gettimeofday.2.html][gettimeofday]] to provide a microsecond-resolution clock. This is a cludge that may or may not work on systems other than Linux (I think it should). This is an improvement over the initial approach, which was to use [[http://www.lispworks.com/documentation/HyperSpec/Body/f_get_in.htm][GET-INTERNAL-REAL-TIME]], hoping for millisecond resolution, and then to introduce jitter so that all traced samples are at least 1 microsecond long. The jittered maybe-millisecond-accurate clock is still available; to use it, do: #+BEGIN_SRC lisp (setf tracer::clock-reset-fun (lambda () (setf tracer::hack-clock-jitter 0)) tracer::clock-get-time-fun #'tracer::get-current-time-usec) #+END_SRC

    It's unknown how much damage is caused when =Tracer= is run when multiple threads are already running and have their functions wrapped by tracing infrastructure mid-flight. It might be a good idea to encourage/support running =Tracer= before most of the traced application is initialized. One way is to make tracing conditioned on a global variable, letting it run continuously during execution of the application, and only start saving traces when the flag gets flipped.

    (Really might want to consider doing it. Passive tracing guarded by flag would work well with in-code counters and block markers and whatever similar things may come up. But then again, that would mean having to define which functions are being traced at startup.)

    It might me just being a bit oversensitive, but I'm starting to have feelings that sometimes (rarely, but sometimes), some calls get missed by =Tracer=. I've started to grow these suspicions after looking at database calls in one of the applications I worked on; sometimes it seems that the actual CFFI calls are missing (but perhaps I misssed them in the trace; some of them are extremely fast).

    This is all a bunch of ugly hacks and I'm not proud of them. Except I kind of am. Despite its kludgy nature, it really does work well.

  • Planned features

    • Support for counters - adding information about memory use, open database connections, etc. Possibly in "auto-polling" mode, to keep the interface non-invasive. These would have their own specific display in Chrome's tracing viewer.
    • Support for connecting call constructs across threads (e.g. marking that a given =HTTP-REQUEST= is related to a handler function invocation in a different thread, visualized by arrows connecting them on the trace graph).
    • Tracking thread life time (to mark when a given thread was actually created or stopped).
    • More actual documentation.
    • Binding to some high-resolution clock that's both portable and doesn't require a third-party library. Currently, =Tracer= doesn't need a library, but the solution isn't portable.
    • API for reporting events to be put on the trace, e.g. to show when the user pressed a key or clicked a button, etc.
    • Actual API for safely and efficiently saving input arguments and return values of traced functions.
    • Handling of signals/conditions - currently exceptional exits break rendering of the trace report.
    • GZIPping saved reports to cut down on their size.
      • This should really be handled by external dependency; best we can probably offer is a way to output to stream, in such a way one can pass it to compression.
  • Dev notes for future changes (Not relevant to use of =Tracer=.)

    • RE clock jitter hack - it should be possible to remove its influence by going over traces in order they were recorded, counting the jitter that was applied and removing anything except the "+1 us for events shorter than clock resolution" fix. But alternatively, I could just not add jitter in the first place, and post-process saved samples by going over them and applying a per-thread offset to them; offset that increases for tasks with recorded length of 0!

    • Speaking of post-processing, here's an idea: how much of the analysis that Chrome does can I do myself? Can I tell, for any call, how much wall-time vs. self-time it has?

    • Anyways; what I'd like to have is a way to say, "what if calls to X took 10% of their time?", or "what if calls to X that are longer than Y took Y?", or in general, "what if calls to X with properties Y were Z(X, Y) long?". I'd like to generate an alternative, filtered trace from that, for side-by-side comparison.

    • Storage: I'm wondering about pros and cons of replacing a list of lists with a fixed-memory array of structs. That would definitely help reduce the memory load, perhaps even improving recording performance (but I'm not sure about this; consing is fast).

    • Recording arguments:

      • There's no good way to print an arbitrary object to string if there's a chance that the printing function will be traced, and its arguments will be printed too. This sounds like a recipe for an infinite loop.
      • I could perhaps work around this with some dynamic flags.
      • As a default, I'd like to move towards not even capturing arguments. But I want to capture some arguments - e.g. queries for database calls are interesting.
      • Could I instead provide a selector that could be used to optionally capture arguments for a given function(s)? Could it be provided to with-tracing macro directly? E.g. instead of ='foo=, I'd say ='(foo :trace-args t)=.