runtime_events_tools
runtime_events_tools copied to clipboard
Peform table-based translation of runtime event names and tags
This PR implements translation of runtime event names (and tags) using tables loaded at runtime, in order to allow consumption of runtime events from different versions of OCaml. It also adds a new command olly{,_bare} gen-tables
to generate the table file from the caml/runtime_events.h
C header file.
Currently, when Olly profiles a program compiled with a newer (or older) version of OCaml than Olly was compiled for, two bugs may occur:
-
olly trace
generates nonsensical names for the slices[^1] -
olly gc-stats
silently generates garbage output, if any of the runtime events it matches on have changed
This occurs because runtime events are read from a mmap
ped ring buffer using C with absolutely zero regard for the version of the OCaml runtime it was produced with (and there is no way to check in the first place), and the integers written to the ring buffer are directly interpreted as elements of the enumerations in runtime_events.ml
. Since runtime events may have been inserted in arbitrary positions in the enum (e.g. https://github.com/ocaml/ocaml/pull/12923), the integer values written into the ring buffer may differ from those that our version should interpret them as, leading to completely wrong Runtime_events.(
lifecycle
, runtime_phase
, and runtime_counter
)
values.
Here is an annotated image of how the issue manifests and is fixed using Olly built on 5.1.0
currently (on the left) and performing name translation (on the right):
Olly Perfetto Mistranslated Names Demo
ring_pause
(seen on the left) isn't even a runtime_phase
event name, but a lifecycle
event name! Something has gone terribly wrong...
The implementation is in two parts:
-
Firstly, I implement a library
olly_rte_shim
to unify all the different types of runtime events (runtime_phase
,runtime_counter
,lifecycle
,alloc
, and custom events) into a single manageableevent
type.- I had made the
Event.t
type forolly trace
to abstract the name/argument extraction of events from the trace format backends - This is a generalisation (and replacement) of the
Event.t
type, which now also maintains the "tag" of the event (rather than just forgetting it for the name) – i.e. the kind ofruntime_phase
,runtime_counter
, custom event, etc. it is. - This is also useful for using alternate sources of runtime events, for instance I have an implementation that allows saving and replaying the full event trace to a text file here https://github.com/eutro/runtime_events_tools/pull/2, which could even be streamed over the network externally to
olly
- I had made the
-
Secondly, I implement table-based translation of event names and tags
- This includes the new command
olly gen-tables
which creates a YAML[^2] or OCaml file, the former which can be read at runtime, and the latter which is linked into Olly to generate the builtin table of events existing in this version - It also adds a new
--table
(no short form [yet?]) option to botholly gc-stats
andolly trace
to load a source table from a file - Event names are translated straightforwardly by using the integer value of the enums (via
Obj.magic
) to index the tables - Event tags are translated slightly less straightforwardly, by computing integer arrays from matching the indices of the names in the source/destination tables, and then converting the integer values (via
Obj.magic
) to the corresponding event tag enumerations
- This includes the new command
[^1]: and could, I believe, possibly crash, under bad circumstances, though I haven't seen it happen
[^2]: it's just three lines of kind: [event_name,...]
, (and can only be parsed from that subset), having it be valid YAML means other tools can potentially use it, e.g. for diffing as json