wazero icon indicating copy to clipboard operation
wazero copied to clipboard

Add hooks to allow trace logging

Open codefromthecrypt opened this issue 2 years ago • 11 comments

Even if only in interpreter mode, it would be helpful for someone to be able to trace by logging or call graph. This could be initially limited to entry-exit on host function boundaries, and possibly extended to allow tracing of each call or call_indirect even if solely implemented in wasm.

The hook should:

  • have a context.Context parameter (allows propagation of a correlation ID)
  • have a flag about the type of invocation it is (call root, wasm or host, indirect or not)
  • intercept around a function call (allows timing and value captures, as well mutating the context)

The hook should be allocated up-front with the following parameters:

  • the function debug name (module.function name)
  • the function signature (ideally parameter names)

The hook should not:

  • have a reference to something that can hold up GC
  • include any 3rd party library dependency

To ensure implementors can make more robust implementation, we can have a proof of concept in the test directory showing how it can be used in basic form.

Particularly, this proof of concept could on each call to an exported function, add the following k/v to messages outputted to console.

  • correlation ID scoped to the originating function call (random hex 64 or 128bit.. doesn't matter)
  • function DebugName (module.function name)
  • monotonic time

Note: We don't need to do anything fancy like span IDs in the proof of concept as nothing is async in wasm function invocation.

With this in place a more featureful logger could be made, which optionally dumps parameter values.

codefromthecrypt avatar Apr 11 '22 01:04 codefromthecrypt

Did some random hacking on this as a dive into the codebase and came up with this hacky POC.

https://github.com/tetratelabs/wazero/compare/main...anuraaga:testing

Some initial notes to remember

  • We currently only allow configuring wasm features, but not engine features orthogonal to wasm, like an interceptor. Just some additional wiring
  • InstantiateModuleFromCodeWithConfig probably needs to be allowed a Context because it can trigger a .start. Currently there doesn't seem to be a way to trace .start without the wazero initialization ending up in it. I see that .Call() maybe has the ability to pass a context in as part of the Module when calling functions directly that aren't .start, but I'm not sure on that yet

anuraaga avatar Apr 18 '22 08:04 anuraaga

interesting and good progress.

Your code made me notice that how things are called is engine-specific due to how the stacks work. We can continue with the interceptor pattern or go with a start/finish hook thing depending on what you think works best. Of course the onStart hook can accept a context and return an updated one to use for real. (it can also take the slice of params though can't effect them). For correlation between start and finish we can add another parameter as needed.

When this shapes up whatever base interface can go into the api package and that might allow some impl flexibility without circular deps tangling.

Give it a think and PS add a screen shot so others can see how cool this is :D

codefromthecrypt avatar Apr 18 '22 11:04 codefromthecrypt

oh PS you are totally right we aren't doing context correctly. having one stashed in RuntimeConfig is a sure way to have broken traces. I'll raise a PR for your review to break API in a good way 😎

codefromthecrypt avatar Apr 18 '22 11:04 codefromthecrypt

For record, tweaked POC to trace wasm functions to see what they look like

https://github.com/tetratelabs/wazero/compare/main...anuraaga:function-listener

Definitely noisy, especially with many garbage collection points, but some sampling of this may be useful. Notably having the entrypoint .start properly show up is much nicer than not.

image

https://github.com/tetratelabs/wazero/files/8511192/caac2efba1b5dffc6b3d999a8181bea1.json.txt

anuraaga avatar Apr 19 '22 08:04 anuraaga

some details to help shape things around names:

  • functions are only referenced by their numeric index, which is module-specific. ex an imported function will have a different number than the module defining it (which exported it)
  • except import and export there are no names used in wasm format, and also import/export names can be any UTF-8
  • the same function can be exported many times from the same module. So for example one module can import two different function name which actually are the same function in the exporting module
  • besides export names text and binary format may include function names and they aren't the same as export names unless someone chose to.
  • some export names (notably rust) are intentionally obfuscated meaning that they can't be used reliably as a metrics dimension
  • similarly module names in import and export may have no correlation to the module name in the text or binary format "custom name section"
  • Import module and function names can be re-written, so may not match the export names of the same which should be considered authoritative. Ex. module a is named "mod1" and exports func[0] as "fn1' and "fn2". It is possible that this is imported as "moda" "fnc"!
  • the custom name section may also include parameter names, which can help in kv logging when present. otherwise, again it is numeric index only (scoped to a function type)

The summary of these notes is that the factory which creates a function interceptor or event listener will have to deal with some realities on names. It likely needs a way to map an integer representing the function index to its type and any metadata about it including any export names, and data in the custom name section (ex module name, function name, parameter (local ) names). By using a factory, the returned value doesn't need to track names any more as it is already in the right scope.

Ex.

type FunctionInterceptor interface {
// hooks here which do not accept names or numeric indices as doing so
// would confuse scope per the above points
}


type Factory interface {
  // NewInterceptor returns an interceptor for this function or nil if one isn't needed.
  // This is called during Instantiate after resolving any final names, and before any
  // functions in the corresponding module are invoked. 
  NewInterceptor(FunctionMetadata) FunctionInterceptor
}

codefromthecrypt avatar Apr 25 '22 01:04 codefromthecrypt

#505 started this internally so we can practice a bit. Particularly, value types are changing between 1.0 and 2.0 so we need to solidify that before exporting the hooks

codefromthecrypt avatar Apr 26 '22 07:04 codefromthecrypt

Plan to expose this in #513 though the interface I think should change to a more intercepter vs event listener (for function lifecycle). We may also do an event listener for things like memory.grow, but I think listener for functions has problems.

Notably, we currently propagate errors via panic, and I'm almost certainly there's not going to be After hook for every Before. I can see why this is hard to do in the shape of the current code, but probably the interpreter code can be changed so it is easier.

codefromthecrypt avatar Apr 28 '22 22:04 codefromthecrypt

Yeah JIT was in mind when switching from interceptor to listener during the prototypes. If possible, would like to not block adding the hooks to JIT in the future, though maybe it's not possible haven't dug into it yet.

anuraaga avatar Apr 28 '22 23:04 anuraaga

I think listener can be done in JIT and possibly we can make a special call-out for an interceptor. In any case I think listener is the wrong design to ensure hooks surround a function invocation. It only works now as nothing causes an error.

Since this is experimental, nothing done here would block a JIT implementation changing things, but we want at least interpreter doing a safe implementation first, before assuming JIT which is harder should use it, Right?

codefromthecrypt avatar Apr 28 '22 23:04 codefromthecrypt

PS I'll be more specific on the issues, though added some to the PR that makes things playable in interpreter for now.

The main thing is JIT code is running natively, and the current design of the listener approach is stateful. There is a context update parameter returned. If we did that in JIT'd code, it would have to not only invoke the listener, but also track and supply the context parameters returned from it (via a stack or otherwise).. inside generated JIT code. While possible, that's probably going to be both hard code to review and also hard to make work. It is the stateful part that makes it harder. If not for that context return, it is a bit easier as it is mainly invoking a function.

Conversely, if you see how host functions are currently implemented in JIT, doing state around a call is actually better suited as an interceptor: https://github.com/tetratelabs/wazero/blob/main/internal/wasm/jit/engine.go#L710-L716

So, it is currently easy in JIT for this part and some of the interpreter code needs to be better organized around errors to proceed either. That leaves us with "call" and "call.indirect" inside JIT which target Wasm-defined (JIT compiled) functions. The current thinking is this.. regardless of whether we are invoking a host function in go from JIT, or an interceptor in Go from JIT, we are calling something in Go from JIT! So, we could make a special case that when the target has an interceptor, call a special go-defined function that wraps a Wasm-compiled function with it, similar to if it was initially a host function in the first place.

This sort of setup is why I was fairly adamant about precise hooks being known before instantiation as that allows this sort of weaving to happen before instantiation and only effect those functions who are actively being intercepted. So long as we also have a stateless event listener for other sorts of things, we can do both things as cheap as possible and also buy back the ability to do things typical in interceptors, such as obviating, rate limiting or timing calls. Trying to do that in a stateful listener is the wrong solution I think.

codefromthecrypt avatar Apr 29 '22 00:04 codefromthecrypt

The TL;DR; of above is I think the current design is a good toehold for a future design that supports the interceptor use case and also reduces the surface area of internal code impact. The next design to try is likely this:

  • define interceptor as something that wraps around a Call(ctx context.Context, params ...uint64) ([]uint64, error)
  • use the same factory approach as exists in listener to ensure only desired calls are intercepted.
  • internally, make a go function wrapper that invokes the interceptor around a go-defined function
  • make an engine-specific means that can use a go function to invoke a wasm-function
  • at instantiation time, replace intercepted functions with wrapped ones at the same index
    • important part is this is regardless of if they are wasm or go. This may need compile hooks..
  • prove this works by implementing a trivial timing interceptor and a call obviator, invoked directly, via wasm "call" and "call indirect" opcodes

Once this works, we delete the current listener api and swap with this one interceptor.

Once this works, our context parameters should be coherent for the whole function call. What this would allow us to do is build a stateless event listener API, which has the same context.Context pointer between "before" and "after" around a function. By separating stateful into interceptor and stateless into listener, it makes certain things like logging easier while harder things like tracing and call blocking possible.

codefromthecrypt avatar Apr 29 '22 00:04 codefromthecrypt

can this be closed?

mathetake avatar Jan 04 '23 05:01 mathetake