discover separating startup and logic phases

I was playing with this on https://github.com/cockroachdb/cockroach (more concretely (cd storage; discover test TestStoreSend)). It works fine, but since these tests have to actually start a server, there's a lot of code being run that isn't actually "logic". I suspect that that's often going to be the case for complicated systems, and that the high level view is really where you'd want to use this tool most (as opposed to looking at a focused unit test, where reading the logic is easier).

So maybe this is crazy, but is there a way of setting "checkpoints" in tests that separate the setup- and logic phase? Something like

func TestSomething(t *testing.T) {
   setupServer()
   defer teardownServer()
   discover.Breakpoint()
   // do actual stuff; `discover` really only looks at what happens now
   discover.Teardown()
}

Maybe this could be achieved naively by running the test multiple times, panicking first at the first breakpoint, then at the second (and so on), and computing diffs from the accumulated coverage profiles?

Jan 04 '16 15:01 tbg

I think something like that would be a great addition, for several reasons. The other reason to support something like this is to be able to see the code that deals with the client, and ignore the server. Your concrete idea would help with that a bit, but not entirely: if both the server and the client run in-process, the code between discover.Breakpoint() and discover.Teardown() would lead to both client and server code being executed.

It would be great to be able to take it even further: having a way to isolate the coverage between client and server. I think the problem with both approaches is how to achieve it. Currently discover just parses the code coverage file; discover test is just a shortcut for go test -coverprofile=... && discover parse <profile>.

I think the best way to support this would be to take over the code rewriting that cmd/cover does. With that, we could collect much richer data which would allow discover to infer more about the code relationships. Looking at cmd/cover this does not look terribly difficult, but I have not yet figured out how to incorporate running tests with the rewritten code. If you have any ideas on how to accomplish that, that would be great.

Jan 04 '16 16:01 eandre

I've made a fair bit of progress on this. I think it should be workable. The current design I am thinking of is the following. Let me know what you think.

Stop piggybacking on the test framework to do discovery. It was convenient, but ultimately to get good output you need to write custom "discover tests" anyway. Instead, take an approach similar to the test framework: look for functions of the form func DiscoverFoo(d *discover.D) { ... } and call them.
Each func DiscoverFoo(d *discover.D) func generates its own trace, and they will be showed separately. These functions will be called by running discover run or similar. It will look for these functions in both _test.go and regular Go files. To actually enable tracing, call d.Trace(func() { /* code goes here */ }). That allows you to perform setup before and in-between the code that gets traced, and teardown after.

As far as implementation goes, it will be done by annotating source just like the cover package, but due to the additional tracing richness (conditional tracing and having different traces) it will be a bit slower than the test coverage. I think it will just end up being a constant per-function overhead (to do a one-time "trace id" lookup), and otherwise identical to -covermode=atomic.

Thoughts?

Jan 08 '16 09:01 eandre

What (schematically) would that look like for a simple client-server thing (or something that has a setup phase and then does the actual test)?

Jan 08 '16 09:01 tbg

Something like this for net/http:

func DiscoverHTTPGet(d *discover.D) {
    ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        fmt.Fprintln(w, "Hello, world")
    }))
    defer ts.Close()

    d.Trace(func() {
        res, err := http.Get(ts.URL)
        if err != nil {
            d.Fatal(err)
        }
        res.Body.Close()
    })
}

The output would be a trace showing only the code that gets run for the HTTP client, not any of the server code, even though both get executed concurrently.

Jan 08 '16 09:01 eandre

That seems really useful. For other tests where you're interacting with a client (potentially over the network) "snapshotting" should still be interesting. Say I'm doing a client request as in your example, but I'm really interested in what that request triggers on the server. Then maybe the API that I would want is:

func DiscoverHTTPGet(d *discover.D) {
    var ts http.Server // keep in scope for later teardown
    // Trace the setup. This isn't collected, but `bg` captures the state so that we can diff on it.
    bg := d.Trace(func() {
        ts = httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            fmt.Fprintln(w, "Hello, world")
        })
    }

    // Execute the client request, but trace only the server portion. This should be achievable
    // by internally creating a trace for the `func() {}` below (which gives the "isolated" code touched
    // by the client until it hits the network) and removing that from all of the code touched while the
    // code runs (relative to the code the init step touched).
    bg.Trace(func() {
        res, err := http.Get(ts.URL)
        if err != nil {
            d.Fatal(err)
        }
        res.Body.Close()
    })

    // More teardown here if desired
}

Your (simpler) example above in this API would look the same. Some more options are presumably needed for more obscure tests, for example whether to halt the global clock (so that timers don't fire), but that doesn't seem important enough to consider, and there may always remain some amount of randomness (unless we also make the scheduler, RNG, etc deterministic) for these tests. That seems ok though.

Jan 08 '16 09:01 tbg

Interesting idea. I am not sure how easy it is to accomplish :). Your example seems to hint at exposing some kind of set operation API, where any piece of code that was executed was part of zero or more traces, and then having control of the output by reasoning about those sets.

I wonder to what extent it's necessary though. The implementation I am working on links together any started goroutines with the trace that started them, which would enable the HTTP server code to be captured by your "bg" trace above (the code that runs httptest.NewServer() without having to do some kind of negation of another trace.

It's possible to circumvent this by passing control to some background goroutine which was started for example in func init() { ... }, but that would be quite uncommon. All that code would still be traced, just with a different "trace id", so the information isn't lost; it would just require some other API to expose.

Jan 08 '16 10:01 eandre

If I am right, this would trace the HTTP server part:

func DiscoverHTTPServer(d *discover.D) {
    var ts http.Server // keep in scope for later teardown
    d.BGTrace(func() {
        ts = httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            fmt.Fprintln(w, "Hello, world")
        })
    }
    defer ts.Close()

    res, err := http.Get(ts.URL)
    if err != nil {
        d.Fatal(err)
    }
    res.Body.Close()
}

Where (*discover.D).BGTrace(func()) is like Trace() except it does not stop tracing when the function returns.

Jan 08 '16 10:01 eandre

How would that avoid mixing the initialization part of the server with server-side code tickled by the client request? That's ultimately what motivated the above.

Jan 08 '16 12:01 tbg

Yeah, I see what you mean. I am not sure there is a good way to avoid that. For atomicity I think the number of traces needs to be computable at "compile time", so I think it will be hard to enable having multiple different traces for a single func DiscoverFoo(). I will try to finish the code I am working on and see what the output is.

The goal is ultimately to provide useful output, not perfect output. If we can get 80% or 90% of the way there, perhaps the little bit of initialization that will creep in sometimes can be disregarded or manually deleted if you want to use the output for illustration/teaching purposes.

Jan 08 '16 16:01 eandre

perhaps the little bit of initialization that will creep in sometimes can be disregarded or manually deleted if you want to use the output for illustration/teaching purposes

In the cases that I was thinking, you're looking at a pretty complicated system whose internals you may not have access to, so it'd be hard to manually delete the bootstrap code (because that's most of the code).

Is it really that difficult to implement though? I was thinking to do something similar to what cmd/cover does, but you have multiple counters created and switched on and off dynamically at runtime:

each call to Trace (or BGTrace, or whatever API) mutates a pkg variable in discover which determines which coverage profiles are active. Ignoring performance, this could be var Active map[string]struct{} (the string is just a name for the trace, maybe "client" or "server").

package discover
var mu sync.RWMutex
func Trace(name string, f func()) {
    mu.Lock()
    defer mu.Unlock()
    Active[name] = struct{}{} // should check for existence
    defer func() { delete(Active, name) } // BGTrace would omit this
    f()
}

the annotation code is something like

mu.RLock()
for s := range Active {
    // Do the stuff cmd/cover does for each counter, but key it.
    // The variable GoCover would by a map[string]<type_of_original_GoCover>,
    // with keys populated as they first occur.
}
mu.RUnlock()

I think that this should be a fairly easy fork of cmd/cover and allow everything that be both want. Sure, you control the flow only through the runtime and not through static analysis (so you might catch a stray goroutine of one profile in the other), but that's likely rare in practice. I suppose if you wanted to go hard-core slow you could also grab the stack trace at entrance into a function and figure out from there which Trace call you belong to, if any (free goroutines would have to inspect their creator's stack), but I doubt that would unlock many new use cases.

Jan 11 '16 09:01 tbg

Your proposed implementation is very similar to the one I have been working on. You can see the progress in the tracing branch here on GitHub, which implements most of the cmd/cover functionality already. It's not quite done yet, but it's getting quite close.

Like you said, I think the approach is quite flexible and with the ability to turn on and off coverage profiles at runtime we should be able to iterate on the API to come up with something useful. If I understand you correctly your initial example should indeed be pretty easy to implement (bg.Trace(foo) would just enable tracing of the bg key, as opposed to a new key for foo). I agree that that would be useful and hadn't fully understood what you meant, so thanks for the explanation! It should definitely be possible to make something like that work.

Jan 11 '16 13:01 eandre

Cool, glad to hear! I'll keep my eyes peeled for the prototype. Let me know if you need a review.

Jan 11 '16 14:01 tbg