BenchmarkDotNet icon indicating copy to clipboard operation
BenchmarkDotNet copied to clipboard

Custom logging/visualization during the benchmark run

Open caaavik-msft opened this issue 1 year ago • 2 comments

Currently, if we want to do some custom visualization or export of benchmark results, we have to wait until all the benchmarks have been completed to be able to interpret the results by using a custom Exporter. As an example, the only way to see the results of benchmarks as they run is by what is passed through to the logger by BDN (link to relevant source). In addition to this, the benchmark runner in BDN does lots of logging at various stages and there is no way to customize these logs or to use them meaningfully outside the context of a console/terminal.

For large projects which have lots of benchmarks that take a long time to run, it would be good to be able to hook into the various stages of the benchmark running so that we don't have to wait until the whole run is completed to be able to get custom exported information.

For this issue, I wanted to suggest a new feature to hook into the benchmark running to be able to subscribe to "events". An example of an event might be "RunStarted" or "RunCompleted" and those events would have read-only data attached to them containing the additional data about the event. I think as well, that all the logging code that is done today by BDN could be moved out to use this same abstraction, that way we can replace or extend it with our own custom logging. Not only would this be useful for customising the log output, but it could also provide the ability to build tools on top of BDN that display live execution results graphically rather than textually.

Diagnosers do somewhat implement this functionality with the Handle(HostSignal, DiagnoserActionParameters) method, but they are only run within the context of a single benchmark execution. If we want to subscribe to events such as when a benchmark case has completed running and the BenchmarkReport is available such as the example linked above, then it would be difficult to see how IDiagnoser could be extended to accommodate such an event.

I am willing to implement this but wanted to get some confirmation that this is a desirable feature for the project. If given the go-ahead, I can follow up with a proposal for how it could be implemented. Alternatively, if this feature is a bit too ambitious with wanting to have events for everything that needs to be logged, I think I still would like to look into something that lets me handle BenchmarkReport objects as they are created.

caaavik-msft avatar Jul 28 '22 07:07 caaavik-msft

I am willing to implement this but wanted to get some confirmation that this is a desirable feature for the project. If given the go-ahead, I can follow up with a proposal for how it could be implemented.

@caaavik-msft I confirm that this is a desirable feature. Please go ahead and send the proposal.

adamsitnik avatar Jul 28 '22 07:07 adamsitnik

I spent some time writing up a proposal and realised it was getting a bit complex, and also that during the functioning of a benchmark run there are lots of cases where logs are done so it wouldn't be possible to replace every log with something that can be dispatched to an event handler. Instead, I just brought it down to the most common events and I think it will cover 99% of use cases for someone wanting to build any tools on top of BDN. I think as part of implementing this we would leave the logging in as it is, and it would not be brought out into the event handler.

I have a commit on my fork implementing this proposal to show it in practice: https://github.com/caaavik-msft/BenchmarkDotNet/commit/ab86077940781acf17dc71eee14918573f037160

Proposal

My proposal is to add a new interface called an IBenchmarkEventHandler. This interface defines handle methods for all the events that occur during the benchmark running process. Since there isn't any concept of events currently in BDN, I had to make them myself based off my understanding of how BDN works. The result is the following interface:

public interface IBenchmarkEventHandler
{
    void HandleStartValidationStage();
    void HandleUnsupportedBenchmark(BenchmarkCase benchmarkCase);
    void HandleValidationError(ValidationError validationError);
    void HandleStartBuildStage();
    void HandleBuildFailed(BenchmarkCase benchmarkCase, BuildResult buildResult);
    void HandleStartRunStage();
    void HandleRunBenchmarksInType(Type type, IReadOnlyList<BenchmarkCase> benchmarks);
    void HandleCompletedBenchmarksInType(Type type, Summary summary);
    void HandleRunBenchmark(BenchmarkCase benchmarkCase);
    void HandleCompletedBenchmark(BenchmarkCase benchmarkCase, BenchmarkReport report);
}

Here is an explanation of what each event represents:

  • HandleStartValidationStage: called before the validators and GetSupportedBenchmarks is called
  • HandleUnsupportedBenchmark: called for every BenchmarkCase that is not supported by the toolchain
  • HandleValidationError: called for every validation error
  • HandleStartBuildStage: called before the benchmarks are split up into partitions and built
  • HandleBuildFailed: called for every BenchmarkCase that had a failed build, along with the build result
  • HandleStartRunStage: called after the build stage, and before the loop over all the BenchmarkRunInfos
  • HandleRunBenchmarksInType: called before running the benchmarks for a single BenchmarkRunInfo
  • HandleCompletedBenchmarksInType: called after running the benchmarks for a single BenchmarkRunInfo and we have a Summary object
  • HandleRunBenchmark: called before running a given BenchmarkCase
  • HandleCompletedBenchmark: called after running a given BenchmarkCase and passed in the resulting BenchmarkReport

This type will be registered the same way as the other types like IExporter, IDiagnoser in that they are stored on the IConfig and can be added using a AddEventHandler(IBenchmarkEventHandler eventHandler) method on ManualConfig or could be set using attributes. Multiple event handlers can be set at once, and they can also be configured to be enabled for specific benchmark classes.

The commit I linked above is more of a proof-of-concept at the moment, so I wouldn't bother reviewing it just yet until we have had some time to review this proposal and see if we want to change the names of anything or add/remove any events. There are some other events I was considering adding such as events for each iteration (Pilot, OverheadWarmup, etc.) inside the running of a BenchmarkCase but didn't want to go crazy with lots of events when I am not sure if there will be a need to get that level of detail. Let me know what you think!

caaavik-msft avatar Aug 02 '22 09:08 caaavik-msft

@albahari based on the integration that you have recently implemented for LINQPad (https://twitter.com/linqpad/status/1576879508078366720), is there anything you would like to add/change in this proposal?

adamsitnik avatar Oct 03 '22 13:10 adamsitnik

That's a very good question. To create the live results visualizer in LINQPad, I implemented BenchmarkDotNet.Loggers.ILogger (I also ended up creating an event listener). In the Write/WriteLine implementations of ILogger, I look for lines that start with "OverheadActual", WorkloadActual", "WorkloadPilot", "WorkloadWarmup" and "GC:", and parse them by calling Measurement.Parse and GcStats.Parse. In order to know which case the results apply to, I ended up using the EventListener enabled for "BenchmarkDotNet.EngineEventSource", matching on eventData.EventName == "BenchmarkStart".

An interface such as the one you describe would make this considerably simpler and more robust, although I guess it would require a method to capture measurements and GC stats. Most of all, it would be great if it had a hook that fired when an exception was thrown by the code being benchmarked. Detecting this, right now, is the most brittle part of the LINQPad implementation. BenchmarkDotNet currently writes exceptions to ILogger with a LogKind.Default (and without a textual prefix) and then later writes another entry (without the exception text) with a LogKind.Error and text that contains "ExitCode != 0". Detecting and parsing this is unreliable as you can imagine, and I'm not sure that it can work with non-English locales. A HandleUserException(Exception) method would solve this nicely.

Also, might it be an idea to define BenchmarkEventHandler as a class with virtual (non-abstract) methods rather than an interface, given that it consists of hook methods?

albahari avatar Oct 06 '22 06:10 albahari

Completed in #2420

caaavik-msft avatar Sep 21 '23 21:09 caaavik-msft

Thank you for your contribution @caaavik-msft !

adamsitnik avatar Sep 22 '23 11:09 adamsitnik

@albahari We won't be able to change this API easily in the future (we expect that more tooling will start depending on it very soon). It would be great if you could take a look at what we have right now, before we ship it and let us know if it meets LINQPad requirements.

The main type is define here: https://github.com/dotnet/BenchmarkDotNet/blob/master/src/BenchmarkDotNet/EventProcessors/EventProcessor.cs

You should be able to get the preview NuGet package from

<add key="bdn-nightly" value="https://www.myget.org/F/benchmarkdotnet/api/v3/index.json" />

adamsitnik avatar Sep 22 '23 11:09 adamsitnik

LINQPad updates its visualization by responding to each measurement, immediately after the measurement is logged. This enables early feedback on execution times and allows you see in real time how the performance figures stabilize as data from more iterations comes in. As far as I can see, none of the methods on EventProcessor fire until all measurements have been averaged and the test case has ended. Is there something that I'm missing here?

Also, I don't see a way to capture exception data, which is the most fragile part of parsing the output log. Maybe that belongs in a separate request.

albahari avatar Sep 23 '23 03:09 albahari

@caaavik-msft Do you want to extend the functionality to cover @albahari's needs?

timcassell avatar Sep 30 '23 07:09 timcassell