renaissance
renaissance copied to clipboard
Establish a consistent way of handling the standard output across all the benchmarks
We should ensure that all standard (output except the iteration times) goes to a separate file that can be inspected after the benchmark finishes, unless the user explicitly requests to see the benchmark output (controlled by a flag).
This brings to mind another thing -- do we want to checksum benchmark outputs so that we can detect if they break?
Yes, this would be useful to have.
Although, I would say it might be useful to check the benchmark's results in a benchmark-specific manner - it would make it easier to track down the bugs if the occur. I.e. have a set of unit-test-style assertions on the value produced in the benchmark. Perhaps the RenaissanceBenchmark class should have another method that does this?
I'm not sure how detailed we want things, but in general I would like to see a file with stdout and stderr for every phase and repetition of a benchmark. Alternatively, I would like to see two files (stdout and stderr) with the phases and repetitions clearly separated.
To keep the management of all textual output in the harness, we should (at least) swap out System.out and System.err for the benchmark, but that is a band-aid supposed to catch output produced in coarsely integrated benchmarks, where we use existing benchmark implementation and basically pass command line parameters to it (for example the STMBech).
Ideally (and to avoid reassigning System.out and System.err), we would provide each benchmark with an object that could be used instead of System.out and System.err, or multiple objects (e.g., PrintStream instances) that would serve the same purpose.
In the first case, this object could be basically an extremely simplified logger, that would provide formatting methods for output(String format, Object ... args), error(...) and possibly debug(...), while in the second case, we could just stick with the PrintWriter interface, which supports both normal and formatted printing.
Either way, the harness would be in control of those objects and where their output goes and could then do whatever it wants, even providing dummy implementations that just discard everything.
So in addition to a Config object, we could also provide benchmarks with a Context object, which would provide (references to) bits of environment a benchmark is allowed to use (access to files comes to mind in addition to stdout and stderr). If we only want to pass in a single object, then Config could be made part of Context. I don't really want to overcomplicate it, SPECJvm also had something like that.
I'm not sure how detailed we want things, but in general I would like to see a file with stdout and stderr for every phase and repetition of a benchmark. Alternatively, I would like to see two files (stdout and stderr) with the phases and repetitions clearly separated.
For convenience, it might be useful to, in addition, have a single file with all the output, to allow users to quickly inspect everything.
Ideally (and to avoid reassigning System.out and System.err), we would provide each benchmark with an object that could be used instead of System.out and System.err, or multiple objects (e.g., PrintStream instances) that would serve the same purpose.
Agreed, but it seems like we might have to do both: a special harness-provided PrintStream that is passed to abenchmark, and having the harness replace System.out and System.err as a best effort to prevent other frameworks from leaking output (since they will generally be unaware of our custom interface).
Either way, the harness would be in control of those objects and where their output goes and could then do whatever it wants, even providing dummy implementations that just discard everything.
+1
If we only want to pass in a single object, then Config could be made part of Context.
The Context parameter, and embedding the Config into it, sounds good to me.
The Context class sounds related to the scratch-dir functionality discussed in: https://github.com/D-iii-S/renaissance-benchmarks/issues/13