Lubomír Bulej comments

Results 22 comments of


Lubomír Bulej

Design API that denotes which releases the benchmark belongs to

It seems to me that this is not really a responsibility of a benchmark to say to which release it belongs. Let me think loudly... To me, a release is...

Design API that denotes which releases the benchmark belongs to

OK, I assume it's not critical for first release because we have no obsolete benchmarks yet. We'll come up with something (and I would lean towards configuration file or annotations...

Spark worker scaling parameters

I was under the impression that the number of executors influenced the default number of partitions created by default when creating RDDs directly from files on disk. It was also...

Spark worker scaling parameters

> ... seeing how our Spark instance is hardcoded to run locally ( > > https://github.com/renaissance-benchmarks/renaissance/blob/510b3b9e8f01dc397f35b64a7b5f8a943b75d012/benchmarks/apache-spark/src/main/scala/org/renaissance/apache/spark/SparkUtil.scala#L61 > ) I'm not sure what we're trying to do with the worker instance...

Spark worker scaling parameters

The thing with controlling the number of executor instances appears to originate from #145 and #147 and at that time, it seemed to work for @farquet. Looking at the [allowed...

Spark worker scaling parameters

#274 removes the configuration of executor instances (along with the benchmark parameter) as well as explicit input data partitioning (for now). I was wondering whether it would make sense to...

Spark worker scaling parameters

I have updated the PR and the measurement bundle (plugins work now). For testing, I added `als-ml` benchmark, which uses `ml.ALS` instead of `mllib.ALS`. Both do a conversion to RDD,...

Provide a detailed description of each benchmark

Well, being able to plug the detailed description into a generated document should get us extra points :-)

Provide a detailed description of each benchmark

Just to keep track of progress, #135 introduces the `@Description` annotation which should be used for this purpose. Note that the one-line summary should go to `@Summary`.

Establish a consistent way of handling the standard output across all the benchmarks

This brings to mind another thing -- do we want to checksum benchmark outputs so that we can detect if they break?