criterion icon indicating copy to clipboard operation
criterion copied to clipboard

Appropriate way to do pre-computation

Open lambdacalculator opened this issue 10 years ago • 8 comments

Can the tutorial be amended to include an example of how to do pre-computation before gathering benchmark data? For example, suppose my benchmark is mapping a function across several lists of 100000 values each, but I don't want to charge the costs of creating these lists to the function I'm benchmarking; that is, I want to compute the lists first, and then benchmark the function being mapped across these existing lists.

lambdacalculator avatar Nov 25 '14 17:11 lambdacalculator

I think this is explained in http://hackage.haskell.org/package/criterion-1.0.2.0/docs/Criterion.html#v:env.

osa1 avatar Jan 21 '15 16:01 osa1

Actually, if you use env in this way, you will be exposing yourself to a very subtle cache-effects bug: if you are running a benchmark N times, the env action only gets run ONCE. So the data (1) will get retained across all of the benchmark runs and (2) NOT be run before the test. I have seen, in practice, due to GC effects, it may even be faster to do a cheap computation inside the loop rather than in an env, because holding onto a large amount of data between runs costs a lot in terms of GC time.

ezyang avatar Jun 01 '15 21:06 ezyang

Just to add to this - it'd be great to be able to create a fresh environment for each run, if I've modified my environment using mutation it'd be great to be able to throw it away and make a new one..

oliver-batchelor avatar Nov 19 '15 11:11 oliver-batchelor

Also env requires the data to be NFData. Any suggestion if the data is not in normal form?

hongchangwu avatar Jan 16 '17 04:01 hongchangwu

@hongchangwu Assuming you had a way to regen the data each run (which, AFAICT this ticket can tell, you don't), you can work past a requested NFData instance by creating a newtype wrapper which no-ops rnf.

ezyang avatar Jan 16 '17 04:01 ezyang

This issue should probably be closed since, as of 1.2, criterion now does have the ability to generate data per run, see the new perRunEnv and perBatchEnv functions.

merijn avatar May 29 '17 19:05 merijn

True. However, @lambdacalculator did request that the tutorial in particular be updated with an example of how to do this, and there doesn't appear to be anything of the sort for perRunEnv or perBatchEnv yet. Would you be willing to do this, @merijn? :)

RyanGlScott avatar May 29 '17 19:05 RyanGlScott

I'll put it on my todo list, unless I can sucker someone else into it...

merijn avatar May 29 '17 19:05 merijn