training_policies icon indicating copy to clipboard operation
training_policies copied to clipboard

Evaluation bottle necks

Open bitfort opened this issue 6 years ago • 5 comments

Sometimes evaluation can be very time consuming and done by 3rd party code. How can we reduce the influence of 3rd party evaluation code performance on the benchmark scores and engineering burden?

bitfort avatar Jan 10 '19 19:01 bitfort

SWG Notes:

Possible solutions (brain storming, have many pros/cons):

  • Not time evaluation
  • Fewer evaluation
  • Provide/choose optimized implementations for evaluation
  • Let submitters figure out how to handle it

AI(Jacob) - Present thoughts from HPC on this next week. AI(all submitters) - This is a call for proposal :)

bitfort avatar Jan 17 '19 19:01 bitfort

Can someone provide an example of a benchmark where third-party code is used for serial evaluation and becomes a bottleneck?

I've run into this issue with the translation and image-classification benchmarks, but haven't made it far enough along in porting of the other benchmarks to know which ones are most problematic.

jbalma avatar Jan 24 '19 15:01 jbalma

SWG Notes:

We believe Maskrcnn and SSD with coco evaluation is at the top of the list.

bitfort avatar Jan 24 '19 19:01 bitfort

SWG Notes:

Long term we'd like a unified solution to this, but for v0.6 it will be up to submitters to optimization evaluation code themselves if they deem it necessary. We intend to revisit this issue in the future to reduce effort submitters have to put into evaluation.

bitfort avatar Apr 11 '19 18:04 bitfort

This is backlogged not a rec.

petermattson avatar May 29 '20 19:05 petermattson