training_policies icon indicating copy to clipboard operation
training_policies copied to clipboard

Submitter should be responsible for showing submission does not converge better reference.

Open jonathan-cohen-nvidia opened this issue 5 years ago • 2 comments

For each submitted HP, submitter should include logs from reference code run to demonstrate that convergence matches the reference.

Without this, the onus is on the reviewers to do it (which we all do) which is needlessly complex and error prone. Would be simpler for the responsibility to lie with organization that submits.

jonathan-cohen-nvidia avatar Jul 23 '19 23:07 jonathan-cohen-nvidia

SWG Notes:

Update from Special Topics: Would be nice if submitters could provide some evidence that you don't converge better than the reference. In practice, this could be difficult to do, specifically: references could be very slow, not support the batch size, etc.

We agreed to look at the references and scope out the work to make them multi-node and improve performance. Also, we will start to look at policy details of this (when/how to submit this evidence/measurement, how to interpret the evidence, etc.)

bitfort avatar Aug 01 '19 18:08 bitfort

SWG Notes:

  1. We need to figure out why the convergence variance exists and make clear of that.
  2. Consider swapping the models with last submission.
  3. People can use reference models to verify their selected HP.

frank-wei avatar Aug 15 '19 18:08 frank-wei