inference icon indicating copy to clipboard operation
inference copied to clipboard

Open Division Rule Clarification Request

Open nvyihengz opened this issue 2 years ago • 2 comments

According to https://github.com/mlcommons/inference/blob/master/Submission_Guidelines.md#expected-time-to-do-benchmark-runs

There is no constraint on the model used also except that the model must be trained on the dataset used in the corresponding MLPerf inference task.

By "the dataset used in the corresponding MLPerf inference task", does it mean the validation set used for accuracy testing in the closed division's same workload? It does not quite make sense to enforce this so that retraining of the open division model can be overfitted to the validation dataset.

If it means the original dataset used to train the closed division model, what would the submitters use if the original dataset is not publicly available? In this case this rule needs to be relaxed, such as the submitters can choose their own training dataset except the validation dataset and have to publish the dataset they used for training as a complementary material for open division submission.

nvyihengz avatar Nov 01 '23 22:11 nvyihengz

@mrmhodak We need to discuss this ASAP in WGM.

nv-ananjappa avatar Nov 13 '23 21:11 nv-ananjappa

except that the model must be trained

I think "trained" should be replaced with "validated".

such as the submitters can choose their own training dataset except the validation dataset and have to publish the dataset they used for training as a complementary material for open division submission.

Requiring to publish the training dataset may be a step too far?

psyhtest avatar Nov 13 '23 22:11 psyhtest