training_policies icon indicating copy to clipboard operation
training_policies copied to clipboard

Add a rule about DLRM training data shuffling

Open johntran-nv opened this issue 4 years ago • 3 comments

Shuffling rules about DLRM were not clear enough in the v0.7 round and they left a lot of room for interpretation. This update makes a clear rule that is easy to follow and should not impact convergence or performance of DLRM implementations.

This was actually part of https://github.com/mlcommons/training_policies/pull/411, which we discussed, but I mistakenly closed that thinking it was only about packing, which we no longer are using that PR for. This is cleaner to break out data shuffling into its own PR, anyway.

johntran-nv avatar Apr 21 '21 20:04 johntran-nv

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

github-actions[bot] avatar Apr 21 '21 20:04 github-actions[bot]

+[email protected], +[email protected], could you please review/approve?

johntran-nv avatar Apr 21 '21 20:04 johntran-nv

Deepak suggested that it is too late for v1.0 to change this, which is fair. Let's defer discussion to v1.1.

Separately, it looks like I inadvertently merged this, maybe as part of another PR. I'll go fix that now as well.

johntran-nv avatar Apr 30 '21 04:04 johntran-nv