Ritika Borkar comments

Results 23 comments of


                                            Ritika Borkar

Add hparams and compliance checks for training and eval samples for all benchmarks

I am assuming we make the check on train_samples & eval_samples to match reference values for all benchmarks, as we noted this for RN50 in https://github.com/mlcommons/submission_training_1.0/issues/48 as well. If you...

[BERT|RNNT] Verify vocab size in compliance checker

@shangw-nvidia , @emizan76 Will the compliance checker verify this for v1.1?

Document RCPs method of production

@emizan76 is this something you can help with?

Document RCPs method of production

Thanks Elias. Can we expect this for v1.0?

Allowed optimizers for Image Classification using Pytorch

@sgpyc to keep me honest. LAMB is not an allowed optimizer for RN50. Only LARS and SGD are allowed. https://github.com/mlcommons/training/tree/master/image_classification#optimizer The rules already call allow apex.optimizers.FusedSGD [here](https://github.com/mlcommons/training_policies/blob/master/training_rules.adoc#15-appendix-allowed-closed-division-optimizers)

Allowed optimizers for Image Classification using Pytorch

Thanks, implementation looks good as long as nesterov momentum is not used (reference doesn't use nesterov)

Clarify that RCP interpolation can be between any two RCPs (one higher, one lower)

We also have some RCPs which break the non-decreasing requirement with respect to increasing batch-size. This is possible if better hparams were not known at the time these RCPs were...

Add MLCube support for Image Segmentation Benchmark

@davidjurado is the issue you observed resolved now?

Add MLCube support for Object Detection Benchmark

@davidjurado can you please address Shriya's feedback. We can then merge this PR.

Gradient clipping not working for llama2_70b_lora benchmark

Discussed in Training WG (3/28): @itayhubara is verifying if setting this value correctly affect convergence & if this can improve convergence or reduce coefficienct of variance in RCPs.