inference [automation and reproducibility taskforce] progress report for 20240130

[automation and reproducibility taskforce] progress report for 20240130

Open gfursin opened this issue 1 year ago • 0 comments

Many thanks to MLPerf submitters and MLCommons members for their feedback during the past 2 weeks to help us improve the MLCommons CM automation for MLPerf inference:

Improvements and extensions to general CM automation recipes

that all submitters can reuse to automate their benchmarking:

Download models

[x] cmr "get ml-model gptj"
[x] cmr "get ml-model stable-diffusion"
[x] cmr --tags=get,ml-model,llama2-70b

Download datasets

[x] cmr "get dataset cnndm _validation"
[x] cmr "get dataset coco2014 _validation"
[x] cmr "get preprocessed dataset openorca _validation"

Detect/install frameworks

[x] cmr "get generic-python-lib _package.onnxruntime" --version_min=1.16.0
[x] cmr "get generic-python-lib _package.torch" --version=2.1.1
[x] cmr "get generic-python-lib _package.torchvision" --version=0.16.2

Continue repeatibility study for v3.1

[x] GPT-J from Intel - thanks to feedback we are nearly done and have CM automation
[x] GPT-J from Nvidia - re-testing it because of some reports about potential issues
[ ] Google submission is under testing

Feel free to report issues or suggest submissions here or via our Discord server.

Suggestions that we may include to the new CM v1.6.1 release before the next meeting:

[x] Substitute current complex docs with a CM GUI to select implementation, target, model and then generate CM commands or show the current state with automation and reproducibility
- we started prototyping this GUI and will hopefully show it next time.
[ ] Develop high-level CM script to run different MLPerf implementations with all models for a given target in sequence similar to SPEC benchmarks (in progress).
[ ] Update inference v4.0 READMEs with CM commands to download/detect/install models, datasets and frameworks

Longer term:

[x] Suggestion for MLPerf repetability/reproducibility badges from ACM/IEEE/NeurIPS
[ ] Sugestion to restart Automation and Reproducibility TF - contact us if you would like to participate, chair, develop ...

CM is a collaborative engineering effort based on your feedback - please don't hesitate to get in touch via our Discord server or open a ticket here. Thank you!

Jan 30 '24 16:01 gfursin

inference inference copied to clipboard

[automation and reproducibility taskforce] progress report for 20240130

Improvements and extensions to general CM automation recipes

Download models

Download datasets

Detect/install frameworks

Continue repeatibility study for v3.1

Suggestions that we may include to the new CM v1.6.1 release before the next meeting:

Longer term:

inference
inference copied to clipboard