algorithmic-efficiency
algorithmic-efficiency copied to clipboard
MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models.
## Description We need to check whether the user experience of self-reporting is clear and easy. This might require additional code or scripts to make self-reporting as easy as possible....
Add workload variants for the base workloads. This is a tracking issue. 7/8 variants along with model-diff tests have been added already. Remaining work is to: - [x] Submit DeepSpeech...
We consistently observe an OOM error when running the one of the NAdamW baselines on LibriSpeech Conformer with multiple trials in PyTorch on 8 V100s with 16GB each. This is...
Two copies of `criteo_resnet_pytorch` exist in .github/workflows/regression_tests_variants.yml Is this intentional? If not, which is the correct version? Thanks!! ``` criteo_resnet_pytorch: runs-on: self-hosted needs: build_and_push_pytorch_docker_image steps: - uses: actions/checkout@v2 - name:...
## Description Is it possible to publish file hashes and directory layouts for all datasets, post processing. I would like to run some checks to ensure that there are no...
Do not merge this before changing base to dev. Running integration tests with these fixes.
The conformer workload hangs when run with shampoo training algorithm. ## Description Traceback ``` I0505 23:26:00.158526 139795269302080 submission_runner.py:319] Starting training loop. I0505 23:26:00.373614 139795269302080 input_pipeline.py:20] Loading split = train-clean-100 I0505...
Add unit and integration tests to test the following requirements: In both strict=False and strict=True, to receive a finite score for a workload a submission must: - Reach the validation...
## Description Most of the code in data_setup.py is untested. There are a few challenges for these tests: - datasets are very large (total just under 2TB total I believe)...