algorithmic-efficiency icon indicating copy to clipboard operation
algorithmic-efficiency copied to clipboard

MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models.

Results 50 algorithmic-efficiency issues
Sort by recently updated
recently updated
newest added

## Original issue Self-tuning submissions use default values for `dropout_rate` and `aux_dropout_rate` (see #753). We want to allow them to specify custom values for these hyperparameters. ## Solution Allow self-tuning...

## Description Our current setup uses pre-commit hooks to enforce code quality checks before each commit. While this helps maintain consistency, it can slow down development. To balance consistency and...

it took me a long time to figure out the basics of how this benchmark works, so I think a short description at the beginning of the readme would be...

### Workload #### Task Text generation. #### Dataset TBD #### Model TBD Possible candidates include - [preferred starting point] [Nanodo](https://github.com/google-deepmind/nanodo) - NanoGPT - Meta’s [lingua](https://github.com/facebookresearch/lingua) - Keller Jordan’s [modded nanoGPT](https://github.com/KellerJordan/modded-nanogpt)...

⚙️ Workload

It is useful to shard optimizer state across devices (to save significant memory). This reflects current practice. We want to support it. * We want to switch from no sharding...

👷 In Progress

It is useful to shard optimizer state across devices (to save significant memory). This reflects current practice. We want to support it. * We want to switch from no sharding...

👷 In Progress

## Description Currently we have been unable to reproduce the schedule free adamw results with JAX. There seem to be differences between the optax implementation of schedule-free adamw and the...

The train diff tests are difficult to run at the moment. - Add documentation on how to run them - Eliminate IO errors related to writing temporary results to files

Good First Issue

Refactor modeldiff tests so that: - variable names are clear (e.g. no `pyt` instead of `pytorch`) - script names are clear and descriptive - deduplicate logic

Good First Issue