algorithmic-efficiency issues

Allow self-tuning submissions to specify `dropout_rate` and `aux_dropout_rate`

2

## Original issue Self-tuning submissions use default values for `dropout_rate` and `aux_dropout_rate` (see #753). We want to allow them to specify custom values for these hyperparameters. ## Solution Allow self-tuning...

Niccolo-Ajroldi

Optimize Git Hook Configuration

## Description Our current setup uses pre-commit hooks to enforce code quality checks before each commit. While this helps maintain consistency, it can slow down development. To balance consistency and...

Naeemkh

add basic summary of benchmark to beginning of readme

4

it took me a long time to figure out the basics of how this benchmark works, so I think a short description at the beginning of the readme would be...

qpwo

Workload Title: Decoder only LM

### Workload #### Task Text generation. #### Dataset TBD #### Model TBD Possible candidates include - [preferred starting point] [Nanodo](https://github.com/google-deepmind/nanodo) - NanoGPT - Meta’s [lingua](https://github.com/facebookresearch/lingua) - Keller Jordan’s [modded nanoGPT](https://github.com/KellerJordan/modded-nanogpt)...

priyakasimbeg

⚙️ Workload

Support FSDP in JAX workloads

2

It is useful to shard optimizer state across devices (to save significant memory). This reflects current practice. We want to support it. * We want to switch from no sharding...

priyakasimbeg

👷 In Progress

Support FSDP in PyTorch

6

It is useful to shard optimizer state across devices (to save significant memory). This reflects current practice. We want to support it. * We want to switch from no sharding...

priyakasimbeg

👷 In Progress

Refactor diffs module to use ModelDiffRunner class

3

SujataSaurabh

Add schedule-free adamw submission in JAX

7

## Description Currently we have been unable to reproduce the schedule free adamw results with JAX. There seem to be differences between the optax implementation of schedule-free adamw and the...

priyakasimbeg

Clean up train diff tests

The train diff tests are difficult to run at the moment. - Add documentation on how to run them - Eliminate IO errors related to writing temporary results to files

priyakasimbeg

Good First Issue

Clean up modeldiff tests

Refactor modeldiff tests so that: - variable names are clear (e.g. no `pyt` instead of `pytorch`) - script names are clear and descriptive - deduplicate logic

priyakasimbeg

Good First Issue

algorithmic-efficiency
algorithmic-efficiency copied to clipboard

Metadata

Allow self-tuning submissions to specify `dropout_rate` and `aux_dropout_rate`

Optimize Git Hook Configuration

add basic summary of benchmark to beginning of readme

Workload Title: Decoder only LM

Support FSDP in JAX workloads

Support FSDP in PyTorch

Refactor diffs module to use ModelDiffRunner class

Add schedule-free adamw submission in JAX

Clean up train diff tests

Clean up modeldiff tests

← Metadata

Owner

Metadata

algorithmic-efficiency algorithmic-efficiency copied to clipboard

Metadata

← Metadata

Owner

Metadata

algorithmic-efficiency
algorithmic-efficiency copied to clipboard