bigscience icon indicating copy to clipboard operation
bigscience copied to clipboard

Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.

Results 28 bigscience issues
Sort by recently updated
recently updated
newest added

The [config file](https://github.com/bigscience-workshop/bigscience/blob/b4a4f4651771cb78297abe5074aaf2de1f92d6ce/train/tr11-176B-ml/setup-test-n2.slurm) lists the sample count of the dataset as 220M and a global batch size of 2048, which equates to ~107K steps per epoch. The [main README](https://huggingface.co/bigscience/bloom/blob/main/README.md) says...

A clean-up to avoid slurm scripts in the Meg-DS repo. I will clean up the Meg-DS repo & the evaluation-results repo if we merge this.

It's more common and easier to follow to put the participial phrase after the noun, I think.

Notes: RE: Learning Rate T0 & FLAN use Adafactor which automatically adjusts the step size: [Finally, while the learning rate in Adam denotes a target absolute step size, we follow...

Add small arguments that are accepted by accelerate for better performance in the previous script we were offloading to the disk which takes a lot of time cc @Muennighoff

This PR updates the small model SLURM scripts with the ones used to finish their training. We could also make these separate files / mark somewhere that we continued with...