nebuly
nebuly copied to clipboard
Refactor of models and trainers with base class for common methods
- Refactor models and trainers to avoid code replication.
- Added logs with loguru package.
- Fix logs with MultiGPU trainers.
- Added support for LoRA with PEFT library.
- Added support for load_8bit option with HF models.
- Added self-instruct dataset of HF.
- Added CerebrasGPT and Decapoda LLaMA models from HF.
- Added mixed-precision training to reduce GPU memory requirements.
- Fixed RLHF KL divergence equation.
- Added support to keep only the last n checkpoints for all training.
- Added generation of negative examples when creating the reward dataset to improve the quality of the reward model.
- Improved stability of MultiGPU training with both Accelerate form HF and DeepSpeed.
@PierpaoloSorbellini please add a description of what this PR is adding in terms of features and which bugs it is fixing.