nebuly Refactor of models and trainers with base class for common methods

Refactor of models and trainers with base class for common methods

Open PierpaoloSorbellini opened this issue 1 year ago • 1 comments

Refactor models and trainers to avoid code replication.
Added logs with loguru package.
Fix logs with MultiGPU trainers.
Added support for LoRA with PEFT library.
Added support for load_8bit option with HF models.
Added self-instruct dataset of HF.
Added CerebrasGPT and Decapoda LLaMA models from HF.
Added mixed-precision training to reduce GPU memory requirements.
Fixed RLHF KL divergence equation.
Added support to keep only the last n checkpoints for all training.
Added generation of negative examples when creating the reward dataset to improve the quality of the reward model.
Improved stability of MultiGPU training with both Accelerate form HF and DeepSpeed.

Mar 27 '23 16:03 PierpaoloSorbellini

@PierpaoloSorbellini please add a description of what this PR is adding in terms of features and which bugs it is fixing.

Apr 16 '23 10:04 diegofiori