alignment-handbook
alignment-handbook copied to clipboard
Robust recipes to align language models with human and AI preferences
When fine-tuning Mistral with LoRA, do you think FlashAttention2 helps in speeding up the process? If yes, how significant is the acceleration? Where is the primary acceleration achieved?
How to finetune or lora on custom dataset
I noticed in the model card for zephyr-7b-beta that you mentioned "removing the in-built alignment of these datasets boosted performance on MT Bench and made the model more helpful," resulting...
https://github.com/huggingface/alignment-handbook/blob/606d2e954fd17999af40e6fb4f712055ca11b2f0/src/alignment/data.py#L216-L221 Actual exception is `ValueError`: ``` [rank5]: Traceback (most recent call last): [rank5]: File "run_sft.py", line 251, in [rank5]: main() [rank5]: File "run_sft.py", line 86, in main [rank5]: raw_datasets =...
Hello, I want to load the `training_arg.bin` of [zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) and pass it to `DPOTrainer` of `trl` to compute the implicit rewards and logps conveniently, but it seems lack some private...
This PR is related to #189 Current `transformers` tries to convert every config objects into Dictionary([link](https://github.com/huggingface/transformers/blob/e0d82534cc95b582ab072c1bbc060852ba7f9d51/src/transformers/training_args.py#L2468C6-L2468C6)), but it does not resolve in nested config case (`BitsAndBytesConfig`) with the following error....
Hello, I tried to fine-tune a model using the SFT/QLoRA method provided in the handbook. Everything runs until the beginning of the training phase. At this moment, the following error...
Hi, I just followed recipes/zephyr-7b-beta/dpo/config_qlora.yaml and hope to replicate the experiments. I was training on A10G, with 1 gpu, and the only modification I did was reducing the train_batch_size from...
Hi Team, It is amazing handbook. In the continued pre-training script (`run_cpt.py`), I saw that it is not using "mlm" (Masked Language Model) parameter in the training process. I though...
Hello , I face a problem when training mistral model in a sft way with deepspeed zero3 config, here is the error information: Variable._execution_engine.run_backward( # Calls into the C++ engine...