alignment-handbook issues

Cannot flatten integer dtype tensors

1

Thank you guys for your work! i was using fsdp + qlora fine tuning llama3 70B on 8* A100 80G, and i encountered this error: ```shell Traceback (most recent call...

jaywongs

Released model weights for ablations of KTO/IPO/DPO cannot be found

Hi @edbeeching , thanks for the great work in ablating KTO/IPO/DPO algorithms in #104 . I notice that in this referenced [blog](https://huggingface.co/blog/pref-tuning ), it says the best performing model for...

ChenDRAG

FSDP + QDoRA Support

6

Hi the team, great work! QDoRA seems to be better than QLoRA, refer to [Efficient finetuning of Llama 3 with FSDP QDoRA](https://www.answer.ai/posts/2024-04-26-fsdp-qdora-llama3.html) I wonder whether there will be demo /...

iseesaw

How to QLoRA training with ZeRO-3 on two or more GPUs?

4

I added a 4-bit load after the command LoRA training with ZeRO-3 on two or more GPUs to achieve a mix of QLoRA and ZeRO-3. But the program encountered the...

Di-Zayn

How to work with local data

1

I downloaded a dataset from hf. I want to load it locally, but it still tries to download it from hf and place it into the cache. How can I...

pretidav

Add `scripts/run_kto.py`

1

## Description As briefly discussed with @lewtun this morning, this PR adds the `scripts/run_kto.py` script to fine-tune LLMs using the `trl.KTOTrainer` from the `alignment-handbook`. The script should work as is,...

alvarobartt

CI failing due to `mistralai/Mistral-7B-Instruct-v0.2` being gated now

## Description Since recently Mistral marked their repository at https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2 as a gated repository, I'm afraid that the account using the `HF_TOKEN` set as a secret for the CI will...

alvarobartt

Add 'do_train' check to cpt

1

I forgot the do_train when creating the cpt script, or for some reason left it out. But I think it would be ueseful to add it still.

BramVanroy

Minor question about PAD token and EOS token.

2

Hello, Thank you for sharing this awesome resource! I have a question regarding models that already have a chat template like "mistralai/Mistral-7B-Instruct-v0.1". I'm planning on using the non packed dataset....

HaniItani

Clarification on dataset mixer

5

from the README from `/scripts`. ```yaml datasets_mixer: dataset_1: 0.5 # Use 50% of the training examples dataset_2: 0.66 # Use 66% of the training examples dataset_3: 0.10 # Use 10%...

deep-diver

alignment-handbook
alignment-handbook copied to clipboard

Metadata

Cannot flatten integer dtype tensors

Released model weights for ablations of KTO/IPO/DPO cannot be found

FSDP + QDoRA Support

How to QLoRA training with ZeRO-3 on two or more GPUs?

How to work with local data

Add `scripts/run_kto.py`

CI failing due to `mistralai/Mistral-7B-Instruct-v0.2` being gated now

Add 'do_train' check to cpt

Minor question about PAD token and EOS token.

Clarification on dataset mixer

← Metadata

Owner

Metadata

alignment-handbook alignment-handbook copied to clipboard

Metadata

← Metadata

Owner

Metadata

alignment-handbook
alignment-handbook copied to clipboard