alignment-handbook issues

Question about "ModuleNotFoundError: No module named 'alignment'"

2

Prectise in python shell is successed, but error is exist when I used "accelarate launch"....

SFT training doesn't fully go through all samples

3

Current training uses ConstantLengthDataset. This dataset return fixed length of tokens (2048) in every step, however, the total number of steps are calculated based on the number of samples. I...

hanxiaotian

DPO alignment doesn't work on Lora models as suggested

1

You claim that "[In practice, we find comparable performance for both full and LoRA fine-tuning, with the latter having the advantage of producing small adapter weights that are fast to...

Abe13

Role of `prompt` field in SFT

I am planning to run SFT on real chatlogs so naturally I don't have the `prompt` field like in the Ultrachat dataset. AFAICT, this field is not used to perform...

shabie

Warning about max sequence length

Hi, when I ran the dpo finetuning code, I noticed that there is a warning in the logging output `[WARNING|tokenization_utils_base.py:3831] 2023-12-06 16:44:52,195 >> Token indices sequence length is longer than...

ChenDRAG

Integrate feedback from the community

This issue collects links of community feedback on the type of content to include in the handbook. Feel free to post a comment below with other ideas / requests! *...

lewtun

Why does the alignment-handbook account for user & system Inputs in loss calculation

3

I noticed that the alignment-handbook doesn't ignore the loss calculated from both the user and system inputs Based on my knowledge, many SFT choose to ignore these. I'm curious about...

xffxff

Global batch size question

7

Hi! Thanks again for the awesome repo. I have a small question regarding the global batch size of DPO training reported in the paper vs used in the code base....

liutianlin0121

Train on emails

2

Working as an cms admin in my company. I have around 1 million emails back and forth to our customers. How can i utilize the emails to make a chatbot...

patchie

What about the system prompt?

It seems that the system prompt is left to be `\n` or rather blank. Inspecting UltraChat (https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k?row=5), seems that no system prompt is added to the dataset. There must be...

timothylimyl

alignment-handbook
alignment-handbook copied to clipboard

Metadata

Question about "ModuleNotFoundError: No module named 'alignment'"

SFT training doesn't fully go through all samples

DPO alignment doesn't work on Lora models as suggested

Role of `prompt` field in SFT

Warning about max sequence length

Integrate feedback from the community

Why does the alignment-handbook account for user & system Inputs in loss calculation

Global batch size question

Train on emails

What about the system prompt?

← Metadata

Owner

Metadata

alignment-handbook alignment-handbook copied to clipboard

Metadata

← Metadata

Owner

Metadata

alignment-handbook
alignment-handbook copied to clipboard