alignment-handbook issues

Question about the evaluation dataset

1

Hi, I wonder which TruthfulQA task you are focusing on during evaluation? MC1, MC2, or generation task?

Finetuned `zephyr-7b-beta` with internal data generates same reuslts as model `HuggingFaceH4/zephyr-7b-beta`

1

use huggingface pipleline to run inference task, but found finetuned `HuggingFaceH4/zephyr-7b-beta` and model `HuggingFaceH4/zephyr-7b-beta` generates exactly otuputs. Does anyone have any clue about this error?

wxp16

Reproducing SFT results.

3

I was looking at the logs of your training (from this [json](https://huggingface.co/HuggingFaceH4/mistral-7b-sft-beta/resolve/main/trainer_state.json?download=true) file) and realized that the scheduling is messed up. It's related to the ConstantLength dataset, not computing its...

tcapelle

Enhanced Uncertainty Vectors: A Novel Approach for AI Alignment

1

## Abstract In the rapidly evolving field of artificial intelligence (AI), aligning AI systems with human values and intentions, known as AI alignment, is of paramount importance. This whitepaper introduces...

binoculars

DPO fine-tuning errors out on Yi 34B (Assertion `srcIndex < srcSelectDimSize` failed)

The script errors out only with Yi 34B Chat. I have tried Llama2 7/13B and SUSTech/SUS-Chat-34B and they all work. Yi 34B Chat has consistently been running into the following...

cvetanovskaa

Running on single GPU(16GB)

1

Hi, What is the best way to run this on my high performance laptop? Should this somehow work? Can i calculate how many days/weeks it will run? Thanks in advance...

patchie

[process exited with code 1 (0x00000001)]

1

Just wanted to report a crash while training. **Error message:** `[process exited with code 1 (0x00000001)]` **Command i used to start the process:** `ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/multi_gpu.yaml --num_processes=1 scripts/run_sft.py...

patchie

SFT lora ends with higher loss

1

I've run the training without changing any hyperparameter except for batch size and gradient accumulation steps to match the global batch size on two machines. The first run is exactly...

Randl

Add instrutions to evaluate on academic datasets

The paper evaluates on ARC, HellaSwag, MMLU, and TruthfulQA, but this repo does not reference these evals. Adding short explanation regarding these evals (e.g., in https://github.com/huggingface/alignment-handbook/tree/main/scripts#evaluating-chat-models) would be nice

Randl

Reproducing of Lora Model Result on MT-Bench

27

Recently, I attempted to fit the DPO on my own dataset. Initially, I tried to reproduce the results of your LORA model( 7.43 on MT-Bench). However, I encountered some issues....

wlhgtc

alignment-handbook
alignment-handbook copied to clipboard

Metadata

Question about the evaluation dataset

Finetuned `zephyr-7b-beta` with internal data generates same reuslts as model `HuggingFaceH4/zephyr-7b-beta`

Reproducing SFT results.

Enhanced Uncertainty Vectors: A Novel Approach for AI Alignment

DPO fine-tuning errors out on Yi 34B (Assertion `srcIndex < srcSelectDimSize` failed)

Running on single GPU(16GB)

[process exited with code 1 (0x00000001)]

SFT lora ends with higher loss

Add instrutions to evaluate on academic datasets

Reproducing of Lora Model Result on MT-Bench

← Metadata

Owner

Metadata

alignment-handbook alignment-handbook copied to clipboard

Metadata

← Metadata

Owner

Metadata

alignment-handbook
alignment-handbook copied to clipboard