LLaMA-Factory
LLaMA-Factory copied to clipboard
eval运行mmlu时,results.json中的结果少了一项
Reminder
- [X] I have read the README and searched the existing issues.
Reproduction
运行mmlu评估时,results.json中的结果少了一项。 results.json中没课结果只有四个答案,如下:
"abstract_algebra": {
"0": "B",
"1": "C",
"2": "A",
"3": "A"
},
"anatomy": {
"0": "D",
"1": "C",
"2": "C",
"3": "B"
},
mmlu的数据中每个应该有五项: abstract_algebra如下:
Find all c in Z_3 such that Z_3[x]/(x^2 + c) is a field. | 0 | 1 | 2 | 3 | B |
Statement 1 | If aH is an element of a factor group, then |aH| divides |a|. Statement 2 | If H and K are subgroups of G then HK is a subgroup of G. | True, True | False, False | True, False | False, True | B |
Statement 1 | Every element of a group generates a cyclic subgroup of the group. Statement 2 | The symmetric group S_10 has 10 elements. | True, True | False, False | True, False | False, True | C |
Statement 1| Every function from a finite set onto itself must be one to one. Statement 2 | Every subgroup of an abelian group is abelian. | True, True | False, False | True, False | False, True | A |
Find the characteristic of the ring 2Z. | 0 | 3 | 12 | 30 | A |
anatomy如下所示:
What is the embryological origin of the hyoid bone? | The first pharyngeal arch | The first and second pharyngeal arches | The second pharyngeal arch | The second and third pharyngeal arches | D |
Which of these branches of the trigeminal nerve contain somatic motor processes? | The supraorbital nerve | The infraorbital nerve | The mental nerve | None of the above | D |
The pleura | have no sensory innervation. | are separated by a 2 mm space. | extend into the neck. | are composed of respiratory epithelium. | C |
In Angle's Class II Div 2 occlusion there is | excess overbite of the upper lateral incisors. | negative overjet of the upper central incisors. | excess overjet of the upper lateral incisors. | excess overjet of the upper central incisors. | C |
Which of the following is the body cavity that contains the pituitary gland? | Abdominal | Cranial | Pleural | Spinal | B |
运行的脚本:
python run/eval.py llama3_lora_eval.yaml
run/eval.py简单的套了个run_eval()函数,如下所示:
from llamafactory.eval.evaluator import run_eval
run_eval()
llama3_lora_eval.yaml如下所示:
### model
model_name_or_path: /opt/gfbai/models/Meta-Llama-3-8B-Instruct
adapter_name_or_path: saves/llama3-8b/lora/sft/checkpoint-54
### method
finetuning_type: lora
### dataset
task: mmlu
split: train
template: fewshot
lang: en
n_shot: 5
### output
save_dir: saves/llama3-8b/lora/eval
### eval
batch_size: 2
download_mode: force_redownload
Expected behavior
输入出完整的答案
System Info
-
transformers
version: 4.40.2 - Platform: Linux-5.15.0-105-generic-x86_64-with-glibc2.35
- Python version: 3.10.14
- Huggingface_hub version: 0.23.0
- Safetensors version: 0.4.3
- Accelerate version: 0.30.1
- Accelerate config: not found
- PyTorch version (GPU?): 2.3.0+cu121 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Others
应该是mmlu数据集的脚本有问题 这个文件evaluation/mmlu/mmlu.py中:
def _generate_examples(self, filepath):
df = pd.read_csv(filepath)
df.columns = ["question", "A", "B", "C", "D", "answer"]
for i, instance in enumerate(df.to_dict(orient="records")):
yield i, instance
改为以下形式就好了
def _generate_examples(self, filepath):
df = pd.read_csv(filepath, header=None)
df.columns = ["question", "A", "B", "C", "D", "answer"]
for i, instance in enumerate(df.to_dict(orient="records")):
yield i, instance