instruct-eval issues

C-Eval

Would you support chinese evaluation dataset C-Eval？It will be a important work for chinese LLM Evaluation.

duanqiyuan

请问能加入对baichuan大模型的支持吗

1

[baichuan-inc/baichuan-7B](https://huggingface.co/baichuan-inc/baichuan-7B)

linghongli

add multiple gpu support

Good job! Could you please add multiple gpu support? Then we can test more larger models, such as llama 65b

lxy444

[Feature Request] Saving Prediction Results

The current version of the code base only returns the final evaluation metric back to the user. However, it is not possible to see what exactly are the model's predictions....

guanqun-yang

Is there any parallel processing methods?

wwngh1233

Add config to save eval results

Thanks for this neat repo, very convenient to evaluate LLM! As a feature request, I would like to suggest adding an option to save results of an evaluation for the...

arthurtobler

Regarding the comparison to lm-evaluation-harness

1

For > Compared to existing libraries such as [evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) and [HELM](https://github.com/stanford-crfm/helm), this repo enables simple and convenient evaluation for multiple models. Notably, we support most models from HuggingFace Transformers isn't...

gakada

Integrate the evaluation in the Transformers trainer with transformers.TrainerCallback

1

Hi, really thank you for this clear code. I wonder whether you plan to integrate this code into the Transformers trainer. in this way, we can run this code during...

BaohaoLiao

AutoModelForCausalLM supports llama models now

1

In newer versions of the transformer library, AutoModelForCausalLM can properly identify llama models. There's therefore no need anymore for the LlamaModel class. Llama models run with --model_name causal. The only...

passaglia

instruct-eval
instruct-eval copied to clipboard

Metadata

Update hhh.py

C-Eval

请问能加入对baichuan大模型的支持吗

add multiple gpu support

[Feature Request] Saving Prediction Results

Is there any parallel processing methods?

Add config to save eval results

Regarding the comparison to lm-evaluation-harness

Integrate the evaluation in the Transformers trainer with transformers.TrainerCallback

AutoModelForCausalLM supports llama models now

← Metadata

Owner

Metadata

instruct-eval instruct-eval copied to clipboard

Metadata

← Metadata

Owner

Metadata

instruct-eval
instruct-eval copied to clipboard