Yiyang comments

Repositories
Issues
Comments

Results 3 comments of


                                            Yiyang

Reproduce result of Boolq on LLaMA-7B

hello, thanks for your nice work. We can reproduce the score in most datasets except winograd_wsc and winogrande. The score of winograd_wsc is 0.8857142925262451 and the score of winogrande is...

Reproduce result of Boolq on LLaMA-7B

and metrics/jeopardy/0-shot/InContextLearningLMAccuracy: 0.36367926001548767, a littile difference with 0.334 in the picture for llama-7b. command is "WORLD_SIZE=8 composer eval/eval.py eval/yamls/hf_eval.yaml" config is ``` max_seq_len: 2048 seed: 1 model_name_or_path: huggyllama/llama-7b # Tokenizer...

关于Humaneval-x的评测速度

我也遇到了，go语言得分全是timed out。设置time out 30不行。