LLM-eval-survey issues

The leaderboard website is down...

Check https://lmexam.com but found: ![image](https://github.com/MLGroupJLU/LLM-eval-survey/assets/8592144/c887ff93-0090-43fe-96b0-6dd61ec163fd)

zhimin-z

Paper Title Change

Hello, Thank you for your excellent work on the survey paper! I am one of the author for the papers you have listed but we had a major title change....

nlee-208

Is all the evaluation of LLM done by changeing the prompts?

Is all the evaluation of LLM done by changeing the prompts? 所有LLM评测工作都是通过改变模型的prompt实现的吗？有用其他的方法吗？

PlantPotatoOnMoon

Can you add SpyGame to your survey?

1

Hi there, Thanks for the effort in putting up this survey on LLMs evaluation. I'd like to suggest adding our work, SpyGame, a framework for evaluating language model intelligence. We...

Skytliang

Can you add our recent work to your survey?

1

Hi, I have read your insightful paper and found it to be a valuable contribution to the field. I would like to kindly suggest adding our recent work to your...

grayground

Add references

Included a reference concerning the reliability of LLMs as generative search engines, hope it is relevant :)

ChanLiang

Add CMB to your paper

4

Hii, this is a really comprehensive work. Can you add our recent work to your survey? [CMB: A Comprehensive Medical Benchmark in Chinese](https://arxiv.org/abs/2308.08833) **Thanks**

g-h-chen

咨询下，LLM的数据污染检测（判断数据集是否训练见过）技术方向靠谱吗？有推荐论文吗？

1

gongjunjin

Can you add LRV-Instruction to Your update Arxiv Version?

1

Paper: Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning link: https://arxiv.org/pdf/2306.14565.pdf Name: LRV-Instruction Focus: Multimodal Notes: A benchmark to evaluate the hallucination and instruction following ability bib: @article{liu2023aligning,...

FuxiaoLiu

LLM-eval-survey
LLM-eval-survey copied to clipboard

Metadata

The leaderboard website is down...

Paper Title Change

Is all the evaluation of LLM done by changeing the prompts?

Can you add SpyGame to your survey?

Can you add our recent work to your survey?

Add FinanceBench

Add references

Add CMB to your paper

咨询下，LLM的数据污染检测（判断数据集是否训练见过）技术方向靠谱吗？有推荐论文吗？

Can you add LRV-Instruction to Your update Arxiv Version?

← Metadata

Owner

Metadata

LLM-eval-survey LLM-eval-survey copied to clipboard

Metadata

← Metadata

Owner

Metadata

LLM-eval-survey
LLM-eval-survey copied to clipboard