LLM-eval-survey
LLM-eval-survey copied to clipboard
The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".
Check https://lmexam.com but found: 
Hello, Thank you for your excellent work on the survey paper! I am one of the author for the papers you have listed but we had a major title change....
Is all the evaluation of LLM done by changeing the prompts? 所有LLM评测工作都是通过改变模型的prompt实现的吗?有用其他的方法吗?
Hi there, Thanks for the effort in putting up this survey on LLMs evaluation. I'd like to suggest adding our work, SpyGame, a framework for evaluating language model intelligence. We...
Hi, I have read your insightful paper and found it to be a valuable contribution to the field. I would like to kindly suggest adding our recent work to your...
Included a reference concerning the reliability of LLMs as generative search engines, hope it is relevant :)
Hii, this is a really comprehensive work. Can you add our recent work to your survey? [CMB: A Comprehensive Medical Benchmark in Chinese](https://arxiv.org/abs/2308.08833) **Thanks**
Paper: Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning link: https://arxiv.org/pdf/2306.14565.pdf Name: LRV-Instruction Focus: Multimodal Notes: A benchmark to evaluate the hallucination and instruction following ability bib: @article{liu2023aligning,...