Songyang Zhang
Songyang Zhang
Thanks for the insightful suggestions. We will add this into our backlog. Contribution is also welcomed.
> [https://opencompass.readthedocs.io/zh-cn/latest/advanced_guides/multimodal_eval.html中的多模态评测使用的是opencompass中的python](https://opencompass.readthedocs.io/zh-cn/latest/advanced_guides/multimodal_eval.html%E4%B8%AD%E7%9A%84%E5%A4%9A%E6%A8%A1%E6%80%81%E8%AF%84%E6%B5%8B%E4%BD%BF%E7%94%A8%E7%9A%84%E6%98%AFopencompass%E4%B8%AD%E7%9A%84python) run.py configs/multimodal/tasks.py --mm-eval,这部分支持测试吗?目前测试报错,榜单中提到使用的是VLMEvalKit Please try VLMEvalKit, evaluation for VLM has been deprecated in opencompass repo
Feel free to re-open if needed.
How about using official Yi?
How about evaluating llama2-70b-base without turbormind?
@zhangyikaii Hi, will you update this PR recently?
Fixed in latest version, welcome to try. Feel free to re-open if needed.
> ### 描述该功能 > Embedding模型在知识的召回起到至关重要的作用,针对Embedding的专业评测非常有价值。 > > ### 是否希望自己实现该功能? > * [ ] 我希望自己来实现这一功能,并向 OpenCompass 贡献代码! Welcome more detailed suggestions about evaluation of embedding model.
> 请教下,你这边复现的结果和榜单上是一致的吗? We may provide more help if you can provide the information of the evaluation experiments, like the specific model and dataset
Also please fix the lint issue