mlmm-evaluation issues

Doesn't work with any HF model

1

Hello, I've been trying with different LLMs but I haven't been able to make it works. Could you bring some light? ```shell luispoveda93@LUIS-PC:~/mlmm-evaluation$ bash scripts/run.sh es microsoft/Phi-3-mini-4k-instruct Selected Tasks: ['arc_es',...

PovedaAqui

Not Support English MMLU?

I set lang as English, but it fails to work. Is it possible to run with English MMLU?

moore3930

ollama installed models

1

Hello, I've been trying to run the framework using a model I installed with Ollama, but I haven't been able to do it, maybe it's related to the model path,...

PovedaAqui

feature: evaluate Jais13B on ar_arc dataset

This PR adds the evaluation results for [Jais13B](https://huggingface.co/core42/jais-13b) model on ArabicArc dataset.

mouhandalkadri

Got stuck when evaluating MMLU

4

Thanks for your open sourcing! i'm trying to evaluate `Llama-7b-hf` on `mmlu-fr`, a warning of `Token indices sequence length is longer than the specified maximum sequence length for this model...

zhangliang-04

ARC-Easy Dataset

Dear authors, Thanks for your nice work. I am wondering if you also translated the ARC-easy dataset as currently the bash download script only yields the ARC-Challenge dataset. I really...

yaof20

Need to submit results

Hi, Can I submit results one by one for languages or do I have to do it all together? Thanks

Mugariya

Few Shot configuration

Hello! Is there a way to control how many examples are used to evaluate the models? Also, how are the evaluations currently set up? Are all benchmarks (ARC, MMLU, HellaSwag)...

Nkluge-correa

Add Azerbaijani(Arabic Script) to languages

Hello, If you could add Azerbaijani(Arabic Script) as a language in your https://cohereforai-review-mmlu-translations.hf.space/dataset/97ce1e12-3204-4461-b865-b2fe1e879b95/annotation-mode?page=3&status=pending we can give you large dataset. We have a big team for this language. Our latest paper....

Jalilnkh

Please add support for Adapter Models

1

The scripts look for `config.json` in the hf repo. But for models whch are finetuned / adapter models that file is adapter_config.json wherein I might also need to give the...

1rsh

mlmm-evaluation
mlmm-evaluation copied to clipboard

Metadata

Doesn't work with any HF model

Not Support English MMLU?

ollama installed models

feature: evaluate Jais13B on ar_arc dataset

Got stuck when evaluating MMLU

ARC-Easy Dataset

Need to submit results

Few Shot configuration

Add Azerbaijani(Arabic Script) to languages

Please add support for Adapter Models

← Metadata

Owner

Metadata

mlmm-evaluation mlmm-evaluation copied to clipboard

Metadata

← Metadata

Owner

Metadata

mlmm-evaluation
mlmm-evaluation copied to clipboard