MobiLlama icon indicating copy to clipboard operation
MobiLlama copied to clipboard

cannot reproduce siqa numbers

Open csarron opened this issue 4 months ago • 0 comments

hello @OmkarThawakar , I used the LLM360 Analysis repo to run eval for siqa task:

python Analysis360/eval/harness/main.py --device cuda:0 --model=hf-causal-experimental --batch_size=auto:1 --model_args="pretrained=MBZUAI/MobiLlama-05B,trust_remote_code=True,dtype=bfloat16" --tasks=social_iqa --num_fewshot=0 --output_path=Analysis360-MobiLlama-05B.json

it only gives 0.3327, which is close to random numbers, since there are only three choices.

Tasks Version Filter n-shot Metric Value Stderr
social_iqa 0 none 0 acc 0.3327 ± 0.0107

Could you share how you ran the siqa evaluation? Thanks

csarron avatar Mar 26 '24 03:03 csarron