Mostelk
Mostelk
This paper https://arxiv.org/pdf/2208.03299 also has interesting code base that may be easier for integration than lm-eval or tiny lm eval, just focus on the zero-shot cases for our case: https://github.com/facebookresearch/atlas?tab=readme-ov-file#tasks...
> > This paper https://arxiv.org/pdf/2208.03299 also has interesting code base that may be easier for integration than lm-eval or tiny lm eval, just focus on the zero-shot cases for our...
> Let us try to quantize these and report accuracy llama 3.1 8B Instruct, llama 3.2 3B Instruct We will use MMLU (5 shot) to report accuracies after quantizing these...
Let us check mmlu-llama benchmark, 0 and 5 shots, we also need to decide on input & output sequence lengths
How about we use perplexity to measure the accuracy, similar to this ExecuTorch example for Llama 3.1 8B: using LM_EVAL, and using similar settings in this example of max input...
> How about we use perplexity to measure the accuracy, similar to this ExecuTorch example for Llama 3.1 8B: using LM_EVAL, and using similar settings in this example of max...
@swasson488 We would like your help on this, given we putting app on playstore
Also Stable Diffusion is not now accelerated on Pixel Phones, needs to be quantized and delegated to edge TPU
@farook-edev @anhappdev would like to test, but we don;t have download link for the reference models, benchmark_setting { benchmark_id: "llm" framework: "TFLite" delegate_choice: { delegate_name: "CPU" accelerator_name: "cpu" accelerator_desc: "CPU"...