Mostelk

Results 39 comments of Mostelk

> The model downloaded from https://github.com/fatihcakirs/mobile_models/blob/main/v0_7/tflite/mobilebert_int8_384_20200602.tflite > > Some Fully-connected weights has none-zero zero point (ex. weight `bert/encoder/layer_0/attention/self/MatMul19` has zero-point = 6) , which violate the [TFLite quantization spec](https://www.tensorflow.org/lite/performance/quantization_spec). >...

@anhappdev it is nice to have, not a must, I know split can do it. There is some reference here as well https://stackoverflow.com/questions/37761543/how-i-can-split-file-in-android-case-the-file-is-large-to-upload-itsound-file

What was the conclusion, is ProcessOutput counted in loadgen latency? e.g. topK calculation for Imagenet?

The doc still reads like this , so is ProcessOutput only called in accuracy mode? / ProcessOutput processes the output data before sending to mlperf. This // function only get...

Right now looks Imagenet topK calculation is done in ProcessOutput https://github.com/mlcommons/mobile_app_open/blob/8d92e1bd8a379f188c0db33fc22052be18a15eff/flutter/cpp/datasets/imagenet.cc#L116

> @Mostelk Please provide a description for the Stable Diffusion benchmark. Please check this description, we reviewed it in the Wed meeting The Text to Image Gen AI benchmark adopts...

@farook-edev Please check https://pytorch.org/executorch/stable/llm/llama-demo-android.html Please list what is needed to add ExecuTorch backend in our app, and what is needed, we can contact the right person to get help.

Due to the small input output size of the IFEval and Tiny-MMLU zero shot, I suggest we use Tiny-MMLU Few Shot instead, for both performance and accuracy. Here is the...

@farook-edev please provide json or txt file for the 100 few-shot tiny MMLU prompts implemented in the app

When using template to enable first letter as the answer: | Index | Prompt Length | Generation Length | |------:|--------------:|------------------:| | 0 | 641 | 1024 | | 1 |...