Mostelk comments

Results 39 comments of


                                            Mostelk

MobileBERT tflite int8 model seems not follow quantization spec

> The model downloaded from https://github.com/fatihcakirs/mobile_models/blob/main/v0_7/tflite/mobilebert_int8_384_20200602.tflite > > Some Fully-connected weights has none-zero zero point (ex. weight `bert/encoder/layer_0/attention/self/MatMul19` has zero-point = 6) , which violate the [TFLite quantization spec](https://www.tensorflow.org/lite/performance/quantization_spec). >...

Split large mlperf_log_trace JSON files into 25MB chunks

@anhappdev it is nice to have, not a must, I know split can do it. There is some reference here as well https://stackoverflow.com/questions/37761543/how-i-can-split-file-in-android-case-the-file-is-large-to-upload-itsound-file

Does performance test need to test accuracy?

What was the conclusion, is ProcessOutput counted in loadgen latency? e.g. topK calculation for Imagenet?

Does performance test need to test accuracy?

The doc still reads like this , so is ProcessOutput only called in accuracy mode? / ProcessOutput processes the output data before sending to mlperf. This // function only get...

Does performance test need to test accuracy?

Right now looks Imagenet topK calculation is done in ProcessOutput https://github.com/mlcommons/mobile_app_open/blob/8d92e1bd8a379f188c0db33fc22052be18a15eff/flutter/cpp/datasets/imagenet.cc#L116

feat: add icon and description for Stable Diffusion benchmark

> @Mostelk Please provide a description for the Stable Diffusion benchmark. Please check this description, we reviewed it in the Wed meeting The Text to Image Gen AI benchmark adopts...

Master issue: LLM Benchmark

@farook-edev Please check https://pytorch.org/executorch/stable/llm/llama-demo-android.html Please list what is needed to add ExecuTorch backend in our app, and what is needed, we can contact the right person to get help.

what to use in the performance model

Due to the small input output size of the IFEval and Tiny-MMLU zero shot, I suggest we use Tiny-MMLU Few Shot instead, for both performance and accuracy. Here is the...

what to use in the performance model

@farook-edev please provide json or txt file for the 100 few-shot tiny MMLU prompts implemented in the app

what to use in the performance model

When using template to enable first letter as the answer: | Index | Prompt Length | Generation Length | |------:|--------------:|------------------:| | 0 | 641 | 1024 | | 1 |...