Anton Lokhmotov
Anton Lokhmotov
I can think of a situation when an implementer refactors/integrates a reference script into their own script. For example, the reference script may hardcode using `/usr/bin/python3`, while they may want...
@arjunsuresh But you admit that in some cases it may not be straightforward: > yes, running the reference accuracy script standalone is fine I believe. > But this is not...
Yes, it is mandatory for the Closed division. However, for the reasons that you outlined, GPTJ might be dropped from MLPerf Inference too. (Normally, a benchmark need to survive 4...
Hi @surbanqq! Reference code often supports only a single accelerator. But for their submissions vendors optimize including scaling to multiple accelerators. In the case of NVIDIA, please take a look...
# Benchmarking LoRA against baseline (no LoRA) throughput We use NVIDIA's [GenAI-Perf](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/perf_analyzer/genai-perf/README.html) tool to force fixed-length inputs and outputs to produce "heatmap" plots as below. On TPU-v6e and H100 instances,...
MlPref -> MLPerf? On Sat, 28 Jan 2023, 19:40 Arjun Suresh, ***@***.***> wrote: > Actually we are using the reference script in CM workflow > > itself and they work...
> Reference implementations are not practically usable- while it is not practical to support all the hardware ideally we should have an object oriented device where a new submitter should...
> I think KILT will be useful particularly if it supports more hardware > backends other than Qualcomm. We are planning to release more backends after the v3.1 round. Some...