ByteMLPerf icon indicating copy to clipboard operation
ByteMLPerf copied to clipboard

AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and versatility of software and hardware.

Results 13 ByteMLPerf issues
Sort by recently updated
recently updated
newest added

Add bert、albert、roberta、deberta、videobert、swin-transformer、widedeep、resnet50、yolov5、conformer.

1. add iluvatar operator backend. 2. add iluvatar llm infer backend. 3. add bert/resnet50/widedeep/yolov5 ixrt infer backend.

Modify the torchrun parameter _--nproc_per_node_ (Number of workers per node) to correct format. Thanks~

问题1: general_perf/prepare_model_and_dataset.sh 脚本无法下载到正确的位置: ``` wget -O general_perf/download/traced_gpt2.tar https://lf-bytemlperf.17mh.cn/obj/bytemlperf-zoo/traced_gpt2.tar tar xf general_perf/download/gpt2.tar -C general_perf/model_zoo/sota/ ``` 下载的tar 包和解压的tar 包名称不一样,需要修改为: ``` wget -O general_perf/download/traced_gpt2.tar -c https://lf-bytemlperf.17mh.cn/obj/bytemlperf-zoo/traced_gpt2.tar mkdir general_perf/model_zoo/sota/traced_gpt2 tar xf general_perf/download/traced_gpt2.tar -C general_perf/model_zoo/sota/traced_gpt2/...

请问llm perf有没有可能支持llam3测试, 8B 版本。这个版本现在用的人比较多。

add batch_gemm, group_gemm; add int8 dtype to gemm ops; fix situation that world_size exceeds available devices.

update `async def get_result(self) -> GenerateResult` to `async def get_result(self) -> GenerateResult`

https://github.com/bytedance/ByteMLPerf/blob/main/byte_infer_perf/llm_perf/workloads/chatglm2-torch-fp16-6b.json We run on A100-40G to get output logits with the below configuration: ```json { "model": "chatglm2-torch-fp16-6b", "test_accuracy": true, "test_perf": true, "min_new_tokens": 128, "max_new_tokens": 256, "tp_sizes": [1, 2], "batch_sizes":[1, 2,...

add MTGPU backends for operators