Zhang Haoming
Zhang Haoming
1. In Kubernetes, any Pod can add a Network Service using an annotation: annotations: networkservicemesh.io: "kernel://my-networkservice-1/nsm-1" if we want nsm to support sriov, is there any change for the annotation....
bash ./run_benchmark.sh -m chatglm3-6b -d nf4 -s 1 -bs 1 -in 256 -out 256 -i 1 bash ./run_benchmark.sh -m chatglm3-6b -d int4 -s 1 -bs 1 -in 256 -out 256...
use real model, not fake model the test cmd is: "bash run_benchmark.sh -m llama-2-70b -d w8a8 -s 2 -bs 1 -in 2048 -out 2048 -kvd int8 -i 5" use “numactl...
dtype 是 bf16_int8 和 bf16_fp16 时,运行run.sh都会直接 killed [web_demo混合精度.zip](https://github.com/intel/xFasterTransformer/files/13373914/web_demo.zip)
the script is in the attachment. [llama2-7b.zip](https://github.com/intel/xFasterTransformer/files/13362390/llama2-7b.zip) the error info is shown as below 1. int8 memory node number: 16 HBM SNC4 mode llama2-7b.sh: 17: Bad substitution llama2-7b.sh: 17: Bad...