QuPengfei issues

Repositories
Issues
Comments

Results 2 issues of


                                            QuPengfei

fix the caculation of performance metric

throughput/latency calculation issue when bs > 1. increase in unexpected way. tm_list from the following should be the per token, not per batch. tm_list = np.array(perf_metrics.raw_metrics.m_durations) / 1000 / 1000...

llm_bench

issue with chatglm2-6b

i saw the issue with chatglm2-6b. it run successfully if with numactl -m 0 -C 0-23. it run failed if with numactl -m 0 -C 0-31, or 0-47 , or...