intel-extension-for-transformers icon indicating copy to clipboard operation
intel-extension-for-transformers copied to clipboard

Fix to use token latency to measure performance

Open louie-tsai opened this issue 1 year ago • 0 comments
trafficstars

Type of Change

Bug Fix to use token latency instead of total inference time to measure performance

Description

The workshop notebook measure total inference time for performance instead of token latency. change it to use token latency.

Expected Behavior & Potential Risk

use token latency for performance measurement

How has this PR been tested?

manually test it on AWS

Dependency Change?

no

louie-tsai avatar Apr 27 '24 21:04 louie-tsai