v6d icon indicating copy to clipboard operation
v6d copied to clipboard

improve the benchmark test of vineyard llm kv cache

Open dashanji opened this issue 1 year ago • 1 comments

What do these changes do?

After the benchmark test, we can get the following result.

Token list size is 17792Total Update time is 2.22029s Total Query time is 0.646123s Average update time is 8013.38token/s Average query time is 27536.5token/s

The query time including (query kv tensor ptr from vineyard) + (memcpy from the kv tensor ptr to users' buffer)

dashanji avatar Mar 12 '24 12:03 dashanji

/cc @sighingnow, this issus/pr has had no activity for a long time, please help to review the status and assign people to work on it.

github-actions[bot] avatar Apr 12 '24 00:04 github-actions[bot]