v6d improve the benchmark test of vineyard llm kv cache

improve the benchmark test of vineyard llm kv cache

Open dashanji opened this issue 1 year ago • 1 comments

What do these changes do?

After the benchmark test, we can get the following result.

Token list size is 17792Total Update time is 2.22029s Total Query time is 0.646123s Average update time is 8013.38token/s Average query time is 27536.5token/s

The query time including (query kv tensor ptr from vineyard) + (memcpy from the kv tensor ptr to users' buffer)

Mar 12 '24 12:03 dashanji

/cc @sighingnow, this issus/pr has had no activity for a long time, please help to review the status and assign people to work on it.

Apr 12 '24 00:04 github-actions[bot]

v6d v6d copied to clipboard

improve the benchmark test of vineyard llm kv cache

What do these changes do?

v6d
v6d copied to clipboard