tensorrtllm_backend icon indicating copy to clipboard operation
tensorrtllm_backend copied to clipboard

The new kv cache related metrics are missing: allocTotalBlocks, allocNewBlocks, reusedBlocks

Open Pernekhan opened this issue 7 months ago • 0 comments

TensorRT-LLM has more stats for kv cache, but the backend doesn't have.

Can we add the missing ones to the next week's commits?

struct KvCacheStats
{
    SizeType32 maxNumBlocks;
    SizeType32 freeNumBlocks;
    SizeType32 usedNumBlocks;
    SizeType32 toksPerBlock;
    SizeType32 allocTotalBlocks;
    SizeType32 allocNewBlocks;
    SizeType32 reusedBlocks;
};

Pernekhan avatar Jul 18 '24 22:07 Pernekhan