glake icon indicating copy to clipboard operation
glake copied to clipboard

Do vtensor need 64K/128K physical memory policy?

Open nalinaly opened this issue 6 months ago • 0 comments

vAttention said that: if use 2M pageSize, 128M physical memory can be wasted per-request in the worst-case in Llama-3-8B (TP-1), but if use 64KB, 128M would be only 4M Do vtensor have the same problem? Will vtensor integrate 64K/128K pageSize in the future?

nalinaly avatar Aug 08 '24 07:08 nalinaly