xFasterTransformer
xFasterTransformer copied to clipboard
Add env param KV_CACHE_LOCATION to control kv cache memory numanode location
Usage: before you run instance export KV_CACHE_LOCATION=#memory_numa_node_id_you_want_to_use_for_kv_cache
by defaults, kv_cache location is the same as other parts of instance.