DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

KV_cache offload

Open yuzhenmao opened this issue 11 months ago • 3 comments

Hi, I am using the latest huggingface transformers (version==4.48.0.dev0). When I tried to run the demo from here, I have this error: AttributeError: 'LlamaForCausalLM' object has no attribute 'set_kv_cache_offload'.

Does anyone know the solution to solve this issue? Thanks!

yuzhenmao avatar Dec 15 '24 06:12 yuzhenmao