DeepSpeedExamples
DeepSpeedExamples copied to clipboard
KV_cache offload
Hi, I am using the latest huggingface transformers (version==4.48.0.dev0). When I tried to run the demo from here, I have this error: AttributeError: 'LlamaForCausalLM' object has no attribute 'set_kv_cache_offload'.
Does anyone know the solution to solve this issue? Thanks!