Consistency_LLM icon indicating copy to clipboard operation
Consistency_LLM copied to clipboard

AttributeError: 'LlamaModel' object has no attribute '_use_flash_attention_2'

Open raghavgarg97 opened this issue 1 year ago • 2 comments

I was running speedup.sh with Llama model but got this issue trace.

Screenshot 2024-05-20 at 2 42 06 PM

The error follows from the file Consistency_LLM/cllm/cllm_llama_modeling.py https://github.com/hao-ai-lab/Consistency_LLM/blob/b2a7283bafd65121e868b92fbeb811aac140be17/cllm/cllm_llama_modeling.py#L154

the code needs to be updated to if self.model.config._attn_implementation=='flash_attention_2': Do i need to change model config to check speed of base model with jacobi iteration? base model="meta-llama/Meta-Llama-3-8B-Instruct"

raghavgarg97 avatar May 20 '24 09:05 raghavgarg97

Did you use package versions we provided in requirements.txt? If not, what are the pytorch and transformers versions you are using?

snyhlxde1 avatar May 21 '24 07:05 snyhlxde1

With all respect, the versions in requirements.txt are quite dated (transformers 4.36). could you please make an effort and find a solution that works with the current transformers version.

Also, please reduce the requirements.txt size from 180 packages to some manageable minimum, and avoid locking to specific version unless critical.

poedator avatar Jul 13 '24 08:07 poedator