Regarding the flash attention issue
When I was running it, I was alerted about a flash attention problem. I want to know if this is normal. During the process, my GPU (NVIDIA 4070 Super) ran at 100% for over 10 seconds. This is critical as it relates to our team's model improvements. Thank you. I have followed your requirements.txt and demo for Python dependencies, so I can confirm there are no issues with CUDA and PyTorch. But my python == 3.11.12. Maybe it has a problem?
Hi, it is not normal. I think it means flash attention cannot be applied.
Basically it is saying the inbuilt function scaled_dot_product_attention is not working correctly. After a simple search, it looks the same to this issue,
https://github.com/CompVis/stable-diffusion/issues/850
I would recommend (a) double check your pytorch version and install, especially the cuda version. (b) use a linux machine rather than window
Alright, I really appreciate your assistance. I'm working on resolving this now. May I reach out to you again if I have additional questions later?