Janus icon indicating copy to clipboard operation
Janus copied to clipboard

Add an option to disable flash attention for older GPU's

Open apcameron opened this issue 10 months ago • 2 comments

When I run generation_inference.py I get the error below.

RuntimeError: FlashAttention only supports Ampere GPUs or newer.

Please add an option to either disable it.

apcameron avatar Feb 14 '25 17:02 apcameron

Same problem. Is there a way to disable it manually?

shootheart avatar May 19 '25 03:05 shootheart

A simple way is to modify the config.json

hongjx175 avatar May 19 '25 06:05 hongjx175