Janus
Janus copied to clipboard
Add an option to disable flash attention for older GPU's
When I run generation_inference.py I get the error below.
RuntimeError: FlashAttention only supports Ampere GPUs or newer.
Please add an option to either disable it.
Same problem. Is there a way to disable it manually?
A simple way is to modify the config.json