syncode
syncode copied to clipboard
Option to use flash_attention
It would be very usefull to have the option to use flash attention to increase speed and lower memory usage.
If you use SynCode as logits processor you should be able to use flash attention even in the current version. See this example