tft-torch icon indicating copy to clipboard operation
tft-torch copied to clipboard

Usage of flash attention

Open shaharbar1 opened this issue 1 year ago • 1 comments

Consider wrapping the call to self.attention in InterpretableMultiHeadAttention with with torch.backends.cuda.sdp_kernel(enable_flash=True, enable_math=False, enable_mem_efficient=True): In order to improve speed and memory efficiency.

shaharbar1 avatar Jan 14 '24 13:01 shaharbar1