TransformerEngine icon indicating copy to clipboard operation
TransformerEngine copied to clipboard

Flash attention support softcap.

Open Lzhang-hub opened this issue 1 year ago • 2 comments

Description

Flash attention had support softcap in commit 8f873cc6, which is used in gemma2.

Fixes # (issue)

Type of change

  • [ ] New feature (non-breaking change which adds functionality)

Changes

add softcap args in Flashattention, and update _flash_attn_max_version to 2.6.1

Checklist:

  • [ ] I have read and followed the contributing guidelines
  • [ ] The functionality is complete
  • [ ] I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] My changes generate no new warnings
  • [ ] I have added tests that prove my fix is effective or that my feature works

Lzhang-hub avatar Jul 14 '24 02:07 Lzhang-hub