TransformerEngine
TransformerEngine copied to clipboard
[Feature Request] Any roadmap for supporting FP8 attention calculation?
There is only FP16/BF16 being supported in class FusedAttention.