Sana About kernal fusion

About kernal fusion

Open qiuzidian opened this issue 9 months ago • 4 comments

trafficstars

You mentioned in the report that Triton was used for kernel fusion, but the corresponding functions for attn and ffn in the config were not called. Could you tell me why?

attn_type: linear
ffn_type: glumbconv

Jan 28 '25 12:01 qiuzidian

Since some of our users with Window system are not comfortable with triton and that's why we set the default setting to original linear attention implementation for everyone's better experience.

Feb 13 '25 03:02 lawrence-cj

If I want to use triton, attn_type can be set to triton_linear, but what should ffn_type be set to? loading pretrained weights be affected?

Feb 13 '25 03:02 qiuzidian

Since some of our users with Window system are not comfortable with triton and that's why we set the default setting to original linear attention implementation for everyone's better experience.

now we have triton on Windows via this repository and it is working

https://github.com/woct0rdho/triton-windows/releases

Feb 13 '25 10:02 FurkanGozukara

If I want to use triton, attn_type can be set to triton_linear, but what should ffn_type be set to? loading pretrained weights be affected?

I have the same question. Could you let me know if you've resolved this issue, or do you have any recommended settings for the ffn_type configuration?

Apr 21 '25 07:04 kangyiyang

Sana Sana copied to clipboard

About kernal fusion

Sana
Sana copied to clipboard